Related
I am using an image processing code in python opencv. Since that process is taking a lot of time to process say 30 images. I tried to process these image parallel using Multiprocessing. The multiprocessing part is working good in CPU but I want to use that multiprocessing thing in GPU(cuda).
I use torch.multiprocessing for running task in parallel. So I am using torch.device('cuda') for our class to run whole thing in to this perticular device. When I run the code it's showing device using "cuda" but not using any GPU processing.
import cv2
import numpy as np
import torch
import torch.nn as nn
from torch.multiprocessing import Process, Pool, Manager, set_start_method
import sys
import os
class RoadShoulderWidth(nn.Module):
def __init__(self):
super(RoadShoulderWidth, self).__init__()
pass
// Want to run below method in parallel for 30 images.
#staticmethod
def get_dim(image, road_shoulder_width_list):
..... code
def get_road_shoulder_width(self, _root_dir, _img_path_list):
manager = Manager()
road_shoulder_width_list = manager.list()
processes = []
for img_path in img_path_list[:30]:
img = cv2.imread(_root_dir + '/' + img_path)
img = img[72 * 5:72 * 6, 0:1280]
# Do work
p = Process(target=self.get_dim,args=(img,road_shoulder_width_list))
p.start()
processes.append(p)
for p in processes:
p.join()
return road_shoulder_width_list
Use below set of code to run your class
if __name__ == '__main__':
root_dir = '/home/nikhil_m/r'
img_path_list = os.listdir(root_dir)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
dataloader_kwargs = {'pin_memory': True}
set_start_method('fork')
obj = RoadShoulderWidth().to(device)
val = obj.get_road_shoulder_width(str(root_dir), img_path_list)
print(val)
print(torch.cuda.is_available())
Can anybody suggest me how to fix this?
Your class RoadShoulderWidth is a nn.Module subclass which lets you use .to(device). This only means that all other nn.Module objects or nn.Parameters that are members of your RoadShoulderWidth object are moved to the device. As from your example, there are none, so nothing happens.
In general PyTorch does not move code to GPU but data. If all data of a pytorch operation are on the GPU (e.g. a + b, a and b are on GPU) then the operation is executed on the GPU. You can move the data with a.to(device), given a is a torch.Tensor object.
PyTorch can only execute its own operations on GPU. It's not able to execute OpenCV code on GPU.
Is output buffering enabled by default in Python's interpreter for sys.stdout?
If the answer is positive, what are all the ways to disable it?
Suggestions so far:
Use the -u command line switch
Wrap sys.stdout in an object that flushes after every write
Set PYTHONUNBUFFERED env var
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
Is there any other way to set some global flag in sys/sys.stdout programmatically during execution?
If you just want to flush after a specific write using print, see How can I flush the output of the print function?.
From Magnus Lycka answer on a mailing list:
You can skip buffering for a whole
python process using python -u
or by
setting the environment variable
PYTHONUNBUFFERED.
You could also replace sys.stdout with
some other stream like wrapper which
does a flush after every call.
class Unbuffered(object):
def __init__(self, stream):
self.stream = stream
def write(self, data):
self.stream.write(data)
self.stream.flush()
def writelines(self, datas):
self.stream.writelines(datas)
self.stream.flush()
def __getattr__(self, attr):
return getattr(self.stream, attr)
import sys
sys.stdout = Unbuffered(sys.stdout)
print 'Hello'
I would rather put my answer in How to flush output of print function? or in Python's print function that flushes the buffer when it's called?, but since they were marked as duplicates of this one (what I do not agree), I'll answer it here.
Since Python 3.3, print() supports the keyword argument "flush" (see documentation):
print('Hello World!', flush=True)
# reopen stdout file descriptor with write mode
# and 0 as the buffer size (unbuffered)
import io, os, sys
try:
# Python 3, open as binary, then wrap in a TextIOWrapper with write-through.
sys.stdout = io.TextIOWrapper(open(sys.stdout.fileno(), 'wb', 0), write_through=True)
# If flushing on newlines is sufficient, as of 3.7 you can instead just call:
# sys.stdout.reconfigure(line_buffering=True)
except TypeError:
# Python 2
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
Credits: "Sebastian", somewhere on the Python mailing list.
Yes, it is.
You can disable it on the commandline with the "-u" switch.
Alternatively, you could call .flush() on sys.stdout on every write (or wrap it with an object that does this automatically)
This relates to Cristóvão D. Sousa's answer, but I couldn't comment yet.
A straight-forward way of using the flush keyword argument of Python 3 in order to always have unbuffered output is:
import functools
print = functools.partial(print, flush=True)
afterwards, print will always flush the output directly (except flush=False is given).
Note, (a) that this answers the question only partially as it doesn't redirect all the output. But I guess print is the most common way for creating output to stdout/stderr in python, so these 2 lines cover probably most of the use cases.
Note (b) that it only works in the module/script where you defined it. This can be good when writing a module as it doesn't mess with the sys.stdout.
Python 2 doesn't provide the flush argument, but you could emulate a Python 3-type print function as described here https://stackoverflow.com/a/27991478/3734258 .
def disable_stdout_buffering():
# Appending to gc.garbage is a way to stop an object from being
# destroyed. If the old sys.stdout is ever collected, it will
# close() stdout, which is not good.
gc.garbage.append(sys.stdout)
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
# Then this will give output in the correct order:
disable_stdout_buffering()
print "hello"
subprocess.call(["echo", "bye"])
Without saving the old sys.stdout, disable_stdout_buffering() isn't idempotent, and multiple calls will result in an error like this:
Traceback (most recent call last):
File "test/buffering.py", line 17, in <module>
print "hello"
IOError: [Errno 9] Bad file descriptor
close failed: [Errno 9] Bad file descriptor
Another possibility is:
def disable_stdout_buffering():
fileno = sys.stdout.fileno()
temp_fd = os.dup(fileno)
sys.stdout.close()
os.dup2(temp_fd, fileno)
os.close(temp_fd)
sys.stdout = os.fdopen(fileno, "w", 0)
(Appending to gc.garbage is not such a good idea because it's where unfreeable cycles get put, and you might want to check for those.)
The following works in Python 2.6, 2.7, and 3.2:
import os
import sys
buf_arg = 0
if sys.version_info[0] == 3:
os.environ['PYTHONUNBUFFERED'] = '1'
buf_arg = 1
sys.stdout = os.fdopen(sys.stdout.fileno(), 'a+', buf_arg)
sys.stderr = os.fdopen(sys.stderr.fileno(), 'a+', buf_arg)
Yes, it is enabled by default. You can disable it by using the -u option on the command line when calling python.
In Python 3, you can monkey-patch the print function, to always send flush=True:
_orig_print = print
def print(*args, **kwargs):
_orig_print(*args, flush=True, **kwargs)
As pointed out in a comment, you can simplify this by binding the flush parameter to a value, via functools.partial:
print = functools.partial(print, flush=True)
You can also run Python with stdbuf utility:
stdbuf -oL python <script>
You can create an unbuffered file and assign this file to sys.stdout.
import sys
myFile= open( "a.log", "w", 0 )
sys.stdout= myFile
You can't magically change the system-supplied stdout; since it's supplied to your python program by the OS.
You can also use fcntl to change the file flags in-fly.
fl = fcntl.fcntl(fd.fileno(), fcntl.F_GETFL)
fl |= os.O_SYNC # or os.O_DSYNC (if you don't care the file timestamp updates)
fcntl.fcntl(fd.fileno(), fcntl.F_SETFL, fl)
One way to get unbuffered output would be to use sys.stderr instead of sys.stdout or to simply call sys.stdout.flush() to explicitly force a write to occur.
You could easily redirect everything printed by doing:
import sys; sys.stdout = sys.stderr
print "Hello World!"
Or to redirect just for a particular print statement:
print >>sys.stderr, "Hello World!"
To reset stdout you can just do:
sys.stdout = sys.__stdout__
It is possible to override only write method of sys.stdout with one that calls flush. Suggested method implementation is below.
def write_flush(args, w=stdout.write):
w(args)
stdout.flush()
Default value of w argument will keep original write method reference. After write_flush is defined, the original write might be overridden.
stdout.write = write_flush
The code assumes that stdout is imported this way from sys import stdout.
Variant that works without crashing (at least on win32; python 2.7, ipython 0.12) then called subsequently (multiple times):
def DisOutBuffering():
if sys.stdout.name == '<stdout>':
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
if sys.stderr.name == '<stderr>':
sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', 0)
(I've posted a comment, but it got lost somehow. So, again:)
As I noticed, CPython (at least on Linux) behaves differently depending on where the output goes. If it goes to a tty, then the output is flushed after each '\n'
If it goes to a pipe/process, then it is buffered and you can use the flush() based solutions or the -u option recommended above.
Slightly related to output buffering:
If you iterate over the lines in the input with
for line in sys.stdin:
...
then the for implementation in CPython will collect the input for a while and then execute the loop body for a bunch of input lines. If your script is about to write output for each input line, this might look like output buffering but it's actually batching, and therefore, none of the flush(), etc. techniques will help that.
Interestingly, you don't have this behaviour in pypy.
To avoid this, you can use
while True:
line=sys.stdin.readline()
...
I am using tesseract to perform OCR on screengrabs. I have an app using a tkinter window leveraging self.after in the initialization of my class to perform constant image scrapes and update label, etc values in the tkinter window. I have searched for multiple days and can't find any specific examples how to leverage CREATE_NO_WINDOW with Python3.6 on a Windows platform calling tesseract with pytesseract.
This is related to this question:
How can I hide the console window when I run tesseract with pytesser
I have only been programming Python for 2 weeks and don't understand what/how to perform the steps in the above question. I opened up the pytesseract.py file and reviewed and found the proc = subprocess.Popen(command, stderr=subproces.PIPE) line but when I tried editing it I got a bunch of errors that I couldn't figure out.
#!/usr/bin/env python
'''
Python-tesseract. For more information: https://github.com/madmaze/pytesseract
'''
try:
import Image
except ImportError:
from PIL import Image
import os
import sys
import subprocess
import tempfile
import shlex
# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'
__all__ = ['image_to_string']
def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False,
config=None):
'''
runs the command:
`tesseract_cmd` `input_filename` `output_filename_base`
returns the exit status of tesseract, as well as tesseract's stderr output
'''
command = [tesseract_cmd, input_filename, output_filename_base]
if lang is not None:
command += ['-l', lang]
if boxes:
command += ['batch.nochop', 'makebox']
if config:
command += shlex.split(config)
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
status = proc.wait()
error_string = proc.stderr.read()
proc.stderr.close()
return status, error_string
def cleanup(filename):
''' tries to remove the given filename. Ignores non-existent files '''
try:
os.remove(filename)
except OSError:
pass
def get_errors(error_string):
'''
returns all lines in the error_string that start with the string "error"
'''
error_string = error_string.decode('utf-8')
lines = error_string.splitlines()
error_lines = tuple(line for line in lines if line.find(u'Error') >= 0)
if len(error_lines) > 0:
return u'\n'.join(error_lines)
else:
return error_string.strip()
def tempnam():
''' returns a temporary file-name '''
tmpfile = tempfile.NamedTemporaryFile(prefix="tess_")
return tmpfile.name
class TesseractError(Exception):
def __init__(self, status, message):
self.status = status
self.message = message
self.args = (status, message)
def image_to_string(image, lang=None, boxes=False, config=None):
'''
Runs tesseract on the specified image. First, the image is written to disk,
and then the tesseract command is run on the image. Tesseract's result is
read, and the temporary files are erased.
Also supports boxes and config:
if boxes=True
"batch.nochop makebox" gets added to the tesseract call
if config is set, the config gets appended to the command.
ex: config="-psm 6"
'''
if len(image.split()) == 4:
# In case we have 4 channels, lets discard the Alpha.
# Kind of a hack, should fix in the future some time.
r, g, b, a = image.split()
image = Image.merge("RGB", (r, g, b))
input_file_name = '%s.bmp' % tempnam()
output_file_name_base = tempnam()
if not boxes:
output_file_name = '%s.txt' % output_file_name_base
else:
output_file_name = '%s.box' % output_file_name_base
try:
image.save(input_file_name)
status, error_string = run_tesseract(input_file_name,
output_file_name_base,
lang=lang,
boxes=boxes,
config=config)
if status:
errors = get_errors(error_string)
raise TesseractError(status, errors)
f = open(output_file_name, 'rb')
try:
return f.read().decode('utf-8').strip()
finally:
f.close()
finally:
cleanup(input_file_name)
cleanup(output_file_name)
def main():
if len(sys.argv) == 2:
filename = sys.argv[1]
try:
image = Image.open(filename)
if len(image.split()) == 4:
# In case we have 4 channels, lets discard the Alpha.
# Kind of a hack, should fix in the future some time.
r, g, b, a = image.split()
image = Image.merge("RGB", (r, g, b))
except IOError:
sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
exit(1)
print(image_to_string(image))
elif len(sys.argv) == 4 and sys.argv[1] == '-l':
lang = sys.argv[2]
filename = sys.argv[3]
try:
image = Image.open(filename)
except IOError:
sys.stderr.write('ERROR: Could not open file "%s"\n' % filename)
exit(1)
print(image_to_string(image, lang=lang))
else:
sys.stderr.write('Usage: python pytesseract.py [-l lang] input_file\n')
exit(2)
if __name__ == '__main__':
main()
The code I am leveraging is similar to the example in the similar question:
def get_string(img_path):
# Read image with opencv
img = cv2.imread(img_path)
# Convert to gray
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
# Write image after removed noise
cv2.imwrite(src_path + "removed_noise.png", img)
# Apply threshold to get image with only black and white
# Write the image after apply opencv to do some ...
cv2.imwrite(src_path + "thres.png", img)
# Recognize text with tesseract for python
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
return result
When it gets to the following line, there is a flash of a black console window for less than a second and then it closes when it runs the command.
result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))
Here is the picture of the console window:
Program Files (x86)_Tesseract
Here is what is suggested from the other question:
You're currently working in IDLE, in which case I don't think it
really matters if a console window pops up. If you're planning to
develop a GUI app with this library, then you'll need to modify the
subprocess.Popen call in pytesser.py to hide the console. I'd first
try the CREATE_NO_WINDOW process creation flag. – eryksun
I would greatly appreciate any help for how to modify the subprocess.Popen call in the pytesseract.py library file using CREATE_NO_WINDOW. I am also not sure of the difference between pytesseract.py and pytesser.py library files. I would leave a comment on the other question to ask for clarification but I can't until I have more reputation on this site.
I did more research and decided to learn more about subprocess.Popen:
Documentation for subprocess
I also referenced the following articles:
using python subprocess.popen..can't prevent exe stopped working prompt
I changed the original line of code in pytesseract.py:
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
to the following:
proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW)
I ran the code and got the following error:
Exception in Tkinter callback Traceback (most recent call last):
File
"C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\tkinter__init__.py",
line 1699, in call
return self.func(*args) File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line
403, in gather_data
update_cash_button() File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line
208, in update_cash_button
currentCash = get_string(src_path + "cash.png") File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line
150, in get_string
result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) File
"C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py",
line 125, in image_to_string
config=config) File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py",
line 49, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW) NameError: name 'CREATE_NO_WINDOW'
is not defined
I then defined the CREATE_NO_WINDOW variable:
#Assignment of the value of CREATE_NO_WINDOW
CREATE_NO_WINDOW = 0x08000000
I got the value of 0x08000000 from the above linked article. After adding the definition I ran the application and I didn't get any more console window popups.
I worked today in a simple script to checksum files in all available hashlib algorithms (md5, sha1.....) I wrote it and debug it with Python2, but when I decided to port it to Python 3 it just won't work. The funny thing is that it works for small files, but not for big files. I thought there was a problem with the way I was buffering the file, but the error message is what makes me think it is something related to the way I am doing the hexdigest (I think) Here is a copy of my entire script, so feel free to copy it, use it and help me figure out what the problem is with it. The error I get when checksuming a 250 MB file is
"'utf-8' codec can't decode byte 0xf3 in position 10: invalid continuation byte"
I google it, but can't find anything that fixes it. Also if you see better ways to optimize it, please let me know. My main goal is to make work 100% in Python 3. Thanks
#!/usr/local/bin/python33
import hashlib
import argparse
def hashFile(algorithm = "md5", filepaths=[], blockSize=4096):
algorithmType = getattr(hashlib, algorithm.lower())() #Default: hashlib.md5()
#Open file and extract data in chunks
for path in filepaths:
try:
with open(path) as f:
while True:
dataChunk = f.read(blockSize)
if not dataChunk:
break
algorithmType.update(dataChunk.encode())
yield algorithmType.hexdigest()
except Exception as e:
print (e)
def main():
#DEFINE ARGUMENTS
parser = argparse.ArgumentParser()
parser.add_argument('filepaths', nargs="+", help='Specified the path of the file(s) to hash')
parser.add_argument('-a', '--algorithm', action='store', dest='algorithm', default="md5",
help='Specifies what algorithm to use ("md5", "sha1", "sha224", "sha384", "sha512")')
arguments = parser.parse_args()
algo = arguments.algorithm
if algo.lower() in ("md5", "sha1", "sha224", "sha384", "sha512"):
Here is the code that works in Python 2, I will just put it in case you want to use it without having to modigy the one above.
#!/usr/bin/python
import hashlib
import argparse
def hashFile(algorithm = "md5", filepaths=[], blockSize=4096):
'''
Hashes a file. In oder to reduce the amount of memory used by the script, it hashes the file in chunks instead of putting
the whole file in memory
'''
algorithmType = hashlib.new(algorithm) #getattr(hashlib, algorithm.lower())() #Default: hashlib.md5()
#Open file and extract data in chunks
for path in filepaths:
try:
with open(path, mode = 'rb') as f:
while True:
dataChunk = f.read(blockSize)
if not dataChunk:
break
algorithmType.update(dataChunk)
yield algorithmType.hexdigest()
except Exception as e:
print e
def main():
#DEFINE ARGUMENTS
parser = argparse.ArgumentParser()
parser.add_argument('filepaths', nargs="+", help='Specified the path of the file(s) to hash')
parser.add_argument('-a', '--algorithm', action='store', dest='algorithm', default="md5",
help='Specifies what algorithm to use ("md5", "sha1", "sha224", "sha384", "sha512")')
arguments = parser.parse_args()
#Call generator function to yield hash value
algo = arguments.algorithm
if algo.lower() in ("md5", "sha1", "sha224", "sha384", "sha512"):
for hashValue in hashFile(algo, arguments.filepaths):
print hashValue
else:
print "Algorithm {0} is not available in this script".format(algorithm)
if __name__ == "__main__":
main()
I haven't tried it in Python 3, but I get the same error in Python 2.7.5 for binary files (the only difference is that mine is with the ascii codec). Instead of encoding the data chunks, open the file directly in binary mode:
with open(path, 'rb') as f:
while True:
dataChunk = f.read(blockSize)
if not dataChunk:
break
algorithmType.update(dataChunk)
yield algorithmType.hexdigest()
Apart from that, I'd use the method hashlib.new instead of getattr, and hashlib.algorithms_available to check if the argument is valid.
I have a simple GUI which run various scripts from another python file, everything works fine until the GUI is running a function which includes a while loop, at which point the GUI seems to crash and become in-active. Does anybody have any ideas as to how this can be overcome, as I believe this is something to do with the GUI being updated,Thanks. Below is a simplified version of my GUI.
GUI
#!/usr/bin/env python
# Python 3
from tkinter import *
from tkinter import ttk
from Entry import ConstrainedEntry
import tkinter.messagebox
import functions
AlarmCode = "2222"
root = Tk()
root.title("Simple Interface")
mainframe = ttk.Frame(root, padding="3 3 12 12")
mainframe.grid(column=0, row=0, sticky=(N, W, E, S))
mainframe.columnconfigure(0, weight=1)
mainframe.rowconfigure(0, weight=1)
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: functions.test()).grid(
column=5, row=5, sticky=SE)
for child in mainframe.winfo_children():
child.grid_configure(padx=5, pady=5)
root.mainloop()
functions
def test():
period = 0
while True:
if (period) <=100:
time.sleep(1)
period +=1
print(period)
else:
print("100 seconds has passed")
break
What will happen in the above is that when the loop is running the application will crash. If I insert a break in the else statement after the period has elapsed, everything will work fine. I want users to be able to click when in loops as this GUI will run a number of different functions.
Don't use time.sleep in the same thread than your Tkinter code: it freezes the GUI until the execution of test is finished. To avoid this, you should use after widget method:
# GUI
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: functions.test(root))
.grid(column=5, row=5, sticky=SE)
# functions
def test(root, period=0):
if period <= 100:
period += 1
print(period)
root.after(1000, lambda: test(root, period))
else:
print("100 seconds has passed")
Update:
In your comment you also add that your code won't use time.sleep, so your original example may not be the most appropiate. In that case, you can create a new thread to run your intensive code.
Note that I posted the alternative of after first because multithreading should be used only if it is completely necessary - it adds overhead to your applicacion, as well as more difficulties to debug your code.
from threading import Thread
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: Thread(target=functions.test).start())
.grid(column=5, row=5, sticky=SE)
# functions
def test():
for x in range(100):
time.sleep(1) # Simulate intense task (not real code!)
print(x)
print("100 seconds has passed")