Sqlite3 crashes from Process on Python 3 - macos

I am trying to initialize my database from from a process but I get return code -11, which seems to be segmentation fault.
import requests
import sqlite3
from multiprocessing import Process
def initializer():
print("Before sqlite connect")
connection = sqlite3.connect("/tmp/files.db")
print("Before sqlite cursor")
connection.cursor()
if __name__ == "__main__":
response = requests.get("http://google.com")
init_process = Process(target=initializer)
init_process.start()
init_process.join()
print("Exit code: %d" % init_process.exitcode)
The process crashes before I can create the cursor:
% python3 file-scanner.py
Before sqlite connect
Exit code: -11
It only happens on Python 3 on OS X it seems. And only happens if I use requests before connecting to sqlite.
I don't have the opportunity to switch to Python 2 -- nor do I really want to.

Related

Multiprocessing runtime error freeze_support() in Mac 64 bit

I am trying to learn Threading and Multiprocessing on a MacOS. I am unable to launch the processes though, with python giving the following error message.
Error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
my code:
parallel_processing.py
import multiprocessing
import time
start= time.perf_counter()
def do_something():
print('sleeping 1 sec....')
time.sleep(1)
return('done sleeping...')
# do_something()
p1 = multiprocessing.Process(target = do_something)
p2 = multiprocessing.Process(target = do_something)
p1.start()
p2.start()
finish= time.perf_counter()
print(f'finished in {finish-start} seconds')
It seems that you can resolve the issue by putting your code inside the block if __name__ == '__main__'. For example:
import multiprocessing as mp
import time
def do_something():
...
if __name__ == '__main__':
p1 = mp.Process(target=do_something)
p2 = mp.Process(target=do_something)
p1.start()
p2.start()
...
I'm not sure this is resulted from the recent macOS update or python, but it seems to have to do with how OS creates new threads, i.e. forking versus spwaning. Please see this blog post for details.

How to run perticular code in gpu using PyTorch?

I am using an image processing code in python opencv. Since that process is taking a lot of time to process say 30 images. I tried to process these image parallel using Multiprocessing. The multiprocessing part is working good in CPU but I want to use that multiprocessing thing in GPU(cuda).
I use torch.multiprocessing for running task in parallel. So I am using torch.device('cuda') for our class to run whole thing in to this perticular device. When I run the code it's showing device using "cuda" but not using any GPU processing.
import cv2
import numpy as np
import torch
import torch.nn as nn
from torch.multiprocessing import Process, Pool, Manager, set_start_method
import sys
import os
class RoadShoulderWidth(nn.Module):
def __init__(self):
super(RoadShoulderWidth, self).__init__()
pass
// Want to run below method in parallel for 30 images.
#staticmethod
def get_dim(image, road_shoulder_width_list):
..... code
def get_road_shoulder_width(self, _root_dir, _img_path_list):
manager = Manager()
road_shoulder_width_list = manager.list()
processes = []
for img_path in img_path_list[:30]:
img = cv2.imread(_root_dir + '/' + img_path)
img = img[72 * 5:72 * 6, 0:1280]
# Do work
p = Process(target=self.get_dim,args=(img,road_shoulder_width_list))
p.start()
processes.append(p)
for p in processes:
p.join()
return road_shoulder_width_list
Use below set of code to run your class
if __name__ == '__main__':
root_dir = '/home/nikhil_m/r'
img_path_list = os.listdir(root_dir)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
dataloader_kwargs = {'pin_memory': True}
set_start_method('fork')
obj = RoadShoulderWidth().to(device)
val = obj.get_road_shoulder_width(str(root_dir), img_path_list)
print(val)
print(torch.cuda.is_available())
Can anybody suggest me how to fix this?
Your class RoadShoulderWidth is a nn.Module subclass which lets you use .to(device). This only means that all other nn.Module objects or nn.Parameters that are members of your RoadShoulderWidth object are moved to the device. As from your example, there are none, so nothing happens.
In general PyTorch does not move code to GPU but data. If all data of a pytorch operation are on the GPU (e.g. a + b, a and b are on GPU) then the operation is executed on the GPU. You can move the data with a.to(device), given a is a torch.Tensor object.
PyTorch can only execute its own operations on GPU. It's not able to execute OpenCV code on GPU.

wxPython GUI crashing due to uncaught exception 'NSRangeException'

I have read that pub/sub mechanism is a thread-safe mean of communicating from a thread to a GUI ( https://www.blog.pythonlibrary.org/2010/05/22/wxpython-and-threads/ )
The program below, which has been reduce from a bigger one to the essence of the problem, crashes after a number of writings from the thread to the wx.TextCtrl area of the GUI through the pub/sub mechanism. For experimenting several writing rates, it can be changed in the time.sleep(x) statement. Whatever x is, it crashes (tested by myself from 0.1 to 100 seconds), it is not a matter of how frequently the thread writes to the GUI.
Basically, the GUI creates the text control and subscribes to a pub/sub mechanism. The thread writes periodically into the publisher. It works fine until crashing with exception:
2017-10-21 13:50:26.221 Python[20665:d07] An uncaught exception was raised
2017-10-21 13:50:26.222 Python[20665:d07] NSMutableRLEArray insertObject:range:: Out of bounds
2017-10-21 13:50:26.222 Python[20665:d07] ([…])
2017-10-21 13:50:26.223 Python[20665:d07] *** Terminating app due to uncaught exception 'NSRangeException', reason: 'NSMutableRLEArray
insertObject:range:: Out of bounds'
*** First throw call stack:
([…]
)
libc++abi.dylib: terminating with uncaught exception of type NSException
The ‘out of bounds’ probably relates to an index on which I have no access from the Python code… I am unable to go further. Could anybody help ?
Using Python 2.7.12 | wxPython 3.0.2.0
Running Mac OS X 10.9.5 | on platform x86_64
The code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__version__ = '04'
import sys
import threading
import time
import platform
try:
import wx
except ImportError:
raise ImportError ("The wxPython module is required to run this program")
try:
from pubsub import pub
except ImportError:
from wx.lib.pubsub import pub
class CrashFrame(wx.Frame):
def __init__(self,parent,id,title):
wx.Frame.__init__(self,parent,id,title)
self.hor_sizer = wx.BoxSizer(wx.HORIZONTAL)
self.textlogger = wx.TextCtrl(self, size=(520,110), style=wx.TE_MULTILINE | wx.VSCROLL, value="" )
self.hor_sizer.Add(self.textlogger)
self.SetSizerAndFit(self.hor_sizer)
self.Show(True)
self.crashthread = SocketClientThread()
self.run()
def run(self):
self.logthistext('Using Python {} | wxPython {}'.format(sys.version.split()[0], wx.VERSION_STRING))
self.logthistext('Is thread running ? - %s' % self.crashthread.isAlive())
# Create a listener in the GUI form
pub.subscribe(self.logthistext, 'fromSocketListener')
def logthistext(self, msg):
self.textlogger.AppendText('{}\n'.format(msg)) # a good way to write on the text area
class SocketClientThread(threading.Thread):
def __init__(self):
super(SocketClientThread, self).__init__()
self.alive = threading.Event()
self.alive.set()
self.start() # thread will start at creation of the class instance
def run(self):
while self.alive.isSet():
data = 'A bunch of bytes'
pub.sendMessage('fromSocketListener', msg=data)
time.sleep(10) # or 0.1 or 100, whatever, it crashes
continue
if __name__ == '__main__':
app = wx.App()
frame = CrashFrame(None,-1,'Crash Program - v{}'.format(__version__))
app.MainLoop()
wxPython’s Threadsafe Methods
wx.PostEvent
wx.CallAfter
wx.CallLater
def run(self):
x=0
while self.alive.isSet():
data = 'A bunch of bytes-'+str(x)
wx.CallAfter(pub.sendMessage,'fromSocketListener', msg=data)
time.sleep(0.2) # or 0.1 or 100, whatever, it crashes
x +=1

`ProcessPoolExecutor` works on Ubuntu, but fails with `BrokenProcessPool` when running Jupyter 5.0.0 notebook with Python 3.5.3 on Windows 10

I'm running Jupyter 5.0.0 notebook with Python 3.5.3 on Windows 10. The following example code fails to run:
from concurrent.futures import as_completed, ProcessPoolExecutor
import time
import numpy as np
def do_work(idx1, idx2):
time.sleep(0.2)
return np.mean([idx1, idx2])
with ProcessPoolExecutor(max_workers=4) as executor:
futures = set()
for idx in range(32):
future = winprocess.submit(
executor, do_work, idx, idx * 2
)
futures.add(future)
for future in as_completed(futures):
print(future.result())
... and throws BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
The code works perfectly fine on Ubuntu 14.04.
I've understand that Windows doesn't have os.fork, thus multiprocessing is handled differently, and doesn't always play nice with interactive mode and Jupyter.
What are some workarounds to make ProcessPoolExecutor work in this case?
There are some similar questions, but they relate to multiprocessing.Pool:
multiprocessing.Pool in jupyter notebook works on linux but not windows
Closer inspection shows that a Jupyter notebook can run external python modules which is parallelized using ProcessPoolExecutor. So, a solution is to do the parallelizable part of your code in a module and call it from the Jupyter notebook.
That said, this can be generalized as a utility. The following can be stored as a module, say, winprocess.py and imported by jupyter.
import inspect
import types
def execute_source(callback_imports, callback_name, callback_source, args):
for callback_import in callback_imports:
exec(callback_import, globals())
exec('import time' + "\n" + callback_source)
callback = locals()[callback_name]
return callback(*args)
def submit(executor, callback, *args):
callback_source = inspect.getsource(callback)
callback_imports = list(imports(callback.__globals__))
callback_name = callback.__name__
future = executor.submit(
execute_source,
callback_imports, callback_name, callback_source, args
)
return future
def imports(callback_globals):
for name, val in list(callback_globals.items()):
if isinstance(val, types.ModuleType) and val.__name__ != 'builtins' and val.__name__ != __name__:
import_line = 'import ' + val.__name__
if val.__name__ != name:
import_line += ' as ' + name
yield import_line
Here is how you would use this:
from concurrent.futures import as_completed, ProcessPoolExecutor
import time
import numpy as np
import winprocess
def do_work(idx1, idx2):
time.sleep(0.2)
return np.mean([idx1, idx2])
with ProcessPoolExecutor(max_workers=4) as executor:
futures = set()
for idx in range(32):
future = winprocess.submit(
executor, do_work, idx, idx * 2
)
futures.add(future)
for future in as_completed(futures):
print(future.result())
Notice that executor has been changed with winprocess and the original executor is passed to the submit function as a parameter.
What happens here is that the notebook function code and imports are serialized and passed to the module for execution. The code is not executed until it is safely in a new process, thus does not trip up with trying to make a new process based on the jupyter notebook itself.
Imports are handled in such a way as to maintain aliases. The import magic can be removed if you make sure to import everything needed for the function being executed inside the function itself.
Also, this solution only works if you pass all necessary variables as arguments to the function. The function should be static so to speak, but I think that's a requirement of ProcessPoolExecutor as well. Finally, make sure you don't execute other functions defined elsewhere in the notebook. Only external modules will be imported, thus other notebook functions won't be included.

send data from celery to tornado websocket

I have some periodic tasks which I execute with Celery (parse pages).
Also I established a websocket with tornado.
I want to pass data from periodic tasks to tornado, then write this data to websocket and use this data on my html page.
How can I do this?
I tried to import module with tornado websocket from my module with celery tasks, but ofcourse, that didn't work.
I know only how to return some data, if I get a message from my client-side. Here is how I cope with it:
import tornado.httpserver
import tornado.websocket
import tornado.ioloop
import tornado.web
import socket
'''
This is a simple Websocket Echo server that uses the Tornado websocket handler.
Please run `pip install tornado` with python of version 2.7.9 or greater to install tornado.
This program will echo back the reverse of whatever it recieves.
Messages are output to the terminal for debuggin purposes.
'''
class handler():
wss = []
class WSHandler(tornado.websocket.WebSocketHandler):
def open(self):
print ('new connection')
if self not in handler.wss:
handler.wss.append(self)
def on_message(self, message):
print ('message received: ' + message)
wssend('Ihaaaa')
def on_close(self):
print ('connection closed')
if self in handler.wss:
handler.wss.remove(self)
def check_origin(self, origin):
return True
def wssend(message):
print(handler.wss)
for ws in handler.wss:
if not ws.ws_connection.stream.socket:
print ("Web socket does not exist anymore!!!")
handler.wss.remove(ws)
else:
print('I am trying!')
ws.write_message(message)
print('tried')
application = tornado.web.Application([
(r'/ws', WSHandler),
])
if __name__ == "__main__":
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8888)
myIP = socket.gethostbyname(socket.gethostname())
print ('*** Websocket Server Started at %s***' % myIP)
main_loop = tornado.ioloop.IOLoop.instance()
main_loop.start()
The option is to make a handle in tornado and then post results of celery task to this handle.
After that, there will be an opportunity to pass this data to websocket.

Resources