How to create multiprocess with regression function? - multiprocessing

I'm trying to build a regression function that call itself in a new process. The new process should not stop the parent process nor wait for it to finish, that is why I don't use join(). Do you have another way to create regression function with multi-process.
I use the following code:
import multiprocessing as mp
import concurrent.futures
import time
def do_something(c, seconds, r_list):
c += 1 # c is a counter that all processes should use
# such that no more than 20 processes are created.
print(f"Sleeping {seconds} second(s)...")
if c < 20:
P_V = mp.Value('d', 0.0, lock=False)
p = mp.Process(group=None, target=do_something, args=(c, 1, r_list,))
p.start()
if not p.is_alive():
r_list.append(P_V.value)
time.sleep(seconds)
print(f"Done Sleeping...{seconds}")
return f"Done Sleeping...{seconds}"
if __name__ == '__main__':
C = 0 # C is a counter that all processes should use
# such that no more than 20 processes are created.
Result_list = [] # results that come from all processes are saved here
Result_list.append(do_something(C, 1, Result_list))
Notice that results from all processes should be compared at the end.
In fact, this code is working well but the child processes, which are created in the recursive method, do not print anything, the list "Result_list" contains only one item from the first call, and C=0 at the end, any idea why?

Here's a simplified example of what I think you're trying to do (side note: launching processes recursively is a great way to accidentally create a "fork bomb". It is extremely more common to create multiple processes in some sort of loop instead)
from multiprocessing import Process, Queue
from time import sleep
from os import getpid
def foo(n_procs, return_Q, arg):
if __name__ == "__main__": #don't actually run the body of foo in the "main" process, just start the recursion
Process(target=foo, args=(n_procs, return_Q, arg)).start()
else:
n_procs -= 1
if n_procs > 0:
Process(target=foo, args=(n_procs, return_Q, arg)).start()
sleep(arg)
print(f"{getpid()} done sleeping {arg} seconds")
return_Q.put(f"{getpid()} done sleeping {arg} seconds") #put the result to a queue so we can get it in the main process
if __name__ == "__main__":
q = Queue()
foo(10, q, 2)
sleep(10) #do something else in the meantime
results = []
#while not q.empty(): #usually better to just know how many results you're expecting as q.empty can be unreliable
for _ in range(10):
results.append(q.get())
print("mp results:")
print("\n".join(results))

Related

"working" terminal prompt running in parallel Python 3.10

I am trying to show an animated "working" prompt while running some python code. I've been searching for a way to do so but the solutions i've found are not quite what i want (if i recall correctly tqdm and alive-progress require a for loop with a defined number of iterations) and I'd like to find a way to code it myself.
The closest I've gotten to making it is using asyncio as follows:
async def main():
dummy_task = asyncio.create_task(dummy_search())
bar_task = asyncio.create_task(progress())
test = await dummy_task
bar_task.cancel()
where dummy task can be any async task and bar_task is:
FLUSH_LINE = "\033[K"
async def progress(mode=""):
def integers():
n = 0
while True:
yield n
n += 1
progress_indicator = ["-", "\\", "|", "/"]
message = "working"
message_len = len(message)
message += "-" * message_len
try:
if not mode:
for i in (n for n in integers()):
await asyncio.sleep(0.05)
message = message[1:] + message[0]
print(f"{FLUSH_LINE}{progress_indicator[i % 4]} [{message[:message_len]}]", end="\r")
finally:
print(f"{FLUSH_LINE}")
The only problem with this approach is that asyncio does not actually run tasks in parallel, so if the dummy_task does not use await at any point, the bar_task will not run until the dummy task is complete, and it won't show the working prompt in the terminal.
How should I go about trying to run both tasks in parallel? Do I need to use multiprocessing? If so, would both tasks write to the same terminal by default?
Thanks in advance.

Multiprocessing runtime error freeze_support() in Mac 64 bit

I am trying to learn Threading and Multiprocessing on a MacOS. I am unable to launch the processes though, with python giving the following error message.
Error
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
my code:
parallel_processing.py
import multiprocessing
import time
start= time.perf_counter()
def do_something():
print('sleeping 1 sec....')
time.sleep(1)
return('done sleeping...')
# do_something()
p1 = multiprocessing.Process(target = do_something)
p2 = multiprocessing.Process(target = do_something)
p1.start()
p2.start()
finish= time.perf_counter()
print(f'finished in {finish-start} seconds')
It seems that you can resolve the issue by putting your code inside the block if __name__ == '__main__'. For example:
import multiprocessing as mp
import time
def do_something():
...
if __name__ == '__main__':
p1 = mp.Process(target=do_something)
p2 = mp.Process(target=do_something)
p1.start()
p2.start()
...
I'm not sure this is resulted from the recent macOS update or python, but it seems to have to do with how OS creates new threads, i.e. forking versus spwaning. Please see this blog post for details.

Problems interrupting a python Input (Mac)

I am trying to allow a user to input multiple answers but only within an allocated amount of time. The problem is I have it running but the program will not interrupt the input. The program will only stop the user from inputing if the user inputs an answer after the time ends. Any ideas? Is what I am trying to do even possible in python?
I have tried using threading and the signal module however they both result in the same issue.
Using Signal:
import signal
def handler(signum, frame):
raise Exception
def answer_loop():
score = 0
while True:
answer = input("Please input your answer")
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
try:
answer_loop()
except Exception:
print("end")
signal.alarm(0)
Using Threading:
from threading import Timer
def end():
print("Time is up")
def answer_loop():
score = 0
while True:
answer = input("Please input your answer")
time_limit = 5
t = Timer(time_limit, end)
t.start()
answer_loop()
t.cancel()
Your problem is that builtin input does not have a timeout parameter and, AFAIK, threads cannot be terminated by other threads. I suggest instead that you use a GUI with events to finely control user interaction. Here is a bare bones tkinter example.
import tkinter as tk
root = tk.Tk()
label = tk.Label(root, text='answer')
entry = tk.Entry(root)
label.pack()
entry.pack()
def timesup():
ans = entry.get()
entry.destroy()
label['text'] = f"Time is up. You answered {ans}"
root.after(5000, timesup)
root.mainloop()

Python multiprocessing Pool map and imap

I have a multiprocessing script with pool.map that works. The problem is that not all processes take as long to finish, so some processes fall asleep because they wait until all processes are finished (same problem as in this question). Some files are finished in less than a second, others take minutes (or hours).
If I understand the manual (and this post) correctly, pool.imap is not waiting for all the processes to finish, if one is done, it is providing a new file to process. When I try that, the script is speeding over the files to process, the small ones are processed as expected, the large files (that take more time to process) don't finish until the end (are killed without notice ?). Is this normal behavior for pool.imap, or do I need to add more commands/parameters ? When I add the time.sleep(100) in the else part as test, it is processing more large files but the other processes fall asleep. Any suggestions ? Thanks
def process_file(infile):
#read infile
#compare things in infile
#acquire Lock, save things in outfile, release Lock
#delete infile
def main():
#nprocesses = 8
global filename
pathlist = ['tmp0', 'tmp1', 'tmp2', 'tmp3', 'tmp4', 'tmp5', 'tmp6', 'tmp7', 'tmp8', 'tmp9']
for d in pathlist:
os.chdir(d)
todolist = []
for infile in os.listdir():
todolist.append(infile)
try:
p = Pool(processes=nprocesses)
p.imap(process_file, todolist)
except KeyboardInterrupt:
print("Shutting processes down")
# Optionally try to gracefully shut down the worker processes here.
p.close()
p.terminate()
p.join()
except StopIteration:
continue
else:
time.sleep(100)
os.chdir('..')
p.close()
p.join()
if __name__ == '__main__':
main()
Since you already put all your files in a list, you could put them directly into a queue. The queue is then shared with your sub-processes that take the file names from the queue and do their stuff. No need to do it twice (first into list, then pickle list by Pool.imap). Pool.imap is doing exactly the same but without you knowing it.
todolist = []
for infile in os.listdir():
todolist.append(infile)
can be replaced by:
todolist = Queue()
for infile in os.listdir():
todolist.put(infile)
The complete solution would then look like:
def process_file(inqueue):
for infile in iter(inqueue.get, "STOP"):
#do stuff until inqueue.get returns "STOP"
#read infile
#compare things in infile
#acquire Lock, save things in outfile, release Lock
#delete infile
def main():
nprocesses = 8
global filename
pathlist = ['tmp0', 'tmp1', 'tmp2', 'tmp3', 'tmp4', 'tmp5', 'tmp6', 'tmp7', 'tmp8', 'tmp9']
for d in pathlist:
os.chdir(d)
todolist = Queue()
for infile in os.listdir():
todolist.put(infile)
process = [Process(target=process_file,
args=(todolist) for x in range(nprocesses)]
for p in process:
#task the processes to stop when all files are handled
#"STOP" is at the very end of queue
todolist.put("STOP")
for p in process:
p.start()
for p in process:
p.join()
if __name__ == '__main__':
main()

Tkinter problems with GUI when entering while loop

I have a simple GUI which run various scripts from another python file, everything works fine until the GUI is running a function which includes a while loop, at which point the GUI seems to crash and become in-active. Does anybody have any ideas as to how this can be overcome, as I believe this is something to do with the GUI being updated,Thanks. Below is a simplified version of my GUI.
GUI
#!/usr/bin/env python
# Python 3
from tkinter import *
from tkinter import ttk
from Entry import ConstrainedEntry
import tkinter.messagebox
import functions
AlarmCode = "2222"
root = Tk()
root.title("Simple Interface")
mainframe = ttk.Frame(root, padding="3 3 12 12")
mainframe.grid(column=0, row=0, sticky=(N, W, E, S))
mainframe.columnconfigure(0, weight=1)
mainframe.rowconfigure(0, weight=1)
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: functions.test()).grid(
column=5, row=5, sticky=SE)
for child in mainframe.winfo_children():
child.grid_configure(padx=5, pady=5)
root.mainloop()
functions
def test():
period = 0
while True:
if (period) <=100:
time.sleep(1)
period +=1
print(period)
else:
print("100 seconds has passed")
break
What will happen in the above is that when the loop is running the application will crash. If I insert a break in the else statement after the period has elapsed, everything will work fine. I want users to be able to click when in loops as this GUI will run a number of different functions.
Don't use time.sleep in the same thread than your Tkinter code: it freezes the GUI until the execution of test is finished. To avoid this, you should use after widget method:
# GUI
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: functions.test(root))
.grid(column=5, row=5, sticky=SE)
# functions
def test(root, period=0):
if period <= 100:
period += 1
print(period)
root.after(1000, lambda: test(root, period))
else:
print("100 seconds has passed")
Update:
In your comment you also add that your code won't use time.sleep, so your original example may not be the most appropiate. In that case, you can create a new thread to run your intensive code.
Note that I posted the alternative of after first because multithreading should be used only if it is completely necessary - it adds overhead to your applicacion, as well as more difficulties to debug your code.
from threading import Thread
ttk.Button(mainframe, width=12,text="ButtonTest",
command=lambda: Thread(target=functions.test).start())
.grid(column=5, row=5, sticky=SE)
# functions
def test():
for x in range(100):
time.sleep(1) # Simulate intense task (not real code!)
print(x)
print("100 seconds has passed")

Resources