Python asyncio default selector that works best with PyZMQ - python-asyncio

In Python asyncio, Python on Ubuntu selects the best performing selector when we import selector:
import asyncio
loop = asyncio.get_event_loop()
If we are to import PyZMQ and use any communication pattern from PyZMQ, it's recommended by PyZMQ authors to use asyncio event loop for faster asynchronous communication. My question is does the default selector chosen by Python in asyncio module is the optimized one when we pickzmq in Python or we should select the best selector that works with zmq please?

Related

Can I use multiple event loops in a program where I also use multiprocessing module

Thanks for any reply in advance.
I have the entrance program main.py:
import asyncio
from loguru import logger
from multiprocessing import Process
from app.events import type_a_tasks, type_b_tasks, type_c_tasks
def run_task(task):
loop = asyncio.get_event_loop()
loop.run_until_complete(task())
loop.run_forever()
def main():
processes = list()
processes.append(Process(target=run_task, args=(type_a_tasks,)))
processes.append(Process(target=run_task, args=(type_b_tasks,)))
processes.append(Process(target=run_task, args=(type_c_tasks,)))
for process in processes:
process.start()
logger.info(f"Started process id={process.pid}, name={process.name}")
for process in processes:
process.join()
if __name__ == '__main__':
main()
where the different types of tasks are similarly defined, for example type_a_tasks are:
import asyncio
from . import business_1, business_2, business_3, business_4, business_5, business_6
async def type_a_tasks():
tasks = list()
tasks.append(asyncio.create_task(business_1.main()))
tasks.append(asyncio.create_task(business_2.main()))
tasks.append(asyncio.create_task(business_3.main()))
tasks.append(asyncio.create_task(business_4.main()))
tasks.append(asyncio.create_task(business_5.main()))
tasks.append(asyncio.create_task(business_6.main()))
await asyncio.wait(tasks)
return tasks
where the main() function of businesses(1-6) are Future objects provided by asyncio, in which I implemented my business code.
Is my usage of multiprocessing and asyncio event loops above the correct way of doing it?
I am doing so because I have a lot of asynchronous tasks to perform, but it doesn't seem appropriate to put them all in one event loop, so I divided them into three parts(a, b and c) accordingly, and I hope they can be run in three different processes to exert the capability of multiple CPU cores, in the meantime taking advantage of asyncio features.
I tried running my code, where the log records show there actually are different processes but all are using the same thread/event loop(knowing this by adding process_id and thread_id to loguru format)
this seens ok. Just use asyncio.run(task()) inside run_task - it is simpler and there is no need to call run_forever (also, with the run_forever` call, your processes will never join the base one.
IDs for other objects across process may repeat - if you want, add to your logging the result of calling os.getpid() in the body of run_task.
(if these are, by chance, the same, that means that somehow subprocessing is using a "dummy" backend due to some configuration in your project - should not happen anyway)

Inter-process communication between async and sync tasks using PyZMQ

On a single process I have a tasks running on a thread that produces values and broadcasts them and
several consumer async tasks that run concurrently in an asyncio loop.
I found this issue on PyZMQ's github asking async <-> sync communication
with inproc sockets which is what I also wanted and the answer was to use .shadow(ctx.underlying) when
creating the async ZMQ Context.
I prepared this example and seems to be working fine:
import signal
import asyncio
import zmq
import threading
import zmq.asyncio
import sys
import time
import json
def producer(ctrl):
# delay first push to give asyncio loop time
# to start
time.sleep(1)
ctx = ctrl["ctx"]
s = ctx.socket(zmq.PUB)
s.bind(ctrl["endpoint"])
v = 0
while ctrl["run"]:
payload = {"value": v, "timestamp": time.time()}
msg = json.dumps(payload).encode("utf-8")
s.send(msg)
v += 1
time.sleep(5)
print("Bye")
def main():
endpoint = "inproc://testendpoint"
ctx = zmq.Context()
actx = zmq.asyncio.Context.shadow(ctx.underlying)
ctrl = {"run": True, "ctx": ctx, "endpoint": endpoint, }
th = threading.Thread(target=producer, args=(ctrl,))
th.start()
try:
asyncio.run(amain(actx, endpoint))
except KeyboardInterrupt:
pass
print("Stopping thread")
ctrl["run"] = False
th.join()
async def amain(ctx, endpoint):
s = ctx.socket(zmq.SUB)
s.subscribe("")
s.connect(endpoint)
loop = asyncio.get_running_loop()
def stop():
try:
print("Closing zmq async socket")
s.close()
except:
pass
raise KeyboardInterrupt
loop.add_signal_handler(signal.SIGINT, stop)
while True:
event = await s.poll(1000)
if event & zmq.POLLIN:
msg = await s.recv()
payload = json.loads(msg.decode("utf-8"))
print("%f: %d" % (payload["timestamp"], payload["value"]))
if __name__ == "__main__":
sys.exit(main())
Is it safe to use inproc://* between a thread and asyncio task in this way? The 0MQ
context is thread safe and I'm not sharing sockets between the thread and the
asyncio task, so I would say in general that this is thread safe, right? Or am I
missing something that I should consider?
Q :Is it safe to use inproc://* between a thread and asyncio task in this way?""
A :First and foremost, I might be awfully wrong (not only here), yet having worked with ZeroMQ since native API 2.1.1+ I dare claim that unless newer "improvements" got lost the core principles ( ZeroMQ ZMTP/RFC-documented properties for building legal implementation of the still valid ZMTP-arsenal ), the answer here shall be YES, as much as the newer releases of pyzmq-binding kept all mandatory properties of the inproc:-Transport-Class without a compromise.
Q :" The 0MQ context is thread safe and I'm not sharing sockets between the thread and the asyncio task, so I would say in general that this is thread safe, right? "
A :Here my troubles start - ZeroMQ implementations were since ever developed based on Martin SUSTRIK's & Pieter HINTJENS' Zen-of-Zero -- i.e. also as Zero-sharing -- so never sharing was the principle ( though "share"-zmq.Context-instances were no problem to be used from different threads, to the contrary of the zmq.Socket-instances )
Python (since ever & still valid in 2022-Q1) used to use & still uses a total [CONCURRENT]-code-execution avoider -- prevented by GIL-lock, which principally avoids any & all kinds of problems, arising from [CONCURRENT]-code-execution to never happen insider Python GIL-lock re-[SERIAL]-ised flow of code-execution, so even if the asyncio-part is built as a pythonic (non-destructive) part of the ecosystem, your code shall never "meet" any kind of concurrency-related issue, as the unless it gains GIL-lock, it does nothing but "hanging in NOP-s cracking" ( nuts-cracking in idle loop ).
Being inside the same process, there seems no advantage to spawn another Context-instance at all ( this used to be the rock-solid certainty since ever, not to ever increase any kind of overheads - Zen-of-Zero ( almost )Zero-overhead ... ). The Sig/Msg core engine was, if performance or latency needs required, powered with more zmq.Context( IOthreads ) upon instantiations, yet these were zmq.Context-owned, not Python-GIL-governed/(b)locked threads, so the performance was pretty well scalable, without wasting any RAM/HWM/buffers/...-resources, without growing any overheads and very efficient, as the IO-threads were co-located for only indeed I/O-work, so not needed for inproc:-( protocol-less )-Transport-Class at all )
Q :" Or am I missing something that I should consider? "
A :Mixing asyncio, O/S-signals ( that are well documented how they interact with native ZeroMQ API ) and other layers of complexity is for sure possible, yet it comes at a cost - it makes the use-case less and less readable and more and more prone to conceptual-gaps and similar hard to decode "errors".
I remember using Tkinter-mainloop() as a cost-wise very cheap and a super-stable framework for rapid-prototyping an MVC-{ M-odel, V-isual, C-ontroller }-parts of many-actors' indeed distributed-system applications in Python. There were Zerop-problems to use ZeroMQ with a single Context-instance, passing the references of the respective AccessNodes' into whatever amount of event-handlers, supposing we kept the ZeroMQ Zen-of-Zero, i.e. no to "share" (meaning no two parts "use" (compete to use) one and the same AccessPoint "one-over-another")
This all was designed-in, at "Zero-cost", by the ZeroMQ by-definition, so unless spoilt in some later phase, re-wrapping a re-wrapped native API, all this ought still work in 2022-Q1, ought it not?

How to check for a websocket bottleneck in a python project

I currently have a script that connects to a server, makes a websocket connection and receives high frequency messages.
I am quite sure that the processing on my client end cannot keep up with the messages and thus i am getting behind after small periods of time.
My understanding is the messages are queued in both the servers sending buffer and in my clients receive buffer too, and if i do not process them quick enough evenutally the buffer will fill up and i will lose messages which will cause an out of sequence issue, is my assumption correct?
My question is, what is the best way (tools) to go about tracing possible bottle necks and track down if the issue is the server or the client? I am working with python in Visual Studio and have the single process running for now using PM2.
I am looking for advice on way to trace low level bottlenecks even if it means using tools like wireshark etc.
thanks.
My advice is to use gevent and gevent-websocket so that all the connections are async. Then you can do multiple connections asynchonously.
With GIPC, you could launch an instance per cpu core and load balance between ports.
example:
from gevent import monkey, socket, Timeout, sleep
monkey.patch_all()
import sys
pyver = sys.version_info[0]
if pyver == 3:
import signal
from gevent import signal_handler as sig
else:
from gevent import signal
import bottle
from bottle import route, request, response, abort
import ujson as json
from gevent.pywsgi import WSGIServer
from geventwebsocket.handler import WebSocketHandler
from geventwebsocket import WebSocketError
import traceback
#route('/ws/app')
def handle_websocket():
global ws_users
ws = request.environ.get('wsgi.websocket')
if not ws:
abort(400, 'Expected WebSocket request.')
while 1:
message = None
try:
with Timeout(2, False) as timeout:
message = ws.receive()
if message:
message = json.loads(message)
# process message, report back with ws.send()
except WebSocketError:
break
except Exception as exc:
traceback.print_exc()
sleep(1)
if __name__ == '__main__':
print(socket.gethostname())
print('Started...')
botapp = bottle.app()
server = WSGIServer(("0.0.0.0", int(80)), botapp , handler_class=WebSocketHandler)
def shutdown():
print('Shutting down ...')
server.stop(timeout=60)
exit(signal.SIGTERM)
if pyver == 3:
sig(signal.SIGTERM, shutdown)
sig(signal.SIGINT, shutdown)
else:
signal(signal.SIGTERM, shutdown)
signal(signal.SIGINT, shutdown) #CTRL C
server.serve_forever()

An asyncio.Future, a coroutine or an awaitable is required

When reading an article on the asyncio module, I wanted to try it in my project. The idea is to wait for a window to disappear.
import asyncio
import pyautogui
import os
workPath = os.getcwd()
filePath = workPath+"\\Projets\\Rotation_PP\\Scripts\\Screenshot\\"
# Wait for the publication to finish. The window should disappear
publicationProgress = pyautogui.locateOnScreen(filePath+'PlanningProgress.png')
loop = asyncio.get_event_loop()
loop.run_until_complete(publicationProgress != None)
A priori I must do it wrong because I have the following error:
TypeError: An asyncio.Future, a coroutine or an awaitable is required
Maybe I'm looking for complications and a simple while would be enough.
Could you, please, help me to solve this problem ?
Best regards

Gathering coin volumes - Is my code running asynchronously?

I'm fairly new to programming in python, I've been programming for about half a year. I've decided to try to build a functional trading bot. While trying to code this bot, I stumbled upon the asyncio module. I would really like to understand the module better but it's hard finding any simple tutorials or documentation about asyncio.
For my script I'm gathering per coin the volume. This works perfectly, but it takes a really long time to gather all the volumes. I would like to ask if my script is running synchronously, and if so how do I fix this? I'm using an API wrapper to communicate with the Binance Exchange.
import binance
import asyncio
import time
s = time.time()
names = [name for name in binance.ticker_prices()] #Gathering all the coin names
loop = asyncio.get_event_loop()
async def get_volume(name):
async def get_data():
return binance.ticker_24hr(name) #Returns per coin a dict of the data of the last 24hr
data = await get_data()
return (name, data['volume'])
tasks = [asyncio.ensure_future(get_volume(name)) for name in names]
results = loop.run_until_complete(asyncio.gather(*tasks))
print('Total time:', time.time() - s)
Since binance.ticker_24hr does not look like it's a coroutine, it is almost certainly blocking the event loop and therefore preventing asyncio.gather to do its job. As a quick fix, you can use run_in_executor to run the blocking function in a separate thread:
async def get_volume(name):
loop = asyncio.get_event_loop()
data = await loop.run_in_executor(None, binance.ticker_24hr, name)
return name, data['volume']
This will work just fine for a reasonable number of parallel tasks. The downside is that it uses threads, so it might not scale to a huge number of parallel requests (or it would require unnecessary waiting). The correct solution in the long run is to use a library that natively supports asyncio.
Maarten firstly you are calling get_ticker for every symbol which means you're making many unnecessary requests. If you call it without a symbol value, you get all tickers in one request. This removes any loops or async as well if you aren't performing other tasks. It looks like the binance library you're using doesn't support this. You can use python-binance to do it
return client.get_ticker()
That said I've been testing an asyncio version of python-binance. It's currently in a feature branch now if you want to try it.
pip install git+https://github.com/sammchardy/python-binance#feature/asyncio
Include the asyncio version of the client and initialise the client
from binance.client_async import AsyncClient as Client
client = Client("<api_key>", "<api_secret>")
Then you can await the calls to get the ticker for a particular symbol
return await client.get_ticker(symbol=name)
Or for all symbol tickers don't pass the symbol parameter
return await client.get_ticker()
Hope that helps

Resources