How to call async method from greenlet (playwright) - python-asyncio

My framework (Locust, https://github.com/locustio/locust) is based on gevent and greenlets. But I would like to leverage Playwright (https://playwright.dev/python/), which is built on asyncio.
Naively using Playwrights sync api doesnt work and gives an exception:
playwright._impl._api_types.Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.
I'm looking for some kind of best practice on how to use async in combination with gevent.
I've tried a couple different approaches but I dont know if I'm close or if what I'm trying to do is even possible (I have some experience with gevent, but havent really used asyncio before)
Edit: I kind of have something working now (I've removed Locust and just directly spawned some greenlets to make it easier to understan). Is this as good as it gets, or is there a better solution?
import asyncio
import threading
from playwright.async_api import async_playwright
import gevent
def thr(i):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(do_stuff(i))
loop.close()
async def do_stuff(i):
playwright = await async_playwright().start()
browser = await playwright.chromium.launch(headless=False)
page = await browser.new_page()
await page.wait_for_timeout(5000)
await page.goto(f"https://google.com")
await page.close()
print(i)
def green(i):
t = threading.Thread(target=thr, args=(i,))
t.start()
# t.join() # joining doesnt work, but I couldnt be bothered right now :)
g1 = gevent.spawn(green, 1)
g2 = gevent.spawn(green, 2)
g1.join()
g2.join()

Insipred by #user4815162342 's comment, I went with something like this:
from playwright.async_api import async_playwright # need to import this first
from gevent import monkey, spawn
import asyncio
import gevent
monkey.patch_all()
loop = asyncio.new_event_loop()
async def f():
print("start")
playwright = await async_playwright().start()
browser = await playwright.chromium.launch(headless=True)
context = await browser.new_context()
page = await context.new_page()
await page.goto(f"https://www.google.com")
print("done")
def greeny():
while True: # and not other_exit_condition
future = asyncio.run_coroutine_threadsafe(f(), loop)
while not future.done():
gevent.sleep(1)
greenlet1 = spawn(greeny)
greenlet2 = spawn(greeny)
loop.run_forever()
The actual implementation will end up in Locust some day, probably after some optimization (reusing browser instance etc)

Here's a simple way to integrate asyncio and gevent:
Run an asyncio loop in a dedicated thread
Use asyncio.run_coroutine_threadsafe() to run a coroutine
Use gevent.event.Event to wait until the coroutine resolves
import asyncio
import threading
import gevent
loop = asyncio.new_event_loop()
loop_thread = threading.Thread(target=loop.run_forever, daemon=True)
loop_thread.start()
async def your_coro():
# ...
def wait_until_complete(coro):
future = asyncio.run_coroutine_threadsafe(coro, loop)
event = gevent.event.Event()
future.add_dome_callback(lambda _: event.set())
event.wait()
return future.result()
result = wait_until_complete(your_coro())

Related

Make multiprocessing.Queue accessible from asyncio [duplicate]

This question already has answers here:
FastAPI runs api-calls in serial instead of parallel fashion
(2 answers)
Is there a way to use asyncio.Queue in multiple threads?
(4 answers)
Closed 19 days ago.
The community is reviewing whether to reopen this question as of 18 days ago.
Given a multiprocessing.Queue that is filled from different Python threads, created via ThreadPoolExecutor.submit(...).
How to access that Queue with asyncio / Trio / Anyio in a safe manner (context FastAPI) and reliable manner?
I am aware of Janus library, but prefer a custom solution here.
Asked (hopefully) more concisely:
How to implement the
await <something_is_in_my_multiprocessing_queue>
to have it accesible with async/await and to prevent blocking the event loop?
What synchronization mechanism in general would you suggest?
(Attention here: multiprocessing.Queue not asyncio.Queue)
Actually, I figured it out.
Given a method, that reads the mp.Queue:
def read_queue_blocking():
return queue.get()
Comment: And this is the main issue: A call to get is blocking.
We can now either
use `asyncio.loop.run_in_executor' in asyncio EventLoop.
( see https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor ) or
use anyio with await anyio.to_thread.run_sync(...) to execute the blocking retrieval of data from the queue in a separate thread.
For FastAPI
#app.websocket("/ws/{client_id}")
async def websocket_endpoint(websocket: WebSocket, client_id: str):
await websocket.accept()
while True:
import anyio
queue_result = await anyio.to_thread.run_sync(read_queue_blocking)
await websocket.send_text(f"Message text was: {queue_result}")
I remastered the answer to show case when main thread with asyncio loop is feed with data from child processes (ProcessPoolExecutor):
from concurrent.futures import ProcessPoolExecutor
import asyncio
from random import randint
from functools import partial
def some_heavy_task() -> int:
sum(i * i for i in range(10 ** 8))
return randint(1, 9)
def callback(fut: asyncio.Future, q: asyncio.Queue) -> None:
"""callback is used instead of mp.Queue to get feed from child processes."""
loop = asyncio.get_event_loop()
if not fut.exception() and not fut.cancelled():
loop.call_soon(q.put_nowait, f"name-{fut.name}: {fut.result()}")
async def result_picker(q: asyncio.Queue) -> None:
"""Returns results to some outer world."""
while True:
res = await q.get()
# imagine it is websocket
print(f"Result from heavy_work_producer: {res}")
q.task_done() # mark task as done here
async def heavy_work_producer(q: asyncio.Queue) -> None:
"""Wrapper around all multiprocessing work."""
loop = asyncio.get_event_loop()
with ProcessPoolExecutor(max_workers=4) as pool:
heavy_tasks = [loop.run_in_executor(pool, some_heavy_task) for _ in range(12)]
[i.add_done_callback(partial(callback, q=q)) for i in heavy_tasks]
[setattr(t, "name", i) for i, t in enumerate(heavy_tasks)] # just name them
await asyncio.gather(*heavy_tasks)
async def amain():
"""Main entrypoint of async app."""
q = asyncio.Queue()
asyncio.create_task(result_picker(q))
await heavy_work_producer(q)
# do not let result_picker finish when heavy_work_producer is done
# wait all results to show
await q.join()
print("All done.")
if __name__ == '__main__':
asyncio.run(amain())

How can ib_insync reqHistoricalDataAsync work with Asyncio?

import asyncio
import ib_insync as ibi
import symbol_list
import time
start = time.perf_counter()
stocklist = symbol_list.test
endDateTime = '20190328 09:30:00'
durationStr='1 D'
dataDirectory = './data/tmp'
class App:
async def run(self):
self.ib = ibi.IB()
with await self.ib.connectAsync():
contracts = [
ibi.Stock(symbol, 'SMART', 'USD')
for symbol in ['AAPL', 'TSLA', 'AMD', 'INTC']]
for contract in contracts:
# self.ib.reqMktData(contract)
bars = await self.ib.reqHistoricalDataAsync(contract,
endDateTime=endDateTime, durationStr=durationStr,
barSizeSetting='5 mins', whatToShow='MIDPOINT', useRTH=True)
df = ibi.util.df(bars)
df.to_csv(f"{dataDirectory}/{contract.symbol}.csv")
async for tickers in self.ib.pendingTickersEvent:
for ticker in tickers:
print(ticker)
def stop(self):
self.ib.disconnect()
app = App()
try:
asyncio.run(app.run())
except (KeyboardInterrupt, SystemExit):
app.stop()
endtime = (time.perf_counter() - start)/60
print(f"Process time: {endtime:,.2f} minutes")
Please help. I modified the example code from async-streaming-example. I didn't get any error message, but it just runs without giving me the shell prompt. And this code should only take less than a minute, if it runs properly. Essentially, instead of reqMktData, I want to use reqHistoricalDataAsync to get historical data, asynchronously. I've also looked at async execution with ib_insync, but I wasn't able to get that technique to work, either. Could you show me what I'm doing wrong? I welcome any async solutions. Thank you.

Asyncio script performs slowly, similar to sync script

I'm writing an asyncio script to retrieve stock bars data from Interactive Brokers via the ib_insync library.
While I have the script working, the performance is similar to a serial script. I was hoping to see a drastic improvement in speed. This code will be used in production.
I am new to asyncio and feel like I'm missing an important element. Below is the full script. Would very much appriciate assistance in speeding this up. Thanks.
import asyncio
import ib_insync as ibi
import nest_asyncio
import pandas as pd
nest_asyncio.apply()
class App:
async def run(self, symbols):
print(f"1 start run: {symbols}")
self.ib = ibi.IB()
with await self.ib.connectAsync("127.0.0.1", "****", clientId="****"):
contracts = [ibi.Stock(symbol, "SMART", "USD") for symbol in symbols]
bars_dict = dict()
print(f"2 start loop: {symbols}")
for contract in contracts:
bars = await self.ib.reqHistoricalDataAsync(
contract,
endDateTime="",
durationStr="1 M",
barSizeSetting="1 day",
whatToShow="ADJUSTED_LAST",
useRTH=True,
)
# Convert to dataframes.
bars_dict[contract.symbol] = ibi.util.df(bars)
print(f"3 End bars: {symbols}")
return bars_dict
async def main(self):
res = await asyncio.gather(self.run(self.sp500(0, 100)))
return res
def stop(self):
self.ib.disconnect()
def sp500(self, start=None, end=10):
payload = pd.read_html(
"https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
)
first_table = payload[0]
sp500 = first_table["Symbol"].sort_values().to_list()
return sp500[start:end]
if __name__ == "__main__":
import time
start = time.time()
app = App()
try:
print(f"START CALL")
res = asyncio.run(app.main())
print(f"END CALL")
except (KeyboardInterrupt, SystemExit):
app.stop()
for ticker, bars in res[0].items():
print(f"{ticker}\n{bars}")
print(f"Total time: {(time.time() - start)}")
Your script is running in sequence. The call to asyncio.gather() in main is useless because it is invoked with just one coroutine. You're supposed to call it with multiple coroutines to have them run in parallel.
For example, you could remove the asyncio.gather() from main (just await self.run(self.sp500(0, 100) there) and instead use it to parallelize calls to reqHistoricalDataAsync:
class App:
async def run(self, symbols):
print(f"1 start run: {symbols}")
self.ib = ibi.IB()
with await self.ib.connectAsync("127.0.0.1", "****", clientId="****"):
contracts = [ibi.Stock(symbol, "SMART", "USD") for symbol in symbols]
print(f"2 start loop: {symbols}")
all_bars = await asyncio.gather(*[
self.ib.reqHistoricalDataAsync(
contract,
endDateTime="",
durationStr="1 M",
barSizeSetting="1 day",
whatToShow="ADJUSTED_LAST",
useRTH=True,
)
for contract in contracts
])
bars_dict = {}
for contract, bars in zip(contracts, all_bars):
# Convert to dataframes.
bars_dict[contract.symbol] = ibi.util.df(bars)
print(f"3 End bars: {symbols}")
return bars_dict

Aiohttp: Server & Client in one time

I try to use aiohttp 3.6.2 both server and client:
For webhook perform work:
1) Get JSON-request from service
2) Fast send HTTP 200 OK back to service
3) Made additional work after: make http-request to slow web-service(answer 2-5 sec)
I dont understand how to perform work after view(or handler) returned web.Response(text="OK")?
Current view:
(it's slow cause slow http_request perform before response)
view.py:
async def make_http_request(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
print(await resp.text())
async def work_on_request(request):
url = (await request.json())['url']
await make_http_request(url)
return aiohttp.web.Response(text='all ok')
routes.py:
from views import work_on_request
def setup_routes(app):
app.router.add_get('/', work_on_request)
server.py:
from aiohttp import web
from routes import setup_routes
import asyncio
app = web.Application()
setup_routes(app)
web.run_app(app)
So, workaround for me is to start one more thread with different event_loop, or may be you know how to add some work to current event loop?
Already not actual, cause i found desicion to add one more task to main event_loop:
//additionaly i created one global queue to interoperate coroutine between each other.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
queue = asyncio.Queue(maxsize=100000)
loop.create_task(worker('Worker1', queue))
app = web.Application()
app['global_queue'] = queue

cx_oracle with Asyncio in Python with SQLAlchemy

I am confused from different thread posted in different time for this topic.
Is this feature of Asyncio available with latest version(As of Dec 2019) of cx_Oracle?
I am using below code snippets which is working but not sure if this is perfect way to do async call for Oracle? Any pointer will be helpful.
import asyncio
async def sqlalchemyoracle_fetch():
conn_start_time = time()
oracle_tns_conn = 'oracle+cx_oracle://{username}:{password}#{tnsname}'
engine = create_engine(
oracle_tns_conn.format(
username=USERNAME,
password=PWD,
tnsname=TNS,
),
pool_recycle=50,
)
for x in test:
pd.read_sql(query_randomizer(x), engine)
!calling custom query_randomizer function which will execute oracle queries from the parameters passed through test which is a list
async def main():
tasks = [sqlalchemyoracle_asyncfetch()]
return await asyncio.gather(*tasks)
if __name__ == "__main__":
result = await main()
I use the cx_Oracle library but not SQLAlchemy. As of v8.2, asyncio is not supported.
This issue tracks and confirms it - https://github.com/oracle/python-cx_Oracle/issues/178.
And no, your code block does not run asynchronously, although defined using async def there is no statement in the code block that is asynchronous. To be asynchronous, your async function either needs to await another async function (that already supports async operations) or use yield to indicate a possible context switch. None of these happens in your code block.
You can try the following package which states to have implemented async support for cx_Oracle. https://pypi.org/project/cx-Oracle-async/

Resources