there is a method EventMachine.next_tick (http://eventmachine.rubyforge.org/EventMachine.html#next_tick-class_method). How big is the tick interval? How to control it? Can the tick interval be set?
Eventmachine Ticks basically match with each run of the reactor event loop. Using next_tick will run the block on the next available run of the reactor loop. Wether this means the next actual run, or more likely, at some point in the near future is based on if there are other events that are waiting to be picked up by the reactor loop. For instance, any blocks of code that where queue using add_timer or add_periodic_timer are run first, then other events like incoming network traffic is processed.
A "tick" in Eventmachine isn't really a measurement of time, it's a counter of the number of times the reactor loop executes. If you have blocking operations in your reactor loop, then each tick will take longer to process.
If you need to know approximately when your should be run, then use add_timer or add_periodic_timer instead ofnext_tick`. But as theres no guarantee that the reactor loop be available at the exact moment the timer should fire, it's almost impossible to use Eventmachine for accurate timer intervals.
Related
Is there a module to measure asyncio event loop metrics? or for asyncio event loop, what metrics we should monitor for performance analysis purpose?
e.g.
how many tasks in the event loop?
how many tasks in waiting states?
I'm not trying to measure the coroutine functions. aiomonitor has the functionality, but is not exactly what I need.
I hardly believe number of pending tasks or tasks summary will tell you much. Let's say you have 10000 tasks, 8000 of them pending: is it much, is it not? Who knows.
Thing is - each asyncio task (or any other Python object) can consume different amount of different machine resources.
Instead of trying to monitor asyncio specific objects I think it's better to monitor general metrics:
CPU usage
RAM usage
Network I/O (in case you're dealing with it)
Hard drive I/O (in case you're dealing with it)
What comes to asyncio you should probably always use asyncio.Semaphore to limit max number of currently running jobs and implement a convenient way to change value of semaphore(s) (for example, through config file).
It'll allow to alter workload on a concrete machine depending on its available and actually utilized resources.
Upd:
My question, will asyncio still accept new connections during this
block?
If your event loop is blocked by some CPU calculation, asyncio will start to process new connections later - when event loop is free again (if they're not time-outed at this moment).
You should always avoid situation of freezing event loop. Freezed somewhere event loop means that all tasks everywhere in code are freezed also! Any kind of loop freezing breaks whole idea of using asynchronous approach regardless of number of tasks. Any kind of code, where event loop is freezed will have performance issues.
As you noted, you can use ProcessPoolExecutor with run_in_executor to await for CPU-bound stuff, but you can use ThreadPoolExecutor to avoid freezing as well.
I'm trying to get a bit of parallelism in my app to decrease the amount of time some operations take. I noticed that Parse.Promise.when() seems to evaluate promises in parallel. But there seems to be no limit to how many promises it tries to evaluate in parallel, is that right?
In this particular example I'm trying to do something to 1500 records. If I use .when, it looks like it's trying to make 1500 connections to the parse api and it seems to be failing somewhere. But when I do these 1500 operations in series, it seems to take forever.
How do you guys deal with this kind of problem?
One way I thought of to deal with this kind of problem might be to modify Parse.Promise.when() so that when I call it, I can specify the level of parallelism I need. e.g. Parse.Promise.when(promises, 10)
Thanks
No, there is not. when does not "evaluate" or "call" promises, it just waits for already existing promises whose tasks are already running since you created them. It's the same as for Promise.all.
Have a look at Limit concurrency of promise being run on how to deal with calling an asynchronous function multiple times.
I'm attempting to build an algorithm for a ticker. The ticker has to emit ticks at a varying rate between a minimum, and a maximum interval between ticks. Every time a tick comes out, it triggers a load action on the system (triggering an expensive fetch). Fetching too often will result in this load time going up, and potentially hurting other consumers of the server. I want an algorithm that comes up with a stable speed to trigger this ticker and avoid overloading the server.
I've looked at proportional controllers, and PID controllers, but neither are obvious tools. Am I missing something?
Imagine you're building something like a monitoring service, which has thousands of tasks that need to be executed in given time interval, independent of each other. This could be individual servers that need to be checked, or backups that need to be verified, or just anything at all that could be scheduled to run at a given interval.
You can't just schedule the tasks via cron though, because when a task is run it needs to determine when it's supposed to run the next time. For example:
schedule server uptime check every 1 minute
first time it's checked the server is down, schedule next check in 5 seconds
5 seconds later the server is available again, check again in 5 seconds
5 seconds later the server is still available, continue checking at 1 minute interval
A naive solution that came to mind is to simply have a worker that runs every second or so, checks all the pending jobs and executes the ones that need to be executed. But how would this work if the number of jobs is something like 100 000? It might take longer to check them all than it is the ticking interval of the worker, and the more tasks there will be, the higher the poll interval.
Is there a better way to design a system like this? Are there any hidden challenges in implementing this, or any algorithms that deal with this sort of a problem?
Use a priority queue (with the priority based on the next execution time) to hold the tasks to execute. When you're done executing a task, you sleep until the time for the task at the front of the queue. When a task comes due, you remove and execute it, then (if its recurring) compute the next time it needs to run, and insert it back into the priority queue based on its next run time.
This way you have one sleep active at any given time. Insertions and removals have logarithmic complexity, so it remains efficient even if you have millions of tasks (e.g., inserting into a priority queue that has a million tasks should take about 20 comparisons in the worst case).
There is one point that can be a little tricky: if the execution thread is waiting until a particular time to execute the item at the head of the queue, and you insert a new item that goes at the head of the queue, ahead of the item that was previously there, you need to wake up the thread so it can re-adjust its sleep time for the item that's now at the head of the queue.
We encountered this same issue while designing Revalee, an open source project for scheduling triggered callbacks. In the end, we ended up writing our own priority queue class (we called ours a ScheduledDictionary) to handle the use case you outlined in your question. As a free, open source project, the complete source code (C#, in this case) is available on GitHub. I'd recommend that you check it out.
Are the simulation loops separate? With separate I mean that JMeter waits for all threads to be done to begin a new iteration of the loop. Or does JMeter just let every thread do a request X time, without stopping?
Additional question: Could one change the number of threads dynamically? Doing a simulation for a range of number of thread (e.g. 100-1500) would be nice.
Each thread is completely independent. So when you have loop set, if a thread is finished its first loop of execution, it goes for another round (as per the loop count) irrespective of the completion of other threads.
You can use a variable for the number of threads & set the number via property files etc. But when the test is running, you can not change the no of threads for the test.
Hope it is clear!
In addition to vIns answer:
You CAN change load dynamically during execution. Threads count is static, but their fire rate is something you can impact.
Look into combination of Beanshell Server and Constant Throughput Timer.