How to make python's socketIO-client libraries wait(seconds=1) non blocking? - socket.io

I'm using the socketIO-client 0.6.5 for a python client that communicates with a node server that uses socketio. My problem is, in order for a client listener to receive data from the server, i have to use the wait() method. wait() hangs the program infinitely while wait(seconds=) hangs the program for the no.of seconds.
I'm using this for a game where the listener will be executed inside the game loop continuously, but if i use the wait() method, the game loop is going to be stuck for the number of seconds which i can't have. The code is way too huge for me to put here but i'm putting a snippet that is representative of the actual code.
def main():
sc = Sock_Con()
while(True):
sc.push_player_location(2,3)
socketIO.on('get_player_location', sc.on_player_location)
socketIO.wait(seconds=1)
if i don't use the wait() method, the data never gets picked from the client. If i do use it the program hangs for the number of seconds. Is there something i'm missing or is there a workaround?

Related

Why do Ruby fibers that run sequentially without a scheduler set run concurrently when a scheduler is set?

I have the following Gemfile:
source "https://rubygems.org"
ruby "3.1.2"
gem "libev_scheduler", "~> 0.2"
and the following Ruby code in a file called main.rb:
require 'libev_scheduler'
set_sched = ARGV[0] == "--set-sched"
if set_sched then
Fiber.set_scheduler Libev::Scheduler.new
end
N_FIBERS = 5
fibers = []
N_FIBERS.times do |i|
n = i + 1
fiber = Fiber.new do
puts "Beginning calculation ##{n}..."
sleep 1
end
fibers.push({fiber: fiber, n: n})
end
fibers.each do |fiber|
fiber[:fiber].resume
end
puts "Finished all calculations!"
I'm executing the code with Ruby 3.1.2 installed via RVM.
When I run the program with time bundle exec ruby main.rb, I get the following output:
Beginning calculation #1...
Beginning calculation #2...
Beginning calculation #3...
Beginning calculation #4...
Beginning calculation #5...
Finished all calculations!
real 0m5.179s
user 0m0.146s
sys 0m0.027s
When I run the program with time bundle exec ruby main.rb --set-sched, I get the following output:
Beginning calculation #1...
Beginning calculation #2...
Beginning calculation #3...
Beginning calculation #4...
Beginning calculation #5...
Finished all calculations!
real 0m1.173s
user 0m0.150s
sys 0m0.021s
Why do my fibers only run concurrently when I've set a scheduler? Some older Stack Overflow answers (like this one) state that fibers are a construct for flow control, not concurrency, and that it is impossible to use fibers to write concurrent code. My results seem to contradict this.
My understanding so far of fibers is that they are meant for cooperative concurrency, as opposed to preemptive concurrency. Therefore, in order to get concurrency out of them, you'd need to have them yield to some other code as early as they can (ex. when IO begins) so that the other code can be executed while the fiber waits for its next opportunity to execute.
Based on this understanding, I think I understand why my code without a scheduler isn't able to run concurrently. It sleeps and because it lacks yield statements before and after code in it, there are no points in time where it could yield control to any other code I've written. But when I add a scheduler, it appears to somehow yield to something. Is sleep detecting the scheduler and yielding to it so that my code resuming the fibers is immediately yielded to, making it able to immediately resume all five fibers?
Great question!
As #stefan noted above, Ruby 3.0 introduced the concept of a "non-blocking fiber." The way the actual non-blocking behavior is accomplished is left up to the scheduler implementation. There is no default scheduler as far as I know; per the Ruby docs:
If Fiber.scheduler is not set in the current thread, blocking and non-blocking fibers’ behavior is identical.
Now, to answer your last question:
But when I add a scheduler, it appears to somehow yield to something ... Is sleep detecting the scheduler and yielding to it so that my code resuming the fibers is immediately yielded to, making it able to immediately resume all five fibers?
You're onto something! When you set a fiber scheduler, it's expected to conform to Fiber::SchedulerInterface, which defines several "hooks." One of those hooks is #kernel_sleep, which is invoked by Kernel#sleep (and Mutex#sleep)!
I can't say I've read much libev code, but you can find libev_scheduler's implementation of that hook here.
The idea is (emphasis my own):
The scheduler runs into a wait loop, checking all the blocked fibers (which it has registered on hook calls) and resuming them when the awaited resource is ready (e.g. I/O ready or sleep time elapsed).
So, in summary:
Your fiber calls Kernel#sleep with some duration.
Kernel#sleep calls the scheduler's #kernel_sleep hook with that same duration.
The schedule "somehow registers what the current fiber is waiting on, and yields control to other fibers with Fiber.yield" (quote from the docs there)
"The scheduler runs into a wait loop, checking all the blocked fibers (which it has registered on hook calls) and resuming them when the awaited resource is ready (e.g. I/O ready or sleep time elapsed)."
Hope this helps!

Kotlin coroutines slow start

I've been attempting to do a bit of performance review on an app I have, it's a back end Kotlin app that just pulls in some data, does a bit of data transformation and dumps it out, nothing too fancy. One thing that caught my eye was the final bit of execution where we dump our final data onto a queue, at first I noticed when we start up the app the final network call takes a very long time at first, sometimes over a second. Normally we run this network call in a coroutine to stop that last call blocking everything but I started trying to time the coroutine and the network call separately and got some odd results, from what I can see the coroutine takes can take forever to launch/complete compared to the network call. It's entirely possible I'm not recording things correctly but this is the general timing approach I have:
val coroutineTime - Instant.now().toEpochMillis()
GlobalScope.launch {
executionTime = measureTimeMillis { <--DO Message Sending -->}
totalTime = Instant.now().toEpochMillis() - coroutineTime
// Log out execution Time and total time
}
Now here what I'll see is something like
- totalTime = ~800ms
- executionTime = ~150ms
These aren't one-offs either, I have multiple of these processes going on at once ( up to 10 threads I think) and the first total times will always take significantly longer than the actual executionTime/network call. Eventually after a new dozen messages the overhead will calm down and these times will become equivalent at about 15ms, but having nearly 700ms overhead on coroutine start up seems insane to me.
Is this normal/expected behavior? I've tested this in a separate app and see similar but less extreme results where the first coroutine will take about 70ms to boot up, I'm struggling to find any other examples of this type of discussion outside of kotlin being used in android development.
As a first note, it's almost never a good idea to use the GlobalScope unless you really know what you're doing. This is why it was marked as delicate API. You should instead use a scope that is appropriately closed (following the lifecycle of whatever component launches this work).
Now, AFAIK, this GlobalScope runs on the default dispatcher, so maybe this is due to a cold start of that default thread pool. Later, it could also be a problem to use this dispatcher for network calls depending on the amount of concurrent coroutines you have. It would be more appropriate to use Disptachers.IO instead for IO bound work (or a custom thread pool).
It still doesn't explain the cold start, but I would first change that before investigating.
This is expected behavior if you use coroutines inappropriately ;-)
My guess is that your message sending is a blocking operation. By default GlobalScope.launch() dispatches coroutines with Dispatchers.Default which is designed to perform CPU-intensive operations, it has a limited number of threads and you should never block when using it. If you do you may run out of threads and coroutines will need to wait until some blocking operations will finish.
If you need to run blocking or IO code, you should use Dispatchers.IO instead:
GlobalScope.launch(Dispatchers.IO) {
I was facing similar issue, I have a function that loads some data from shared prefs, makes some calculations on the data (all this done in Dispatcher.Default), and return the result on Dispatcher.Main. I measured how long it took the Coroutine to actually start executing the block inside Dispatchers.Main.launch { } after calculations are done(time from tag2 to tag3 below), and got about 950ms (!!) Here is the function :
fun someName() {
CoroutineScope(Dispatchers.Default).launch {
val time = System.currentTimeMillis()
//load data and calculations
Log.d("tag2", "load and calculations took " + (System.currentTimeMillis() - time))
CoroutineScope(Dispatchers.Main.immediate).launch {
Log.d("tag3", "reached main thread code " + (System.currentTimeMillis() - time))
//do something
Log.d("tag4", "do something took " + (System.currentTimeMillis() - time))
}
}
}
But then I realized this happens while app launch, and main thread is busy creating all the UI, so even with .immediate it takes time until main thread will get to execute the dispatched code... then I tried to run this function after app already started and waiting, and found that from tag2 to tag 3 takes about 1ms (!!) (with .immediate). So looks like when dispatching something on Coroutine, when thread isn't busy it will start immediately

Safe await on function in another process

TL;DR
How to safely await on function execution (takes str and int as arguments and doesn't require any other context) in a separate process?
Long story
I have aiohtto.web web API that uses Boost.Python wrapper for C++ extension, run under gunicorn (and I plan to deploy it on Heroku), tested by locust.
About extension: it have just one function that does non-blocking operation - takes one string (and one integer for timeout management), does some calculations with it and returns a new string. And for every input string, it is only one possible output (except timeout, but in that case, C++ exception must be raised and translated by Boost.Python to a Python-compatible one).
In short, a handler for specific URL executes the code below:
res = await loop.run_in_executor(executor, func, *args)
where executor is the ProcessPoolExecutor instance, and func -function from C++ extension module. (in the real project, this code is in the coroutine method of the class, and func - it's classmethod that only executes C++ function and returns the result)
Error catching
When a new request arrives, I extract it's POST data by request.post() and then storing it's data to the instance of the custom class named Call (because I have no idea how to name it in another way). So that call object contains all input data (string), request receiving time and unique id that comes with the request.
Then it proceeds to class named Handler (not the aiohttp request handler), that passes it's input to another class' method with loop.run_in_executor inside. But Handler has a logging system that works like a middleware - reads id and receiving time of every incoming call object and logging it with a message that tells you either it just starting to execute, successfully executed or get in trouble. Also, Handler have try/except and stores all errors inside the call object, so that logging middleware knows what error occurred, or what output extension had returned
Testing
I have the unit test that just creates 256 coroutines with this code inside and executor that have 256 workers and it works well.
But when testing with Locust here comes a problem. I use 4 Gunicorn workers and 4 executor workers for this kind of testing. At some time application just starts to return wrong output.
My Locust's TaskSet is configured to log every fault response with all available information: output string, error string, input string (that was returned by the application too), id. All simulated requests are the same, but id is unique for every.
The situation is better when setting Gunicorn's max_requests option to 100 requests, but failures still come.
Interesting thing is, that sometimes I can trigger "wrong output" period by simply stopping and starting Locust's test.
I need a 100% guarantee that my web API works as I expect.
UPDATE & solution
Just asked my teammate to review the C++ code - the problem was in global variables. In some way, it wasn't a problem for 256 parallel coroutines, but for Gunicorn was.

Understanding Celluloid Pool

I guess my understanding toward Celluloid Pool is sort of broken. I will try to explain below but before that a quick note.
Note: Our system is running against a very fast client passing messages over ZeroMQ.
With the following Vanilla Celluloid app
class VanillaClient
include Celluloid::ZMQ
def read
loop { async.evaluate_response(socket.read_multipart)
end
def evaluate_response(data)
## the reason for using defer can be found over here.
Celluloid.defer do
ExternalService.execute(data)
end
end
end
Our system result in failure after some time, reason 'Can't spawn more thread' (or something like it)
So we intended to use Celluloid Pool(to avoid the above-mentioned problem ) so that we can limit the number of threads that spawned
My Understanding toward Celluloid Pool is
Celluloid Pool maintains a pool of actors for you so that you can distribute your task in parallel.
Hence, I decide to test it, but according to my test cases, it seems to behave serially(i.e thing never get distribute or happen in parallel.)
Example to replicate this.
sender-1.rb
## Send message `1` to the the_client.rb
sender-2.rb
## Send message `2` to the the_client.rb
the_client.rb
## take message from sender-1 and sender-2 and return it back to receiver.rb
## heads on, the `sleep` is introduced to test/replicate the IO block that happens in the actual code.
receiver.rb
## print the message obtained from the_client.rb
If, the sender-2.rb is run before sender-1.rb it appears that the pool gets blocked for 20 sec (sleep time in the_client.rb,can be seen over here) before consuming the data sent by sender-1.rb
It behaves the same in ruby-2.2.2 and under jRuby-9.0.5.0. What could be the possible causes for Pool to act in such manner?
Your pool call is not asynchronous.
Execution of evaluate on #pool needs to be .async still, as in your original example, not using pools. You still want asynchronous behavior, but you als want to have multiple handler actors.
Next you will likely hit the Pool.async bug.
https://github.com/celluloid/celluloid-pool/issues/6
This means after 5 hits to evaluate your pool will become unresponsive until at least one actor in the pool is finished. Worst case scenario, if you get 6+ requests in rapid succession, the 6th will then take 120 seconds, because it will take 5*20 seconds before it executes, then 20 seconds to execute itself.
Depending on what your actual operation is that's causing you delays -- you might need to adjust your pool size down the line.

Asynchronous IO server : Thin(Ruby) and Node.js. Any difference?

I wanna clear my concept of asynchronous IO, non-blocking server
When dealing with Node.js , it is easy to under the concept
var express = require('express');
var app = express();
app.get('/test', function(req, res){
setTimeout(function(){
console.log("sleep doesn't block, and now return");
res.send('success');
}, 2000);
});
var server = app.listen(3000, function() {
console.log('Listening on port %d', server.address().port);
});
I know that when node.js is waiting for 2 seconds of setTimeout, it is able to serve another request at the same time, once the 2 seconds is passed, it will call it callback function.
How about in Ruby world, thin server?
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
sleep 2 <----
"success"
end
The code snippet above is using Thin server (non-blocking, asynchronous IO), When talking to asynchronous IO, i want to ask when reaching sleep 2 , is that the server are able to serve another request at the same time as sleep 2 is blocking IO.
The code between node.js and sinatra is that
node.js is writing asynchronous way (callback approach)
ruby is writing in synchronous way (but working in asynchronous way under the cover? is it true)
If the above statement is true,
it seems that ruby is better as the code looks better rather than bunch of callback code in node.js
Kit
Sinatra / Thin
Thin will be started in threaded mode,
if it is started by Sinatra (i.e. with ruby asynchtest.rb)
This means that your assumptions are correct; when reaching sleep 2 , the server is able to serve another request at the same time , but on another thread.
I would to show this behavior with a simple test:
#asynchtest.rb
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
puts "[#{Time.now.strftime("%H:%M:%S")}] logging /test starts on thread_id:#{Thread.current.object_id} \n"
sleep 10
"[#{Time.now.strftime("%H:%M:%S")}] success - id:#{Thread.current.object_id} \n"
end
let's test it by starting three concurrent http requests ( in here timestamp and thread-id are relevant parts to observe):
The test demonstrate that we got three different thread ( one for each cuncurrent request ), namely:
70098572502680
70098572602260
70098572485180
each of them starts concurrently ( the starts is pretty immediate as we can see from the execution of the puts statement ) , then waits (sleeps) ten seconds and after that time flush the response to the client (to the curl process).
deeper understanding
Quoting wikipedia - Asynchronous_I/O:
In computer science, asynchronous I/O, or non-blocking I/O is a form of input/output processing that permits
other processing to continue before the transmission has finished .
The above test (Sinatra/thin) actually demonstrate that it's possible to start a first request from curl ( the client ) to thin ( the server)
and, before we get the response from the first (before the transmission has finished) it's possible to start a second and a third
request and these lasts requests aren't queued but starts concurrently the first one or in other words: permits other processing to continue*
Basically this is a confirmation of the #Holger just's comment: sleep blocks the current thread, but not the whole process. That said, in thin, most stuff is handled in the main reactor thread which thus works similar to the one thread available in node.js: if you block it, nothing else scheduled in this thread will run. In thin/eventmachine, you can however defer stuff to other threads.
This linked answers have more details: "is-sinatra-multi-threaded and Single thread still handles concurrency request?
Node.js
To compare the behavoir of the two platform let's run an equivalent asynchtest.js on node.js; as we do in asynchtest.rb to undertand what happen we add a log line when processing starts;
here the code of asynchtest.rb:
var express = require('express');
var app = express();
app.get('/test', function(req, res){
console.log("[" + getTime() + "] logging /test starts\n");
setTimeout(function(){
console.log("sleep doen't block, and now return");
res.send('[' + getTime() + '] success \n');
},10000);
});
var server = app.listen(3000,function(){
console.log("listening on port %d", server.address().port);
});
Let's starts three concurrent requests in nodejs and observe the same behavoir:
of course very similar to what we saw in the previous case.
This response doesn't claim to be exhaustive on the subject which is very complex and deserves further study and specific evidence before drawing conclusions for their own purposes.
There are lots of subtle differences, almost too many to list here.
First, don't confuse "coding style" with "event model". There's no reason you need to use callbacks in Node.js (see various 'promise' libraries). And Ruby has EventMachine if like the call-back structured code.
Second, Thin (and Ruby) can have many different multi-tasking models. You didn't specify which one.
In Ruby 1.8.7, "Thread" will create green threads. The language actually turns a "sleep N" into a timer call, and allows other statements to execute. But it's got a lot of limitations.
Ruby 1.9.x can create native OS threads. But those can be hard to use (spinning up 1000's is bad for performance, etc.)
Ruby 1.9.x has "Fibers" which are a much better abstraction, very similar to Node.
In any comparison, you also have to take into account the entire ecosystem: Pretty much any node.js code will work in a callback. It's really hard to write blocking code. But many Ruby libraries are not Thread-aware out of the box (require special configuration, etc). Many seemingly simple things (DNS) can block the entire ruby process.
You also need to consider the language. Node.JS, is built on JavaScript, which has a lot of dark corners to trip you up. For example, it's easy to assume that JavaScript has Integers, but it doesn't. Ruby has fewer dark corners (such as Metaprogramming).
If you are really into evented architectures, you should really consider Go. It has the best of all worlds: The evented architecture is built in (just like in Node, except it's multiprocessor-aware), there are no callbacks (just like in Ruby), plus it has first-class messaging (very similar to Erlang). As a bonus, it will use a fraction of the memory of a Node or Ruby process.
No, node.js is fully asynchronous, setTimeout will not block script execution, just delay part inside it. So this parts of code are not equal. Choosing platform for your project depends on tasks you want to reach.

Resources