Ruby multithreading queues - ruby

I have a problem with synchronizing threads, I have no idea how to do it, can someone help me ?
So, the thing is that I have to launch the threads in some specific order.
The order is the following:
Threads 1 and 7 can go simultaneously, and after one of them is finished, the next thread is launched (i.e. thread 2 or/and thread 6), the same way goes with thread 3 and 5.
And the last one, after both thread 3 and 5 are finished running, goes the last one, thread 4.
This is the code, I had begun with, but I am stuck at the queue implementation somehow.
MUTEX = Mutex.new
high_condition = ConditionVariable.new
low_condition = ConditionVariable.new
threads = []
7.times do |i|
threads << Thread.new{
MUTEX.synchronize {
Thread.current["number"] = i
you_shall_not_pass
}
}
end
threads.map(&:join)
def you_shall_not_pass
order = Thread.current["number"]
end

Use Ruby's Queue as a counting semaphore. It has blocking push and pop operations that you can use to hand out a limited number of tokens to threads, requiring that each thread acquire a token before it runs and release the token when it's finished. If you initialize the queue with 2 tokens, you can ensure only 2 threads run at a time, and you can create your threads in whatever order you like.
require 'thread'
semaphore = Queue.new
2.times { semaphore.push(1) } # Add two concurrency tokens
puts "#{semaphore.size} available tokens"
threads = []
[1, 7, 2, 6, 3, 5, 4].each do |i|
puts "Enqueueing thread #{i}"
threads << Thread.new do
semaphore.pop # Acquire token
puts "#{Time.now} Thread #{i} running. #{semaphore.size} available tokens. #{semaphore.num_waiting} threads waiting."
sleep(rand(10)) # Simulate work
semaphore.push(1) # Release token
end
end
threads.each(&:join)
puts "#{semaphore.size} available tokens"
$ ruby counting_semaphore.rb
2 available tokens
Enqueueing thread 1
Enqueueing thread 7
2015-12-04 08:17:11 -0800 Thread 7 running. 1 available tokens. 0 threads waiting.
2015-12-04 08:17:11 -0800 Thread 1 running. 0 available tokens. 0 threads waiting.
Enqueueing thread 2
Enqueueing thread 6
2015-12-04 08:17:11 -0800 Thread 2 running. 0 available tokens. 0 threads waiting.
Enqueueing thread 3
Enqueueing thread 5
Enqueueing thread 4
2015-12-04 08:17:19 -0800 Thread 6 running. 0 available tokens. 3 threads waiting.
2015-12-04 08:17:19 -0800 Thread 5 running. 0 available tokens. 2 threads waiting.
2015-12-04 08:17:21 -0800 Thread 3 running. 0 available tokens. 1 threads waiting.
2015-12-04 08:17:22 -0800 Thread 4 running. 0 available tokens. 0 threads waiting.
2 available tokens

Related

`loop{}` versus `loop{sleep 1}`

I am using a loop to wait on a keyboard interrupt and then allow for some clean up operation before exit in a multi threaded environment.
begin
loop {}
rescue Interrupt
p "Ctr-C Pressed..Cleaning Up & Shutting Down"
loop do
break if exit_bool.false?
end
exit 130
end
This piece of code runs in the main thread. There are multiple threads performing several file and DB ops. exit_bool is an atomic var set by other threads to indicate they are in the middle of some operation. I check for the value and wait until it turns false and then exit.
I'm wondering what the cost of loop{} is as opposed to loop{sleep x}.
loop {} results in a high CPU utilization (~100%), whereas loop { sleep x } does not.
Another option is to just sleep forever:
begin
sleep
rescue Interrupt
# ...
end

How do I allow concurrent access to the same route?

I've a simple Sinatra application with one long running route:
get '/jobs/new' do
logger.info "jobs/new start. Thread = #{Thread.current.inspect}"
sleep 10
logger.info "end new..."
erb :'jobs/new'
end
get '/jobs' do
erb :'jobs/index'
end
I've concurrent access between routes, but not to the same route.
An example is, while a client invokes /jobs/new(long during access), another client can invoke jobs in parallel. But the parallel call for the same route doesn't work. In this case, Puma, the webserver, always calls the route with the same thread:
jobs/new started. Thread = #<Thread:0x007f42b128e600 run>
10 seconds later...
jobs/new ended. Thread = #<Thread:0x007f42b128e600 run>
jobs/new started. Thread = #<Thread:0x007f42b128e600 run> <-- new call. Has to wait till first has finished
The other route is being called by different threads. And while route 1 is running:
jobs/new started. Thread = #<Thread:0x007f42b128e600 run>
2 seconds later...
jobs started. Thread = #<Thread:0x007f541f581a40 run> <--other thread
8 seconds later...
jobs/new ended. Thread = #<Thread:0x007f42b128e600 run>
jobs/new started. Thread = #<Thread:0x007f42b128e600 run>
I tried running the app with Thin in threaded mode and with Puma, with the same behavior
Whatever you did, I think it was not right.
Running this code:
# config.ru
require 'bundler'
Bundler.require
get '/jobs/new' do
logger.info "jobs/new start. Thread = #{Thread.current.inspect}"
sleep 10
logger.info "end new..."
"jobs/new"
end
run Sinatra::Application
with puma:
Puma starting in single mode...
* Version 2.7.1, codename: Earl of Sandwich Partition
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://0.0.0.0:9292
Use Ctrl-C to stop
I, [2013-12-12T14:04:48.820907 #9686] INFO -- : jobs/new start. Thread = #<Thread:0x007fa5667eb7c0 run>
I, [2013-12-12T14:04:50.282718 #9686] INFO -- : jobs/new start. Thread = #<Thread:0x007fa566731e38 run>
I, [2013-12-12T14:04:58.821509 #9686] INFO -- : end new...
127.0.0.1 - - [12/Dec/2013 14:04:58] "GET /jobs/new HTTP/1.1" 200 8 10.0132
I, [2013-12-12T14:05:00.283496 #9686] INFO -- : end new...
127.0.0.1 - - [12/Dec/2013 14:05:00] "GET /jobs/new HTTP/1.1" 200 8 10.0015
^C- Gracefully stopping, waiting for requests to finish
- Goodbye
Results in 2 different threads!

Ruby: begin, sleep, retry: where to put incrementer

I have a method 'rate_limited_follow' that takes my Twitter useraccount and follows all the users in an array 'users'. Twitter's got strict rate limits, so the method deals with that contingency by sleeping for 15 minutes and then retrying again. (I didn't write this method, rather got it from the Twitter ruby gem api). You'll notice that it checks to see if the number of attempts are less than the MAX_ATTEMPTS.
My users array has about 400 users that I'm trying to follow. It's adding 15 users at a time (when the rate limits seems to kick in), then sleeping for 15 minutes. Since I set the MAX_ATTEMPTS constant to 3 (just to test it), I expected it to stop trying once it had added 45 users (3 times 15) but it's gone past that, continuing to add 15 users around every fifteen minutes, so it seems as if num_attempts is somehow remaining below 3, even though it's gone through this cycle more than 3 times. Is there something I don't understand about the code? Once 'sleep' is finished and it hits 'retry', where does it start again? Is there some reason num_attempts isn't incrementing?
Calling the method in the loop
>> users.each do |i|
?> rate_limited_follow(myuseraccount, i)
>> end
Method definition with constant
MAX_ATTEMPTS = 3
def rate_limited_follow (account, user)
num_attempts = 0
begin
num_attempts += 1
account.twitter.follow(user)
rescue Twitter::Error::TooManyRequests => error
if num_attempts <= MAX_ATTEMPTS
sleep(15*60) # minutes * 60 seconds
retry
else
raise
end
end
end
Each call to rate_limited_follow resets your number of attempts - or, to rephrase, you are keeping track of attempts per user rather than attempts over your entire array of users.
Hoist num_attempt's initialization out of rate_limited_follow, so that it isn't being reset by each call, and you'll have the behavior that you're looking for.

Ruby TCP/IP Client Threading

I'm using ruby socket for a simple ping-pong scenario.
(The client is sending a string to the server, and the server is sending the string back - that's all)
Simple Client:
socket = TCPSocket.new "localhost", 5555
socket.write "test-string\n"
puts socket.gets.inspect
It's working fine, until Threads come into play:
socket = TCPSocket.new "localhost", 5555
threads = []
5.times do |t|
threads << Thread.new(t) do |th|
socket.write "#{t}\n"
puts "THREAD: #{t} --> [ #{socket.recv(1024).inspect} ]"
end
end
threads.each { |th| th.join }
# Output: THREAD: 3 --> [ "0\r\n1\r\n2\r\n3\r\n4\r\n" ]
The problem here is that each Thread seems to "listen" for responses from the server with socket.gets, and as a result an arbitrary Thread will receive ALL responses from the server, as you can see from the output.
Preferably each Thread should receive it's own response, the output should not look like
THREAD: 3 --> [ "0\r\n1\r\n2\r\n3\r\n4\r\n" ]
but rather like:
THREAD: 0 --> [ "0\r\n" ]
THREAD: 1 --> [ "1\r\n" ]
THREAD: 2 --> [ "2\r\n" ]
THREAD: 3 --> [ "3\r\n" ]
THREAD: 4 --> [ "4\r\n" ]
What is the deal here?
All your threads are sharing the same socket. You write your messages to the socket and then all 5 threads are sitting waiting for data to be available to read.
Depending on the behaviour of the other end, the buffering in the network stack etc. that could come back in one chunk or multiple chunks. In your particular set of circumstances the data appears in one chunk and one thread happens to get lucky.
To get the behaviour you want you should use one socket per thread.

Detect which worker returned a TTR-expired job to the queue?

I have multiple workers processing requests in a beanstalkd queue using beanstalk-client-ruby.
For testing purposes, the workers randomly dive into an infinite loop after picking up a job from the queue.
Beanstalk notices that a job has been reserved for too long and returns it to the queue for other workers to process.
How could I detect that this has happened, so that I can kill the malfunctioning worker?
Looks like I can get detect that a timeout has happened :
> job.timeouts
=> 0
> sleep 10
=> nil
> job.timeouts
=> 1
Now how can I something like:
> job=queue.reserve
=> 189
> job.MAGICAL_INFO_STORE[:previous_worker_pid] = $$
=> extraordinary magic happened
> sleep 10
=> nil
> job=queue.reserve
=> 189
> job.timeouts
=> 1
> kill_the_sucker(job.MAGICAL_INFO_STORE[:previous_worker_pid])
=> nil
Found a working solution myself:
Reserve a job
Setup a new tube with the job_id
Push a job with your PID in the body to the new tube
When a job with timeouts > 0 is found, pop the PID-task from the job_id queue.
Kill the worker

Resources