Threads in Ruby - ruby

Why does this code work (I see the output 1 2 3):
for i in 1..3
Thread.new{
puts i
}
end
However, the following code does not produce the same output (I do not see the output 1 2 3)?
for i in 1..3
Thread.new{
sleep(5)
puts i
}
end

When you hit the end of the script, Ruby exits. If you add sleep 10 after the final loop, you can see the output show up. (Albeit, as 3 each time, because the binding to i reflects the value at the end of processing, and the sleep causes a thread switch back to the loop.)
You might want something like:
threads = []
for i in 1..3
threads << Thread.new {
sleep 5
puts i
}
end
threads.map {|t| t.join }
That will wait for all the threads to terminate before exiting.

Related

MRI: Muti-threading + yield to main thread not working

When I run this code, I only see "Loop 1" output once, and then just a constant stream of "Loop 2", "Loop 2"...
It's like the other loop is waiting on something and ends up blocked forever.
Any suggestions as to why?
def loop1(&block)
loop do
block.call("Loop 1")
sleep 1
end
end
def loop2(&block)
loop do
block.call("Loop 2")
sleep 1
end
end
def both(&block)
Thread.new { loop1(&block) }
loop2(&block)
end
both(&method(:puts))
I'm not even sure if it's the puts or the block.call() that causes the threaded loop to stop.

Ruby ThreadsWait timeout

I have the following code to block until all threads have finished (Gist):
ThreadsWait.all_waits(*threads)
What's the simplest way to set a timeout here, ie kill any threads if they are still running after e.g. 3 seconds?
Thread#join accepts an argument after which it will time out. Try this, for example:
5.times.map do |i|
Thread.new do
1_000_000_000.times { |i| i } # takes more than a second
puts "Finished" # will never print
end
end.each { |t| t.join(1) } # times out after a second
p 'stuff I want to execute after finishing the threads' # will print
If you have some things you want to execute before joining, you can do:
5.times.map do |i|
Thread.new do
1_000_000_000.times { |i| i } # takes more than a second
puts "Finished" # will never print
end
end.each do |thread|
puts 'Stuff I want to do before join' # Will print, multiple times
thread.join(1)
end

How to wait for Thread notification instead of joining all the Threads?

In few words, a user makes a request to my web service, and I have to forward the request to X different APIs. I should do it on parallel, so I'm creating threads for this, and the first Thread that answers with a valid response, I should kill the rest of the threads and give back the answer to my customer right away.
One common pattern in Ruby, is to create multiple threads like
threads << Thread.new {}
threads.each { |t| t.join }
The logic I already have is something like:
results = []
threads = []
valid_answer = nil
1.upto(10) do |i|
threads = Thread.new do
sleep(rand(60))
results << i
end
end
threads.each { |t| t.join }
valid_answer = results.detect { |r| r > 7 }
But on a code like the previous, I'm blocking the process until all of the threads finish. It could be that one thread will answer back in 1 second with a valid answer (so at this point I should kill all the other threads and give back that answer), but instead, I'm joining all the threads and doesn't make too much sense.
Is there a way in ruby to sleep or wait until one thread answer, check if that answer is valid, and then blocking/sleeping again until all of the threads are done or either until one of them gives me back a valid response?
Edit:
It should be done in parallel. When I get a request from the customer, I can forward the request to 5 different companies.
Each company can have a timeout up to 60 seconds (insane but real, healthcare business).
As soon as one of these companies answer, I have to check the response (if its a real response or an error), if its a real response, I should kill all of the other threads and answer the customer right away (no reason to make him to wait for 60 seconds if one of the requests gives me back a timeout). Also, no reason to make it on a loop (like if I do this on a loop, it would be like 5 x 60 seconds in the worst scenario).
Perhaps by making the main thread sleep?
def do_stuff
threads = []
valid_answer = nil
1.upto(10) do |i|
threads << Thread.new do
sleep(rand(3))
valid_answer ||= i if i > 7
end
end
sleep 0.1 while valid_answer.nil?
threads.each { |t| t.kill if t.alive? }
valid_answer
end
Edit: there is a better approach with wakeup, too:
def do_stuff
threads = []
answer = nil
1.upto(10) do |i|
threads << Thread.new do
sleep(rand(3))
answer ||= i and Thread.main.wakeup if i > 7
end
end
sleep
threads.each { |t| t.kill if t.alive? }
answer
end

Multithreading calculations in ruby

I want to create a script to calculate numbers in multiple threads. Each thread will calculate the powers of 2 but the first thread must start calculating from 2, the second from 4, and the third from 8, printing some text in-between.
Example:
Im a thread and these are my results
2
4
8
Im a thread and these are my results
4
8
16
Im a thread and these are my results
8
16
32
My fail code:
def loopa(s)
3.times do
puts s
s=s**2
end
end
threads=[]
num=2
until num == 8 do
threads << Thread.new{ loopa(num) }
num=num**2
end
threads.each { |x| puts "Im a thread and these are my results" ; x.join }
My fail results:
Im a thread and these are my results
8
64
4096
8
64
4096
8
64
4096
Im a thread and these are my results
Im a thread and these are my results
I suggest you read the "Threads and Processes" chapter Pragmatic Programmer's ruby book. Here's an old version online. The section called "Creating Ruby Threads" is especially relevant to your question.
To fix the problem, you need to change your Thread.new line to this:
threads << Thread.new(num){|n| loopa(n) }
Your version doesn't work because num is shared between threads, and may be changed by another thread. By passing the variable via a block, the block variable is no longer shared.
More Info
Also, there's an error in your math.
Output values will be:
Thread 1: 2 4 16
Thread 2: 4 16 256
Thread 3: 6 36 1296
"8" is never reached because the until condition quits as soon as it sees "8".
If you want clearer output, use this as the body of loopa:
3.times do
print "#{Thread.current}: #{s}\n"
s=s**2
end
This lets you distinguish the 3 threads. Note that it's better to use a print command with a newline-terminated string versus using puts without a newline, because the latter prints the newline as a separate instruction, which may be interrupted by another thread.
It's normal. Read what you write. Firstly you run 3 threads that are async so output will be in various of combinations of threads output. Then you write 'Im a thread and these are my results' and join each thread. Also remember that Ruby has only references. So if you pass num to thread and then change it it will change in all threads. To avoid it write:
threads = (1..3).map do |i|
puts "I'm starting thread no #{i}"
Thread.new { loopa(2**i) }
end
I feel the need to post a mathematically correct version:
def loopa(s)
3.times do
print "#{Thread.current}: #{s}\n"
s *= 2
end
end
threads=[]
num=2
while num <= 8 do
threads << Thread.new(num){|n| loopa(n) }
num *= 2
end
threads.each { |x| print "Im a thread and these are my results\n" ; x.join }
Bonus 1: threadless solution (naive)
power = 1
workers = 3
iterations = 3
(power ... power + workers).each do |pow|
worker_pow = 2 ** pow
puts "I'm a worker and these are my results"
iterations.times do |inum|
puts worker_pow
worker_pow *= 2
end
end
Bonus 2: threadless solution (cached)
power = 1
workers = 3
iterations = 3
cache_size = workers + iterations - 1
# generate all the values upfront
cache = []
(power ... power+cache_size).each do |i|
cache << 2**i
end
workers.times do |wnum|
puts "I'm a worker and these are my results"
# use a sliding-window to grab the part of the cache we want
puts cache[wnum,3]
end

Reading a file N lines at a time in ruby

I have a large file (hundreds of megs) that consists of filenames, one per line.
I need to loop through the list of filenames, and fork off a process for each filename. I want a maximum of 8 forked processes at a time and I don't want to read the whole filename list into RAM at once.
I'm not even sure where to begin, can anyone help me out?
File.foreach("large_file").each_slice(8) do |eight_lines|
# eight_lines is an array containing 8 lines.
# at this point you can iterate over these filenames
# and spawn off your processes/threads
end
It sounds like the Process module will be useful for this task. Here's something I quickly threw together as a starting point:
include Process
i = 0
for line in open('files.txt') do
i += 1
fork { `sleep #{rand} && echo "#{i} - #{line.chomp}" >> numbers.txt` }
if i >= 8
wait # join any single child process
i -= 1
end
end
waitall # join all remaining child processes
Output:
hello
goodbye
test1
test2
a
b
c
d
e
f
g
$ ruby b.rb
$ cat numbers.txt
1 - hello
3 -
2 - goodbye
5 - test2
6 - a
4 - test1
7 - b
8 - c
8 - d
8 - e
8 - f
8 - g
The way this works is that:
for line in open(XXX) will lazily iterate over the lines of the file you specify.
fork will spawn a child process executing the given block, and in this case, we use backticks to indicate something to be executed by the shell. Note that rand returns a value 0-1 here so we are sleeping less than a second, and I call line.chomp to remove the trailing newline that we get from line.
If we've accumulated 8 or more processes, call wait to stop everything until one of them returns.
Finally, outside the loop, call waitall to join all remaining processes before exiting the script.
Here's Mark's solution wrapped up as a ProcessPool class, might be helpful to have it around (and please correct me if I made some mistake):
class ProcessPool
def initialize pool_size
#pool_size = pool_size
#free_slots = #pool_size
end
def fork &p
if #free_slots == 0
Process.wait
#free_slots += 1
end
#free_slots -= 1
puts "Free slots: #{#free_slots}"
Process.fork &p
end
def waitall
Process.waitall
end
end
pool = ProcessPool.new 8
for line in open('files.txt') do
pool.fork { Kernel.sleep rand(10); puts line.chomp }
end
pool.waitall
puts 'finished'
The standard library documentation for Queue has
require 'thread'
queue = Queue.new
producer = Thread.new do
5.times do |i|
sleep rand(i) # simulate expense
queue << i
puts "#{i} produced"
end
end
consumer = Thread.new do
5.times do |i|
value = queue.pop
sleep rand(i/2) # simulate expense
puts "consumed #{value}"
end
end
consumer.join
I do find it a little verbose though.
Wikipedia describes this as a thread pool pattern
arr = IO.readlines("filename")

Resources