I am calling the same Ruby function with a number of threads (for example 10 threads). Each thread passes different argument to function.
Example:
def test thread_no
puts "In thread no." + thread_no.to_s
end
num_threads = 6
threads=[]
for thread_no in 1..num_threads
puts "Creating thread no. "+thread_no.to_s
threads << Thread.new{test(thread_no)}
end
threads.each { |thr| thr.join }
Output:
Creating thread no. 1
Creating thread no. 2
Creating thread no. 3
Creating thread no. 4
In thread no.4
Creating thread no. 5
Creating thread no. 6
In thread no.6
In thread no.6
In thread no.6
In thread no.6
In thread no.6
Of course I want to get output: In thread no. 1 (2,3,4,5,6) Can I somehow achieve that this would work?
The problem is the for-loop. In Ruby, it reuses a single variable.
So all blocks of the thread bodies access the same variable. An this variable is 6 at the end of the loop. The thread itself may start only after the loop has ended.
You can solve this by using the each-loops. They are more cleanly implemented, each loop variable exists on its own.
(1..num_threads).each do | thread_no |
puts "Creating thread no. "+thread_no.to_s
threads << Thread.new{test(thread_no)}
end
Unfortunately, for loops in ruby are a source of surprises. So it is best to always use each loops.
Addition:
You an also give Thread.new one or several parameters, and these parameters get passed into the thread body block. This way you can make sure that the block uses no vars outside it's own scope, so it also works with for-loops.
threads << Thread.new(thread_no){|n| test(n) }
#Meier already mentioned the reason why for-end spits out different result than expected.
for loop is language syntax construction, it reuses the same local variable thread_no and thread_no yields 6 because your for loop ends before the last few threads start executing.
In order to get rid of such issue, you can keep a copy of the exact thread_no in an another scope - such as -
def test thread_no
puts "In thread no." + thread_no.to_s
end
num_threads = 6
threads = []
for thread_no in 1..num_threads
threads << -> (thread_no) { Thread.new { test(thread_no) } }. (thread_no)
end
threads.each { |thr| thr.join }
Related
My records in a database are already categorized into buckets (0, 1, 2, 3). Rather than applying a function to each record serially, I'd like to open four threads and apply the function to the record in that thread's bucket.
If I run this:
i = 4
i.times do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}.join
end
I get:
opening thread for 0 degree
opening thread for 1 degree
opening thread for 2 degree
opening thread for 3 degree
with waiting in between each one. It's still going serially.
If I do the same as above, but without join:
i = 4
i.times do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}
end
I get:
opening thread for 3 degree
opening thread for 2 degreeopening thread for 0 degree
opening thread for 4 degree
which is closer to what I want; it seems they all run simultaneously.
It makes me nervous when my puts statements are printed haphazardly like this. If I don't have the join there, doesn't that mean that whichever thread terminates first, the rest of the script moves on and the other threads terminate early? What should I do?
What SHOULD I be doing here?
You should be joining your threads. Otherwise when the main thread (your script) exits, it takes all unfinished threads with it. The reason why execution is serial in your first case is that you wait for a thread to finish right after you start it (and before you start the next one). First create all threads, then wait on them.
i = 4
threads = i.times.map do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}
end
threads.each(&:join)
# or
require 'thwait'
ThreadsWait.all_waits(*threads)
You will see further improvements in threading performance if you run the code on JRuby or Rubinius, as their threads are not crippled in any way by some global lock.
Why the result is not from 1 to 10, but 10s only?
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new{ run(i)}
end
sleep(100)
end
Result:
10
10
10
10
10
10
10
10
10
10
10
Why loop? I am running while loop, because later I want to iterate through the DB table all the time and echo any records that are retrieved from the DB.
The block that is passed to Thread.new may actually begin at some point in the future, and by that time the value of i may have changed. In your case, they all have incremented up to 10 prior to when all the threads actually run.
To fix this, use the form of Thread.new that accepts a parameter, in addition to the block:
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new(i) { |j| run(j) }
end
sleep(100)
end
This sets the block variable j to the value of i at the time new was called.
#DavidGrayson is right.
You can see here a side effect in for loop. In your case i variable scope is whole your file. While you are expecting only a block in your for loop as a scope. Actually this is wrong approach in idiomatic Ruby. Ruby gives you iterators for this job.
(1..10).each do |i|
Thread.new{ run(i)}
end
In this case scope of variable i will be isolated in block scope what means for each iteration you will get new local (for this block) variable i.
The problem is that you have created 11 threads that are all trying to access the same variable i which was defined by the main thread of your program. One trick to avoid that is to call Thread.new inside a method; then the variable i that the thread has access to is just the particular i that was passed to the method, and it is not shared with other threads. This takes advantage of a closure.
require 'thread'
def run(i)
puts i
end
def start_thread(i)
Thread.new { run i }
end
for i in 0..10
start_thread i
sleep 0.1
end
Result:
0
1
2
3
4
5
6
7
8
9
10
(I added the sleep just to guarantee that the threads run in numerical order so we can have tidy output, but you could take it out and still have a valid program where each thread gets the correct argument.)
I have a small code just for test
#!/usr/bin/env ruby
def A
puts "A"
sleep 2
end
def B
puts "B"
sleep 2
end
[
Thread.new(loop{A()}),
Thread.new(loop{B()})
].each do |thr|
thr.join
end
and it didn't works as I wish.
I hoped that I'll get
A
B
A
B
and so on, but I got just
A
A
A
A
It means that only 1st thread was started.
Does it means that ruby waits when 1st Thread will be closed to start the 2nd one ?
How I could run Thread as thread, I'd like to have threads in my app which will be in parallel make their work and main application thread will be its own job.
What could you advise me ?
Instead of running the loop in the threads, the code is running the loop inside the main thread; which make the threads not to start because of the infinite loop.
Replace following lines (parentheses):
[
Thread.new(loop{A()}),
Thread.new(loop{B()})
]
with (braces):
[
Thread.new{loop{A()}},
Thread.new{loop{B()}}
]
to pass the block instead of the return value of the (infinite) loop.
Your call to Thread ctor does not do what you expect. You are passing the result of the loop block to the Thread constructor. This way, the loop has to end before the Thread could start. But since your loop is never ending, you only see the A() method output which is being executed in the current thread.
Try calling it this way:
[
Thread.new{loop{A()}},
Thread.new{loop{B()}}
]
Why the result is not from 1 to 10, but 10s only?
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new{ run(i)}
end
sleep(100)
end
Result:
10
10
10
10
10
10
10
10
10
10
10
Why loop? I am running while loop, because later I want to iterate through the DB table all the time and echo any records that are retrieved from the DB.
The block that is passed to Thread.new may actually begin at some point in the future, and by that time the value of i may have changed. In your case, they all have incremented up to 10 prior to when all the threads actually run.
To fix this, use the form of Thread.new that accepts a parameter, in addition to the block:
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new(i) { |j| run(j) }
end
sleep(100)
end
This sets the block variable j to the value of i at the time new was called.
#DavidGrayson is right.
You can see here a side effect in for loop. In your case i variable scope is whole your file. While you are expecting only a block in your for loop as a scope. Actually this is wrong approach in idiomatic Ruby. Ruby gives you iterators for this job.
(1..10).each do |i|
Thread.new{ run(i)}
end
In this case scope of variable i will be isolated in block scope what means for each iteration you will get new local (for this block) variable i.
The problem is that you have created 11 threads that are all trying to access the same variable i which was defined by the main thread of your program. One trick to avoid that is to call Thread.new inside a method; then the variable i that the thread has access to is just the particular i that was passed to the method, and it is not shared with other threads. This takes advantage of a closure.
require 'thread'
def run(i)
puts i
end
def start_thread(i)
Thread.new { run i }
end
for i in 0..10
start_thread i
sleep 0.1
end
Result:
0
1
2
3
4
5
6
7
8
9
10
(I added the sleep just to guarantee that the threads run in numerical order so we can have tidy output, but you could take it out and still have a valid program where each thread gets the correct argument.)
I am not fluent in ruby and am having trouble with the following code example. I want to pass the array index to the thread function. When I run this code, all threads print "4". They should instead print "0 1 2 3 4" (in any order).
It seems that the num variable is being shared between all iterations of the loop and passes a reference to the "test" function. The loop finishes before the threads start and num is left equal to 4.
What is going on and how do I get the correct behavior?
NUM_THREADS = 5
def test(num)
puts num.to_s()
end
threads = Array.new(NUM_THREADS)
for i in 0..(NUM_THREADS - 1)
num = i
threads[i] = Thread.new{test(num)}
end
for i in 0..(NUM_THREADS - 1)
threads[i].join
end
Your script does what I would expect in Unix but not in Windows, most likely because the thread instantiation is competing with the for loop for using the num value. I think the reason is that the for loop does not create a closure, so after finishing that loop num is equal to 4:
for i in 0..4
end
puts i
# => 4
To fix it (and write more idiomatic Ruby), you could write something like this:
NUM_THREADS = 5
def test(num)
puts num # to_s is unnecessary
end
# Create an array for each thread that runs test on each index
threads = NUM_THREADS.times.map { |i| Thread.new { test i } }
# Call the join method on each thread
threads.each(&:join)
where i would be local to the map block.
"What is going on?" => The scope of num is the main environment, so it is shared by all threads (The only thing surrounding it is the for keyword, which does not create a scope). The execution of puts in all threads was later than the for loop on i incrementing it to 4. A variable passed to a thread as an argument (such as num below) becomes a block argument, and will not be shared outside of the thread.
NUM_THREADS = 5
threads = Array.new(NUM_THREADS){|i| Thread.new(i){|num| puts num}}.each(&:join)