Simulating parallel calls to a method in Ruby - ruby

I have a method which is frequently called from different users so I want to simulate this behavior in order to observe it's behavior
I've read about fork and threads and actually think that fork is better suited for this purpose but I couldn't get anywhere with fork so I switch to threads and I got this:
module MethodBenchmark
extend self
def execute_with_threads
arr = []
3.times do |i|
arr[i] = Thread.new {
puts "Thread number #{i}"
call_actual_method(3)
}
end
arr.each {|t| t.join;}
end
def call_actual_method(number_of_requests)
number_of_requests.times do |i|
puts "Executing request #{i}"
end
end
end
the result I got is
Thread number 2
Executing request 0
Executing request 1
Executing request 2
Thread number 0
Executing request 0
Executing request 1
Executing request 2
Thread number 1
Executing request 0
Executing request 1
Executing request 2
So what I want is each thread to represent a different user and each request, well to represent a new request from a random user. In other words I would like to have an output something like this:
Thread number 2
Executing request 0
Executing request 1
Thread number 0
Executing request 2
Executing request 0
Executing request 1
Thread number 1
Executing request 2
Executing request 0
Executing request 1
Executing request 2
The idea being that once all the threads have spawned I'll get a lot of concurrent requests from different threads and not that sequential output. How can I achieve this behavior?
P.S
I hoped that this could be due to the small amount of threads/requests but I got the same sequential result with 25 threads and 100 requests per thread.
PS.PS
Inside the body of this method
def call_actual_method(number_of_requests)
I plan to actually call a method which is making request to the database
def call_actual_method(number_of_requests)
number_of_requests.times do |i|
method_to_call_database()
end
end
Currently method_to_call_database() has two possible implementations in terms of how the SQL is structured and I want to measure the execution time of both implementations under a given load. The idea is to choose the faster method.

Related

Ruby Timeout.timeout does not timeout in x secs

Below code
Timeout.timeout(2) do
i = 0
while(true)
i = i + 1
p "test #{i}"
end
end
does not timeout in 2 secs. whereas below similar code timeout in 2 seconds
Timeout.timeout(2) do
i = 0
while(true)
i = i + 1
# p "test #{i}"
end
end
What is the underlying difference? Please help.
I don't know exactly what's going on here and I suspect somebody who understands the underlying C code would be the one to give a complete answer. I have an inkling. The Matz Ruby Interpreter (MRI) has a global thread lock which means only one thread can actually run at any given time. The way threading works is when one thread is waiting on a resource it sleeps and this gives another thread opportunity to run.
Timeout creates a second thread that will sleep for 2 seconds then raise an exception on the current thread enforcing the timeout. We are guaranteed this thread will not run before 2 seconds but not guaranteed exactly when it will run after 2 seconds but usually a few milliseconds or so with some exceptions.
The function p is unique in that it writes directly to std.out. This is where a C programmer may be helpful but it appears to me that its starving the other thread of resources possibly because to throw an exception the second thread needs to own std.out.
p and pp both cause this problem whereas puts does not.
In support of the resource starvation theory the following code works
Timeout.timeout(2) do
i = 0
while(true)
i = i + 1
p "testing timeout #{i}"
sleep 0.001
end
end

Ruby multi threading #join and simultaneous execution

My records in a database are already categorized into buckets (0, 1, 2, 3). Rather than applying a function to each record serially, I'd like to open four threads and apply the function to the record in that thread's bucket.
If I run this:
i = 4
i.times do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}.join
end
I get:
opening thread for 0 degree
opening thread for 1 degree
opening thread for 2 degree
opening thread for 3 degree
with waiting in between each one. It's still going serially.
If I do the same as above, but without join:
i = 4
i.times do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}
end
I get:
opening thread for 3 degree
opening thread for 2 degreeopening thread for 0 degree
opening thread for 4 degree
which is closer to what I want; it seems they all run simultaneously.
It makes me nervous when my puts statements are printed haphazardly like this. If I don't have the join there, doesn't that mean that whichever thread terminates first, the rest of the script moves on and the other threads terminate early? What should I do?
What SHOULD I be doing here?
You should be joining your threads. Otherwise when the main thread (your script) exits, it takes all unfinished threads with it. The reason why execution is serial in your first case is that you wait for a thread to finish right after you start it (and before you start the next one). First create all threads, then wait on them.
i = 4
threads = i.times.map do |n|
Thread.new {
puts "opening thread for #{n} degree"
myFunction(n)
}
end
threads.each(&:join)
# or
require 'thwait'
ThreadsWait.all_waits(*threads)
You will see further improvements in threading performance if you run the code on JRuby or Rubinius, as their threads are not crippled in any way by some global lock.

Why are not ruby threads working as expected? [duplicate]

Why the result is not from 1 to 10, but 10s only?
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new{ run(i)}
end
sleep(100)
end
Result:
10
10
10
10
10
10
10
10
10
10
10
Why loop? I am running while loop, because later I want to iterate through the DB table all the time and echo any records that are retrieved from the DB.
The block that is passed to Thread.new may actually begin at some point in the future, and by that time the value of i may have changed. In your case, they all have incremented up to 10 prior to when all the threads actually run.
To fix this, use the form of Thread.new that accepts a parameter, in addition to the block:
require 'thread'
def run(i)
puts i
end
while true
for i in 0..10
Thread.new(i) { |j| run(j) }
end
sleep(100)
end
This sets the block variable j to the value of i at the time new was called.
#DavidGrayson is right.
You can see here a side effect in for loop. In your case i variable scope is whole your file. While you are expecting only a block in your for loop as a scope. Actually this is wrong approach in idiomatic Ruby. Ruby gives you iterators for this job.
(1..10).each do |i|
Thread.new{ run(i)}
end
In this case scope of variable i will be isolated in block scope what means for each iteration you will get new local (for this block) variable i.
The problem is that you have created 11 threads that are all trying to access the same variable i which was defined by the main thread of your program. One trick to avoid that is to call Thread.new inside a method; then the variable i that the thread has access to is just the particular i that was passed to the method, and it is not shared with other threads. This takes advantage of a closure.
require 'thread'
def run(i)
puts i
end
def start_thread(i)
Thread.new { run i }
end
for i in 0..10
start_thread i
sleep 0.1
end
Result:
0
1
2
3
4
5
6
7
8
9
10
(I added the sleep just to guarantee that the threads run in numerical order so we can have tidy output, but you could take it out and still have a valid program where each thread gets the correct argument.)

Ruby threads calling the same function with different arguments

I am calling the same Ruby function with a number of threads (for example 10 threads). Each thread passes different argument to function.
Example:
def test thread_no
puts "In thread no." + thread_no.to_s
end
num_threads = 6
threads=[]
for thread_no in 1..num_threads
puts "Creating thread no. "+thread_no.to_s
threads << Thread.new{test(thread_no)}
end
threads.each { |thr| thr.join }
Output:
Creating thread no. 1
Creating thread no. 2
Creating thread no. 3
Creating thread no. 4
In thread no.4
Creating thread no. 5
Creating thread no. 6
In thread no.6
In thread no.6
In thread no.6
In thread no.6
In thread no.6
Of course I want to get output: In thread no. 1 (2,3,4,5,6) Can I somehow achieve that this would work?
The problem is the for-loop. In Ruby, it reuses a single variable.
So all blocks of the thread bodies access the same variable. An this variable is 6 at the end of the loop. The thread itself may start only after the loop has ended.
You can solve this by using the each-loops. They are more cleanly implemented, each loop variable exists on its own.
(1..num_threads).each do | thread_no |
puts "Creating thread no. "+thread_no.to_s
threads << Thread.new{test(thread_no)}
end
Unfortunately, for loops in ruby are a source of surprises. So it is best to always use each loops.
Addition:
You an also give Thread.new one or several parameters, and these parameters get passed into the thread body block. This way you can make sure that the block uses no vars outside it's own scope, so it also works with for-loops.
threads << Thread.new(thread_no){|n| test(n) }
#Meier already mentioned the reason why for-end spits out different result than expected.
for loop is language syntax construction, it reuses the same local variable thread_no and thread_no yields 6 because your for loop ends before the last few threads start executing.
In order to get rid of such issue, you can keep a copy of the exact thread_no in an another scope - such as -
def test thread_no
puts "In thread no." + thread_no.to_s
end
num_threads = 6
threads = []
for thread_no in 1..num_threads
threads << -> (thread_no) { Thread.new { test(thread_no) } }. (thread_no)
end
threads.each { |thr| thr.join }

Threading and Looping in Ruby

I am looking for a way to for my threads to iterate through an array of email addresses without stepping on each others toes and changing the variables (I can't use mutex). I found some information on using "thread local variables" but can't seem to get that to work. Below is an example of my problem (this is just a small chuck of the code).
(1..(threads).map { |thread_count|
Thread.new do
(1..(messages).each do |message_count|
email = recipients_array[recipient_count].join(", ")
if (recipient_count != ( recipients_array.length - 1 ))
recipient_count += 1
else
recipient_count = 0
end
I've been stuck on this for a while. I'm writing script that utilize multithreading in JRuby for the purpose of sending emails. I tell the script how many threads I want to send and how many messages per thread I am going to send. I pass in a text file of recipient addresses which I load into an array. I then want to iterate through the array so that:
Thread 1, Message 1 will go to email 1
Thread 2, Message 1 will go to email 1
Thread 1, Message 2 will go to email 2
Thread 2, Message 2 will go to email 2
and so on... It starts off fine but If I'm setting up to do 5 threads x 5 messages:
Threads 1 through 5, Message 1 will go to email 1
Thread 1, Message 2 will go to email 6
because they are all accessing recipient_count variable and incrementing it +1.
Looking for some advice on how to set this up better.
Usually I utilize ruby multithreading this way:
require 'thread'
count = 4
result = []
mutex = Mutex.new
queue = Queue.new
# fill the queue
(0..100).each do |i|
queue << i
end
(0..count).map do
begin
loop do
item = queue.pop(true)
item = do_something_with_it(item)
mutex.synchronize do
result << item
end
end
rescue ThreadError
Thread.exit
end
end.each(&:join)
# process results
You said this script is called with the threads, messages and recipient_array as arguments. I'm not sure what form the individual entries in recipients_array take or what the Array#join is for, nor why you reset recipient_count to 0 when it has reached the last index of recipients_array. I assume there is some missing code. But, how about this.
emails_handled = []
(0..threads - 1).map do |i|
Thread.new do
(i..messages * threads - 1).step(threads) do |n|
email = recipients_array[n].join(", ")
emails_handled[n] = 1
# ... do stuff with your email
end
end
end
Each thread has the same step and same endpoint, but a different starting point, so they don't clash. It's not optimal, but I'm pretty new to threads myself.
When you want to get recipient_count, you can call emails_handled.compact.reduce(:+). I wasn't sure if you needed recipient_count for anything other than the recipients_array[] lookup - if not, you can dump emails_handled entirely.

Resources