to create n number of threads and join it to main thread and monitor their status in ruby - ruby

I have a task, which is right now a def of a controller. I want to create n number of threads depending on the number of inputs from user.
how to create dynamic number of threads and join and monitor them at main thread in ruby?
def validate_save
upload_thread =[]
#counter = 1
params.each do |param_name,param_value|
if param_name.include? "file"
Rails.logger.debug("making a thread")
Rails.logger.debug(#counter)
Rails.logger.debug(param_name)
Rails.logger.debug(param_value)
Rails.logger.debug(param_value.original_filename)
upload_thread << Thread.new(#counter) do
sleep 1
Rails.logger.debug("inside Thread:")
Rails.logger.debug(#counter)
if S3.upload_file(param_value)
Rails.logger.debug('we can save into S3')
#flash[:successful]="Successfully Uploaded : "+params[:file].original_filename
else
#flash[:unsuccessful]="UnSuccessful Upload of : "+params[:file].original_filename
Rails.logger.debug('can not save into S3')
end
end
#counter = #counter + 1
end
end
upload_thread.each do |up_th|
up_th.join
end

Related

Limit the number of threads in an iteration ruby

When I have my code like this, I get "can't create thread, resource temporarily unavailable". There are over 24k files in the directory to process.
frames.each do |image|
Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
end.each{ |thread| thread.join }
So, I tried the first 1001 files by calling the index 0-1000, and did it this way:
frames[0..1000].each_with_index do |image, index|
thread = Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
thread.join
end
And while this is processing, the speed seems to be about the same as if it was on a single thread when I'm watching it in the Terminal.
But I want the code to be able to limit to whatever the OS will allow before it disallows, so that it can parse through them all faster.
Or at lease:
Find the maximum threads allowed
Get original directory's count, divided by the number of threads allowed.
Run this each in batches of that division.

Reset a counter in Ruby

I have the following code to compile jobs from github jobs API. How do I reset a counter back to 0 every time I call on a new city? I've tried putting it in several different places with no luck.
def ft_count_and_percentage
##url += #city
uri = URI(##url)
response = Net::HTTP.get(uri)
result = JSON.parse(response)
result.each do |job|
if job["type"] == "Full Time"
##fulltime_count += 1
end
end
puts "Total number of jobs in #{#city}: #{result.length}"
if ##fulltime_count > 0
puts ("full time percent ") + "#{(##fulltime_count/result.length) * 100}"
else
puts "No FT Positions"
end
end
##fulltime_count is defined outside this method to start at 0. Currently, as expected the counter just keeps adding jobs every time I add a new city.
boston = Job.new("Boston")
boston.ft_count_and_percentage
sf = Job.new("San Francisco")
sf.ft_count_and_percentage
la = Job.new("Los Angeles")
la.ft_count_and_percentage
denver = Job.new("Denver")
denver.ft_count_and_percentage
boulder = Job.new("Boulder")
boulder.ft_count_and_percentage
chicago = Job.new("Chicago")
chicago.ft_count_and_percentage
ny = Job.new("New York City")
ny.ft_count_and_percentage
You may need to reset it inside Job init
class Job
def initialize
##count = 0
end
def ft_count_and_percentage
#the blah you already have
end
end

Two version of the same code not giving the same result

I am trying to implement a simple timeout class that handles timeouts of different requests.
Here is the first version:
class MyTimer
def handleTimeout mHash, k
while mHash[k] > 0 do
mHash[k] -=1
sleep 1
puts "#{k} : #{mHash[k]}"
end
end
end
MAX = 3
timeout = Hash.new
timeout[1] = 41
timeout[2] = 5
timeout[3] = 14
t1 = MyTimer.new
t2 = MyTimer.new
t3 = MyTimer.new
first = Thread.new do
t1.handleTimeout(timeout,1)
end
second = Thread.new do
t2.handleTimeout(timeout,2)
end
third = Thread.new do
t3.handleTimeout(timeout,3)
end
first.join
second.join
third.join
This seems to work fine. All the timeouts work independently of each other.
Screenshot attached
The second version of the code however produces different results:
class MyTimer
def handleTimeout mHash, k
while mHash[k] > 0 do
mHash[k] -=1
sleep 1
puts "#{k} : #{mHash[k]}"
end
end
end
MAX = 3
timeout = Hash.new
timers = Array.new(MAX+1)
threads = Array.new(MAX+1)
for i in 0..MAX do
timeout[i] = rand(40)
# To see timeout value
puts "#{i} : #{timeout[i]}"
end
sleep 1
for i in 0..MAX do
timers[i] = MyTimer.new
threads[i] = Thread.new do
timers[i].handleTimeout( timeout, i)
end
end
for i in 0..MAX do
threads[i].join
end
Screenshot attached
Why is this happening?
How can I implement this functionality using arrays?
Is there a better way to implement the same functionality?
In the loop in which you are creating threads by using Thread.new, the variable i is shared between main thread (where threads are getting created) and in the threads created. So, the value of i seen by handleTimeout is not consistent and you get different results.
You can validate this by adding a debug statement in your method:
#...
def handleTimeout mHash, k
puts "Handle timeout called for #{mHash} and #{k}"
#...
end
#...
To fix the issue, you need to use code like below. Here parameters are passed to Thread.new and subsequently accessed using block variables.
for i in 0..MAX do
timers[i] = MyTimer.new
threads[i] = Thread.new(timeout, i) do |a, b|
timers[i].handleTimeout(a, b)
end
end
More on this issue is described in When do you need to pass arguments to Thread.new? and this article.

Ruby: Wait for all threads completed using join and ThreadsWait.all_waits - what the difference?

Consider the following example:
threads = []
(0..10).each do |_|
threads << Thread.new do
# do async staff there
sleep Random.rand(10)
end
end
Then there is 2 ways to wait when it's done:
Using join:
threads.each(&:join)
Using ThreadsWait:
ThreadsWait.all_waits(threads)
Is there any difference between these two ways of doing this?
I know that the ThreadsWait class has other useful methods.
And asking especially about all_waits method.
The documentation clearly states that all_waits will execute any passed block after each thread's execution; join doesn't offer anything like this.
require "thwait"
threads = [Thread.new { 1 }, Thread.new { 2 }]
ThreadsWait.all_waits(threads) do |t|
puts "#{t} complete."
end # will return nil
# output:
# #<Thread:0x00000002773268> complete.
# #<Thread:0x00000002772ea8> complete.
To accomplish the same with join, I imagine you would have to do this:
threads.each do |t|
t.join
puts "#{t} complete."
end # will return threads
Apart from this, the all_waits methods eventually calls the join_nowait method which processes each thread by calling join on it.
Without any block, I would imagine that directly using join would be faster since you would cut back on all ThreadsWait methods leading up to it. So I gave it a shot:
require "thwait"
require "benchmark"
loops = 100_000
Benchmark.bm do |x|
x.report do
loops.times do
threads = [Thread.new { 2 * 1000 }, Thread.new { 4 * 2000 }]
threads.each(&:join)
end
end
x.report do
loops.times do
threads = [Thread.new { 2 * 1000 }, Thread.new { 4 * 2000 }]
ThreadsWait.all_waits(threads)
end
end
end
# results:
# user system total real
# 4.030000 5.750000 9.780000 ( 5.929623 )
# 12.810000 17.060000 29.870000 ( 17.807242 )
Using map instead of each, will wait for them as it needs their values to build the map.
(0..10).map do |_|
Thread.new do
# do async staff there
sleep Random.rand(10)
end
end.map(&:join).map(&:value)

Ruby Pause thread

In ruby, is it possible to cause a thread to pause from a different concurrently running thread.
Below is the code that I've written so far. I want the user to be able to type 'pause thread' and the sample500 thread to pause.
#!/usr/bin/env ruby
# Creates a new thread executes the block every intervalSec for durationSec.
def DoEvery(thread, intervalSec, durationSec)
thread = Thread.new do
start = Time.now
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = Thread
beginTime = Time.now
DoEvery(sample500, 0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")}}
when "pause thread\n"
sample500.stop
when "resume thread"
sample500.run
when "exit\n"
exit = TRUE
end
end
Passing Thread object as argument to DoEvery function makes no sense because you immediately overwrite it with Thread.new, check out this modified version:
def DoEvery(intervalSec, durationSec)
thread = Thread.new do
start = Time.now
Thread.current["stop"] = false
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
if Thread.current["stop"]
Thread.current["stop"] = false
puts "paused"
Thread.stop
end
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
thread
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = DoEvery(0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")} }
when "pause thread\n"
sample500["stop"] = true
when "resume thread\n"
sample500.run
when "exit\n"
exit = TRUE
end
end
Here DoEvery returns new thread object. Also note that Thread.stop called inside running thread, you can't directly stop one thread from another because it is not safe.
You may be able to better able to accomplish what you are attempting using Ruby Fiber object, and likely achieve better efficiency on the running system.
Fibers are primitives for implementing light weight cooperative
concurrency in Ruby. Basically they are a means of creating code
blocks that can be paused and resumed, much like threads. The main
difference is that they are never preempted and that the scheduling
must be done by the programmer and not the VM.
Keeping in mind the current implementation of MRI Ruby does not offer any concurrent running threads and the best you are able to accomplish is a green threaded program, the following is a nice example:
require "fiber"
f1 = Fiber.new { |f2| f2.resume Fiber.current; while true; puts "A"; f2.transfer; end }
f2 = Fiber.new { |f1| f1.transfer; while true; puts "B"; f1.transfer; end }
f1.resume f2 # =>
# A
# B
# A
# B
# .
# .
# .

Resources