I've started looking into multi-threading in Ruby.
So basically, I want to create a few threads, and have them all execute, but not display any of the output until the thread has successfully completed.
Example:
#!/usr/bin/env ruby
t1 = Thread.new {
puts "Hello_1"
sleep(5)
puts "Hello_1 after 5 seconds of sleep"
}
t2 = Thread.new {
puts "Hello_2"
sleep(5)
puts "Hello_2 after 5 seconds of sleep"
}
t1.join
t2.join
puts "Hello_3"
sleep(5)
puts "Hello_3 after 5 seconds of sleep"
The first Hello_1 / Hello_2 execute immediately. I wouldn't want any of the output to show until the thread has successfully completed.
Because puts prints to a single output stream (sysout) you can't use it if you want to capture the output each thread.
You will have to use separate buffered stream for each thread, write to that in each thread, and then dump them to sysout when the thread terminates to see the output.
Here is an example of a thread:
t = Thread.new() do
io = StringIO.new
io << "mary"
io.puts "fred"
io.puts "fred"
puts io.string
end
You will have to pass io to every method in the thread.
or have a look at this for creating a module that redirects stdout for a thread.
But in each thread that your start wrap your code with:
Thread.start do
# capture the STDOUT by storing a StringIO in the thread space
Thread.current[:stdout] = StringIO.new
# Do your stuff.. print using puts
puts 'redirected to StringIO'
# print everything before we exit
STDIO.puts Thread.current[:stdout].string
end.join
You can share a buffer but you should 'synchronize' access to it:
buffer = ""
lock = Mutex.new
t1 = Thread.new {
lock.synchronize{buffer << "Hello_1\n"}
sleep(5)
lock.synchronize{buffer << "Hello_1 after 5 seconds of sleep\n"}
}
t2 = Thread.new {
lock.synchronize{buffer << "Hello_2\n"}
sleep(5)
lock.synchronize{buffer << "Hello_2 after 5 seconds of sleep\n"}
}
t1.join
t2.join
puts buffer
Related
When I have my code like this, I get "can't create thread, resource temporarily unavailable". There are over 24k files in the directory to process.
frames.each do |image|
Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
end.each{ |thread| thread.join }
So, I tried the first 1001 files by calling the index 0-1000, and did it this way:
frames[0..1000].each_with_index do |image, index|
thread = Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
thread.join
end
And while this is processing, the speed seems to be about the same as if it was on a single thread when I'm watching it in the Terminal.
But I want the code to be able to limit to whatever the OS will allow before it disallows, so that it can parse through them all faster.
Or at lease:
Find the maximum threads allowed
Get original directory's count, divided by the number of threads allowed.
Run this each in batches of that division.
I created a script which checks healthcheck and ports status from a .json file populated with microservices.
So for every microservice from the .json file the script will output the HTTP status and healthcheck body and other small details, and I want to add multithreading here in order to return all the output at once.Please see the script below:
#!/usr/bin/env ruby
... get the environment argument part...
file = File.read('./services.json')
data_hash = JSON.parse(file)
threads = []
service = data_hash.keys
service.each do |microservice|
threads << Thread.new do
begin
puts "Microservice: #{microservice}"
port = data_hash["#{microservice}"]['port']
puts "Port: #{port}"
nodes = "knife search 'chef_environment:#{env} AND recipe:#{microservice}' -i"
node = %x[ #{nodes} ].split
node.each do |n|
puts "Node: #{n}"
uri = URI("http://#{n}:#{port}/healthcheck?count=10")
res = Net::HTTP.get_response(uri)
status = Net::HTTP.get(uri)
puts res.code
puts status
puts res.message
end
rescue Net::ReadTimeout
puts "ReadTimeout Error"
next
end
end
end
threads.each do |thread|
thread.join
end
Anyway in this way the script return first the puts "Microservice: #{microservice}" and puts "Port: #{port}" and after this it will return the nodes and only after the STATUS.
How can I return all the data for each loop together?
Instead of puts write output to a variable (hash).
If you wand to wait for all threads to finish their job before showing the output, use ThreadsWait class.
require 'thwait'
file = File.read('./services.json')
data_hash = JSON.parse(file)
h = {}
threads = []
service = data_hash.keys
service.each do |microservice|
threads << Thread.new do
thread_id = Thread.current.object_id.to_s(36)
begin
h[thread_id] = "Microservice: #{microservice}"
port = data_hash["#{microservice}"]['port']
h[thread_id] << "Port: #{port}"
nodes = "knife search 'chef_environment:#{env} AND recipe:#{microservice}' -i"
node = %x[ #{nodes} ].split
node.each do |n|
h[thread_id]<< "Node: #{n}"
uri = URI("http://#{n}:#{port}/healthcheck?count=10")
res = Net::HTTP.get_response(uri)
status = Net::HTTP.get(uri)
h[thread_id] << res.code
h[thread_id] << status
h[thread_id] << res.message
end
rescue Net::ReadTimeout
h[thread_id] << "ReadTimeout Error"
next
end
end
end
threads.each do |thread|
thread.join
end
# wait untill all threads finish their job
ThreadsWait.all_waits(*threads)
p h
[edit]
ThreadsWait.all_waits(*threads) is redundant in above code and can be omitted, since line treads.each do |thread| thread.join end does exactely the same thing.
Instead of outputting the data as you get it using puts, you can collect it all in a string and then puts it once at the end. Strings can take the << operator (implemented as a method in Ruby), so you can just initialize the string, add to it, and then output it at the end, like this:
report = ''
report << 'first thing'
report << 'second thing'
puts report
You could even save them all up together and print them all after all were finished if you want.
I have something like below:
all_hosts.each do |hostname|
Thread.new {
...
}
end
# next line of execution
Each of the hosts above opens its own thread and executes the commands. I want to wait for all threads to finish executing before moving onto next part of file. Is there an easy way of doing this?
Use Thread#join which will wait termination of the thread.
To do that you need to save threads; so use map instead of each:
threads = all_hosts.map do |hostname|
Thread.new {
# commands
}
end
threads.each(&:join)
The Thread documentation explains it:
Alternatively, you can use an array for handling multiple threads at once, like in the following example:
threads = []
threads << Thread.new { puts "Whats the big deal" }
threads << Thread.new { 3.times { puts "Threads are fun!" } }
After creating a few threads we wait for them all to finish consecutively.
threads.each { |thr| thr.join }
Applied to your code:
threads = []
all_hosts.each do |hostname|
threads << Thread.new { ... }
end
threads.each(&:join)
# next line of execution
can somebody explain why the following code won't spawn the passed block ?
require 'daemons'
t = Daemons.call do
# This block does not start
File.open('out.log','w') do # code don't get here to open a file
|fw|
10.times {
fw.puts "=>#{rand(100)}"
sleep 1
}
end
end
#t.start # has no effect
10.times {
puts "Running ? #{t.running?}" # prints "Running ? false" 10 times
sleep 1
}
t.stop
puts 'finished'
Ruby 1.9.3p392, x86_64 Linux
You sure you're not trying to run a Thread for concurrent programming?
Here's what a Thread implementation would look like:
f = File.open('out.log', 'w')
t = Thread.new do
10.times {
f.puts "=>#{rand(100)}"
sleep 1
}
end
10.times {
puts "Running ? #{t.alive?}"
sleep 1
}
t.exit
puts 'finished'
In ruby, is it possible to cause a thread to pause from a different concurrently running thread.
Below is the code that I've written so far. I want the user to be able to type 'pause thread' and the sample500 thread to pause.
#!/usr/bin/env ruby
# Creates a new thread executes the block every intervalSec for durationSec.
def DoEvery(thread, intervalSec, durationSec)
thread = Thread.new do
start = Time.now
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = Thread
beginTime = Time.now
DoEvery(sample500, 0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")}}
when "pause thread\n"
sample500.stop
when "resume thread"
sample500.run
when "exit\n"
exit = TRUE
end
end
Passing Thread object as argument to DoEvery function makes no sense because you immediately overwrite it with Thread.new, check out this modified version:
def DoEvery(intervalSec, durationSec)
thread = Thread.new do
start = Time.now
Thread.current["stop"] = false
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
if Thread.current["stop"]
Thread.current["stop"] = false
puts "paused"
Thread.stop
end
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
thread
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = DoEvery(0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")} }
when "pause thread\n"
sample500["stop"] = true
when "resume thread\n"
sample500.run
when "exit\n"
exit = TRUE
end
end
Here DoEvery returns new thread object. Also note that Thread.stop called inside running thread, you can't directly stop one thread from another because it is not safe.
You may be able to better able to accomplish what you are attempting using Ruby Fiber object, and likely achieve better efficiency on the running system.
Fibers are primitives for implementing light weight cooperative
concurrency in Ruby. Basically they are a means of creating code
blocks that can be paused and resumed, much like threads. The main
difference is that they are never preempted and that the scheduling
must be done by the programmer and not the VM.
Keeping in mind the current implementation of MRI Ruby does not offer any concurrent running threads and the best you are able to accomplish is a green threaded program, the following is a nice example:
require "fiber"
f1 = Fiber.new { |f2| f2.resume Fiber.current; while true; puts "A"; f2.transfer; end }
f2 = Fiber.new { |f1| f1.transfer; while true; puts "B"; f1.transfer; end }
f1.resume f2 # =>
# A
# B
# A
# B
# .
# .
# .