Add multithreads/concurency in script - ruby

I created a script which checks healthcheck and ports status from a .json file populated with microservices.
So for every microservice from the .json file the script will output the HTTP status and healthcheck body and other small details, and I want to add multithreading here in order to return all the output at once.Please see the script below:
#!/usr/bin/env ruby
... get the environment argument part...
file = File.read('./services.json')
data_hash = JSON.parse(file)
threads = []
service = data_hash.keys
service.each do |microservice|
threads << Thread.new do
begin
puts "Microservice: #{microservice}"
port = data_hash["#{microservice}"]['port']
puts "Port: #{port}"
nodes = "knife search 'chef_environment:#{env} AND recipe:#{microservice}' -i"
node = %x[ #{nodes} ].split
node.each do |n|
puts "Node: #{n}"
uri = URI("http://#{n}:#{port}/healthcheck?count=10")
res = Net::HTTP.get_response(uri)
status = Net::HTTP.get(uri)
puts res.code
puts status
puts res.message
end
rescue Net::ReadTimeout
puts "ReadTimeout Error"
next
end
end
end
threads.each do |thread|
thread.join
end
Anyway in this way the script return first the puts "Microservice: #{microservice}" and puts "Port: #{port}" and after this it will return the nodes and only after the STATUS.
How can I return all the data for each loop together?

Instead of puts write output to a variable (hash).
If you wand to wait for all threads to finish their job before showing the output, use ThreadsWait class.
require 'thwait'
file = File.read('./services.json')
data_hash = JSON.parse(file)
h = {}
threads = []
service = data_hash.keys
service.each do |microservice|
threads << Thread.new do
thread_id = Thread.current.object_id.to_s(36)
begin
h[thread_id] = "Microservice: #{microservice}"
port = data_hash["#{microservice}"]['port']
h[thread_id] << "Port: #{port}"
nodes = "knife search 'chef_environment:#{env} AND recipe:#{microservice}' -i"
node = %x[ #{nodes} ].split
node.each do |n|
h[thread_id]<< "Node: #{n}"
uri = URI("http://#{n}:#{port}/healthcheck?count=10")
res = Net::HTTP.get_response(uri)
status = Net::HTTP.get(uri)
h[thread_id] << res.code
h[thread_id] << status
h[thread_id] << res.message
end
rescue Net::ReadTimeout
h[thread_id] << "ReadTimeout Error"
next
end
end
end
threads.each do |thread|
thread.join
end
# wait untill all threads finish their job
ThreadsWait.all_waits(*threads)
p h
[edit]
ThreadsWait.all_waits(*threads) is redundant in above code and can be omitted, since line treads.each do |thread| thread.join end does exactely the same thing.

Instead of outputting the data as you get it using puts, you can collect it all in a string and then puts it once at the end. Strings can take the << operator (implemented as a method in Ruby), so you can just initialize the string, add to it, and then output it at the end, like this:
report = ''
report << 'first thing'
report << 'second thing'
puts report
You could even save them all up together and print them all after all were finished if you want.

Related

Limit the number of threads in an iteration ruby

When I have my code like this, I get "can't create thread, resource temporarily unavailable". There are over 24k files in the directory to process.
frames.each do |image|
Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
end.each{ |thread| thread.join }
So, I tried the first 1001 files by calling the index 0-1000, and did it this way:
frames[0..1000].each_with_index do |image, index|
thread = Thread.new do
pipeline = ImageProcessing::MiniMagick.
source(File.open("original/#{image}"))
.append("-fuzz", "30%")
.append("-transparent", "#ff00fe")
result = pipeline.call
puts result.path
file_parts = image.split("_")
frame_number = file_parts[2]
FileUtils.cp(result.path, "transparent/image_transparent_#{frame_number}")
puts "Done with #{image}!"
puts "#{Dir.children("transparent").count.to_s} / #{Dir.children("original").count.to_s}"
puts "\n"
end
thread.join
end
And while this is processing, the speed seems to be about the same as if it was on a single thread when I'm watching it in the Terminal.
But I want the code to be able to limit to whatever the OS will allow before it disallows, so that it can parse through them all faster.
Or at lease:
Find the maximum threads allowed
Get original directory's count, divided by the number of threads allowed.
Run this each in batches of that division.

Ruby method variable declaration

I'm trying to define methods to parse through an apache log file and pull ip addresses, URLs, requests per hour, and error codes. I've got everything working outside of methods, but when attempting to put that code into the methods I keep getting the error message "Stack level too deep." Here is the code in question.
class CommonLog
def initialize(logfile)
#logfile = logfile
end
def readfile
#readfile = File.readlines(#logfile).map { |line|
line.split()
}
#readfile = #readfile.to_s.split(" ")
end
def ip_histogram
#ip_count = 0
#readfile.each_index { |index|
if (#readfile[index] =~ /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/ )
puts #readfile[index]
puts #ip_count += 1
end
}
end
def url_histogram
url_count = 0
cleaned_file.each_index { |index|
if (cleaned_file[index] =~ /\/{1}(([a-z]{4,})|(\~{1}))\:{0}\S+/ )
puts cleaned_file[index]
puts url_count += 1
end
}
end
def requests_per_hour
end
def sorted_list
end
end
my_file = CommonLog.new("test_log")
cleaned_file = my_file.readfile
puts cleaned_file.ip_histogram
It looks like the problem lies on you CommonLog#readfile method:
def readfile
#readfile = File.readlines(#logfile).map { |line|
line.split()
}
#readfile = readfile.to_s.split(" ")
end
Notice that inside the implementation of readfile your calling readfile recursively? When it executes it reads the lines from the file, maps them and assign the result the #readfile; then it calls readfile and the method starts to execute again; this goes on forever, until you stack blows up because of too many recursive method calls.
I assume what you actually meant is:
#readfile = #readfile.to_s.split(" ")

Ruby Mechanize Stops Working while in Each Do Loop

I am using a mechanize Ruby script to loop through about 1,000 records in a tab delimited file. Everything works as expected until i reach about 300 records.
Once I get to about 300 records, my script keeps calling rescue on every attempt and eventually stops working. I thought it was because I had not properly set max_history, but that doesn't seem to be making a difference.
Here is the error message that I start getting:
getaddrinfo: nodename nor servname provided, or not known
Any ideas on what I might be doing wrong here?
require 'mechanize'
result_counter = 0
used_file = File.open(ARGV[0])
total_rows = used_file.readlines.size
mechanize = Mechanize.new { |agent|
agent.open_timeout = 10
agent.read_timeout = 10
agent.max_history = 0
}
File.open(ARGV[0]).each do |line|
item = line.split("\t").map {|item| item.strip}
website = item[16]
name = item[11]
if website
begin
tries ||= 3
page = mechanize.get(website)
primary1 = page.link_with(text: 'text')
secondary1 = page.link_with(text: 'other_text')
contains_primary = true
contains_secondary = true
unless contains_primary || contains_secondary
1.times do |count|
result_counter+=1
STDERR.puts "Generate (#{result_counter}/#{total_rows}) #{name} - No"
end
end
for i in [primary1]
if i
page_to_visit = i.click
page_found = page_to_visit.uri
1.times do |count|
result_counter+=1
STDERR.puts "Generate (#{result_counter}/#{total_rows}) #{name}"
end
break
end
end
rescue Timeout::Error
STDERR.puts "Generate (#{result_counter}/#{total_rows}) #{name} - Timeout"
rescue => e
STDERR.puts e.message
STDERR.puts "Generate (#{result_counter}/#{total_rows}) #{name} - Rescue"
end
end
end
You get this error because you don't close the connection after you used it.
This should fix your problem:
mechanize = Mechanize.new { |agent|
agent.open_timeout = 10
agent.read_timeout = 10
agent.max_history = 0
agent.keep_alive = false
}

Put contents of array all at once

I don't understand why this won't do what the title states.
#!/usr/bin/env ruby
require 'socket'
require 'timeout'
class Scanner
def initialize(host, port)
#host = host
#port = port
end
def popen
begin
array = []
sock = Socket.new(:INET, :STREAM)
sockaddr = Socket.sockaddr_in(#port, #host)
Timeout::timeout(5) do
array.push("Port #{#port}: Open") if sock.connect(sockaddr)
end
puts array
rescue Timeout::Error
puts "Port #{#port}: Filtered"
rescue Errno::ECONNREFUSED
end
end
end # end Scanner
def main
begin
p = 1
case ARGV[0]
when '-p'
eport = ARGV[1]
host = ARGV[2]
else
eport = 65535
host = ARGV[0]
end
t1 = Time.now
puts "\n"
puts "-" * 70
puts "Scanning #{host}..."
puts "-" * 70
while p <= eport.to_i do
scan = Scanner.new(host, p)
scan.popen
p += 1
end
t2 = Time.now
time = t2 - t1
puts "\nScan completed: #{host} scanned in #{time} seconds."
rescue Errno::EHOSTUNREACH
puts "This host appears to be unreachable"
rescue Interrupt
puts "onnection terminated."
end
end
main
What I'm trying to achieve is an output similar to nmap, in the way that it scans everything, and then shows all open or closed ports at the end. Instead what happens is that it prints them out as it discovers them. I figured pushing the output into an array then printing the array would achieve such an output, yet it still prints out the ports one at a time. Why is this happening?
Also, I apologize for the formatting, the code tags are a little weird.
Your loop calls popen once per iteration. Your popen method sets array = [] each time it is called, then populates it with one item, then you print it with puts. On the next loop iteration, you reset array to [] and do it all again.
You only asked "why," but – you could solve this by setting array just once in the body of main and then passing it to popen (or any number of ways).

Ruby output is not displayed on the sinatra browser

I want to bulid a multi threaded application. If i do not use threads, everything works fine. When i try to use threads, then nothing is displayed on the browser. when i use the syntax 'puts "%s" %io.read' then it displays on the command prompt and not on the browser. Any help would be appreciated.
require 'sinatra'
require 'thread'
set :environment, :production
get '/price/:upc/:rtype' do
Webupc = "#{params[:upc]}"
Webformat = "#{params[:rtype]}"
MThread = Thread.new do
puts "inside thread"
puts "a = %s" %Webupc
puts "b = %s" %Webformat
#call the price
Maxupclen = 16
padstr = ""
padupc = ""
padlen = (Maxupclen - Webupc.length)
puts "format type: #{params[:rtype]}"
puts "UPC: #{params[:upc]}"
puts "padlen: %s" %padlen
if (Webformat == 'F')
puts "inside format"
if (padlen == 0 ) then
IO.popen("tstprcpd.exe #{Webupc}")
{ |io|
"%s" %io.read
}
elsif (padlen > 0 ) then
for i in 1 .. padlen
padstr = padstr + "0"
end
padupc = padstr + Webupc
puts "padupc %s" %padupc
IO.popen("tstprcpd.exe #{padupc}") { |io|
"%s" %io.read
}
elsif (padlen < 0 ) then
IO.popen("date /T") { |io|
"UPC length must be 16 digits or less." %io.read
}
end
end
end
end
Your code has several problems:
It is not formatted properly
You are using Uppercase names for variables; that makes them constants!
puts will not output to the browser, but to the console. The browser will recieve the return value of the block, i.e. the return value of the last statement in the block. Therefore, you need to build your output differently (see below).
You are never joining the thread
Here's a minimal sinatra app that uses a thread. However, the thread makes no sense in this case because you must wait for its termination anyway before you can output the result to the browser. In order to build the output I have used StringIO, which you can use with puts to build a multiline string conveniently. However, you could also simply initialize res with an empty string with res = "" and then append your lines to this string with res << "new line\n".
require 'sinatra'
require 'thread'
require 'stringio'
get '/' do
res = StringIO.new
th = Thread.new do
res.puts 'Hello, world!'
end
th.join
res.string
end

Resources