How to calculate duration of array combination function in Ruby - ruby

I have a function below which generates a set of combinations for an array, at various lengths, defined by a range. I'd like to be able to get data about the combination process which would include the time required to process the combinations. Given the following:
source = ("a".."z").to_a
range = 1..7
The command to generate the combinations is this:
combinations = (range).flat_map do |size|
source.combination(size).to_a
end
This command takes about 5 seconds to run on my machine, and generates 971,711 combinations. However, when I try to execute this in the context of a function, below:
def combinations(source, range)
time_start = Time.now
combinations = (range).flat_map do |size|
source.combination(size).to_a
end
time_elapsed = (Time.now - time_start).round(1)
puts "Generated #{combinations.count} in #{time_elapsed} seconds."
return combinations
end
source = ("a".."z").to_a
range = 1..7
combinations(source, range)
The function almost immediately outputs:
Generated 971711 in 0.1 seconds.
... and then 5 seconds later returns the combinations. What's going on here? And how can I calculate the duration of the time required to process the combinations?

When I run your code on ruby 2.0.0p247 on an Ubuntu 12.04 32-bit machine, I get the output:
Generated 971711 in 0.6 seconds.
and the program exits immediately after that.
Since there is only one puts line in the program what do you mean by "and then 5 seconds later returns the combinations"? Is there more code that you are not showing us? What ruby interpreter are you running? What operating system? Could you provide the full code if you have not yet?
If you want to look into this more, I recommend trying rblineprof or ruby-prof.

So it looks like the issue here is that the ruby is taking the ~ 5 seconds to be able to load and display the information in IRB, but the "Generated X in Y seconds." information is actually correct and working. It was just less than I was expecting because I was confused about the difference between the time required to calculate the combinations vs the time required to load and start displaying the output of the combinations.

Related

Ruby - how to add phonecall's length? CSV file

I have class Call - it represents single phonecall with certain number of minutes/seconds, date of the call etc. I want to sum length of calls for given day.
Problem is my data is in string format, I'm formatting it with various Time.parse options and many different things.
But my main problem is, how to sum them? I need something like Ruby's inject/reduce but smart enough to know 60 seconds is one minute.
One additional problem is I'm reading from .CSV file, turning every row into Hash, and making Call objects out of it.
Any hints? :)
I suggest to store the duration of a call as a number of second in an integer. Because that would allow you to easily run calculation in the database.
But if you prefer to keep the string representation you might want to use something like this:
# assuming `calls` is an array of call instances and the
# duration of the call is stores an attribute `duration`
total = calls.sum do |call|
minutes, seconds = call.duration.split(':')
minutes * 60 + seconds
end
# format output
"#{total / 60}:#{total % 60}"
Please note that the sum method is part of ActiveSupport. When you are using pure Ruby without Rails you need to use this instead:
total = calls.inject(0) do |sum, call|
minutes, seconds = call.duration.split(':')
sum + minutes * 60 + seconds
end
You may could map them all to seconds represented by Float via Time.parse, then inject the mapped float array.

How to get user to enter in 24 hour format in BBC Basic

I am making a program that will enable me to work out the avergae speed of something over a set distance
For this to work the user needs to input the start time and the end time.. I am not sure how you input time in a 24 hour format.
Furthermore I need to find the difference in the 2 times and then work out the speed.. which is distance/time taken.
Let's say distance was 1000 meters
I lack a bbc basic compiler but you should create some like this
print str$(secondsinday("22:50:01")-secondsinday("17:09:17"))
sub secondsinday(t$)
return val(left$(t$,2))*3600+val(mid$(t$,4,2))*60+val(right$(t$,2))
end sub
I saw some bbc basic examples and the formula should be the same, only the function syntax is diffrent (I'll try and convert it after some research)

Ruby threading vs normal

Lets say I have 4 folders with 25 folders in each. In each of those 25 folders there is 20 folders each with 1 very long text document. The method i'm using now seems to have room to improve and in every scenario in which I implement ruby's threads, the result is slower than before. I have an array of the 54names of the folders. I iterate through each and use a foreach method to get the deeply nested files. In the foreach loop I do 3 things. I get the contents of today's file, I get the contents of yesterday's file, and I use my diff algorithm to find what has changed from yesterday to today. How would you do this faster with threads.
def backup_differ_loop device_name
device_name.strip!
Dir.foreach("X:/Backups/#{device_name}/#{#today}").each do |backup|
if backup != "." and backup != ".."
#today_filename = "X:/Backups/#{device_name}/#{#today}/#{backup}"
#yesterday_filename = "X:/Backups/#{device_name}/#{#yesterday}/#{backup.gsub(#today, #yesterday)}"
if File.exists?(#yesterday_filename)
today_backup_content = File.open(#today_filename, "r").read
yesterday_backup_content = File.open(#yesterday_filename, "r").read
begin
Diffy::Diff.new(yesterday_backup_content, today_backup_content, :include_plus_and_minus_in_html => true, :context => 1).to_s(:html)
rescue
#do nothing just continue
end
end
else
#file not found
end
end
end
The first part of your logic is finding all files in a specific folder. Instead of doing Dir.foreach and then checking against "." and ".." you can do this in one line:
files = Dir.glob("X:/Backups/#{device_name}/#{#today}/*").select { |item| File.file?(item)}
Notice the /* at the end? This will search 1 level deep (inside the #today folder). If you want to search inside sub-folders too, replace it with /**/* so you'll get array of all files inside all sub-folders of #today.
So I'd first have a method which would give me a double array containing a bunch of arrays of matching files:
def get_matching_files
matching_files = []
Dir.glob("X:/Backups/#{device_name}/#{#today}/*").select { |item| File.file?(item)}.each do |backup|
today_filename = File.absolute_path(backup) # should get you X:/Backups...converts to an absolute path
yesterday_filename = "X:/Backups/#{device_name}/#{#yesterday}/#{backup.gsub(#today, #yesterday)}"
if File.exists?(yesterday_filename)
matching_files << [today_filename, yesterday_filename]
end
end
return matching_files
end
and call it:
matching_files = get_matching_files
NOW we can start the multi-threading which is where things probably slow down. I'd first get all the files from the array matching_files into a queue, then start 5 threads which will go until the queue is empty:
queue = Queue.new
matching_files.each { |file| queue << file }
# 5 being the number of threads
5.times.map do
Thread.new do
until queue.empty?
begin
today_file_content, yesterday_file_content = queue.pop
Diffy::Diff.new(yesterday_backup_content, today_backup_content, :include_plus_and_minus_in_html => true, :context => 1).to_s(:html)
rescue
#do nothing just continue
end
end
end
end.each(&:join)
I can't guarantee my code will work because I don't have the entire context of your program. I hope I've given you some ideas.
And the MOST important thing: The standard implementation of Ruby can run only 1 thread at a time. This means even if you implement the code above, you won't get a significant performance difference. So get Rubinius or JRuby which allow more than 1 threads to be running at a time. Or if you prefer to use the standard MRI Ruby, then you'll need to re-structure your code (you can keep your original version) and start multiple processes. You'll just need something like a shared database where you can store the matching_files (as a single row, for example) and every time a process will 'take' something from that database, it will mark that row as 'used'. SQLite is a good db for this I think because it's thread safe by default.
Most Ruby implementations don't have "true" multicore threading i.e. threads won't gain you any performance improvement since the interpreter can only run one thread at a time. For applications like yours with lots of disk IO this is especially true. In fact, even with real multithreading your applications might be IO-bound and still not see much of an improvement.
You are more likely to get results by finding some inefficient algorithm in your code and improving it.

Best way to convert a Mongo query to a Ruby array?

Let's say I have a large query (for the purposes of this exercise say it returns 1M records) in MongoDB, like:
users = Users.where(:last_name => 'Smith')
If I loop through this result, working with each member, with something like:
users.each do |user|
# Some manipulation to "user"
# Some calculation for "user"
...
# Saving "user"
end
I'll often get a Mongo cursor timeout (as the database cursor that is reserved exceeds the default timeout length). I know I can extend the cursor timeout, or even turn it off--but this isn't always the most efficient method. So, one way I get around this is to change the code to:
users = Users.where(:last_name => 'Smith')
user_array = []
users.each do |u|
user_array << u
end
THEN, I can loop through user_array (since it's a Ruby array), doing manipulations and calculations, without worrying about a MongoDB timeout.
This works fine, but there has to be a better way--does anyone have a suggestion?
If your result set is so large that it causes cursor timeouts, it's not a good idea to load it entirely to RAM.
A common approach is to process records in batches.
Get 1000 users (sorted by _id).
Process them.
Get another batch of 1000 users where _id is greater than _id of last processed user.
Repeat until done.
For a long running task, consider using rails runner.
runner runs Ruby code in the context of Rails non-interactively. For instance:
$ rails runner "Model.long_running_method"
For further details, see:
http://guides.rubyonrails.org/command_line.html

Guess A Number - Ruby Online

I have been trying to create a Ruby program that will be running online where a user can guess a number, and it will say higher or lower. I know it will take a random number store in a variable, then run a loop? With conditionals to check?
Im not asking for full code, the basic structure for I can use this to get me going.
Any idea how i would do this? I found info to create a random number like this:
x = rand(20)
UPDATE: My code I am going to be working with is something like this: http://pastie.org/461976
I would say to do something like this:
x = rand(20)
loop {
# get the number from the user somehow, store it in num
if num == x
# they got it right
break
elsif num > x
# the guess was too high
else
# the guess was too low
end
}
If you're running it online, this structure may not be feasible. You may need to store the guess in the user's session and have a textbox for the guess, and submit it to a controller which would have the above code without the loop construct, and just redirect them to the same page with a message if they didn't get it right.

Resources