Trying to create nested loops dynamically in Ruby - ruby

I currently have the following method:
def generate_lineups(max_salary)
player_combos_by_position = calc_position_combinations
lineups = []
player_combos_by_position[:qb].each do |qb_set|
unless salary_of(qb_set) > max_salary
player_combos_by_position[:rb].each do |rb_set|
unless salary_of(qb_set, rb_set) > max_salary
lineups << create_team_from_sets(qb_set, rb_set)
end
end
end
end
return lineups
end
player_combos_by_position is a hash that contains groupings of players keyed by position:
{ qb: [[player1, player2], [player6, player7]], rb: [[player3, player4, player5], [player8, player9, player10]] }
salary_of() takes the sets of players and calculates their total salary.
create_team_from_sets() takes sets of players and returns a new Team of the players
Ideally I want to remove the hardcoded nested loops as I do not know which positions will be available. I think recursion is the answer, but I'm having a hard time wrapping my head around the solution. Any ideas would be greatly appreciated.
Some answers have recommended the use of Array#product. This is normally an elegant solution however I'm dealing with very large sets of data (there's about 161,000 combinations of WRs and about 5000 combinations of RBs to form together alone). In my loops I use the unless salary_of(qb_set, rb_set) > max_salary check to avoid making unnecessary calculations as this weeds out quite a few. I cannot do this using Array#product and therefore the combinations take very long times to put together. I'm looking for away to rule out combinations early and save on computer cycles.

You can use Array#product to get all the possible lineups and then select the ones that are within budget. This allows for variable number of positions.
first_pos, *rest = player_combos_by_position.values
all_lineups = first_pos.product(*rest)
#=> all possible lineups
lineups = all_lineups.
# select lineups within budget
select{|l| salary_of(*l) <= max_salary}.
# create teams from selected lineups
map{|l| create_team_from_sets(*l) }
Other option: Recursive Method (not tested but should get you started)
def generate_lineups(player_groups,max_salary)
first, *rest = player_groups
lineups = []
first.each do |player_group|
next if salary_of(player_group) > max_salary
if rest.blank?
lineups << player_group
else
generate_lineups(rest,max_salary).each do |lineup|
new_lineup = create_team_from_sets(player_group, *lineup)
lineups << new_lineup unless salary_of(*new_lineup) > max_salary
end
end
end
return lineups
end
Usage:
lineups = generate_lineups(player_combos_by_position.values,max_salary)

After reading your edit, I see your problem. Here I've modified my code to show you how you could impose a salary limit for each combination for each position group, as well as for the entire team. Does this help? You may want to consider putting your data in a database and using Rails.
team_max_salary = 300
players = {player1: {position: :qb, salary: 15, rating: 9}, player2: {postion: :rb, salary: 6, rating: 6},...}
group_info = {qb: {nplayers: 2, max_salary: 50}, rb: {nplayers: 2, max_salary: 50}, ... }
groups = group_info.keys
players_by_group = {}
groups.each {|g| players_by_group[g] = []}
players.each {|p| players_by_group[p.position] << p}
combinations_for_team = []
groups.each do |g|
combinations_by_group = players_by_group[g].combinations(group_info[g][:nplayers]).select {|c| salary(c) <= group_info[g][:max_salary]}
# Possibly employ other criteria here to further trim combinations_by_group
combinations_for_team = combinations_for_team.product(combinations_by_group).flatten(1).select {|c| salary(c) <= team_max_salary}
end
I may be missing a flatten(1). Note I've made the player keys symbols (e.g., :AaronRogers`), but you could of course use strings instead.

Related

How to map one array to two in Ruby and perform some function if a condition is met

I’m trying to improve the readability of a piece of code and also make it more concise if possible.
I have an array that needs to be iterated over and if any item matches some criteria I want to collect it and also do some other work ie updating the balance as we iterate if the if condition is met is necessary
need_bananas = []
need_apples = []
balance = 10
array.each do |item|
if need_bananas?(item)
need_bananas << item
elsif need_apples?(item)
need_apples << item
end
balance -= item.amount
end
def need_bananas?(item)
balance >= item.amount
end
def need_apples?(item)
balance < item.amount
end
This feels too cumbersome and there must be a way to make it more concise. I have thoughts around using reduce or partition etc but I can’t settle on a nice solution
Thanks in advance
Is this something that will work for you?
balance = 10
need_bananas, need_apples = array.partition do |item|
(balance -= item.amound) >= 0
end

Iterating over big arrays with limited memory and time of execution

I’m having trouble using Ruby to pass some tests that make the array too big and return an error.
Solution.rb: failed to allocate memory (NoMemoryError)
I have failed to pass it twice.
The problem is about scheduling meetings. The method receives two parameters in order: a matrix with all the first days that investors can meet in the company, and a matrix with all the last days.
For example:
firstDay = [1,5,10]
lastDay = [4,10,10]
This shows that the first investor will be able to find himself between the days 1..4, the second between the days 5..10 and the last one in 10..10.
I need to return the largest number of investors that the company will serve. In this case, all of them can be attended to, the first one on day 1, the second one on day 5, and the last one on day 10.
So far, the code works normally, but with some hidden tests with at least 1000 investors, the error I mentioned earlier appears.
Is there a best practice in Ruby to handle this?
My current code is:
def countMeetings(firstDay, lastDay)
GC::Profiler.enable
GC::Profiler.clear
first = firstDay.sort.first
last = lastDay.sort.last
available = []
#Construct the available days for meetings
firstDay.each_with_index do |d, i|
available.push((firstDay[i]..lastDay[i]).to_a)
end
available = available.flatten.uniq.sort
investors = {}
attended_day = []
attended_investor = []
#Construct a list of investor based in their first and last days
firstDay.each_index do |i|
investors[i+1] = (firstDay[i]..lastDay[i]).to_a
end
for day in available
investors.each do |key, value|
next if attended_investor.include?(key)
if value.include?(day)
next if attended_day.include?(day)
attended_day.push(day)
attended_investor.push(key)
end
end
end
attended_investor.size
end
Using Lazy as far as I could understand, I escaped the MemoryError, but I started receiving a runtime error:
Your code was not executed on time. Allowed time: 10s
And my code look like this:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
daily_attendance = {}
(first..last).each do |day|
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
daily_attendance.size
end
And it went through the cases with few investors. I thought about using multi-thread and the code became the following:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
threads = []
daily_attendance = {}
(first..last).lazy.each_slice(25000) do |slice|
slice.each do |day|
threads << Thread.new do
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
end
end
threads.each{|t| t.join}
daily_attendance.size
end
Unfortunately, it went back to the MemoryError.
This can be done without consuming any more memory than the range of days. The key is to avoid Arrays and keep things as Enumerators as much as possible.
First, rather than the awkward pair of Arrays that need to be converted into Ranges, pass in an Enumerable of Ranges. This both simplifies the method, and it allows it to be Lazy if the list of ranges is very large. It could be read from a file, fetched from a database or an API, or generated by another lazy enumerator. This saves you from requiring big arrays.
Here's an example using an Array of Ranges.
p count_meetings([(1..4), (5..10), (10..10)])
Or to demonstrate transforming your firstDay and lastDay Arrays into a lazy Enumerable of Ranges...
firstDays = [1,5,10]
lastDays = [4,10,10]
p count_meetings(
firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
)
firstDays.lazy makes everything that comes after lazy. .zip(lastDays) iterates through both Arrays in pairs: [1,4], [5,10], and [10,10]. Then we turn them into Ranges. Because it's lazy it will only map them as needed. This avoids making another big Array.
Now that's fixed, all we need to do is iterate over each Range and increment their attendance for the day.
def count_meetings(attendee_ranges)
# Make a Hash whose default values are 0.
daily_attendance = Hash.new(0)
# For each attendee
attendee_ranges.each { |range|
# For each day they will attend, add one to the attendance for that day.
range.each { |day| daily_attendance[day] += 1 }
}
# Get the day/attendance pair with the maximum value, and only return the value.
daily_attendance.max[1]
end
Memory growth is limited to how big the day range is. If the earliest attendee is on day 1 and the last is on day 1000 daily_attendance is just 1000 entries which is a long time for a conference.
And since you've built the whole Hash anyway, why waste it? Write one function that returns the full attendance, and another that extracts the max.
def count_meeting_attendance(attendee_ranges)
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end
def max_meeting_attendance(*args)
count_meeting_attendance(*args).max[1]
end
Since this is an exercise and you're stuck with the wonky arguments, we can do the same trick and lazily zip firstDays and lastDays together and turn them into Ranges.
def count_meeting_attendance(firstDays, lastDays)
attendee_ranges = firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end

How to improve enumerable find in ruby?

I am making a request to a webservice API and I have several ids pointing to a location.
In order to re assign the results from the API which contains only the id (sid).
I have the following code:
Location.all.each do |city|
accommodations = Accommodation.within_distance(city.lonlat, [1, 2])
lookups << LookUp.new(city.id, accommodations.select(:supplier,:sid).to_a.map(&:serializable_hash))
end
after the webservice call I try re assigning results ids (sid's) to cities:
results = call_to_api
lookups.each do | lup|
res << {:city=> lup.city, :accommodations => lup.accommodations.map{ |j|
results.find { |i|
i.sid == j['sid']
}
}
}
end
The lookups iteration in incredibly slow and takes up to 50s for just 4000 entries.
So how can I improve from a performance point of view?
Imagine you have three lookups that all have accomodations A, B, and C.
The way it is done now, the first lookup will perform the map and search for A, B, and C.
the second lookup will perform the map and search for A, B, and C.
And so on. Given the basic nature of the search criteria, it doesn't look like the results for accomodation A is really going to change between different lookups in the same collection.
In that case I would consider caching the results of each sid search and if you ever have an accomodation with the same sid, just pull it from the cache.
For example, something like
cache = {}
if cache.include?(yourSID)
// use cache[yourSID]
else
mappings = //doYourMappingHere
// cache it for future use. Might need to dup
cache[yourSID] = mappings
end
Of course this is under the assumption that the same accomodation appears several times.
so here comes the boost
ids = Rails.cache.fetch(:ids) {
ids = {}
Location.all.each do |city|
Accommodation.within_distance(city.lonlat, [1, 2]).each do |acc|
if acc.supplier == 2
if ids.include? acc.id
ids[acc.sid] << city.attributes
else
ids[acc.sid] = [city.attributes]
end
end
end
end
ids
}
results = Rails.cache.fetch(:results) {
results = api.rates_by_ids(ids.keys)
}
p results.size
accommodations_in_cities={}
results.each do |res|
ids[res.sid].each do |city|
if accommodations_in_cities.include? city['id']
accommodations_in_cities[city['id']] << res
else
accommodations_in_cities[city['id']] = [res]
end
end
end
accommodations_in_cities
end

Adding elements of different arrays together

I'm trying to use CSV to calculate the average of three numbers and output it to a separate file. Particularly, open one file, take the first value (name), and then calculate the average of the next three values. Do this multiple times for each person in the file.
Here is my Book1.csv
Tom,90,80,70
Adam,80,85,83
Mike,100,93,89
Dave,100,100,100
Rob,80,70,75
Nick,80,90,70
Justin,100,90,90
Jen,80,90,100
I'm trying to get it to output this:
Tom,80
Adam,83
Mike,94
Dave,100
Rob,75
Nick,80
Justin,93
Jen,90
I have each person in an array and I could get this to work with the basic "pseudo" code I have written, but it does not work.
Here is my code so far:
#!/usr/bin/ruby
require 'csv'
names=[]
grades1=[]
grades2=[]
grades3=[]
average=[]
i = 0
CSV.foreach('Book1.csv') do |students|
names << students.values_at(0)
grades1 << reader.values_at(1)
grades2 << reader.values_at(2)
grades3 << reader.values_at(3)
end
while i<10 do
average[i]= grades1[i] + grades2[i] + grades3[i]
i= i + 1
end
CSV.open('Book2.csv', 'w') do |writer|
rows.each { |record| writer << record }
end
The while loop part is the part that I am most concerned with. Any insight?
If you have an array of values that you want to sum, you can use:
sum = array.inject(:+)
If you change your data structure to:
grades = [ [], [], [] ]
...
grades[0] << reader.values_at(1)
Then you can do:
0.upto(9) do |i|
average[i] = (0..2).map{ |n| grades[n][i] }.inject(:+) / 3
end
There are a variety of ways to improve your data structures, the above being one of the least impactful to your code.
Any time you find yourself writing:
foo1 = ...
foo2 = ...
You should recognize it as code smell, and think of how you could organize your data in better collections.
Here's a rewrite of how I might do this. Notice that it works for any number of scores, not hardcoded to 3:
require 'csv'
averages = CSV.parse(DATA.read).map do |row|
name, *grades = *row
[ name, grades.map(&:to_i).inject(:+) / grades.length ]
end
puts averages.map(&:to_csv)
#=> Tom,80
#=> Adam,82
#=> Mike,94
#=> Dave,100
#=> Rob,75
#=> Nick,80
#=> Justin,93
#=> Jen,90
__END__
Tom,90,80,70
Adam,80,85,83
Mike,100,93,89
Dave,100,100,100
Rob,80,70,75
Nick,80,90,70
Justin,100,90,90
Jen,80,90,100

How to return a Ruby array intersection with duplicate elements? (problem with bigrams in Dice Coefficient)

I'm trying to script Dice's Coefficient, but I'm having a bit of a problem with the array intersection.
def bigram(string)
string.downcase!
bgarray=[]
bgstring="%"+string+"#"
bgslength = bgstring.length
0.upto(bgslength-2) do |i|
bgarray << bgstring[i,2]
end
return bgarray
end
def approx_string_match(teststring, refstring)
test_bigram = bigram(teststring) #.uniq
ref_bigram = bigram(refstring) #.uniq
bigram_overlay = test_bigram & ref_bigram
result = (2*bigram_overlay.length.to_f)/(test_bigram.length.to_f+ref_bigram.length.to_f)*100
return result
end
The problem is, as & removes duplicates, I get stuff like this:
string1="Almirante Almeida Almada"
string2="Almirante Almeida Almada"
puts approx_string_match(string1, string2) => 76.0%
It should return 100.
The uniq method nails it, but there is information loss, which may bring unwanted matches in the particular dataset I'm working.
How can I get an intersection with all duplicates included?
As Yuval F said you should use multiset. However, there is nomultiset in Ruby standard library , Take at look at here and here.
If performance is not that critical for your application, you still can do it usingArray with a little bit code.
def intersect a , b
a.inject([]) do |intersect, s|
index = b.index(s)
unless index.nil?
intersect << s
b.delete_at(index)
end
intersect
end
end
a= ["al","al","lc" ,"lc","ld"]
b = ["al","al" ,"lc" ,"ef"]
puts intersect(a ,b).inspect #["al", "al", "lc"]
From this link I believe you should not use Ruby's sets but rather multisets, so that every bigram gets counted the number of times it appears. Maybe you can use this gem for multisets. This should give a correct behavior for recurring bigrams.
I toyed with this, based on the answer from #pierr, for a while and ended up with this.
a = ["al","al","lc","lc","lc","lc","ld"]
b = ["al","al","al","al","al","lc","ef"]
result=[]
h1,h2=Hash.new(0),Hash.new(0)
a.each{|x| h1[x]+=1}
b.each{|x| h2[x]+=1}
h1.each_pair{|key,val| result<<[key]*[val,h2[key]].min if h2[key]!=0}
result.flatten
=> ["al", "al", "lc"]
This could be a kind of multiset intersect of a & b but don't take my word for it because I haven't tested it enough to be sure.

Resources