Simple Ruby Rate Limiting - ruby

I am trying to build a very simple rate limit algorithm using an array.
Let's for example use the following rate limit as an example "5 requests every 5 minutes"
I have an array that stores a list of timestamps (where each element is a Time.now) and is added to the array when an API gets called (assuming it's under the rate limit)
I also used a Mutex here so different threads can both share the timestamp resource as well as ensuring there's no race condition happening.
However, I'd like this array to be self-cleaning of sorts. If there are 5 (or more) elements in the array AND one or more of it is outside of the 5 minute interval, it would automatically remove this entry.
And this is sort of where I am stuck on.
I have the following code:
def initialize(max, interval)
#max, #interval = max, interval
#m = Mutex.new
#timestamp = []
end
def validate_rate
#m.synchronize do
if #timestamp.count > #max && self.is_first_ts_expired
#timestamp.shift
if self.rate_count < #max
#timestamp << Time.now
return false
else
return true
end
end
end
end
def is_first_ts_expired
return false if ##timestamp[#name].first.nil? # no logged entries = no expired timestamps
return ##timestamp[#name].first <= Time.now - #interval
end
# Gets the number of requests that are under the allowed interval
def rate_count
count = 0
#timestamp.each { |x|
if x >= Time.now - #interval
count += 1
end
}
count
end
The following is how you will call this simple class. rl.validate_rate will return true if it's under the rate limit, but false if it's above. And ideally it will self-clean the timestamp array when it's greater than the max variable.
rl = RateLimit.new(5, 5.minutes)
raise RateLimitException unless rl.validate_rate do
# stuff
end
I am curious if where I put the "clean up" is_first_ts_expired code is called at the right place?

I think this is a totally valid approach.
Two quick notes:
1) It seems like you're only allowing insertion into the array when there are less than the max number of elements:
if rate_count < #max
#timestamp << Time.now
return true
else
return false
end
However, you're also only clearing out expired elements when there are greater than the number of allowed elements in the array:
if #timestamp.count > max && is_first_ts_expired
#timestamp.shift
I think in order to get this working, you want to remove that first condition when you are checking if you should clear elements from the array. It will look something like this:
if is_first_ts_expired
#timestamp.shift
2) You will only ever clean one item out of your array here:
if is_first_ts_expired
#timestamp.shift
To make this solution more robust, you may want to replace the if with a while so you can clean out multiple expired items. For example:
while is_first_ts_expired do
#timestamp.shift
end
Updated based on comment below:
Since you'll potentially be going through all of the timestamps if the timestamps are all expired, you'll want to slightly modify the is_first_ts_expired to handle an empty timestamp array. Something like this:
def is_first_ts_expired
current_ts = #timestamp.first
current_ts && current_ts <= Time.now - #interval
end

Related

Iterating over big arrays with limited memory and time of execution

I’m having trouble using Ruby to pass some tests that make the array too big and return an error.
Solution.rb: failed to allocate memory (NoMemoryError)
I have failed to pass it twice.
The problem is about scheduling meetings. The method receives two parameters in order: a matrix with all the first days that investors can meet in the company, and a matrix with all the last days.
For example:
firstDay = [1,5,10]
lastDay = [4,10,10]
This shows that the first investor will be able to find himself between the days 1..4, the second between the days 5..10 and the last one in 10..10.
I need to return the largest number of investors that the company will serve. In this case, all of them can be attended to, the first one on day 1, the second one on day 5, and the last one on day 10.
So far, the code works normally, but with some hidden tests with at least 1000 investors, the error I mentioned earlier appears.
Is there a best practice in Ruby to handle this?
My current code is:
def countMeetings(firstDay, lastDay)
GC::Profiler.enable
GC::Profiler.clear
first = firstDay.sort.first
last = lastDay.sort.last
available = []
#Construct the available days for meetings
firstDay.each_with_index do |d, i|
available.push((firstDay[i]..lastDay[i]).to_a)
end
available = available.flatten.uniq.sort
investors = {}
attended_day = []
attended_investor = []
#Construct a list of investor based in their first and last days
firstDay.each_index do |i|
investors[i+1] = (firstDay[i]..lastDay[i]).to_a
end
for day in available
investors.each do |key, value|
next if attended_investor.include?(key)
if value.include?(day)
next if attended_day.include?(day)
attended_day.push(day)
attended_investor.push(key)
end
end
end
attended_investor.size
end
Using Lazy as far as I could understand, I escaped the MemoryError, but I started receiving a runtime error:
Your code was not executed on time. Allowed time: 10s
And my code look like this:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
daily_attendance = {}
(first..last).each do |day|
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
daily_attendance.size
end
And it went through the cases with few investors. I thought about using multi-thread and the code became the following:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
threads = []
daily_attendance = {}
(first..last).lazy.each_slice(25000) do |slice|
slice.each do |day|
threads << Thread.new do
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
end
end
threads.each{|t| t.join}
daily_attendance.size
end
Unfortunately, it went back to the MemoryError.
This can be done without consuming any more memory than the range of days. The key is to avoid Arrays and keep things as Enumerators as much as possible.
First, rather than the awkward pair of Arrays that need to be converted into Ranges, pass in an Enumerable of Ranges. This both simplifies the method, and it allows it to be Lazy if the list of ranges is very large. It could be read from a file, fetched from a database or an API, or generated by another lazy enumerator. This saves you from requiring big arrays.
Here's an example using an Array of Ranges.
p count_meetings([(1..4), (5..10), (10..10)])
Or to demonstrate transforming your firstDay and lastDay Arrays into a lazy Enumerable of Ranges...
firstDays = [1,5,10]
lastDays = [4,10,10]
p count_meetings(
firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
)
firstDays.lazy makes everything that comes after lazy. .zip(lastDays) iterates through both Arrays in pairs: [1,4], [5,10], and [10,10]. Then we turn them into Ranges. Because it's lazy it will only map them as needed. This avoids making another big Array.
Now that's fixed, all we need to do is iterate over each Range and increment their attendance for the day.
def count_meetings(attendee_ranges)
# Make a Hash whose default values are 0.
daily_attendance = Hash.new(0)
# For each attendee
attendee_ranges.each { |range|
# For each day they will attend, add one to the attendance for that day.
range.each { |day| daily_attendance[day] += 1 }
}
# Get the day/attendance pair with the maximum value, and only return the value.
daily_attendance.max[1]
end
Memory growth is limited to how big the day range is. If the earliest attendee is on day 1 and the last is on day 1000 daily_attendance is just 1000 entries which is a long time for a conference.
And since you've built the whole Hash anyway, why waste it? Write one function that returns the full attendance, and another that extracts the max.
def count_meeting_attendance(attendee_ranges)
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end
def max_meeting_attendance(*args)
count_meeting_attendance(*args).max[1]
end
Since this is an exercise and you're stuck with the wonky arguments, we can do the same trick and lazily zip firstDays and lastDays together and turn them into Ranges.
def count_meeting_attendance(firstDays, lastDays)
attendee_ranges = firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end

Infinity is returned when calculating average in array

Why does the following method return infinity when trying to find the average volume of a stock:
class Statistics
def self.averageVolume(stocks)
values = Array.new
stocks.each do |stock|
values.push(stock.volume)
end
values.reduce(:+).to_f / values.size
end
end
class Stock
attr_reader :date, :open, :high, :low, :close, :adjusted_close, :volume
def initialize(date, open, high, low, close, adjusted_close, volume)
#date = date
#open = open
#high = high
#low = low
#close = close
#adjusted_close = adjusted_close
#volume = volume
end
def close
#close
end
def volume
#volume
end
end
CSV.foreach(fileName) do |stock|
entry = Stock.new(stock[0], stock[1], stock[2], stock[3], stock[4], stock[5], stock[6])
stocks.push(entry)
end
Here is how the method is called:
Statistics.averageVolume(stocks)
Output to console using a file that has 251 rows:
stock.rb:32: warning: Float 23624900242507002003... out of range
Infinity
Warning is called on the following line: values.reduce(:+).to_f / values.size
When writing average functions you'll want to pay close attention to the possibility of division by zero.
Here's a fixed and more Ruby-like implementation:
def self.average_volume(stocks)
# No data in means no data out, can't calculate.
return if (stocks.empty?)
# Pick out the `volume` value from each stock, then combine
# those with + using 0.0 as a default. This forces all of
# the subsequent values to be floating-point.
stocks.map(&:volume).reduce(0.0, &:+) / values.size
end
In Ruby it's strongly recommended to keep variable and method names in the x_y form, like average_volume here. Capitals have significant meaning and indicate constants like class, module and constant names.
You can test this method using a mock Stock:
require 'ostruct'
stocks = 10.times.map do |n|
OpenStruct.new(volume: n)
end
average_volume(stocks)
# => 4.5
average_volume([ ])
# => nil
If you're still getting infinity it's probably because you have a broken value somewhere in there for volume which is messing things up. You can try and filter those out:
stocks.map(&:value).reject(&:nan?)...
Where testing vs. nan? might be what you need to strip out junk data.

Manipulating Ruby class default values

When there is more than one default value, how could I change only the second initialization variable without also calling the first?
For example, a Ruby class is created to return the value akin to the roll of a single die with default values for a six sided die ranging from 1 to 6:
class Die
def initialize(min=1, max=6)
#min = min
#max = max
end
def roll
rand(#min..#max)
end
end
If I wanted instead to use this code to simulate the return from rolling a 20 sided die, I could write the following:
p Die.new(min=1, max=20).roll
...but is there a way to argue only the second (max) value?
Of note - and this is where I am confused (I don't fully understand Ruby class attributes and variable scopes) - if I invoke:
p Die.new(max=20).roll
... I get nil printed. ?. (I understand that this is because rand(20..6) returns nil, but I thought that max= would retain the default min value for the first argument - instead max=20 gets ingested as the integer 20 binding to the min=... This seems weird to me.)
I suppose I could re-work the Die class to take a default value of the number of sides and also set the min (or max) value relative to the number of sides, but this is beside the point of my main question: How to override only the second default value without explicitly writing the first as well...
Presuming that most dice would normally have a minimum value of 1, I realize that I could reverse the order of min and max like so:
class Die2
def initialize(max=6, min=1)
#max = max
#min = min
end
def roll
rand(#min..#max)
end
end
...and then invoke whatever maximum number of sides like so:
p Die2.new(20).roll
...but given the syntax of class Die (and my inclination to write the minimum before the maximum) is there a way to only enter an argument for the second value? Or, perhaps I am approaching Ruby classes poorly? Any help or guidance is appreciated - thanks!
If you write
class Die
def initialize(min=1, max=6)
#min, #max = min, max
end
end
and create a new instance by passing a single argument, such as:
die = Die.new(3)
#=> #<Die:0x007fcc6902a700 #min=3, #max=6>
we can see from the return value that the argument 3 has been assigned to #min and #max gets its default value. In short, to pass a value to #max you must also pass one to #min (unless, of course, you reverse the order of the arguments).
You can do what you want by using named arguments (or named parameters), introduced in Ruby v2.0.
class Die
def initialize(min: 1, max: 6)
#min, #max = min, max
end
end
die = Die.new(max: 3)
#=> #<Die:0x007fcc698ccc00 #min=1, #max=3>
(or die = Die.new(:max=>3). As you see, #min equals its default value and #max equals the argument that is passed, 3.
Default values were required for keyword arguments in Ruby v2.0, but v2.1 extended their functionality to permit required named arguments as well. See, for example, this article.
Lastly, consider the following two cases (the second being the more interesting).
class Die
def initialize(min=1, max: 6)
#min, #max = min, max
end
end
die = Die.new(max: 3)
#=> #<Die:0x007fcc69954448 #min=1, #max=3>
class Die
def initialize(min, max: 6)
#min, #max = min, max
end
end
die = Die.new(max: 3)
#=> #<Die:0x007fa01b900930 #min={:max=>3}, #max=6>
In Ruby 2.0 and higher you can use keyword arguments to achieve the same effect:
class Die
def initialize(min: 1, max: 6) #<--declare arguments like this
#min = min
#max = max
end
def roll
rand(#min..#max)
end
end
p Die.new(max: 20).roll #<--and call like this
https://repl.it/Dyxn/0
you can read more about keyword arguments in this article

Process something in each element of a matrix and return a value

I have matrix like this:
my_matrix = [['regular', '16/03/2009', '17/03/2009', '18/03/2009'],
['regular', '20/03/2009', '21/03/2009', '22/03/2009'],
['rewards', '26/03/2009', '27/03/2009', '28/03/2009']]
I need to verify if the first element is 'regular' or 'rewards' and, verify each date of the first element, process something and return a value.
For example:
['regular', '20/03/2009', '21/03/2009', '22/03/2009']
The first element is 'regular' and, I need to loop through the rest of the array verifying if each date is a weekday or a weekend and then process something. If there are more weekends than weekdays process something , else, process another thing.
I've tried this:
HOTELS = {
:RIDGEWOOD => 'RidgeWood',
:LAKEWOOD => 'LakeWood',
:BRIDGEWOOD => 'BridgeWood'
}
def weekend?(date)
datetime = DateTime.parse(date.to_s)
datetime.saturday? || datetime.sunday?
end
def find_the_cheapest_hotel(text_file)
#costumer_request = File.open(text_file){|io| io.each_line.map{|line| line.split(/[:,\s]+/)}}
#costumer_request.each do |line|
line.each do |value|
if line.shift == 'regular'
if weekend?(line)
print 'weekend regular'
else
print 'weekday regular'
end
elsif line.shift == 'rewards'
if weekend?(line)
print 'weekend rewards'
else
print 'weekday rewards'
end
end
end
It gets this ['regular', '16/03/2009', '17/03/2009', '18/03/2009'] and returns this
weekday weekday weekday
I want to process something e each array not only in the first.
You still have not said what the problem is, and your code does not match your initial description very well. But I can certainly point to a place where things are going wrong at the outset:
if line.shift == 'regular'
# ...
elsif line.shift == 'rewards'
# ...
end
Think about it. The initial if calls shift, and therefore it does in fact shift the array. The first element of the array is now gone forever. So suppose it was not regular. So now we get to the elsif condition. But I can tell you for a fact that this condition will never be true, because if the first element of the array was rewards, it is now lost; it has been removed from the array (the first element is now a date).
So, instead of shifting, just examine line[0] in both conditions. You can shift later when it's time to walk the rest of the array.

Lychrel numbers

First of all, for those of you, who don't know (or forgot) about Lychrel numbers, here is an entry from Wikipedia: http://en.wikipedia.org/wiki/Lychrel_number.
I want to implement the Lychrel number detector in the range from 0 to 10_000. Here is my solution:
class Integer
# Return a reversed integer number, e.g.:
#
# 1632.reverse #=> 2361
#
def reverse
self.to_s.reverse.to_i
end
# Check, whether given number
# is the Lychrel number or not.
#
def lychrel?(depth=30)
if depth == 0
return true
elsif self == self.reverse and depth != 30 # [1]
return false
end
# In case both statements are false, try
# recursive "reverse and add" again.
(self + self.reverse).lychrel?(depth-1)
end
end
puts (0..10000).find_all(&:lychrel?)
The issue with this code is the depth value [1]. So, basically, depth is a value, that defines how many times we need to proceed through the iteration process, to be sure, that current number is really a Lychrel number. The default value is 30 iterations, but I want to add more latitude, so programmer can specify his own depth through method's parameter. The 30 iterations is perfect for such small range as I need, but if I want to cover all natural numbers, I have to be more agile.
Because of the recursion, that takes a place in Integer#lychrel?, I can't be agile. If I had provided an argument to the lychrel?, there wouldn't have been any changes because of the [1] statement.
So, my question sounds like this: "How do I refactor my method, so it will accept parameters correctly?".
What you currently have is known as tail recursion. This can usually be re-written as a loop to get rid of the recursive call and eliminate the risk of running out of stack space. Try something more like this:
def lychrel?(depth=30)
val = self
first_iteration = true
while depth > 0 do
# Return false if the number has become a palindrome,
# but allow a palindrome as input
if first_iteration
first_iteration = false
else
if val == val.reverse
return false
end
# Perform next iteration
val = (val + val.reverse)
depth = depth - 1
end
return true
end
I don't have Ruby installed on this machine so I can't verify whether that 's 100% correct, but you get the idea. Also, I'm assuming that the purpose of the and depth != 30 bit is to allow a palindrome to be provided as input without immediately returning false.
By looping, you can use a state variable like first_iteration to keep track of whether or not you need to do the val == val.reverse check. With the recursive solution, scoping limitations prevent you from tracking this easily (you'd have to add another function parameter and pass the state variable to each recursive call in turn).
A more clean and ruby-like solution:
class Integer
def reverse
self.to_s.reverse.to_i
end
def lychrel?(depth=50)
n = self
depth.times do |i|
r = n.reverse
return false if i > 0 and n == r
n += r
end
true
end
end
puts (0...10000).find_all(&:lychrel?) #=> 249 numbers
bta's solution with some corrections:
class Integer
def reverse
self.to_s.reverse.to_i
end
def lychrel?(depth=30)
this = self
first_iteration = true
begin
if first_iteration
first_iteration = false
elsif this == this.reverse
return false
end
this += this.reverse
depth -= 1
end while depth > 0
return true
end
end
puts (1..10000).find_all { |num| num.lychrel?(255) }
Not so fast, but it works:
code/practice/ruby% time ruby lychrel.rb > /dev/null
ruby lychrel.rb > /dev/null 1.14s user 0.00s system 99% cpu 1.150 total

Resources