How to find the "effective" end of a Range? - ruby

The .end of a Ruby range is the number used to specify the end of the range, whether or not the range is "exclusive of" the end. I don't understand the rationale for this design decision, but never the less, I was wondering what the most idiomatic mechanism is for determining the "effective end" of the range (i.e. greatest integer n such that range.include?(n) is true. The only mechanism I know of is last(1)[0], which seems pretty klunky.

Use the Range#max method:
r = 1...10
r.end
# => 10
r.max
# => 9
max won't work for exclusive ranges if any of the endpoints is Float (see range_max in range.c for details):
(1.0...10.0).max
# TypeError: cannot exclude non Integer end value
(1.0..10.0).max
# => 10.0
As an exception, if begin > end, no error is raised and nil is returned:
(2.0...1.0).max
# => nil
Update: max works in costant time for any inclusive range and exclusive ranges of integers, it returns the same value as end and end-1, respectively. It is also O(1) if for the range holds that begin > end, in this case nil is returned. For exlusive ranges of non integer objects (String, Date, ...) or when a block is passed, it calls Enumerable#max so it have to travel all the elements in the range!

Related

Ruby language curious integer arithmetic : (-5/2) != -(5/2)

I spent some time on a quite simple task about splitting an array. Until I found that: 2 == 5/2 and -3 == -5/2. To get -2 I need to pull the minus out of the parentheses: -2 == -(5/2). Why does this happen?
As I understand it, the result rounds to the smallest integer, but (-2.5).to_i == -2. Very curious.
# https://www.codewars.com/kata/swap-the-head-and-the-tail/train/ruby
# -5/2 != -(5/2)
def swap_head_tail a
a[-(a.size/2)..-1] + a[a.size/2...-(a.size/2)] + a[0...a.size/2]
end
Why does this happen?
It's not quite clear what kind of answer your are looking for other than because that is how it is specified (bold emphasis mine):
15.2.8.3.4 Integer#/
/(other)
Visibility: public
Behavior:
a) If other is an instance of the class Integer:
1) If the value of other is 0, raise a direct instance of the class ZeroDivisionError.
2) Otherwise, let n be the value of the receiver divided by the value of other. Return an instance of the class Integer whose value is the largest integer smaller than or equal to n.
NOTE The behavior is the same even if the receiver has a negative value. For example, -5 / 2 returns -3.
As you can see, the specification even contains your exact example.
It is also specified in the Ruby/Spec:
it "supports dividing negative numbers" do
(-1 / 10).should == -1
end
Compare this with the specification for Float#to_i (bold emphasis mine):
15.2.9.3.14 Float#to_i
to_i
Visibility: public
Behavior: The method returns an instance of the class Integer whose value is the integer part of the receiver.
And in the Ruby/Spec:
it "returns self truncated to an Integer" do
899.2.send(#method).should eql(899)
-1.122256e-45.send(#method).should eql(0)
5_213_451.9201.send(#method).should eql(5213451)
1.233450999123389e+12.send(#method).should eql(1233450999123)
-9223372036854775808.1.send(#method).should eql(-9223372036854775808)
9223372036854775808.1.send(#method).should eql(9223372036854775808)
end

`<=': comparison of String with nil failed (ArgumentError)

I am testing whether array elements are greater than or equal to elements of smaller indices.
I get the error message from the subject line if I use the following loop
return true if order.each_index {|i| order[i ] <= order[i+1]}
I understand the last element of my array(order) can't be compared to a non-existant element.
Comparing a value to nil is impossible.
I don't, however understand why the following loop doesn't return the same error
(0...(order.length - 1)).all? do |i|
order[i] <= order[i + 1]
end
It seems that at some point, i = order.length-1
This means order[i+1] is a nil value (order.length)
Apparently not?
No, because three dots ... here (0...(order.length - 1)) mean 'without last element', so last value would be order.length - 2.
You'll encounter the same error if you try (0..(order.length - 1)).
Check Range documentation:
Ranges may be constructed using the s..e and s...e literals
Those created using ... exclude the end value

Range and the mysteries that it covers going out of my range

I am trying to understand how range.cover? works and following seems confusing -
("as".."at").cover?("ass") # true and ("as".."at").cover?("ate") # false
This example in isolation is not confusing as it appears to be evaluated dictionary style where ass comes before at followed by ate.
("1".."z").cover?(":") # true
This truth seems to be based on ASCII values rather than dictionary style, because in a dictionary I'd expect all special characters to precede even digits and the confusion starts here. If what I think is true then how does cover? decide which comparison method to employ i.e. use ASCII values or dictionary based approach.
And how does range work with arrays. For example -
([1]..[10]).cover?([9,11,335]) # true
This example I expected to be false. But on the face of it looks like that when dealing with arrays, boundary values as well as cover?'s argument are converted to string and a simple dictionary style comparison yields true. Is that correct interpretation?
What kind of objects is Range equipped to handle? I know it can take numbers (except complex ones), handle strings, able to mystically work with arrays while boolean, nil and hash values among others cause it to raise ArgumentError: bad value for range
Why does ([1]..[10]).cover?([9,11,335]) return true
Let's take a look at the source. In Ruby 1.9.3 we can see a following definition.
static VALUE
range_cover(VALUE range, VALUE val)
{
VALUE beg, end;
beg = RANGE_BEG(range);
end = RANGE_END(range);
if (r_le(beg, val)) {
if (EXCL(range)) {
if (r_lt(val, end))
return Qtrue;
}
else {
if (r_le(val, end))
return Qtrue;
}
}
return Qfalse;
}
If the beginning of the range isn't lesser or equal to the given value cover? returns false. Here lesser or equal to is determined in terms of the r_lt function, which uses the <=> operator for comparison. Let's see how does it behave in case of arrays
[1] <=> [9,11,335] # => -1
So apparently [1] is indeed lesser than [9,11,335]. As a result we go into the body of the first if. Inside we check whether the range excludes its end and do a second comparison, once again using the <=> operator.
[10] <=> [9,11,335] # => 1
Therefore [10] is greater than [9,11,335]. The method returns true.
Why do you see ArgumentError: bad value for range
The function responsible for raising this error is range_failed. It's called only when range_check returns a nil. When does it happen? When the beginning and the end of the range are uncomparable (yes, once again in terms of our dear friend, the <=> operator).
true <=> false # => nil
true and false are uncomparable. The range cannot be created and the ArgumentError is raised.
On a closing note, Range.cover?'s dependence on <=> is in fact an expected and documented behaviour. See RubySpec's specification of cover?.

Generating integer within range from unique string in ruby

I have a code that should get unique string(for example, "d86c52ec8b7e8a2ea315109627888fe6228d") from client and return integer more than 2200000000 and less than 5800000000. It's important, that this generated int is not random, it should be one for one unique string. What is the best way to generate it without using DB?
Now it looks like this:
did = "d86c52ec8b7e8a2ea315109627888fe6228d"
min_cid = 2200000000
max_cid = 5800000000
cid = did.hash.abs.to_s.split.last(10).to_s.to_i
if cid < min_cid
cid += min_cid
else
while cid > max_cid
cid -= 1000000000
end
end
Here's the problem - your range of numbers has only 3.6x10^9 possible values where as your sample unique string (which looks like a hex integer with 36 digits) has 16^32 possible values (i.e. many more). So when mapping your string into your integer range there will be collisions.
The mapping function itself can be pretty straightforward, I would do something such as below (also, consider using only a part of the input string for integer conversion, e.g. the first seven digits, if performance becomes critical):
def my_hash(str, min, max)
range = (max - min).abs
(str.to_i(16) % range) + min
end
my_hash(did, min_cid, max_cid) # => 2461595789
[Edit] If you are using Ruby 1.8 and your adjusted range can be represented as a Fixnum, just use the hash value of the input string object instead of parsing it as a big integer. Note that this strategy might not be safe in Ruby 1.9 (per the comment by #DataWraith) as object hash values may be randomized between invocations of the interpreter so you would not get the same hash number for the same input string when you restart your application:
def hash_range(obj, min, max)
(obj.hash % (max-min).abs) + [min, max].min
end
hash_range(did, min_cid, max_cid) # => 3886226395
And, of course, you'll have to decide what to do about collisions. You'll likely have to persist a bucket of input strings which map to the same value and decide how to resolve the conflicts if you are looking up by the mapped value.
You could generate a 32-bit CRC, drop one bit, and add the result to 2.2M. That gives you a max value of 4.3M.
Alternately you could use all 32 bits of the CRC, but when the result is too large, append a zero to the input string and recalculate, repeating until you get a value in range.

Fastest way to get maximum value from an exclusive Range in ruby

Ok, so say you have a really big Range in ruby. I want to find a way to get the max value in the Range.
The Range is exclusive (defined with three dots) meaning that it does not include the end object in it's results. It could be made up of Integer, String, Time, or really any object that responds to #<=> and #succ. (which are the only requirements for the start/end object in Range)
Here's an example of an exclusive range:
past = Time.local(2010, 1, 1, 0, 0, 0)
now = Time.now
range = past...now
range.include?(now) # => false
Now I know I could just do something like this to get the max value:
range.max # => returns 1 second before "now" using Enumerable#max
But this will take a non-trivial amount of time to execute. I also know that I could subtract 1 second from whatever the end object is is. However, the object may be something other than Time, and it may not even support #-. I would prefer to find an efficient general solution, but I am willing to combine special case code with a fallback to a general solution (more on that later).
As mentioned above using Range#last won't work either, because it's an exclusive range and does not include the last value in it's results.
The fastest approach I could think of was this:
max = nil
range.each { |value| max = value }
# max now contains nil if the range is empty, or the max value
This is similar to what Enumerable#max does (which Range inherits), except that it exploits the fact that each value is going to be greater than the previous, so we can skip using #<=> to compare the each value with the previous (the way Range#max does) saving a tiny bit of time.
The other approach I was thinking about was to have special case code for common ruby types like Integer, String, Time, Date, DateTime, and then use the above code as a fallback. It'd be a bit ugly, but probably much more efficient when those object types are encountered because I could use subtraction from Range#last to get the max value without any iterating.
Can anyone think of a more efficient/faster approach than this?
The simplest solution that I can think of, which will work for inclusive as well as exclusive ranges:
range.max
Some other possible solutions:
range.entries.last
range.entries[-1]
These solutions are all O(n), and will be very slow for large ranges. The problem in principle is that range values in Ruby are enumerated using the succ method iteratively on all values, starting at the beginning. The elements do not have to implement a method to return the previous value (i.e. pred).
The fastest method would be to find the predecessor of the last item (an O(1) solution):
range.exclude_end? ? range.last.pred : range.last
This works only for ranges that have elements which implement pred. Later versions of Ruby implement pred for integers. You have to add the method yourself if it does not exist (essentially equivalent to special case code you suggested, but slightly simpler to implement).
Some quick benchmarking shows that this last method is the fastest by many orders of magnitude for large ranges (in this case range = 1...1000000), because it is O(1):
user system total real
r.entries.last 11.760000 0.880000 12.640000 ( 12.963178)
r.entries[-1] 11.650000 0.800000 12.450000 ( 12.627440)
last = nil; r.each { |v| last = v } 20.750000 0.020000 20.770000 ( 20.910416)
r.max 17.590000 0.010000 17.600000 ( 17.633006)
r.exclude_end? ? r.last.pred : r.last 0.000000 0.000000 0.000000 ( 0.000062)
Benchmark code is here.
In the comments it is suggested to use range.last - (range.exclude_end? ? 1 : 0). It does work for dates without additional methods, but will never work for non-numeric ranges. String#- does not exist and makes no sense with integer arguments. String#pred, however, can be implented.
I'm not sure about the speed (and initial tests don't seem incredibly fast), but the following might do what you need:
past = Time.local(2010, 1, 1, 0, 0, 0)
now = Time.now
range = past...now
range.to_a[-1]
Very basic testing (counting in my head) showed that it took about 4 seconds while the method you provided took about 5-6. Hope this helps.
Edit 1: Removed second solution as it was totally wrong.
I can't think there's any way to achieve this that doesn't involve enumerating the range, at least unless as already mentioned, you have other information about how the range will be constructed and therefore can infer the desired value without enumeration. Of all the suggestions, I'd go with #max, since it seems to be most expressive.
require 'benchmark'
N = 20
Benchmark.bm(30) do |r|
past, now = Time.local(2010, 2, 1, 0, 0, 0), Time.now
#range = past...now
r.report("range.max") do
N.times { last_in_range = #range.max }
end
r.report("explicit enumeration") do
N.times { #range.each { |value| last_in_range = value } }
end
r.report("range.entries.last") do
N.times { last_in_range = #range.entries.last }
end
r.report("range.to_a[-1]") do
N.times { last_in_range = #range.to_a[-1] }
end
end
user system total real
range.max 49.406000 1.515000 50.921000 ( 50.985000)
explicit enumeration 52.250000 1.719000 53.969000 ( 54.156000)
range.entries.last 53.422000 4.844000 58.266000 ( 58.390000)
range.to_a[-1] 49.187000 5.234000 54.421000 ( 54.500000)
I notice that the 3rd and 4th option have significantly increased system time. I expect that's related to the explicit creation of an array, which seems like a good reason to avoid them, even if they're not obviously more expensive in elapsed time.

Resources