I've just started learning Ruby and Ruby on Rails and came across validation code that uses ranges:
validates_inclusion_of :age, :in => 21..99
validates_exclusion_of :age, :in => 0...21, :message => "Sorry, you must be over 21"
At first I thought the difference was in the inclusion of endpoints, but in the API docs I looked into, it didn't seem to matter whether it was .. or ...: it always included the endpoints.
However, I did some testing in irb and it seemed to indicate that .. includes both endpoints, while ... only included the lower bound but not the upper one. Is this correct?
The documentation for Range† says this:
Ranges constructed using .. run from the beginning to the end inclusively. Those created using ... exclude the end value.
So a..b is like a <= x <= b, whereas a...b is like a <= x < b.
Note that, while to_a on a Range of integers gives a collection of integers, a Range is not a set of values, but simply a pair of start/end values:
(1..5).include?(5) #=> true
(1...5).include?(5) #=> false
(1..4).include?(4.1) #=> false
(1...5).include?(4.1) #=> true
(1..4).to_a == (1...5).to_a #=> true
(1..4) == (1...5) #=> false
†The docs used to not include this, instead requiring reading the Pickaxe’s section on Ranges. Thanks to #MarkAmery (see below) for noting this update.
That is correct.
1.9.3p0 :005 > (1...10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9]
1.9.3p0 :006 > (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
The triple-dot syntax is less common, but is nicer than (1..10-1).to_a
The API docs now describe this behaviour:
Ranges constructed using .. run from the beginning to the end inclusively. Those created using ... exclude the end value.
-- http://ruby-doc.org/core-2.1.3/Range.html
In other words:
2.1.3 :001 > ('a'...'d').to_a
=> ["a", "b", "c"]
2.1.3 :002 > ('a'..'d').to_a
=> ["a", "b", "c", "d"]
a...b excludes the end value, while a..b includes the end value.
When working with integers, a...b behaves as a..b-1.
>> (-1...3).to_a
=> [-1, 0, 1, 2]
>> (-1..2).to_a
=> [-1, 0, 1, 2]
>> (-1..2).to_a == (-1...3).to_a
=> true
But really the ranges differ on a real number line.
>> (-1..2) == (-1...3)
=> false
You can see this when incrementing in fractional steps.
>> (-1..2).step(0.5).to_a
=> [-1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]
>> (-1...3).step(0.5).to_a
=> [-1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
.. and ... denote a range.
Just see it in irb:
ruby-1.9.2-p290 :032 > (1...2).each do puts "p" end
p
=> 1...2
ruby-1.9.2-p290 :033 > (1..2).each do puts "p" end
p
p
I would like to extend the Array class with a uniq_elements method which returns those elements with multiplicity of one. I also would like to use closures to my new method as with uniq. For example:
t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements # => [1,3,5,6,8]
Example with closure:
t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements{|z| z.round} # => [2.0, 5.1]
Neither t-t.uniq nor t.to_set-t.uniq.to_set works. I don't care of speed, I call it only once in my program, so it can be a slow.
Helper method
This method uses the helper:
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
This method is similar to Array#-. The difference is illustrated in the following example:
a = [3,1,2,3,4,3,2,2,4]
b = [2,3,4,4,3,4]
a - b #=> [1]
c = a.difference b #=> [1, 3, 2, 2]
As you see, a contains three 3's and b contains two, so the first two 3's in a are removed in constructing c (a is not mutated). When b contains as least as many instances of an element as does a, c contains no instances of that element. To remove elements beginning at the end of a:
a.reverse.difference(b).reverse #=> [3, 1, 2, 2]
Array#difference! could be defined in the obvious way.
I have found many uses for this method: here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here and here.
I have proposed that this method be added to the Ruby core.
When used with Array#-, this method makes it easy to extract the unique elements from an array a:
a = [1,3,2,4,3,4]
u = a.uniq #=> [1, 2, 3, 4]
u - a.difference(u) #=> [1, 2]
This works because
a.difference(u) #=> [3,4]
contains all the non-unique elements of a (each possibly more than once).
Problem at Hand
Code
class Array
def uniq_elements(&prc)
prc ||= ->(e) { e }
a = map { |e| prc[e] }
u = a.uniq
uniques = u - a.difference(u)
select { |e| uniques.include?(prc[e]) ? (uniques.delete(e); true) : false }
end
end
Examples
t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements
#=> [1,3,5,6,8]
t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round }
# => [2.0, 5.1]
Here's another way.
Code
require 'set'
class Array
def uniq_elements(&prc)
prc ||= ->(e) { e }
uniques, dups = {}, Set.new
each do |e|
k = prc[e]
((uniques.key?(k)) ? (dups << k; uniques.delete(k)) :
uniques[k] = e) unless dups.include?(k)
end
uniques.values
end
end
Examples
t = [1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements #=> [1,3,5,6,8]
t = [1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
t.uniq_elements { |z| z.round } # => [2.0, 5.1]
Explanation
if uniq_elements is called with a block, it is received as the proc prc.
if uniq_elements is called without a block, prc is nil, so the first statement of the method sets prc equal to the default proc (lambda).
an initially-empty hash, uniques, contains representations of the unique values. The values are the unique values of the array self, the keys are what is returned when the proc prc is passed the array value and called: k = prc[e].
the set dups contains the elements of the array that have found to not be unique. It is a set (rather than an array) to speed lookups. Alternatively, if could be a hash with the non-unique values as keys, and arbitrary values.
the following steps are performed for each element e of the array self:
k = prc[e] is computed.
if dups contains k, e is a dup, so nothing more needs to be done; else
if uniques has a key k, e is a dup, so k is added to the set dups and the element with key k is removed from uniques; else
the element k=>e is added to uniques as a candidate for a unique element.
the values of unique are returned.
class Array
def uniq_elements
counts = Hash.new(0)
arr = map do |orig_val|
converted_val = block_given? ? (yield orig_val) : orig_val
counts[converted_val] += 1
[converted_val, orig_val]
end
uniques = []
arr.each do |(converted_val, orig_val)|
uniques << orig_val if counts[converted_val] == 1
end
uniques
end
end
t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
p t.uniq_elements
t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
p t.uniq_elements { |elmt| elmt.round }
--output:--
[1, 3, 5, 6, 8]
[2.0, 5.1]
Array#uniq does not find non-duplicated elements, rather Array#uniq removes duplicates.
Use Enumerable#tally:
class Array
def uniq_elements
tally.select { |_obj, nb| nb == 1 }.keys
end
end
t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
t.uniq_elements # => [1,3,5,6,8]
If you are using Ruby < 2.7, you can get tally with the backports gem
require 'backports/2.7.0/enumerable/tally'
class Array
def uniq_elements
zip( block_given? ? map { |e| yield e } : self )
.each_with_object Hash.new do |(e, v), h| h[v] = h[v].nil? ? [e] : false end
.values.reject( &:! ).map &:first
end
end
[1,2,2,3,4,4,5,6,7,7,8,9,9,9].uniq_elements #=> [1, 3, 5, 6, 8]
[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2].uniq_elements &:round #=> [2.0, 5.1]
Creating and calling a default proc is a waste of time, and
Cramming everything into one line using tortured constructs doesn't make the code more efficient--it just makes the code harder to understand.
In require statements, rubyists don't capitalize file names.
....
require 'set'
class Array
def uniq_elements
uniques = {}
dups = Set.new
each do |orig_val|
converted_val = block_given? ? (yield orig_val) : orig_val
next if dups.include? converted_val
if uniques.include?(converted_val)
uniques.delete(converted_val)
dups << converted_val
else
uniques[converted_val] = orig_val
end
end
uniques.values
end
end
t=[1,2,2,3,4,4,5,6,7,7,8,9,9,9]
p t.uniq_elements
t=[1.0, 1.1, 2.0, 3.0, 3.4, 4.0, 4.2, 5.1, 5.7, 6.1, 6.2]
p t.uniq_elements {|elmt|
elmt.round
}
--output:--
[1, 3, 5, 6, 8]
[2.0, 5.1]
Is there a standard method in ruby similar to (1...4).to_a is [1,2,3,4] except reverse i.e. (4...1).to_a would be [4,3,2,1]?
I realize this can easily be defined via (1...4).to_a.reverse but it strikes me as odd that it is not already and 1) am I missing something? 2) if not, is there a functional/practical reason it is not already?
The easiest is probably this:
4.downto(1).to_a #=> [4, 3, 2, 1]
Alternatively you can use step:
4.step(1,-1).to_a #=> [4, 3, 2, 1]
Finally a rather obscure solution for fun:
(-4..-1).map(&:abs) #=> [4, 3, 2, 1]
(1...4) is a Range. Ranges in ruby are not like arrays; one if their advantages is you can create a range like
(1..1e9)
without taking up all of your machine's memory. Also, you can create this range:
r = (1.0...4.0)
Which means "the set of all floating point numbers from 1.0 to 4.0, including 1.0 but not 4.0"
In other words:
irb(main):013:0> r.include? 3.9999
=> true
irb(main):014:0> r.include? 3.99999999999
=> true
irb(main):015:0> r.include? 4.0
=> false
you can turn an integer Range into an array:
irb(main):022:0> (1..4).to_a
=> [1, 2, 3, 4]
but not a floating point range:
irb(main):023:0> (1.0...4.0).to_a
TypeError: can't iterate from Float
from (irb):23:in `each'
from (irb):23:in `to_a'
from (irb):23
from /home/mslade/rubygems1.9/bin/irb:12:in `<main>'
Because there is no natural way to iterate over floating point numbers. Instead you use #step:
irb(main):015:0> (1..4).step(0.5).to_a
=> [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0]
irb(main):016:0> (1...4).step(0.5).to_a
=> [1.0, 1.5, 2.0, 2.5, 3.0, 3.5]
If you need to iterate backwards through a large integer range, use Integer#downto.
You could patch Range#to_a to automatically work with reverse like this:
class Range
alias :to_a_original :to_a
def reverse
Range.new(last, first)
end
def to_a
(first < last) ? to_a_original : reverse.to_a_original.reverse
end
end
Result:
(4..1).to_a
=> [4, 3, 2, 1]
This approach is called "re-opening" the class a.k.a. "monkey-patching". Some developers like this approach because it's adding helpful functionality, some dislike it because it's messing with Ruby core.)
I've just started learning Ruby and Ruby on Rails and came across validation code that uses ranges:
validates_inclusion_of :age, :in => 21..99
validates_exclusion_of :age, :in => 0...21, :message => "Sorry, you must be over 21"
At first I thought the difference was in the inclusion of endpoints, but in the API docs I looked into, it didn't seem to matter whether it was .. or ...: it always included the endpoints.
However, I did some testing in irb and it seemed to indicate that .. includes both endpoints, while ... only included the lower bound but not the upper one. Is this correct?
The documentation for Range† says this:
Ranges constructed using .. run from the beginning to the end inclusively. Those created using ... exclude the end value.
So a..b is like a <= x <= b, whereas a...b is like a <= x < b.
Note that, while to_a on a Range of integers gives a collection of integers, a Range is not a set of values, but simply a pair of start/end values:
(1..5).include?(5) #=> true
(1...5).include?(5) #=> false
(1..4).include?(4.1) #=> false
(1...5).include?(4.1) #=> true
(1..4).to_a == (1...5).to_a #=> true
(1..4) == (1...5) #=> false
†The docs used to not include this, instead requiring reading the Pickaxe’s section on Ranges. Thanks to #MarkAmery (see below) for noting this update.
That is correct.
1.9.3p0 :005 > (1...10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9]
1.9.3p0 :006 > (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
The triple-dot syntax is less common, but is nicer than (1..10-1).to_a
The API docs now describe this behaviour:
Ranges constructed using .. run from the beginning to the end inclusively. Those created using ... exclude the end value.
-- http://ruby-doc.org/core-2.1.3/Range.html
In other words:
2.1.3 :001 > ('a'...'d').to_a
=> ["a", "b", "c"]
2.1.3 :002 > ('a'..'d').to_a
=> ["a", "b", "c", "d"]
a...b excludes the end value, while a..b includes the end value.
When working with integers, a...b behaves as a..b-1.
>> (-1...3).to_a
=> [-1, 0, 1, 2]
>> (-1..2).to_a
=> [-1, 0, 1, 2]
>> (-1..2).to_a == (-1...3).to_a
=> true
But really the ranges differ on a real number line.
>> (-1..2) == (-1...3)
=> false
You can see this when incrementing in fractional steps.
>> (-1..2).step(0.5).to_a
=> [-1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0]
>> (-1...3).step(0.5).to_a
=> [-1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0, 2.5]
.. and ... denote a range.
Just see it in irb:
ruby-1.9.2-p290 :032 > (1...2).each do puts "p" end
p
=> 1...2
ruby-1.9.2-p290 :033 > (1..2).each do puts "p" end
p
p