I'm interested in implementing a custom equality method for use in an array of objects in Ruby. Here's a stripped-back example:
class Foo
attr_accessor :a, :b
def initialize(a, b)
#a = a
#b = b
end
def ==(other)
puts 'doing comparison'
#a == #a && #b == #b
end
def to_s
"#{#a}: #{#b}"
end
end
a = [
Foo.new(1, 1),
Foo.new(1, 2),
Foo.new(2, 1),
Foo.new(2, 2),
Foo.new(2, 2)
]
a.uniq
I expected the uniq method to call Foo#==, and remove the last instance of Foo. Instead, I don't see the 'doing comparison' debug line and the array remains the same length.
Notes:
I'm using ruby 2.2.2
I've tried defining the method as ===
I have done it long-hand with a.uniq{|x| [x.a, x.b]}, but I don't like this solution it's making the code look pretty cluttered.
It compares values using their hash and eql? methods for efficiency.
https://ruby-doc.org/core-2.5.0/Array.html#method-i-uniq-3F
So you should override eql? (that is ==) and hash
UPDATE:
I cannot explain fully why is that, but overriding hash and == doesn't work. I guess it's cause by the way uniq is implemented in C:
From: array.c (C Method):
Owner: Array
Visibility: public
Number of lines: 20
static VALUE
rb_ary_uniq(VALUE ary)
{
VALUE hash, uniq;
if (RARRAY_LEN(ary) <= 1)
return rb_ary_dup(ary);
if (rb_block_given_p()) {
hash = ary_make_hash_by(ary);
uniq = rb_hash_values(hash);
}
else {
hash = ary_make_hash(ary);
uniq = rb_hash_values(hash);
}
RBASIC_SET_CLASS(uniq, rb_obj_class(ary));
ary_recycle_hash(hash);
return uniq;
}
You can bypass that by using a block version of uniq:
> [Foo.new(1,2), Foo.new(1,2), Foo.new(2,3)].uniq{|f| [f.a, f.b]}
=> [#<Foo:0x0000562e48937cc8 #a=1, #b=2>, #<Foo:0x0000562e48937c78 #a=2, #b=3>]
Or use Struct instead:
F = Struct.new(:a, :b)
[F.new(1,2), F.new(1,2), F.new(2,3)].uniq
# => [#<struct F a=1, b=2>, #<struct F a=2, b=3>]
UPDATE2:
Actually in terms of overriding it's not the same if you override == or eql?. When I overriden eql? It worked as intended:
class Foo
attr_accessor :a, :b
def initialize(a, b)
#a = a
#b = b
end
def eql?(other)
(#a == other.a && #b == other.b)
end
def hash
[a, b].hash
end
def to_s
"#{#a}: #{#b}"
end
end
a = [
Foo.new(1, 1),
Foo.new(1, 2),
Foo.new(2, 1),
Foo.new(2, 2),
Foo.new(2, 2)
]
a.uniq
#=> [#<Foo:0x0000562e483bff70 #a=1, #b=1>,
#<Foo:0x0000562e483bff48 #a=1, #b=2>,
#<Foo:0x0000562e483bff20 #a=2, #b=1>,
#<Foo:0x0000562e483bfef8 #a=2, #b=2>]
You can find the answer in the documentation of Array#uniq (for some reason, it is not mentioned in the documentation of Enumerable#uniq):
It compares values using their hash and eql? methods for efficiency.
The contracts of hash and eql? are as follows:
hash returns an Integer that must be the same for objects which are considered equal, but does not necessarily have to be different for objects that are not equal. This means that different hashes mean that the objects are definitely not equal, but the same hash doesn't tell you anything. Ideally, hash should also be resistant to accidental and deliberate collisions.
eql? is value equality, usually stricter than == but less strict than equal? which is more or less identity: equal? should only return true if you compare an object to itself.
uniq? uses the same trick that is used in hash tables, hash sets, etc. to speed up lookups:
Compare the hashes. Computing a hash should normally be fast.
If the hashes are identical, then, and only then double-check using eql?.
Related
So I need to create an instance method for Array that takes two arguments, the size of an array and an optional object that will be appended to an array.
If the the size argument is less than or equal to the Array.length or the size argument is equal to 0, then just return the array. If the optional argument is left blank, then it inputs nil.
Example output:
array = [1,2,3]
array.class_meth(0) => [1,2,3]
array.class_meth(2) => [1,2,3]
array.class_meth(5) => [1,2,3,nil,nil]
array.class_meth(5, "string") => [1,2,3,"string","string"]
Here is my code that I've been working on:
class Array
def class_meth(a ,b=nil)
self_copy = self
diff = a - self_copy.length
if diff <= 0
self_copy
elsif diff > 0
a.times {self_copy.push b}
end
self_copy
end
def class_meth!(a ,b=nil)
# self_copy = self
diff = a - self.length
if diff <= 0
self
elsif diff > 0
a.times {self.push b}
end
self
end
end
I've been able to create the destructive method, class_meth!, but can't seem to figure out a way to make it non-destructive.
Here's (IMHO) a cleaner solution:
class Array
def class_meth(a, b = nil)
clone.fill(b, size, a - size)
end
def class_meth!(a, b = nil)
fill(b, size, a - size)
end
end
I think it should meet all your needs. To avoid code duplication, you can make either method call the other one (but not both simulaneously, of course):
def class_meth(a, b = nil)
clone.class_meth!(a, b)
end
or:
def class_meth!(a, b = nil)
replace(class_meth(a, b))
end
As you problem has been diagnosed, I will just offer a suggestion for how you might do it. I assume you want to pass two and optionally three, not one and optionally two, parameters to the method.
Code
class Array
def self.class_meth(n, arr, str=nil)
arr + (str ? ([str] : [nil]) * [n-arr.size,0].max)
end
end
Examples
Array.class_meth(0, [1,2,3])
#=> [1,2,3]
Array.class_meth(2, [1,2,3])
#=> [1,2,3]
Array.class_meth(5, [1,2,3])
#=> [1,2,3,nil,nil]
Array.class_meth(5, [1,2,3], "string")
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"])
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"], "string")
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"])
#=> ["dog", "cat", "pig", nil, nil]
Array.class_meth(5, ["dog","cat","pig"], "string")
#=> ["dog", "cat", "pig", "string", "string"]
Before withdrawing his answer, #PatriceGahide suggested using Array#fill. That would be an improvement here; i.e., replace the operative line with:
arr.fill(str ? str : nil, arr.size, [n-arr.size,0].max)
self_copy = self does not make a new object - assignment in Ruby never "copies" or creates a new object implicitly.
Thus the non-destructive case works on the same object (the instance the method was invoked upon) as in the destructive case, with a different variable bound to the same object - that is self.equal? self_copy is true.
The simplest solution is to merely use #clone, keeping in mind it is a shallow clone operation:
def class_meth(a ,b=nil)
self_copy = self.clone # NOW we have a new object ..
# .. so we can modify the duplicate object (self_copy)
# down here without affecting the original (self) object.
end
If #clone cannot be used other solutions involve create a new array or obtain an array #slice (returns a new array) or even append (returning a new array) with #+; however, unlike #clone, these generally lock-into returning an Array and not any sub-type as may be derived.
After the above change is made it should also be apparent that it can written as so:
def class_meth(a ,b=nil)
clone.class_meth!(a, b) # create a NEW object; modify it; return it
# (assumes class_meth! returns the object)
end
A more appropriate implementation of #class_meth!, or #class_meth using one of the other forms to avoid modification of the current instance, is left as an exercise.
FWIW: Those are instance methods, which is appropriate, and not "class meth[ods]"; don't be confused by the ill-naming.
I have class Foo, and I overload two of its methods == and eql?:
class Foo
def initialize(bar)
#bar = bar
end
def bar
#bar
end
def ==(o)
self.bar == o.bar
end
def .eql?(o)
return ==(o)
end
end
I test that f1 and f2 below are equal with respect to the two methods:
u = User.find(12345)
f1 = Foo.new(u)
f2 = Foo.new(u)
f1 == f2 # => true
f1.eql?(f2) # => true
But Hash#has_key? does not render them equal:
{f1 => true}.has_key?(f2) # => false
What is the equality method used in Hash#has_key??
Most implementations of a hash type, Ruby’s included, rely on a hash first (for speed!) and then equality checks. To verify that it works, first, you can just add
def hash
1
end
After that, you should work on providing as many possible distinct return values for hash that will still be equal if the objects are considered equal (as long as it’s fast, of course).
It uses hash method. You may concatinate properties of your objects there or something like that. In your case you want hash value to be the same for 2 objects if and only if they are equal.
I am looking for a way to have, I would say synonym keys in the hash.
I want multiple keys to point to the same value, so I can read/write a value through any of these keys.
As example, it should work like that (let say :foo and :bar are synonyms)
hash[:foo] = "foo"
hash[:bar] = "bar"
puts hash[:foo] # => "bar"
Update 1
Let me add couple of details. The main reason why I need these synonyms, because I receive keys from external source, which I can't control, but multiple keys could actually be associated with the same value.
Rethink Your Data Structure
Depending on how you want to access your data, you can make either the keys or the values synonyms by making them an array. Either way, you'll need to do more work to parse the synonyms than the definitional word they share.
Keys as Definitions
For example, you could use the keys as the definition for your synonyms.
# Create your synonyms.
hash = {}
hash['foo'] = %w[foo bar]
hash
# => {"foo"=>["foo", "bar"]}
# Update the "definition" of your synonyms.
hash['baz'] = hash.delete('foo')
hash
# => {"baz"=>["foo", "bar"]}
Values as Definitions
You could also invert this structure and make your keys arrays of synonyms instead. For example:
hash = {["foo", "bar"]=>"foo"}
hash[hash.rassoc('foo').first] = 'baz'
=> {["foo", "bar"]=>"baz"}
You could subclass hash and override [] and []=.
class AliasedHash < Hash
def initialize(*args)
super
#aliases = {}
end
def alias(from,to)
#aliases[from] = to
self
end
def [](key)
super(alias_of(key))
end
def []=(key,value)
super(alias_of(key), value)
end
private
def alias_of(key)
#aliases.fetch(key,key)
end
end
ah = AliasedHash.new.alias(:bar,:foo)
ah[:foo] = 123
ah[:bar] # => 123
ah[:bar] = 456
ah[:foo] # => 456
What you can do is completely possible as long as you assign the same object to both keys.
variable_a = 'a'
hash = {foo: variable_a, bar: variable_a}
puts hash[:foo] #=> 'a'
hash[:bar].succ!
puts hash[:foo] #=> 'b'
This works because hash[:foo] and hash[:bar] both refer to the same instance of the letter a via variable_a. This however wouldn't work if you used the assignment hash = {foo: 'a', bar: 'a'} because in that case :foo and :bar refer to different instance variables.
The answer to your original post is:
hash[:foo] = hash[:bar]
and
hash[:foo].__id__ == hash[:bar].__id__it
will hold true as long as the value is a reference value (String, Array ...) .
The answer to your Update 1 could be:
input.reduce({ :k => {}, :v => {} }) { |t, (k, v)|
t[:k][t[:v][v] || k] = v;
t[:v][v] = k;
t
}[:k]
where «input» is an abstract enumerator (or array) of your input data as it comes [key, value]+, «:k» your result, and «:v» an inverted hash that serves the purpose of finding a key if its value is already present.
This question already has answers here:
Ruby Style: How to check whether a nested hash element exists
(16 answers)
How to avoid NoMethodError for nil elements when accessing nested hashes? [duplicate]
(4 answers)
Closed 7 years ago.
I'm working a little utility written in ruby that makes extensive use of nested hashes. Currently, I'm checking access to nested hash elements as follows:
structure = { :a => { :b => 'foo' }}
# I want structure[:a][:b]
value = nil
if structure.has_key?(:a) && structure[:a].has_key?(:b) then
value = structure[:a][:b]
end
Is there a better way to do this? I'd like to be able to say:
value = structure[:a][:b]
And get nil if :a is not a key in structure, etc.
Traditionally, you really had to do something like this:
structure[:a] && structure[:a][:b]
However, Ruby 2.3 added a method Hash#dig that makes this way more graceful:
structure.dig :a, :b # nil if it misses anywhere along the way
There is a gem called ruby_dig that will back-patch this for you.
Hash and Array have a method called dig.
value = structure.dig(:a, :b)
It returns nil if the key is missing at any level.
If you are using a version of Ruby older than 2.3, you can install a gem such as ruby_dig or hash_dig_and_collect, or implement this functionality yourself:
module RubyDig
def dig(key, *rest)
if value = (self[key] rescue nil)
if rest.empty?
value
elsif value.respond_to?(:dig)
value.dig(*rest)
end
end
end
end
if RUBY_VERSION < '2.3'
Array.send(:include, RubyDig)
Hash.send(:include, RubyDig)
end
The way I usually do this these days is:
h = Hash.new { |h,k| h[k] = {} }
This will give you a hash that creates a new hash as the entry for a missing key, but returns nil for the second level of key:
h['foo'] -> {}
h['foo']['bar'] -> nil
You can nest this to add multiple layers that can be addressed this way:
h = Hash.new { |h, k| h[k] = Hash.new { |hh, kk| hh[kk] = {} } }
h['bar'] -> {}
h['tar']['zar'] -> {}
h['scar']['far']['mar'] -> nil
You can also chain indefinitely by using the default_proc method:
h = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) }
h['bar'] -> {}
h['tar']['star']['par'] -> {}
The above code creates a hash whose default proc creates a new Hash with the same default proc. So, a hash created as a default value when a lookup for an unseen key occurs will have the same default behavior.
EDIT: More details
Ruby hashes allow you to control how default values are created when a lookup occurs for a new key. When specified, this behavior is encapsulated as a Proc object and is reachable via the default_proc and default_proc= methods. The default proc can also be specified by passing a block to Hash.new.
Let's break this code down a little. This is not idiomatic ruby, but it's easier to break it out into multiple lines:
1. recursive_hash = Hash.new do |h, k|
2. h[k] = Hash.new(&h.default_proc)
3. end
Line 1 declares a variable recursive_hash to be a new Hash and begins a block to be recursive_hash's default_proc. The block is passed two objects: h, which is the Hash instance the key lookup is being performed on, and k, the key being looked up.
Line 2 sets the default value in the hash to a new Hash instance. The default behavior for this hash is supplied by passing a Proc created from the default_proc of the hash the lookup is occurring in; ie, the default proc the block itself is defining.
Here's an example from an IRB session:
irb(main):011:0> recursive_hash = Hash.new do |h,k|
irb(main):012:1* h[k] = Hash.new(&h.default_proc)
irb(main):013:1> end
=> {}
irb(main):014:0> recursive_hash[:foo]
=> {}
irb(main):015:0> recursive_hash
=> {:foo=>{}}
When the hash at recursive_hash[:foo] was created, its default_proc was supplied by recursive_hash's default_proc. This has two effects:
The default behavior for recursive_hash[:foo] is the same as recursive_hash.
The default behavior for hashes created by recursive_hash[:foo]'s default_proc will be the same as recursive_hash.
So, continuing in IRB, we get the following:
irb(main):016:0> recursive_hash[:foo][:bar]
=> {}
irb(main):017:0> recursive_hash
=> {:foo=>{:bar=>{}}}
irb(main):018:0> recursive_hash[:foo][:bar][:zap]
=> {}
irb(main):019:0> recursive_hash
=> {:foo=>{:bar=>{:zap=>{}}}}
I made rubygem for this. Try vine.
Install:
gem install vine
Usage:
hash.access("a.b.c")
I think one of the most readable solutions is using Hashie:
require 'hashie'
myhash = Hashie::Mash.new({foo: {bar: "blah" }})
myhash.foo.bar
=> "blah"
myhash.foo?
=> true
# use "underscore dot" for multi-level testing
myhash.foo_.bar?
=> true
myhash.foo_.huh_.what?
=> false
value = structure[:a][:b] rescue nil
Solution 1
I suggested this in my question before:
class NilClass; def to_hash; {} end end
Hash#to_hash is already defined, and returns self. Then you can do:
value = structure[:a].to_hash[:b]
The to_hash ensures that you get an empty hash when the previous key search fails.
Solution2
This solution is similar in spirit to mu is too short's answer in that it uses a subclass, but still somewhat different. In case there is no value for a certain key, it does not use a default value, but rather creates a value of empty hash, so that it does not have the problem of confusion in assigment that DigitalRoss's answer has, as was pointed out by mu is too short.
class NilFreeHash < Hash
def [] key; key?(key) ? super(key) : self[key] = NilFreeHash.new end
end
structure = NilFreeHash.new
structure[:a][:b] = 3
p strucrture[:a][:b] # => 3
It departs from the specification given in the question, though. When an undefined key is given, it will return an empty hash instread of nil.
p structure[:c] # => {}
If you build an instance of this NilFreeHash from the beginning and assign the key-values, it will work, but if you want to convert a hash into an instance of this class, that may be a problem.
You could just build a Hash subclass with an extra variadic method for digging all the way down with appropriate checks along the way. Something like this (with a better name of course):
class Thing < Hash
def find(*path)
path.inject(self) { |h, x| return nil if(!h.is_a?(Thing) || h[x].nil?); h[x] }
end
end
Then just use Things instead of hashes:
>> x = Thing.new
=> {}
>> x[:a] = Thing.new
=> {}
>> x[:a][:b] = 'k'
=> "k"
>> x.find(:a)
=> {:b=>"k"}
>> x.find(:a, :b)
=> "k"
>> x.find(:a, :b, :c)
=> nil
>> x.find(:a, :c, :d)
=> nil
This monkey patch function for Hash should be easiest (at least for me). It also doesn't alter structure i.e. changing nil's to {}. It would still also apply even if you're reading a tree from a raw source e.g. JSON. It also doesn't need to produce empty hash objects as it goes or parse a string. rescue nil was actually a good easy solution for me as I'm brave enough for such a low risk but I find it to essentially have a drawback with performance.
class ::Hash
def recurse(*keys)
v = self[keys.shift]
while keys.length > 0
return nil if not v.is_a? Hash
v = v[keys.shift]
end
v
end
end
Example:
> structure = { :a => { :b => 'foo' }}
=> {:a=>{:b=>"foo"}}
> structure.recurse(:a, :b)
=> "foo"
> structure.recurse(:a, :x)
=> nil
What's also good is that you can play around saved arrays with it:
> keys = [:a, :b]
=> [:a, :b]
> structure.recurse(*keys)
=> "foo"
> structure.recurse(*keys, :x1, :x2)
=> nil
The XKeys gem will read and auto-vivify-on-write nested hashes (::Hash) or hashes and arrays (::Auto, based on the key/index type) with a simple, clear, readable, and compact syntax by enhancing #[] and #[]=. The sentinel symbol :[] will push onto the end of an array.
require 'xkeys'
structure = {}.extend XKeys::Hash
structure[:a, :b] # nil
structure[:a, :b, :else => 0] # 0 (contextual default)
structure[:a] # nil, even after above
structure[:a, :b] = 'foo'
structure[:a, :b] # foo
You can use the andand gem, but I'm becoming more and more wary of it:
>> structure = { :a => { :b => 'foo' }} #=> {:a=>{:b=>"foo"}}
>> require 'andand' #=> true
>> structure[:a].andand[:b] #=> "foo"
>> structure[:c].andand[:b] #=> nil
There is the cute but wrong way to do this. Which is to monkey-patch NilClass to add a [] method that returns nil. I say it is the wrong approach because you have no idea what other software may have made a different version, or what behavior change in a future version of Ruby can be broken by this.
A better approach is to create a new object that works a lot like nil but supports this behavior. Make this new object the default return of your hashes. And then it will just work.
Alternately you can create a simple "nested lookup" function that you pass the hash and the keys to, which traverses the hashes in order, breaking out when it can.
I would personally prefer one of the latter two approaches. Though I think it would be cute if the first was integrated into the Ruby language. (But monkey-patching is a bad idea. Don't do that. Particularly not to demonstrate what a cool hacker you are.)
Not that I would do it, but you can Monkeypatch in NilClass#[]:
> structure = { :a => { :b => 'foo' }}
#=> {:a=>{:b=>"foo"}}
> structure[:x][:y]
NoMethodError: undefined method `[]' for nil:NilClass
from (irb):2
from C:/Ruby/bin/irb:12:in `<main>'
> class NilClass; def [](*a); end; end
#=> nil
> structure[:x][:y]
#=> nil
> structure[:a][:y]
#=> nil
> structure[:a][:b]
#=> "foo"
Go with #DigitalRoss's answer. Yes, it's more typing, but that's because it's safer.
In my case, I needed a two-dimensional matrix where each cell is a list of items.
I found this technique which seems to work. It might work for the OP:
$all = Hash.new()
def $all.[](k)
v = fetch(k, nil)
return v if v
h = Hash.new()
def h.[](k2)
v = fetch(k2, nil)
return v if v
list = Array.new()
store(k2, list)
return list
end
store(k, h)
return h
end
$all['g1-a']['g2-a'] << '1'
$all['g1-a']['g2-a'] << '2'
$all['g1-a']['g2-a'] << '3'
$all['g1-a']['g2-b'] << '4'
$all['g1-b']['g2-a'] << '5'
$all['g1-b']['g2-c'] << '6'
$all.keys.each do |group1|
$all[group1].keys.each do |group2|
$all[group1][group2].each do |item|
puts "#{group1} #{group2} #{item}"
end
end
end
The output is:
$ ruby -v && ruby t.rb
ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]
g1-a g2-a 1
g1-a g2-a 2
g1-a g2-a 3
g1-a g2-b 4
g1-b g2-a 5
g1-b g2-c 6
I am currently trying out this:
# --------------------------------------------------------------------
# System so that we chain methods together without worrying about nil
# values (a la Objective-c).
# Example:
# params[:foo].try?[:bar]
#
class Object
# Returns self, unless NilClass (see below)
def try?
self
end
end
class NilClass
class MethodMissingSink
include Singleton
def method_missing(meth, *args, &block)
end
end
def try?
MethodMissingSink.instance
end
end
I know the arguments against try, but it is useful when looking into things, like say, params.
Here's some example code:
class Obj
attr :c, true
def == that
p '=='
that.c == self.c
end
def <=> that
p '<=>'
that.c <=> self.c
end
def equal? that
p 'equal?'
that.c.equal? self.c
end
def eql? that
p 'eql?'
that.c.eql? self.c
end
end
a = Obj.new
b = Obj.new
a.c = 1
b.c = 1
p [a] | [b]
It prints 2 objects but it should print 1 object. None of the comparison methods get called. How is Array.| comparing for equality?
Array#| is implemented using hashs. So in order for your type to work well with it (as well as with hashmaps and hashsets), you'll have to implement eql? (which you did) and hash (which you did not). The most straight forward way to define hash meaningfully would be to just return c.hash.
Ruby's Array class is implemented in C, and from what I can tell, uses a custom hash table to check for equality when comparing objects in |. If you wanted to modify this behavior, you'd have to write your own version that uses an equality check of your choice.
To see the full implementation of Ruby's Array#|: click here and search for "rb_ary_or(VALUE ary1, VALUE ary2)"
Ruby is calling the hash functions and they are returning different values, because they are still just returning the default object_id. You will need to def hash and return something reflecting your idea of what makes an Obj significant.
>> class Obj2 < Obj
>> def hash; t = super; p ['hash: ', t]; t; end
>> end
=> nil
>> x, y, x.c, y.c = Obj2.new, Obj2.new, 1, 1
=> [#<Obj2:0x100302568 #c=1>, #<Obj2:0x100302540 #c=1>, 1, 1]
>> p [x] | [y]
["hash: ", 2149061300]
["hash: ", 2149061280]
["hash: ", 2149061300]
["hash: ", 2149061280]
[#<Obj2:0x100302568 #c=1>, #<Obj2:0x100302540 #c=1>]