Comparing hash of hashes in Ruby - ruby

I am having two structures similar to the one below.
[1 => [{'pc'=>1,'id'=>0},{'pc'=>4,'id'=>0},{'pc'=>2,'id'=>1]]
But both of them need not contain the inside array in the exact same order. How to compare in such case?

If the order isn't important, you should consider other structures instead of Array there, Set, for example.
You can use Set for comparing as well, by converting Arrays to Sets before comparison:
require 'set'
a = [{:a => 2}, {:b => 3}]
b = [{:b => 3}, {:a => 2}]
sa = Set.new a
#=> #<Set: {{:a=>2}, {:b=>3}}>
sb = Set.new b
#=> #<Set: {{:b=>3}, {:a=>2}}>
a == b
#=> false
sa == sb
#=> true

It seems a simple compare works:
data = {
1 => [{'pc'=>1,'id'=>0},{'pc'=>4,'id'=>0},{'pc'=>2,'id'=>1}],
2 => [{'pc'=>1,'id'=>0},{'pc'=>4,'id'=>0},{'pc'=>2,'id'=>1}],
3 => [{'pc'=>1,'id'=>0},{'pc'=>2,'id'=>1},{'pc'=>4,'id'=>0}]
}
data[1] == data[2]
#> true
data[2] == data[3]
#> false

If both the nested hashes having same keys & order, we can write a recursive method to do the comparison. I am not sure whether we have inbuilt method to do so.

Related

Create a Hash from two arrays of different sizes and iterate until none of the keys are empty

Having two arrays of different sizes, I'd like to get the longer array as keys and the shorter one as values. However, I don't want any keys to remain empty, so that is why I need to keep iterating on the shorter array until all keys have a value.
EDIT: I want to keep array longer intact, but without empty values, that means keep iterating on shorter until all keys have a value.
longer = [1, 2, 3, 4, 5, 6, 7]
shorter = ['a', 'b', 'c']
Hash[longer.zip(shorter)]
#=> {1=>"a", 2=>"b", 3=>"c", 4=>nil, 5=>nil, 6=>nil, 7=>nil}
Expected Result
#=> {1=>"a", 2=>"b", 3=>"c", 4=>"a", 5=>"b", 6=>"c", 7=>"a"}
Here's an elegant one. You can "loop" the short array
longer = [1, 2, 3, 4, 5, 6, 7]
shorter = ['a', 'b', 'c']
longer.zip(shorter.cycle).to_h # => {1=>"a", 2=>"b", 3=>"c", 4=>"a", 5=>"b", 6=>"c", 7=>"a"}
A crude way until you find something more elegant:
Slice the longer array as per length of shorter one, and iterate over it to re-map the values.
mapped = longer.each_slice(shorter.length).to_a.map do |slice|
Hash[slice.zip(shorter)]
end
=> [{1=>"a", 2=>"b", 3=>"c"}, {4=>"a", 5=>"b", 6=>"c"}, {7=>"a"}]
Merge all hashes withing the mapped array into a single hash
final = mapped.reduce Hash.new, :merge
=> {1=>"a", 2=>"b", 3=>"c", 4=>"a", 5=>"b", 6=>"c", 7=>"a"}
Here's a fun answer.
longer = [1, 2, 3, 4, 5, 6, 7]
shorter = ['a', 'b', 'c']
h = Hash.new do |h,k|
idx = longer.index(k)
idx ? shorter[idx % shorter.size] : nil
end
#=> {}
h[1] #=> a
h[2] #=> b
h[3] #=> c
h[4] #=> a
h[5] #=> b
h[6] #=> c
h[7] #=> a
h[8] #=> nil
h #=> {}
h.values_at(3,5) #=> ["c", "b"]
If this is not good enough (e.g., if you wish to use Hash methods such as keys, key?, merge, to_a and so on), you could create the associated hash quite easily:
longer.each { |n| h[n] = h[n] }
h #=> {1=>"a", 2=>"b", 3=>"c", 4=>"a", 5=>"b", 6=>"c", 7=>"a"}

Ruby hash order, uniq method

Variable fav_food can be either "pie", "cake", or "cookie", which will be inputed by the users. However, I want my food_qty Hash to list the fav_food as the first key.
Thus, I come out with
food_order = ([fav_food] + food_qty.keys).uniq
However, is there a better way to do this?
food_qty = {"pie" => 0, "cake" => 0, "cookie" => 0}
# make the fav_food listed first in the hash
food_order = ([fav_food] + food_qty.keys).uniq
Why do you want a particular key/value pair to be first in a hash? Hashes don't need to be ordered, because you can directly access any element at any time without any extra cost.
If you need to retrieve elements in an order, then get the keys and sort that list, then iterate over that list, or use values_at:
foo = {
'z' => 1,
'a' => 2
}
foo_keys = foo.keys.sort # => ["a", "z"]
foo_keys.map{ |k| foo[k] } # => [2, 1]
foo.values_at(*foo_keys) # => [2, 1]
Hashes remember their insertion order, but you shouldn't rely on that; Ordering a hash doesn't help if you insert something later, and other languages don't support it. Instead, order the keys however you want, and use that list to retrieve the values.
If you want to force a key to be first so its value is retrieved first, then consider this:
foo = {
'z' => 1,
'a' => 2,
'w' => 3,
}
foo_keys = foo.keys # => ["z", "a", "w"]
foo_keys.unshift(foo_keys.delete('w')) # => ["w", "z", "a"]
foo_keys.map{ |k| foo[k] } # => [3, 1, 2]
foo.values_at(*foo_keys) # => [3, 1, 2]
If you want a sorted list of keys with one forced to a position:
foo_keys = foo.keys.sort # => ["a", "w", "z"]
foo_keys.unshift(foo_keys.delete('w')) # => ["w", "a", "z"]
foo_keys.map{ |k| foo[k] } # => [3, 2, 1]
foo.values_at(*foo_keys) # => [3, 2, 1]
RE your first paragraph: Hashes are ordered though, specifically because this is such a common requirement and hashes fill so many roles in Ruby. There is no harm relying on hashes being ordered in Ruby, even if other languages don't support this behavior.
Not ordered, as in sorted, instead they remember their insertion order. From the documentation:
Hashes enumerate their values in the order that the corresponding keys were inserted.
This is easily tested/proven:
foo = {z:0, a:-1} # => {:z=>0, :a=>-1}
foo.to_a # => [[:z, 0], [:a, -1]]
foo[:b] = 3
foo.merge!({w:2})
foo # => {:z=>0, :a=>-1, :b=>3, :w=>2}
foo.to_a # => [[:z, 0], [:a, -1], [:b, 3], [:w, 2]]
foo.keys # => [:z, :a, :b, :w]
foo.values # => [0, -1, 3, 2]
If a hash was ordered foo.to_a would be collated somehow, even after adding additional key/value pairs. Instead, it remains in its insertion order. An ordered hash based on keys would move a:-1 to be the first element, just as an ordered hash based on the values would do.
If hashes were ordered, and, if it was important, we'd have some way of telling a hash what its ordering is, ascending or descending or of having some sort of special order based on the keys or values. Instead we have none of those things, and only have the sort and sort_by methods inherited from Enumerable, both of which convert the hash into an array and sort it and return the array, because Arrays can benefit from having an order.
Perhaps you are thinking of Java, which has SortedMap, and provides those sort of capabilities:
A Map that further provides a total ordering on its keys. The map is ordered according to the natural ordering of its keys, or by a Comparator typically provided at sorted map creation time. This order is reflected when iterating over the sorted map's collection views (returned by the entrySet, keySet and values methods). Several additional operations are provided to take advantage of the ordering.
Again, because Ruby's Hash does not sort ordering beyond its insertion order, we have none of those capabilities.
You could use Hash#merge:
food_qty = { "pie" => 0, "cake" => 0, "cookie" => 0 }
fav_food = "cookie"
{ fav_food => nil }.merge(food_qty)
# => { "cookie" => 0, "pie" => 0, "cake" => 0 }
This works because Hash#merge first duplicates the original Hash and then, for keys that already exist (like "cookie"), updates the values—which preserves the order of existing keys. In case it's not clear, the above is equivalent to this:
{ "cookie" => nil }.merge("pie" => 0, "cake" => 0, "cookie" => 0)
Edit: The below is my original answer, before I realized that I had also stumbled upon the "real" answer above.
I don't really advocate this (the Tin Man's advice should be taken instead), but if you're using Ruby 2.0+, I present the following Stupid Ruby Trick:
food_qty = { :pie => 0, :cake => 0, :cookie => 0}
fav_food = :cookie
{ fav_food => nil, **food_qty }
# => { :cookie => 0, :pie => 0, :cake => 0 }
This works because Ruby 2.0 added the "double splat" or "keyword splat" operator, as an analogue to the splat in an Array:
arr = [ 1, 2, *[3, 4] ] # => [ 1, 2, 3, 4 ]
hsh = { a: 1, b: 2, **{ c: 3 } } # => { :a => 1, :b => 2, :c => 3 }
...but it appears to do a reverse merge (a la ActiveSupport's Hash#reverse_merge), merging the "outer" hash into the "inner."
{ a: 1, b: 2, **{ a: 3 } } # => { :a => 1, :b => 2 }
# ...is equivalent to:
{ a: 3 }.merge( { a: 1, b: 2 } ) # => { :a => 1, :b => 2 }
The double splat was implemented to support keyword arguments in Ruby 2.0, which is presumably the reason why it only works if the "inner" Hash's keys are all Symbols.
Like I said, I don't recommend actually doing it, but I find it interesting nonetheless.

How do I output the index of elements in an array that are also in another?

I have two arrays:
A = ["a","s","p","e","n"]
V = ["a","e","i","o","u"]
I want to output an array that shows the index of every element in array A that is also an element anywhere in V.
In other words:
some_function(A, V) == [0,3]
This is because A[0]="a" and A[3]="e" matches the elements "a" and "e" in array V. How do I do that?
Here is how I would do:
A = ["a","s","p","e","n"]
V = ["a","e","i","o","u"]
A.each_index.select{|i| V.include? A[i]} # => [0, 3]
If V is a Set of data (order doesn't matter, no duplicates), and it is large, then you might get a performance benefit by converting it to a Set so that the include? runs faster since Set is built on a hash and gets O(1) retrieval time:
require 'set'
A = ["a","s","p","e","n"]
V = Set.new ["a","e","i","o","u"]
A.each_index.select{|i| V.include? A[i]} # => [0, 3]
As #Arup has answered your question, I thought I might elaborate a bit. Arup suggested you do this:
A.each_index.select{|i| V.include? A[i]}
where
A = ["a","s","p","e","n"]
V = ["a","e","i","o","u"]
Firstly, what is A.each_index? Try it in IRB:
e = A.each_index # => #<Enumerator: ["a", "s", "p", "e", "n"]:each_index>
e.class # => Enumerator
e.to_a # => [0, 1, 2, 3, 4]
So the enumerator e is the receiver of the method Enumerable#select, Enumerable being a mix-in module that is included by several Ruby classes, including Enumerator. Want to check that?
e.respond_to?(:select) # => true
e.respond_to?(:map) # => true
e.respond_to?(:reduce) # => true
Next, note that A.each_index does not depend on the contents of A, just its size, so we could replace that with any enumerator that iterates from 0 to A.size - 1, such as:
m = A.size
m.times.select{|i| V.include? A[i]} # => [0, 3]
0.upto(m-1).select{|i| V.include? A[i]} # => [0, 3]
We can confirm these are Enumerator objects:
m.times.class # => Enumerator
0.upto(m-1).class # => Enumerator
The other main classes that include Enumerable are Array, Hash, Set, Range and IO (but, since Ruby 1.9, not String), so we could also do this:
Array(0...m).select{|i| V.include? A[i]} # => [0, 3]
(0...m).select{|i| V.include? A[i]} # => [0, 3]
require 'set'
Set.new(0..m-1).select{|i| V.include? A[i]} # => [0, 3]
Note that, regardless of the receiver's class, select returns an array. Most (but not all) Enumerable methods that return a collection, return an array, regardless of the receiver's class.

What is the difference of Ruby's Array#to_a method

For example:
a = [1,2,3,4]
b = a
c = a.to_a
a.insert(0,0) #=> [0,1,2,3,4]
b #=> [0,1,2,3,4]
c #=> [0,1,2,3,4]
Why the output of array b and c is the same? If I want to get a copy of array a, not a reference one, which method should I use?
Why the output of array b and c is the same?
Because all three local variables referencing the same objects,as below:
a = [1,2,3,4]
b = a
c = a.to_a
a.object_id
# => 72187200
b.object_id
# => 72187200
c.object_id
# => 72187200
If i want to get a copy of array a, not a reference one , which method should i use?
Then use a.dup.Here documented Object#dup
a = [1,2,3,4]
b = a.dup
c = a.dup
a.object_id
# => 82139270
b.object_id
# => 82139210
c.object_id
# => 82134600
a.insert(0,0) # => [0, 1, 2, 3, 4]
b # => [1, 2, 3, 4]
c # => [1, 2, 3, 4]
Array#to_a says : Returns self.If called on a subclass of Array, converts the receiver to an Array object.
So it will not be helpful as per your need.
This is because Array#to_a returns self, so both variables contain a reference to the same Array object. In order to get a new Array with the same contents, you can use either dup or clone (read about the differences between dup and clone):
a = [1, 2, 3]
b = a.dup
a << 4
a #=> [1, 2, 3, 4]
b #=> [1, 2, 3]
Note, however, that the same object references are stored in the new array. This means if you mutate the objects themselves they will still change in both arrays:
a = ['foo', 'bar']
b = a.dup
a[0] << 'baz'
a #=> ["foobaz", "bar"]
b #=> ["foobaz", "bar"]
This is because dup and clone are shallow-copies.
Array#to_a returns the receiver. That is why b and c refer to the same thing. Regarding why to_a returns the original array, in principle, it could be defined one or the other way, but I guess one use case for to_a is to apply it to a variable that is potentially nil to ensure it becomes an array.
some_value.to_a # => `[]` if `some_value` is `nil`
In such use case, you don't need to replace the array with another one in case the receiver is already an array. That would be performantly more preferable.
The reason for that is that variables are merely references to data. Variables are stored in memory; variables keep address of where they are located in a memory. So when you do:
a = b
Those two variables point to the same memory location, hence if you alter a, b is altered as well because it is the same object.
There are a few ways to force Ruby to create another copy of the object. The most popular one is the dup method mentioned by LBg. Note however that it is only creating a shallow copy. If you run:
a = ['foo','bar', []]
b = a.dup
a << 'blah'
b #=> ['foo', 'bar', []] as expected but
b[3] << blah
a #=> ['foobar', 'bar', ['blah]]
The reason for that is that array is in fact an array of references and nested array has not been duplicated when performing dup, so they are the same object.
To create a deep copy of an object, you can use the Marshall module:
b = Marshal.load(Marshal.dump(a))
However, usually you don't really need to do this. Also, some objects cannot be duplicated (e.g. symbols).
You can just do
b = a.dup
OLD POST
You can try this if there is no easier way
b = a.map {|x| x}
It works
1.9.3-p448 :001 > a = [1,2,3] => [1, 2, 3]
1.9.3-p448 :002 > b = a => [1, 2, 3]
1.9.3-p448 :003 > c = a.map{|x| x} => [1, 2, 3]
1.9.3-p448 :004 > a<<0 => [1, 2, 3, 0]
1.9.3-p448 :005 > b => [1, 2, 3, 0]
1.9.3-p448 :006 > c => [1, 2, 3]
But it is a shallow copy though.
According to this post, a.dup is the easier way.

Ruby hash permutation

Is there any quick way to get a (random) permutation of a given hash? For example with arrays I can use the sample method as in
ruby-1.9.2-p180 :031 > a = (1..5).to_a
=> [1, 2, 3, 4, 5]
ruby-1.9.2-p180 :032 > a.sample(a.length)
=> [3, 5, 1, 2, 4]
For hashes I can use the same method on hash keys and build a new hash with
ruby-1.9.2-p180 :036 > h = { 1 => 'a', 2 => 'b', 3 => 'c' }
=> {1=>"a", 2=>"b", 3=>"c"}
ruby-1.9.2-p180 :037 > h.keys.sample(h.length).inject({}) { |h2, k| h2[k] = h[k]; h2 }
=> {3=>"c", 2=>"b", 1=>"a"}
but this is so ugly. Is there any 'sample' method for hashes which can avoid all that code?
Update As pointed out by #Michael Kohl in comments, this question is meaningful only for ruby 1.9.x. Since in 1.8.x Hash are unordered there is no way to do that.
A slight refinement of mu is too short's answer:
h = Hash[h.to_a.shuffle]
Just add a to_a and Hash[] to your array version to get a Hash version:
h = Hash[h.to_a.sample(h.length)]
For example:
>> h = { 1 => 'a', 2 => 'b', 3 => 'c' }
=> {1=>"a", 2=>"b", 3=>"c"}
>> h = Hash[h.to_a.sample(h.length)]
=> {2=>"b", 1=>"a", 3=>"c"}
Do you really need to shuffle or do you just need a way to access/iterate on a random key ?
Otherwise, a maybe less expensive solution would be to shuffle the hash keys and access your items based on the permutation of those hash keys
h = your_hash
shuffled_hash_keys = hash.keys.shuffle
shuffled_hash_keys.each do |key|
# do something with h[key]
end
I believe (but would need a proof with a benchmark) that this avoids the need/cost to build a brand new hash and is probably more efficient if you have big hashes (you only need to pay the cost of an array permutation)

Resources