Compare 2 Hash with values consisting of array - ruby

I have 2 hashes, let's say A, B
A: { 'key1' => [a, b], 'key2' => 'c' }
B: { 'key1' => [b, a], 'key2' => 'c' }
What is the best possible way to compare these 2 hashes. The ordering of the array contents does not matter. So in my case, hash A and B are equal

It's not as easy as it seems at first glance.
It is necessary to take into account several nuances:
the number of elements in the hashes may not match;
items with the same key in two hashes can be of different types.
A relatively universal solution can be as follows:
def hashes_comp(hash1, hash2)
return false if hash1.size != hash2.size
hash1.each do |key, value|
if value.class == Array
return false if hash2[key].class != Array || value.sort != hash2[key].sort
else
return false if value != hash2[key]
end
end
true
end
hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}
hashes_comp(hash_a, hash_b) #=> true
hashes_comp(hash_a, hash_c) #=> false
hashes_comp(hash_a, hash_d) #=> false
hashes_comp(hash_a, hash_e) #=> false
hashes_comp(hash_a, hash_f) #=> false

One can sort the arrays but that can be an expensive operation if the arrays are large. If n equals the size of the array, the time complexity of heapsort, for example, is O(n log(n)). It's faster to replace arrays with counting hashes, the construction of which enjoys a time complexity of O(n).
h1 = { 'k1' => [1, 2, 1, 3, 2, 1], 'k2' => 'c' }
h2 = { 'k1' => [3, 2, 1, 2, 1, 1], 'k2' => 'c' }
def same?(h1, h2)
return false unless h1.size == h2.size
h1.all? do |k,v|
if h2.key?(k)
vo = h2[k]
if v.is_a?(Array)
if vo.is_a?(Array)
convert(v) == convert(vo)
end
else
v == vo
end
end
end
end
def convert(arr)
arr.each_with_object(Hash.new(0)) { |e,g| g[e] += 1 }
end
same?(h1, h2)
#=> true
Here
convert([1, 2, 1, 3, 2, 1])
#=> {1=>3, 2=>2, 3=>1}
convert([3, 2, 1, 2, 1, 1])
#=> {3=>1, 2=>2, 1=>3}
and
{1=>3, 2=>2, 3=>1} == {3=>1, 2=>2, 1=>3}
#=> true
See Hash::new, specifically the case where the method takes an argument that equals the default value.
The guard clause return false unless h1.size == h2.size is to ensure that h2 does not have keys that are not present in h1. Note that the following returns the falsy value nil:
if false
#...
end
#=> nil
In a couple of places I've written that rather than the more verbose but equivalent expresion
if false
#...
else
nil
end

I would definitely agree with Ivan it's not as easy as it initially seems but I figured I would try doing it with recursion. This has the added benefit of being able to compare beyond just hashes.
hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}
def recursive_compare(one, two)
unless one.class == two.class
return false
end
match = false
# If it's not an Array or Hash...
unless one.class == Hash || one.class == Array
return one == two
end
# If they're both Hashes...
if one.class == Hash
one.each_key do |k|
match = two.key? k
break unless match
end
two.each_key do |k|
match = one.key? k
break unless match
end
if match
one.each do |k, v|
match = recursive_compare(v, two[k])
break unless match
end
end
end
# If they're both Arrays...
if one.class == Array
one.each do |v|
match = two.include? v
break unless match
end
two.each do |v|
match = one.include? v
break unless match
end
end
match
end
puts recursive_compare(hash_a, hash_b) #=> true
puts recursive_compare(hash_a, hash_c) #=> false
puts recursive_compare(hash_a, hash_d) #=> false
puts recursive_compare(hash_a, hash_e) #=> false
puts recursive_compare(hash_a, hash_f) #=> false

I came up with this solution:
def are_equals?(a, b)
(a.keys.sort == b.keys.sort) &&
a.merge(b) { |k, o_val, n_val| [o_val, n_val].all? { |e| e.kind_of? Array} ? o_val.sort == n_val.sort : o_val == n_val }.values.all?
end
How it works.
The first part tests for key equality, using Hash#keys, which returns the array of keys, sorted of course:
a.keys.sort == b.keys.sort
For the second part I used Hash#merge to compare values related to the same key, and can be expanded in this way:
res = a.merge(b) do |k, o_val, n_val|
if [o_val, n_val].all? { |e| e.kind_of? Array}
o_val.sort == n_val.sort
else
o_val == n_val
end
end
#=> {"key1"=>true, "key2"=>true}
It returns a Hash where values are true or false, then checks if all values are true using Enumerable#all?:
res.values.all?
#=> [true, true].all? => true

Related

Ruby: Adding different array values into hash for the same key

I'm trying to add different values into an array for the same key into the hash. Instead of making a new instance in the array, my function sum ups the index values of the array elements
def dupe_indices(array)
hash = Hash.new([])
array.each.with_index { |ele, idx| hash[ele] = (idx) }
hash
end
I'm getting this
print dupe_indices(['a', 'b', 'c', 'a', 'c']) => {"a"=>3, "b"=>1,
"c"=>4}
Expected output
print dupe_indices(['a', 'b', 'c', 'a', 'c']) => { 'a' => [0, 3], 'b'
=> [1], 'c' => [2, 4] }
With two small modifications your code willl work.
change hash = Hash.new([]) to hash = Hash.new { |h,k| h[k] = [] }
you should really never use Hash.new([]), see this article for an explanation: https://mensfeld.pl/2016/09/ruby-hash-default-value-be-cautious-when-you-use-it/
Change hash[ele] = (idx) to hash[ele].push(idx)
you don't want to replace the value whenever you encounter a new index, you want to push it to the array.
array = ['a', 'b', 'c', 'a', 'c']
def dupe_indices(array)
hash = Hash.new { |h,k| h[k] = [] }
array.each.with_index { |ele, idx| hash[ele].push(idx) }
hash
end
dupe_indices(array)
# => {"a"=>[0, 3], "b"=>[1], "c"=>[2, 4]}

Compare values from multiple hashes within an array

I know for-loops should be avoided and i guess the better way to iterate through an array is instead of doing
for i in 0..array.size-1 do
puts array[i]
end
do
array.each{ |x|
puts x
}
But what if i had an array of hashes like
array = [{:value0 => 1, :value1 => 0}, {:value0 => 2, :value1 => 1}, {:value0 => 1, :value1 => 2}]
and wanted to check if :value0 is unique in all hashes.. intuitively I would do something like
for i in 0..array.size-1 do
_value_buffer = array[i][:value0]
for j in i+1..array.size-1 do
if _value_buffer == array[j][:value0]
puts "whatever"
end
end
end
Is there a better way to do this?
Why not just get all the values in question and see if they’re unique?
!array.map { |h| h[:value0] }.uniq!
(uniq! returns nil when there are no duplicates)
As what Andrew Marshall says, you can use uniq!, but with a more concise way:
!array.uniq!{|a| a[:value0]}
Here is how I would do it:
2.0.0p195 :001 > array = [{:value0 => 1, :value2 => 0}, {:value0 => 2, :value2 => 1}, {:value0 => 1, :value2 => 2}]
=> [{:value0=>1, :value2=>0}, {:value0=>2, :value2=>1}, {:value0=>1, :value2=>2}]
2.0.0p195 :002 > val0 = array.map { |hash| hash[:value0] }
=> [1, 2, 1]
2.0.0p195 :003 > puts val0.uniq == val0
false
=> nil
I would collect the values of :value0 and then compare them to the array of unique values.

How do I map an array of hashes?

I have an array of hashes:
arr = [ {:a => 1, :b => 2}, {:a => 3, :b => 4} ]
What I want to achieve is:
arr.map{|x| x[:a]}.reduce(:+)
but I think it's a bit ugly, or at least not that elegant as:
arr.map(&:a).reduce(:+)
The later one is wrong because there is no method called a in the hashes.
Are there any better ways to write map{|x| x[:a]}?
You could make actual Objects, possibly with a Struct:
MyClass = Struct.new :a, :b
arr = [MyClass.new(1, 2), MyClass.new(3, 4)]
arr.map(&:a).reduce(:+) #=> 4
Or for more flexibility, an OpenStruct:
require 'ostruct'
arr = [OpenStruct.new(a: 1, b: 2), OpenStruct.new(a: 3, b: 4)]
arr.map(&:a).reduce(:+) #=> 4
Of course either of these can be constructed from existing hashes:
arr = [{ :a => 1, :b => 2 }, { :a => 3, :b => 4 }]
ss = arr.map { |h| h.values_at :a, :b }.map { |attrs| MyClass.new(*attrs) }
ss.map(&:a).reduce(:+) #=> 4
oss = arr.map { |attrs| OpenStruct.new attrs }
oss.map(&:a).reduce(:+) #=> 4
Or, for a more creative, functional approach:
def hash_accessor attr; ->(hash) { hash[attr] }; end
arr = [{ :a => 1, :b => 2 }, { :a => 3, :b => 4 }]
arr.map(&hash_accessor(:a)).reduce(:+) #=> 4
It is unclear what you mean as "better" and why you think the correct version is ugly.
Do you like this "better"?
arr.inject(0) { |sum, h| sum + h[:a] }
There's a way, extending the Symbol.
lib/core_extensions/symbol.rb (credit goes here)
# frozen_string_literal: true
class Symbol
def with(*args, &)
->(caller, *rest) { caller.send(self, *rest, *args, &) }
end
end
Then, given:
arr = [ {:a => 1, :b => 2}, {:a => 3, :b => 4} ]
you can do this:
arr.map(&:[].with(:a)).reduce(:+)
Explanation: to access hash value under any key, you call Hash#[] method. When passed as a :[] (extended) symbol to the Array#map, you can then call .with(*args) on this symbol, effectively passing the parameter (hash key) down to the :[] method. Enjoy.

Find the (deep) complement of two hashes

The complement is the mathematical term for what I'm looking for, but for context and possibly more targeted solution: I have hash A, which can have nested hashes (i.e. they're N-dimensional), and I apply to it a process (over which I have no control) which returns hash B, which is hash A with some elements removed. From there on, I am trying to find the elements in A which have been removed in B.
For example: (note that I use symbols for simplicity. Keys will always be symbols, but values won't.)
a = {:a => :b,
:c => {:d => :e, :f => :g},
:h => :i,
:j => {:k => :l, :m => :n},
:o => {:p => :q, :r => :s},
:t => :u}
b = {:h => :i,
:j => {:k => :l, :m => :n},
:o => {:r => :s},
:t => :u}
complement(a,b)
#=> {:a => :b,
# :c => {:d => :e, :f => :g},
# :o => {:p => :q}}
What is the best (ruby-esque) way of doing this?
Came up with this
a = {a: "thing", b: [1,2,3], c:2}
b = {a: "thing", b: [1,2,3]}
c= {}
a.each do |k, v|
c[k] = v unless b[k]
end
p c
EDIT: Now checking nested hashes. But yes, there should be some better ruby way to do this.
def check_deleted(a, b)
c = Hash.new
a.each do |k, v|
if ! b.has_key? k
c[k] = v
elsif b[k].is_a? Hash
c[k] = check_deleted(v, b[k])
end
end
c
end
a = {a: "thing", b: [1,2,3], c:2, d: {e: 1, r:2}}
b = {a: "thing", b: [1,2,3], d: {r:2}}
p check_deleted(a,b) #=> {:c=>2, :d=>{:e=>1}}

Sorting a hash in Ruby by its value first then its key

I am trying to sort a document based on the number of times the word appears then alphabetically by the words so when it is outputted it will look something like this.
Unsorted:
'the', '6'
'we', '7'
'those', '5'
'have', '3'
Sorted:
'we', '7'
'the', '6'
'those', '5'
'have', '3'
Try this:
Assuming:
a = {
'the' => '6',
'we' => '7',
'those' => '5',
'have' => '3',
'hav' => '3',
'haven' => '3'
}
then after doing this:
b = a.sort_by { |x, y| [ -Integer(y), x ] }
b will look like this:
[
["we", "7"],
["the", "6"],
["those", "5"],
["hav", "3"],
["have", "3"],
["haven", "3"]
]
Edited to sort by reverse frequencies.
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3}
sorted_words = words.sort { |a,b| b.last <=> a.last }
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
have 3
You probably want the values to be integers rather than strings for comparison purposes.
EDIT
Oops, overlooked the requirement that it needs to be sorted by the key too. So:
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3,'zoo' => 3,'foo' => 3}
sorted_words = words.sort do |a,b|
a.last == b.last ? a.first <=> b.first : b.last <=> a.last
end
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
foo 3
have 3
zoo 3
When you use the sort method on a hash, you receive two element arrays in your comparison block, with which you can do comparisons in one pass.
hsh = { 'the' => '6', 'we' => '6', 'those' => '5', 'have' => '3'}
ary = hsh.sort do |a,b|
# a and b are two element arrays in the format [key,value]
value_comparison = a.last <=> b.last
if value_comparison.zero?
# compare keys if values are equal
a.first <=> b.first
else
value_comparison
end
end
# => [['have',3],['those',5],['the',6],['we',6]]
Note that the result is an array of arrays because hashes do not have intrinsic order in ruby
Try this:
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3}
words.sort { |(x_k, x_v), (y_k, y_v)| [y_v, y_k] <=> [x_v, x_k]}
#=> [["we", 7], ["the", 6], ["those", 5], ["have", 3]]
histogram = { 'the' => 6, 'we' => 7, 'those' => 5, 'have' => 3, 'and' => 6 }
Hash[histogram.sort_by {|word, freq| [-freq, word] }]
# {
# 'we' => 7,
# 'and' => 6,
# 'the' => 6,
# 'those' => 5,
# 'have' => 3
# }
Note: this assumes that you use numbers to store the numbers. In your data model, you appear to use strings to store the numbers. I have no idea why you would want to do this, but if you do want to do this, you would obviously have to convert them to numbers before sorting and then back to strings.
Also, this assumes Ruby 1.9. In Ruby 1.8, hashes aren't ordered, so you cannot convert the sorted result back to a hash since that would lose the ordering information, you would have to keep it as an array.
1.9.1
>> words = {'the' => 6,'we' => 7, 'those' => 5, 'have' => 3}
=> {"the"=>6, "we"=>7, "those"=>5, "have"=>3}
>> words.sort_by{ |x| x.last }.reverse
=> [["we", 7], ["the", 6], ["those", 5], ["have", 3]]
word_counts = {
'the' => 6,
'we' => 7,
'those' => 5,
'have' => 3,
'and' => 6
};
word_counts_sorted = word_counts.sort do
|a,b|
# sort on last field descending, then first field ascending if necessary
b.last <=> a.last || a.first <=> b.first
end
puts "Unsorted\n"
word_counts.each do
|word,count|
puts word + " " + count.to_s
end
puts "\n"
puts "Sorted\n"
word_counts_sorted.each do
|word,count|
puts word + " " + count.to_s
end

Resources