Sorting a hash in Ruby by its value first then its key - ruby

I am trying to sort a document based on the number of times the word appears then alphabetically by the words so when it is outputted it will look something like this.
Unsorted:
'the', '6'
'we', '7'
'those', '5'
'have', '3'
Sorted:
'we', '7'
'the', '6'
'those', '5'
'have', '3'

Try this:
Assuming:
a = {
'the' => '6',
'we' => '7',
'those' => '5',
'have' => '3',
'hav' => '3',
'haven' => '3'
}
then after doing this:
b = a.sort_by { |x, y| [ -Integer(y), x ] }
b will look like this:
[
["we", "7"],
["the", "6"],
["those", "5"],
["hav", "3"],
["have", "3"],
["haven", "3"]
]
Edited to sort by reverse frequencies.

words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3}
sorted_words = words.sort { |a,b| b.last <=> a.last }
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
have 3
You probably want the values to be integers rather than strings for comparison purposes.
EDIT
Oops, overlooked the requirement that it needs to be sorted by the key too. So:
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3,'zoo' => 3,'foo' => 3}
sorted_words = words.sort do |a,b|
a.last == b.last ? a.first <=> b.first : b.last <=> a.last
end
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
foo 3
have 3
zoo 3

When you use the sort method on a hash, you receive two element arrays in your comparison block, with which you can do comparisons in one pass.
hsh = { 'the' => '6', 'we' => '6', 'those' => '5', 'have' => '3'}
ary = hsh.sort do |a,b|
# a and b are two element arrays in the format [key,value]
value_comparison = a.last <=> b.last
if value_comparison.zero?
# compare keys if values are equal
a.first <=> b.first
else
value_comparison
end
end
# => [['have',3],['those',5],['the',6],['we',6]]
Note that the result is an array of arrays because hashes do not have intrinsic order in ruby

Try this:
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3}
words.sort { |(x_k, x_v), (y_k, y_v)| [y_v, y_k] <=> [x_v, x_k]}
#=> [["we", 7], ["the", 6], ["those", 5], ["have", 3]]

histogram = { 'the' => 6, 'we' => 7, 'those' => 5, 'have' => 3, 'and' => 6 }
Hash[histogram.sort_by {|word, freq| [-freq, word] }]
# {
# 'we' => 7,
# 'and' => 6,
# 'the' => 6,
# 'those' => 5,
# 'have' => 3
# }
Note: this assumes that you use numbers to store the numbers. In your data model, you appear to use strings to store the numbers. I have no idea why you would want to do this, but if you do want to do this, you would obviously have to convert them to numbers before sorting and then back to strings.
Also, this assumes Ruby 1.9. In Ruby 1.8, hashes aren't ordered, so you cannot convert the sorted result back to a hash since that would lose the ordering information, you would have to keep it as an array.

1.9.1
>> words = {'the' => 6,'we' => 7, 'those' => 5, 'have' => 3}
=> {"the"=>6, "we"=>7, "those"=>5, "have"=>3}
>> words.sort_by{ |x| x.last }.reverse
=> [["we", 7], ["the", 6], ["those", 5], ["have", 3]]

word_counts = {
'the' => 6,
'we' => 7,
'those' => 5,
'have' => 3,
'and' => 6
};
word_counts_sorted = word_counts.sort do
|a,b|
# sort on last field descending, then first field ascending if necessary
b.last <=> a.last || a.first <=> b.first
end
puts "Unsorted\n"
word_counts.each do
|word,count|
puts word + " " + count.to_s
end
puts "\n"
puts "Sorted\n"
word_counts_sorted.each do
|word,count|
puts word + " " + count.to_s
end

Related

Compare 2 Hash with values consisting of array

I have 2 hashes, let's say A, B
A: { 'key1' => [a, b], 'key2' => 'c' }
B: { 'key1' => [b, a], 'key2' => 'c' }
What is the best possible way to compare these 2 hashes. The ordering of the array contents does not matter. So in my case, hash A and B are equal
It's not as easy as it seems at first glance.
It is necessary to take into account several nuances:
the number of elements in the hashes may not match;
items with the same key in two hashes can be of different types.
A relatively universal solution can be as follows:
def hashes_comp(hash1, hash2)
return false if hash1.size != hash2.size
hash1.each do |key, value|
if value.class == Array
return false if hash2[key].class != Array || value.sort != hash2[key].sort
else
return false if value != hash2[key]
end
end
true
end
hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}
hashes_comp(hash_a, hash_b) #=> true
hashes_comp(hash_a, hash_c) #=> false
hashes_comp(hash_a, hash_d) #=> false
hashes_comp(hash_a, hash_e) #=> false
hashes_comp(hash_a, hash_f) #=> false
One can sort the arrays but that can be an expensive operation if the arrays are large. If n equals the size of the array, the time complexity of heapsort, for example, is O(n log(n)). It's faster to replace arrays with counting hashes, the construction of which enjoys a time complexity of O(n).
h1 = { 'k1' => [1, 2, 1, 3, 2, 1], 'k2' => 'c' }
h2 = { 'k1' => [3, 2, 1, 2, 1, 1], 'k2' => 'c' }
def same?(h1, h2)
return false unless h1.size == h2.size
h1.all? do |k,v|
if h2.key?(k)
vo = h2[k]
if v.is_a?(Array)
if vo.is_a?(Array)
convert(v) == convert(vo)
end
else
v == vo
end
end
end
end
def convert(arr)
arr.each_with_object(Hash.new(0)) { |e,g| g[e] += 1 }
end
same?(h1, h2)
#=> true
Here
convert([1, 2, 1, 3, 2, 1])
#=> {1=>3, 2=>2, 3=>1}
convert([3, 2, 1, 2, 1, 1])
#=> {3=>1, 2=>2, 1=>3}
and
{1=>3, 2=>2, 3=>1} == {3=>1, 2=>2, 1=>3}
#=> true
See Hash::new, specifically the case where the method takes an argument that equals the default value.
The guard clause return false unless h1.size == h2.size is to ensure that h2 does not have keys that are not present in h1. Note that the following returns the falsy value nil:
if false
#...
end
#=> nil
In a couple of places I've written that rather than the more verbose but equivalent expresion
if false
#...
else
nil
end
I would definitely agree with Ivan it's not as easy as it initially seems but I figured I would try doing it with recursion. This has the added benefit of being able to compare beyond just hashes.
hash_a = {'key1' => ['a', 'b'], 'key2' => 'c'}
hash_b = {'key1' => ['b', 'a'], 'key2' => 'c'}
hash_c = {'key1' => ['a', 'c'], 'key2' => 'c'}
hash_d = {'key1' => ['a', 'b'], 'key2' => 'd'}
hash_e = {'key1' => ['a', 'b'], 'key2' => ['a', 'b']}
hash_f = {'key1' => ['a', 'b'], 'key2' => 'c', 'key3' => 'd'}
def recursive_compare(one, two)
unless one.class == two.class
return false
end
match = false
# If it's not an Array or Hash...
unless one.class == Hash || one.class == Array
return one == two
end
# If they're both Hashes...
if one.class == Hash
one.each_key do |k|
match = two.key? k
break unless match
end
two.each_key do |k|
match = one.key? k
break unless match
end
if match
one.each do |k, v|
match = recursive_compare(v, two[k])
break unless match
end
end
end
# If they're both Arrays...
if one.class == Array
one.each do |v|
match = two.include? v
break unless match
end
two.each do |v|
match = one.include? v
break unless match
end
end
match
end
puts recursive_compare(hash_a, hash_b) #=> true
puts recursive_compare(hash_a, hash_c) #=> false
puts recursive_compare(hash_a, hash_d) #=> false
puts recursive_compare(hash_a, hash_e) #=> false
puts recursive_compare(hash_a, hash_f) #=> false
I came up with this solution:
def are_equals?(a, b)
(a.keys.sort == b.keys.sort) &&
a.merge(b) { |k, o_val, n_val| [o_val, n_val].all? { |e| e.kind_of? Array} ? o_val.sort == n_val.sort : o_val == n_val }.values.all?
end
How it works.
The first part tests for key equality, using Hash#keys, which returns the array of keys, sorted of course:
a.keys.sort == b.keys.sort
For the second part I used Hash#merge to compare values related to the same key, and can be expanded in this way:
res = a.merge(b) do |k, o_val, n_val|
if [o_val, n_val].all? { |e| e.kind_of? Array}
o_val.sort == n_val.sort
else
o_val == n_val
end
end
#=> {"key1"=>true, "key2"=>true}
It returns a Hash where values are true or false, then checks if all values are true using Enumerable#all?:
res.values.all?
#=> [true, true].all? => true

How to invert a hash, maintaining duplicate keys

From an initial hash t:
t = {"1"=>1, "2"=>2, "3"=>2, "6"=>3, "5"=>4, "4"=>1, "8"=>2, "9"=>2, "0"=>1, "7"=>1}
I need to swap the keys and values as follows:
t = {"1"=>1, "2"=>2, "3"=>2, "6"=>3, "5"=>4, "1"=>4, "8"=>2, "9"=>2, "1"=>0, "1"=>7}
While maintaining the structure of the hash (ie, without collapsing duplicate keys).
Then I'll make an array out of this hash.
Is there a way to do this? I tried this:
t.find_all{ |key,value| value == 1 } # pluck all elements with values of 1
#=> [["1", 1], ["4", 1], ["0", 1], ["7", 1]]
But it returns a new array, and the initial hash isn't changed.
The following doesn't work either:
t.invert.find_all{ |key,value| value == 1 }
#=> []
Here's a way to do this:
>> t = {"1" => 1, "2" => 2, "3" => 2, "6" => 3, "5" => 4, "4" => 1, "8" => 2, "9" => 2, "0" => 1, "7" => 1}
Hash#compare_by_identity allows for keys that are duplicates by value but unique by object id:
>> h = Hash.new.compare_by_identity
>> t.each_pair{ |k,v| h[v.to_s] = v.to_i }
The inverse hash of t:
>> h
#=> {"1" => 1, "2" => 2, "2" => 3, "3" => 6, "4" => 5, "1" => 4, "2" => 8, "2" => 9, "1" => 0, "1" => 7}
You can then use find_all to retrieve an array of elements without mutating h:
>> h.find_all{ |k,_| k == "1" }
#=> [["1", 1], ["1", 1], ["1", 1], ["1", 1]]
or keep_if to return the mutated h:
>> h.keep_if{ |k,_| k == "1" }
#=> {"1"=>1, "1"=>1, "1"=>1, "1"=>1}
>> h
#=> {"1"=>1, "1"=>1, "1"=>1, "1"=>1}
Note that this solution assumes you want to maintain the pattern of string keys and integer values in your hash. If you require integer keys, compare_by_identity won't be helpful to you.

Ruby printing all elements of a hash but last

Say I have:
hash = {"a" => 1, "b" => 2, "c" => 3}
keys = hash.keys
In order to print all the keys you would do:
keys.each {|x| puts x}
My question is, how do you print all the keys BUT the last one?
I'd do as below using Enumerable#take :
hash = {"a" => 1, "b" => 2, "c" => 3}
hash.take(hash.size-1).each do |k,_|
p k
end
# >> "a"
# >> "b"
Or, as below :
hash = {"a" => 1, "b" => 2, "c" => 3}
hash.keys.take(hash.size-1) # => ["a", "b"]
puts hash.keys.take(hash.size-1)
# >> a
# >> b
update (As asked by OP - Alright now how do I print just the last element explicitly?)
hash = {"a" => 1, "b" => 2, "c" => 3}
hash.keys.last # => "c"
Hash#keys will give you all the keys of that hash as an array. So, now you can call Array#last on that array to get the last element.
hash = {"a" => 1, "b" => 2, "c" => 3}
keys = hash.keys
keys[0..-2]
#=> ["a", "b"]
[0..-2] would index the array from the first to the one before the last
keys[0..-2].each { |x| puts x }
the last element in an array can be retrieved by calling last method on the array
keys.last
#=> "c"

Ruby - extracting the unique values per key from an array of hashes

From a hash like the below one, need to extract the unique values per key
array_of_hashes = [ {'a' => 1, 'b' => 2 , 'c' => 3} ,
{'a' => 4, 'b' => 5 , 'c' => 3},
{'a' => 6, 'b' => 5 , 'c' => 3} ]
Need to extract the unique values per key in an array
unique values for 'a' should give
[1,4,6]
unique values for 'b' should give
[2,5]
unique values for 'c' should give
[3]
Thoughts ?
Use Array#uniq:
array_of_hashes = [ {'a' => 1, 'b' => 2 , 'c' => 3} ,
{'a' => 4, 'b' => 5 , 'c' => 3},
{'a' => 6, 'b' => 5 , 'c' => 3} ]
array_of_hashes.map { |h| h['a'] }.uniq # => [1, 4, 6]
array_of_hashes.map { |h| h['b'] }.uniq # => [2, 5]
array_of_hashes.map { |h| h['c'] }.uniq # => [3]
This is more generic:
options = {}
distinct_keys = array_of_hashes.map(&:keys).flatten.uniq
distinct_keys.each do |k|
options[k] = array_of_hashes.map {|o| o[k]}.uniq
end

How do I change a hash using new hash values?

I have these hashes:
{"a" => 1, "b" => 2, "c" => 3, "k" => 14}
{"b" => 51, "c" => 2, "d" => 8}
I need to write code, so that after manipulation, the result would be:
{"a" => 1, "b" => 51, "c" => 2, "k" => 14}
I tried:
h1.each do |h, j|
h2.each do |hh, jj|
if h == hh
j = jj
end
end
end
but it doesn't work. Also I think this is ugly code, so how would could it be written better/right?
I though I should compare the two hashes, and, if the second key is the same as the first, change the first hash value to the second hash's value.
Just iterate over the entries in h2 and update the corresponding entry in h1 only if it already exists:
h2.each { |k,v| h1[k]=v if h1.include?(k) }
h1 # => {"a"=>1, "b"=>51, "c"=>2, "k"=>14 }
Also, if you want to update the entries as above and also add new entries from h2 you can simply use the Hash#merge! method:
h1.merge!(h2)
h1 # => {"a"=>1, "b"=>51, "c"=>2, "k"=>14, "d"=>8}

Resources