Sort hash by length of values (descending) - ruby

I'm having problems sorting a hash by the length of the array values in descending order. I have the following hash:
hash = {
"1" => [0,3],
"2" => [0,2],
"3" => [1,2,3,4],
"4" => [1,8,7,6,5],
"5" => [7,8],
"10" => [5]
}
I want to sort it to be in this order: 4,3,1,2,5,10.
hash.sort_by {|k,v| v.length}.reverse
What am I not doing right? Any ideas?

It seems you are looking for Enumerable#sort_by like this (as a note, this could be hash.sort_by {|_,v| -v.length}.to_h depending on the Ruby version. I used Hash[] because of it's compatibility).
Hash[hash.sort_by {|_,v| -v.length}]
#=>
# {
# "4"=>[1, 8, 7, 6, 5],
# "3"=>[1, 2, 3, 4],
# "1"=>[0, 3],
# "2"=>[0, 2],
# "5"=>[7, 8],
# "10"=>[5]
# }
Sorting a Hash using Enumerable#sort_by will return an associative array of [[key,value],[key,value],...] when called with a block (otherwise it returns an Enumerator). Since Hash understands associative Array structure, you can easily turn this back into a Hash by calling associative_array.to_h (Ruby >= 2.1) or Hash[associative_array] (for all Ruby Versions).

You cannot sort a hash - that might be causing your confusion. There is no "internal" ordering of elements of a hash as it appears in an array.
You can, however, iterate over a hash in a certain order, e.g.
hash.sort_by {|k,v| v.length}.reverse.each do |k, v|
puts "k = #{k}, v = #{v}"
end

Related

How does `Hash#sort{|a, b| block}` work?

In ruby-doc I see this example:
h = { "a" => 20, "b" => 30, "c" => 10 }
h.sort {|a,b| a[1]<=>b[1]} #=> [["c", 10], ["a", 20], ["b", 30]]
Can anyone explain what a[1]<=>b[1] means? What are we comparing here? Is a is a key and b its value? Why we are comparing index 1?
a and b are both arrays of [key, value] which come from Hash#sort.
Converts hsh to a nested array of [ key, value ] arrays and sorts it, using Array#sort.
So a[1]<=>b[1] sorts the resulting pairs by the value. If it were a[0]<=>b[0] it would be sorting by the key.
Ruby doesn't have a key-value-pair or tuple datatype, so all Hash iteration methods (each, map, select, sort, …) represent hash entries as an Array with two elements, [key, value]. (In fact, most methods aren't even implemented in Hash, they are inherited from Enumerable and don't even know anything about keys and values.)
Meditate on this:
h = { "a" => 20, "b" => 30, "c" => 10 }
h.sort { |a,b| # => {"a"=>20, "b"=>30, "c"=>10}
a[1]<=>b[1]
} # => [["c", 10], ["a", 20], ["b", 30]]
For each loop of the key/value pairs in h, Ruby passes each key/value pair into the block as an array of two elements.

How to preserve alphabetical order of keys when sorting a hash by the value

beginner here. My first question. Go easy on me.
Given the following hash:
pets_ages = {"Eric" => 6, "Harry" => 3, "Georgie" => 12, "Bogart" => 4, "Poly" => 4,
"Annie" => 1, "Dot" => 3}
and running the following method:
pets_ages.sort {|x, y| x[1] <=> y[1]}.to_h
the following is returned:
{
"Annie" => 1,
"Dot" => 3,
"Harry" => 3,
"Poly" => 4,
"Bogart" => 4,
"Eric" => 6,
"Georgie" => 12
}
You will notice the hash is nicely sorted by the value, as intended. What I'd like to change is the ordering of the keys, so that they remain alphabetical in the case of a tie. Notice "Dot" and "Harry" are correct in that regard, but for some reason "Poly" and "Bogart" are not. My theory is that it is automatically sorting the keys by length in the case of a tie, and not alphabetically. How can I change that?
In many languages, Hashes/Dicts aren't ordered, because of how the are implemented under the covers. Ruby 1.9+ is nice enough to guarantee ordering.
You can do this in a single pass - Ruby allows you to sort by arbitrary criteria.
# Given
pets_ages = {"Eric" => 6, "Harry" => 3, "Georgie" => 12, "Bogart" => 4, "Poly" => 4, "Annie" => 1, "Dot" => 3}
# Sort pets by the critera of "If names are equal, sort by name, else, sort by age"
pets_ages.sort {|(n1, a1), (n2, a2)| a1 == a2 ? n1 <=> n2 : a1 <=> a2 }.to_h
# => {"Annie"=>1, "Dot"=>3, "Harry"=>3, "Bogart"=>4, "Poly"=>4, "Eric"=>6, "Georgie"=>12}
Hash#sort will return an array of [k, v] pairs, but those k, v pairs can be sorted by any criteria you want in a single pass. Once we have the sorted pairs, we turn it back into a Hash with Array#to_h (Ruby 2.1+), or you can use Hash[sorted_result] in earlier versions, as Beartech points out.
You could get as complex as you want in the sort block; if you're familiar with Javascript sorting, Ruby actually works the same here. The <=> method returns -1, 0, or 1 depending on how the objects compare to each other. #sort just expects one of those return values, which tells it how the two given values relate to each other. You don't even have to use <=> at all if you don't want to - something like this is equivalent to the more compact form:
pets_ages.sort do |a, b|
if a[1] == b[1]
if a[0] > b[0]
1
elsif a[0] < b[0]
-1
else
0
end
else
if a[1] > b[1]
1
elsif a[1] < b[1]
-1
end
end
end
As you can see, as long as you always return something in the set (-1 0 1), your sort function can do whatever you want, so you can compose them however you'd like. However, such verbose forms are practically never necessary in Ruby, because of the super handy <=> operator!
As Stefan points out, though, you have a BIG shortcut here: Array#<=> is nice enough to compare each entry between the compared arrays. This means that we can do something like:
pets_ages.sort {|a, b| a.reverse <=> b.reverse }.to_h
This takes each [k, v] pair, reverses it into [v, k], and uses Array#<=> to compare it. Since you need to perform this same operation on each [k, v] pair compared, you can shortcut it even further with #sort_by
pets_ages.sort_by {|k, v| [v, k] }.to_h
What this does is for each hash entry, it passes the key and value to the block, and the return result of the block is what is used to compare this [k, v] pair to other entries. Since comparing [v, k] to another [v, k] pair will give us the result we want, we just return an array consisting of [v, k], which sort_by collects and sorts the original [k, v] pairs by.
As Philip pointed out, hashes were not meant to preserve order, though I think in the latest Ruby they might. But let's say they don't. Here's an array based solution that could then be re-hashed:
Edit here it is in a one-liner:
new_pets_ages = Hash[pets_ages.sort.sort_by {|a| a[1]}]
previous answer:
pets_ages = {"Eric" => 6, "Harry" => 3, "Georgie" => 12, "Bogart" => 4, "Poly" => 4,
"Annie" => 1, "Dot" => 3}
arr = pets_ages.sort
# [["Annie", 1], ["Bogart", 4], ["Dot", 3], ["Eric", 6], ["Georgie", 12],
# ["Harry", 3], ["Poly", 4]]
new_arr = arr.sort_by {|a| a[1]}
#[["Annie", 1], ["Dot", 3], ["Harry", 3], ["Bogart", 4], ["Poly", 4], ["Eric", 6],
# ["Georgie", 12]]
And finally to get a hash back:
h = Hash[new_arr]
#{"Annie"=>1, "Dot"=>3, "Harry"=>3, "Bogart"=>4, "Poly"=>4, "Eric"=>6,
# "Georgie"=>12}
So when we sort a hash, it gives us an array of arrays with the items sorted by the original keys. Then we sort that array of arrays by the second value of each, and since it's a lazy sort, it only shifts them if need be. Then we can send it back to a hash. I'm sure there's a trick way to do a two-pass sort in one line but this seems pretty simple.
As you already know few methods of ruby for sorting that you have used. So I would not explain it you in detail rather keep it very simple one liner for you. Here is your answer:
pets_ages.sort.sort_by{|pets| pets[1]}.to_h
Thanks

Sort hash by order of keys in secondary array

I have a hash:
hash = {"a" => 1, "b" =>2, "c" => 3, "d" => 4}
And I have an array:
array = ["b", "a", "d"]
I would like to create a new array that is made up of the original hash values that correspond with original hash keys that are also found in the original array while maintaining the sequence of the original array. The desired array being:
desired_array = [2, 1, 3]
The idea here is to take the word "bad", assign numbers to the alphabet, and then make an array of the numbers that correspond with "b" "a" and "d" in that order.
Since your question is a little unclear I'm assuming you want desired_array to be an array (you say you want a new array and finish the sentence off with new hash). Also in your example I'm assuming you want desired_array to be [2, 1, 4] for ['b', 'a', 'd'] and not [2, 1, 3] for ['b', 'a', 'c'].
You should just you the Enumerable#map method to create a array that will map the first array to the your desired array like so:
desired_array = array.map { |k| hash[k] }
You should familiarize yourself with the Enumerable#map method, it's quite the handy method. From the rubydocs for the method: Returns a new array with the results of running block once for every element in enum. So in this case we are iterating through array and invoking hash[k] to select the value from the hash and creating a new array with values selected by the hash. Since iteration is in order, you will maintain the original sequence.
I would use Enumerable#map followed by Enumerable#sort_by, for example:
hash = {"d" => 4, "b" =>2, "c" => 3, "a" => 1}
order = ["b", "a", "d"]
# For each key in order, create a [key, value] pair from the hash.
# (Doing it this way instead of filtering the hash.to_a is O(n) vs O(n^2) without
# an additional hash-probe mapping. It also feels more natural.)
selected_pairs = order.map {|v| [v, hash[v]]}
# For each pair create a surrogate ordering based on the `order`-index
# (The surrogate value is only computed once, not each sort-compare step.
# This is, however, an O(n^2) operation on-top of the sort.)
sorted = selected_pairs.sort_by {|p| order.find_index(p[0]) }
p sorted
# sorted =>
# [["b", 2], ["a", 1], ["d", 4]]
I've not turned the result back into a Hash, because I am of the belief that hashes should not be treated as having any sort of order, except for debugging aids. (Do keep in mind that Ruby 2 hashes are ordered-by-insertion.)
All you need is values_at:
hash.values_at *array
Enumerable methods map, each works perfect
desired_array = array.map { |k| hash[k] }
or
desired_array = array.each { |k| hash[k] }

Having trouble with the sort method

I am building a histogram based on of the amount of words in a text file. I have an array of hashes whose keys are the words and the values are the amount of times the word appears per line. I need to use the sort method on this array of hashes to sort the values in order of the most occurring word to the least. This is what my sort line looks like:
twoOfArray.sort { |k, v| v <=> k }
twoOfArray.each { |key, value| puts "#{key} occurs #{value} times" "\n"}
Full code is here. If I use the sort! method, I get an undefined method error. Does anyone know why?
I would convert your data structure (an array of hashes) into just one large hash. If you want to sort the words, there's no reason to have them in separate hashes.
Then, if your hash is something like {'the' => 5, 'and' => 23, 'beer' => 2} you can sort via:
> h = {'the' => 5, 'and' => 23, 'beer' => 2}
> a = h.sort {|a, b| b[1] <=> a[1] } # sort converts a hash into an array of arrays.
> a
#=> [['and', 23], ['the', 5], ['beer', 2]]

Swap hash keys with values and convert keys to symbols in Ruby?

This is the input hash:
p Score.periods #{"q1"=>0, "q2"=>1, "q3"=>2, "q4"=>3, "h1"=>4, "h2"=>5}
This is my current code to exchange the keys with the values, while converting the keys to symbols:
periods = Score.periods.inject({}) do |hsh,(k,v)|
hsh[v] = k.to_sym
hsh
end
Here is the result:
p periods #{0=>:q1, 1=>:q2, 2=>:q3, 3=>:q4, 4=>:h1, 5=>:h2}
It just seems like my code is clunky and it shouldn't take 4 lines to do what I'm doing here. Is there a cleaner way to write this?
You can do this:
Hash[periods.values.zip(periods.keys.map(&:to_sym))]
Or if you're using a version of Ruby where to_h is available for arrays, you can do this:
periods.values.zip(periods.keys.map(&:to_sym)).to_h
What the two examples above do is make arrays of the keys and values of the original hash. Note that the string keys of the hash are mapped to symbols by passing to_sym to map as a Proc:
periods.keys.map(&:to_sym)
# => [:q1, :q2, :q3, :q4, :h1, :h2]
periods.values
# => [0, 1, 2, 3, 4, 5]
Then it zips them up into an array of [value, key] pairs, where each corresponding elements of values is matched with its corresponding key in keys:
periods.values.zip(periods.keys.map(&:to_sym))
# => [[0, :q1], [1, :q2], [2, :q3], [3, :q4], [4, :h1], [5, :h2]]
Then that array can be converted back into a hash using Hash[array] or array.to_h.
The simplest way is:
data = {"q1"=>0, "q2"=>1, "q3"=>2, "q4"=>3, "h1"=>4, "h2"=>5}
Hash[data.invert.collect { |k, v| [ k, v.to_sym ] }]
The Hash[] method converts an array of key/value pairs into an actual Hash. Quite handy for situations like this.
If you're using Ruby on Rails this could be even easier:
data.symbolize_keys.invert
h = {"q1"=>0, "q2"=>1, "q3"=>2, "q4"=>3, "h1"=>4, "h2"=>5}
h.each_with_object({}) { |(k,v),g| g[v] = k.to_sym }
#=> {0=>:q1, 1=>:q2, 2=>:q3, 3=>:q4, 4=>:h1, 5=>:h2}
The steps are as follows (for the benefit of Ruby newbies).
enum = h.each_with_object({})
#=> #<Enumerator: {0=>"q1", 1=>"q2", 2=>"q3", 3=>"q4",
# 4=>"h1", 5=>"h2"}:each_with_object({})>
The elements that will be generated by the enumerator and passed to the block can be seen by converting the enumerator to an array, using Enumerable#entries or Enumerable#to_a.
enum.entries
#=> [[["q1", 0], {}], [["q2", 1], {}], [["q3", 2], {}],
# [["q4", 3], {}], [["h1", 4], {}], [["h2", 5], {}]]
Continuing,
enum.each { |(k,v),g| g[v] = k.to_sym }
#=> {0=>:q1, 1=>:q2, 2=>:q3, 3=>:q4, 4=>:h1, 5=>:h2}
In the last step, Enumerator#each passes the first element generated by enum to the block and assigns the three block variables. Consider the first element of enum that is passed to the block and the associated calculation of values for the three block variables. (I must first execute enum.rewind to reinitialize enum, as each above took the enumerator to its end. See Enumerator#rewind).
(k, v), g = enum.next
#=> [["q1", 0], {}]
k #=> "q1"
v #=> 0
g #=> {}
See Enumerator#next. The block calculation is therefore
g[v] = k.to_sym
#=> :q1
Hence,
g #=> {0=>:q1}
The next element of enum is passed to the block and similar calculations are performed.
(k, v), g = enum.next
#=> [["q2", 1], {0=>:q1}]
k #=> "q2"
v #=> 1
g #=> {0=>:q1}
g[v] = k.to_sym
#=> :q2
g #=> {0=>:q1, 1=>:q2}
The remaining calculations are similar.

Resources