Ruby sort hash of hashes - ruby

I have a set of activities that I want to sort based on an average rating, and how many times the activity has been rated.
activities = {
:one => { :avg_rating => 5, :total_ratings => 23 },
:two => { :avg_rating => 5, :total_ratings => 18 },
:three => { :avg_rating => 5, :total_ratings => 54 }
}
EDIT - updated so the results I was expecting was correct
The result of the sort would be in order of :three, :one, :two
Thanks!

activities.sort_by { |k,v|
[v[:avg_rating], v[:total_ratings]]
}.reverse
Whether you need the element names only:
activities.sort_by { |k,v|
[v[:avg_rating], v[:total_ratings]]
}.reverse.map &:first
#⇒ [
# [0] :three,
# [1] :one,
# [2] :two
#]

Related

Convert array into hash and add a counter value to the new hash

I have the following array of hashes:
[
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
And I would like to translate this array into a hash that includes the counts for each item:
Output:
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
I have tried two approaches to translate the array into a hash.
new_cart = cart.inject(:merge)
hash = Hash[cart.collect { |item| [item, ""] } ]
Both work but then I am stumped at how to capture and pass the count value.
Expected output
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
We are given the array:
arr = [
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
and make the assumption that each hash has a single key and if two hashes have the same (single) key, the value of that key is the same in both hashes.
The first step is create an empty hash to which will add key-value pairs:
h = {}
Now we loop through arr to build the hash h. I've added a puts statement to display intermediate values in the calculation.
arr.each do |g|
k, v = g.first
puts "k=#{k}, v=#{v}"
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
displays:
k=BREAD, v={:price=>1.5, :discount=>true}
k=BREAD, v={:price=>1.5, :discount=>true}
k=MARMITE, v={:price=>1.6, :discount=>false}
and returns:
#=> [{"BREAD" =>{:price=>1.5, :discount=>true}},
# {"BREAD" =>{:price=>1.5, :discount=>true}},
# {"MARMITE"=>{:price=>1.6, :discount=>false}}]
each always returns its receiver (here arr), which is not what we want.
h #=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
is the result we need. See Hash#key? (aka, has_key?), Hash#[], Hash#[]= and Hash#merge.
Now let's wrap this in a method.
def hashify(arr)
h = {}
arr.each do |g|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count=>1 })
end
end
h
end
hashify(arr)
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
Rubyists would often use the method Enumerable#each_with_object to simplify.
def hashify(arr)
arr.each_with_object({}) do |g,h|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
end
Compare the two methods to identify their differences. See Enumerable#each_with_object.
When, as here, the keys are symbols, Ruby allows you to use the shorthand { count: 1 } for { :count=>1 }. Moreover, she permits you to write :count = 1 or count: 1 without the braces when the hash is an argument. For example,
{}.merge('cat'=>'meow', dog:'woof', :pig=>'oink')
#=> {"cat"=>"meow", :dog=>"woof", :pig=>"oink"}
It's probably more common to see the form count: 1 when keys are symbols and for the braces to be omitted when a hash is an argument.
Here's a further refinement you might see. First create
h = arr.group_by { |h| h.keys.first }
#=> {"BREAD" =>[{"BREAD"=>{:price=>1.5, :discount=>true}},
# {"BREAD"=>{:price=>1.5, :discount=>true}}],
# "MARMITE"=>[{"MARMITE"=>{:price=>1.6, :discount=>false}}]}
See Enumerable#group_by. Now convert the values (arrays) to their sizes:
counts = h.transform_values { |arr| arr.size }
#=> {"BREAD"=>2, "MARMITE"=>1}
which can be written in abbreviated form:
counts = h.transform_values(&:size)
#=> {"BREAD"=>2, "MARMITE"=>1}
See Hash#transform_values. We can now write:
uniq_arr = arr.uniq
#=> [{"BREAD"=>{:price=>1.5, :discount=>true}},
#= {"MARMITE"=>{:price=>1.6, :discount=>false}}]
uniq_arr.each_with_object({}) do |g,h|
puts "g=#{g}"
k,v = g.first
puts " k=#{k}, v=#{v}"
h[k] = v.merge(counts: counts[k])
puts " h=#{h}"
end
which displays:
g={"BREAD"=>{:price=>1.5, :discount=>true}}
k=BREAD, v={:price=>1.5, :discount=>true}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2}}
g={"MARMITE"=>{:price=>1.6, :discount=>false}}
k=MARMITE, v={:price=>1.6, :discount=>false}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
"MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
and returns:
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
See Array#uniq.
This did the trick:
arr = [
{ bread: { price: 1.50, discount: true } },
{ bread: { price: 1.50, discount: true } },
{ marmite: { price: 1.60, discount: false } }
]
Get the count for each occurrence of hash, add as key value pair and store:
h = arr.uniq.each { |x| x[x.first.first][:count] = arr.count(x) }
Then convert hashes into arrays, flatten to a single array then construct a hash:
Hash[*h.collect(&:to_a).flatten]
#=> {:bread=>{:price=>1.50, :discount=>true, :count=>2}, :marmite=>{:price=>1.60, :discount=>false, :count=>1}}
Combined a couple of nice ideas from here:
https://raycodingdotnet.wordpress.com/2013/08/05/array-of-hashes-into-single-hash-in-ruby/
and here:
http://carol-nichols.com/2015/08/07/ruby-occurrence-couting/

Most performant way to group/summarise two hashes?

I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}

Ruby: Link two arrays of objects by attribute value

I'm pretty new in Ruby programming. In Ruby there are plenty ways to write elegant code. Is there any elegant way to link two arrays with objects of the same type by attribute value?
It's hard to explain. Let's look at the next example:
a = [ { :id => 1, :value => 1 }, { :id => 2, :value => 2 }, { :id => 3, :value => 3 } ]
b = [ { :id => 1, :value => 2 }, { :id => 3, :value => 4 } ]
c = link a, b
# Result structure after linkage.
c = {
"1" => {
:a => { :id => 1, :value => 1 },
:b => { :id => 1, :value => 1 }
},
"3" => {
:a => { :id => 3, :value => 3 },
:b => { :id => 3, :value => 4 }
}
}
So the basic idea is to get pairs of objects from different arrays by their common ID and construct a hash, which will give this pair by ID.
Thanks in advance.
If you want to take an adventure through Enumerable, you could say this:
(a.map { |h| [:a, h] } + b.map { |h| [:b, h] })
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == 2 }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
And if you really want the keys to be strings, say n.to_s => Hash[v] instead of n => Hash[v].
The logic works like this:
We need to know where everything comes from we decorate the little hashes with :a and :b symbols to track their origins.
Then add the decorated arrays together into one list so that...
group_by can group things into almost-the-final-format.
Then find the groups of size two since those groups contain the entries that appeared in both a and b. Groups of size one only appeared in one of a or b so we throw those away.
Then a little injection to rearrange things into their final format. Note that the arrays we built in (1) just somehow happen to be in the format that Hash[] is looking for.
If you wanted to do this in a link method then you'd need to say things like:
link :a => a, :b => b
so that the method will know what to call a and b. This hypothetical link method also easily generalizes to more arrays:
def link(input)
input.map { |k, v| v.map { |h| [k, h] } }
.inject(:+)
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == input.length }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
end
link :a => [...], :b => [...], :c => [...]
I assume that, for any two elements h1 and h2 of a (or of b), h1[:id] != h2[:id].
I would do this:
def convert(arr) Hash[arr.map {|h| [h[:id], h]}] end
ah, bh = convert(a), convert(b)
c = ah.keys.each_with_object({}) {|k,h|h[k]={a: ah[k], b: bh[k]} if bh.key?(k)}
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Note that:
ah = convert(a)
# => {1=>{:id=>1, :value=>1}, 2=>{:id=>2, :value=>2}, 3=>{:id=>3, :value=>3}}
bh = convert(b)
# => {1=>{:id=>1, :value=>2}, 3=>{:id=>3, :value=>4}}
Here's a second approach. I don't like it as well, but it represents a different way of looking at the problem.
def sort_by_id(a) a.sort_by {|h| h[:id]} end
c = Hash[*sort_by_id(a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}})
.zip(sort_by_id(b))
.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
.flatten]
Here's what's happening. The first step is to select only the elements ha of a for which there is an element hb of b for which ha[:id] = hb[id]. Then we sort both (what's left of) a and b on h[:id], zip them together and then make the hash c.
r1 = a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}}
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r2 = sort_by_id(r1)
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r3 = sort_by_id(b)
# => [{:id=>1, :value=>2}, {:id=>3, :value=>4}]
r4 = r2.zip(r3)
# => [[{:id=>1, :value=>1}, {:id=>1, :value=>2}],
# [{:id=>3, :value=>3}, {:id=>3, :value=>4}]]
r5 = r4.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
# => [[1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}}],
# [3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]]
r6 = r5.flatten
# => [1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]
c = Hash[*r6]
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Ok, I've found the answer by myself. Here is a quite short line of code, which should do the trick:
Hash[a.product(b)
.select { |pair| pair[0][:id] == pair[1][:id] }
.map { |pair| [pair[0][:id], { :a => pair[0], :b => pair[1] }] }]
The product method gives us all possible pairs, then we filter them by equal IDs of pair elements. And then we map pairs to the special form, which will produce a Hash we are looking for.
So Hash[["key1", "value1"], ["key2", "value2"]] returns { "key1" => "value1", "key2" => "value2" }. And I use this to get the answer on my question.
Thanks.
P.S.: you can use pair.first instead of pair[0] and pair.last instead of pair[1] for better readability.
UPDATE
As Cary pointed out, it is better to replace |pair| with |ha, hb| to avoid these ugly indices:
Hash[a.product(b)
.select { |ha, hb| ha[:id] == hb[:id] }
.map { |ha, hb| [ha[:id], { :a => ha, :b => hb }] }]

How to access assoc array within assoc array

I have the following code:
def operation hash
puts hash[:three][:three][:three]
end
operation :one => 'item', :two => [1,2,3], :three => [
:one => 1,
:two => 2,
:three => [
:one => 1,
:two => 2,
:three => [
:test1,
:test2
]
]
]
I would like to access the item hash[:three][:three][:three] to output [test1, test2].
Why doesn't it work?
A Hash needs to be surrounded by braces {}, not brackets [], which are reserved for arrays.
Unlike PHP, in Ruby these are distinct types.
Try this:
def operation hash
puts hash[:three][0][:three][0][:three] #=> [:test1, :test2]
end
Notice that each of the :three keys has an array storing the values.

How to get a hash value by numeric index

Have a hash:
h = {:a => "val1", :b => "val2", :c => "val3"}
I can refer to the hash value:
h[:a], h[:c]
but I would like to refer by numeric index:
h[0] => val1
h[2] => val3
Is it possible?
h.values will give you an array requested.
> h.values
# ⇒ [
# [0] "val1",
# [1] "val2",
# [2] "val3"
# ]
UPD while the answer with h[h.keys[0]] was marked as correct, I’m a little bit curious with benchmarks:
h = {:a => "val1", :b => "val2", :c => "val3"}
Benchmark.bm do |x|
x.report { 1_000_000.times { h[h.keys[0]] = 'ghgh'} }
x.report { 1_000_000.times { h.values[0] = 'ghgh'} }
end
#
# user system total real
# 0.920000 0.000000 0.920000 ( 0.922456)
# 0.820000 0.000000 0.820000 ( 0.824592)
Looks like we’re spitting on 10% of productivity.
h = {:a => "val1", :b => "val2", :c => "val3"}
keys = h.keys
h[keys[0]] # "val1"
h[keys[2]] # "val3"
So you need both array indexing and hash indexing ?
If you need only the first one, use an array.
Otherwise, you can do the following :
h.values[0]
h.values[1]

Resources