Add up values in hashes - ruby

array1 = { "d1" => 2, "d2" => 3}
array2 = { "d1" => 3, "d3" => 10}
i want this:
array3 = { "d1" => 5, "d2" => 3, "d3" => 10}
i tried this, it doesn't work. i am getting the error: "NoMethodError: undefined method `+' for nil:NilClass"
array3 = {}
array1.each {|key, count| array3[key] += count}
array2.each {|key, count| array3[key] += count}

You're getting the error because array1.each tries to access array3['d1'], which doesn't exist yet, so it returns nil as the value. You just need to define array3 a bit more specifically, using Hash.new to tell it to assign 0 to all keys by default.
array3 = Hash.new(0)
array1.each {|key, count| array3[key] += count}
array2.each {|key, count| array3[key] += count}
Be careful going forward, though: the object you pass as the default value can be modified, so if you were to write my_hash = Hash.new(Array.new); my_hash[:some_key] << 3 then all keys that receive a default value will share the same object. This is one of those strange gotchas in Ruby, and you would want to use the block version of Hash.new in that case.

it is much more simplier
=> h1 = { "a" => 100, "b" => 200 }
{"a"=>100, "b"=>200}
=> h1 = { "a" => 100, "b" => 200 }
{"b"=>254, "c"=>300}
=> h1.merge(h2) {|key, oldval, newval| newval + oldval}
{"a"=>100, "b"=>454, "c"=>300}
it was undocumented in core-1.8.7 but you can read more here:
http://www.ruby-doc.org/core/classes/Hash.src/M000759.html
it works on both versions

User you have to initialize the key for the non-existent values in the second array:
irb(main):007:0> array1 = {"d1" => 2, "d2" => 3}
=> {"d1"=>2, "d2"=>3}
irb(main):008:0> array2 = {"d1" => 3, "d3" => 10}
=> {"d1"=>3, "d3"=>10}
irb(main):009:0> array3 = {}
=> {}
irb(main):010:0> array1.each {|key, count| array3[key] = (array3[key] || 0) + count}
=> {"d1"=>2, "d2"=>3}
irb(main):011:0> array2.each {|key, count| array3[key] = (array3[key] || 0) + count}
=> {"d1"=>3, "d3"=>10}
irb(main):012:0> array3
=> {"d1"=>5, "d2"=>3, "d3"=>10}
irb(main):013:0>

If all the keys in your hash are not integers you can use this trick which uses the fact that string.to_i gives zero
hash.to_a.flatten.inject{|sum, n| sum.to_s.to_i + n.to_s.to_i }

Related

How to find the largest value of a hash in an array of hashes

In my array, I'm trying to retrieve the key with the largest value of "value_2", so in this case, "B":
myArray = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
myArray.each do |array_hash|
array_hash.each do |key, value|
if value["value_2"] == array_hash.values.max
puts key
end
end
end
I get the error:
"comparison of Hash with Hash failed (ArgumentError)".
What am I missing?
Though equivalent, the array given in the question is generally written:
arr = [{ "A" => { "value_1" => 30, "value_2" => 240 } },
{ "B" => { "value_1" => 40, "value_2" => 250 } },
{ "C" => { "value_1" => 18, "value_2" => 60 } }]
We can find the desired key as follows:
arr.max_by { |h| h.values.first["value_2"] }.keys.first
#=> "B"
See Enumerable#max_by. The steps are:
g = arr.max_by { |h| h.values.first["value_2"] }
#=> {"B"=>{"value_1"=>40, "value_2"=>250}}
a = g.keys
#=> ["B"]
a.first
#=> "B"
In calculating g, for
h = arr[0]
#=> {"A"=>{"value_1"=>30, "value_2"=>240}}
the block calculation is
a = h.values
#=> [{"value_1"=>30, "value_2"=>240}]
b = a.first
#=> {"value_1"=>30, "value_2"=>240}
b["value_2"]
#=> 240
Suppose now arr is as follows:
arr << { "D" => { "value_1" => 23, "value_2" => 250 } }
#=> [{"A"=>{"value_1"=>30, "value_2"=>240}},
# {"B"=>{"value_1"=>40, "value_2"=>250}},
# {"C"=>{"value_1"=>18, "value_2"=>60}},
# {"D"=>{"value_1"=>23, "value_2"=>250}}]
and we wish to return an array of all keys for which the value of "value_2" is maximum (["B", "D"]). We can obtain that as follows.
max_val = arr.map { |h| h.values.first["value_2"] }.max
#=> 250
arr.select { |h| h.values.first["value_2"] == max_val }.flat_map(&:keys)
#=> ["B", "D"]
flat_map(&:keys) is shorthand for:
flat_map { |h| h.keys }
which returns the same array as:
map { |h| h.keys.first }
See Enumerable#flat_map.
Code
p myArray.pop.max_by{|k,v|v["value_2"]}.first
Output
"B"
I'd use:
my_array = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
h = Hash[*my_array]
# => {"A"=>{"value_1"=>30, "value_2"=>240},
# "B"=>{"value_1"=>40, "value_2"=>250},
# "C"=>{"value_1"=>18, "value_2"=>60}}
k = h.max_by { |k, v| v['value_2'] }.first # => "B"
Hash[*my_array] takes the array of hashes and turns it into a single hash. Then max_by will iterate each key/value pair, returning an array containing the key value "B" and the sub-hash, making it easy to grab the key using first:
k = h.max_by { |k, v| v['value_2'] } # => ["B", {"value_1"=>40, "value_2"=>250}]
I guess the idea of your solution is looping through each hash element and compare the found minimum value with hash["value_2"].
But you are getting an error at
if value["value_2"] == array_hash.values.max
Because the array_hash.values is still a hash
{"A"=>{"value_1"=>30, "value_2"=>240}}.values.max
#=> {"value_1"=>30, "value_2"=>240}
It should be like this:
max = nil
max_key = ""
myArray.each do |array_hash|
array_hash.each do |key, value|
if max.nil? || value.values.max > max
max = value.values.max
max_key = key
end
end
end
# max_key #=> "B"
Another solution:
myArray.map{ |h| h.transform_values{ |v| v["value_2"] } }.max_by{ |k| k.values }.keys.first
You asked "What am I missing?".
I think you are missing a proper understanding of the data structures that you are using. I suggest that you try printing the data structures and take a careful look at the results.
The simplest way is p myArray which gives:
[{"A"=>{"value_1"=>30, "value_2"=>240}, "B"=>{"value_1"=>40, "value_2"=>250}, "C"=>{"value_1"=>18, "value_2"=>60}}]
You can get prettier results using pp:
require 'pp'
pp myArray
yields:
[{"A"=>{"value_1"=>30, "value_2"=>240},
"B"=>{"value_1"=>40, "value_2"=>250},
"C"=>{"value_1"=>18, "value_2"=>60}}]
This helps you to see that myArray has only one element, a Hash.
You could also look at the expression array_hash.values.max inside the loop:
myArray.each do |array_hash|
p array_hash.values
end
gives:
[{"value_1"=>30, "value_2"=>240}, {"value_1"=>40, "value_2"=>250}, {"value_1"=>18, "value_2"=>60}]
Not what you expected? :-)
Given this, what would you expect to be returned by array_hash.values.max in the above loop?
Use p and/or pp liberally in your ruby code to help understand what's going on.

Convert array into hash and add a counter value to the new hash

I have the following array of hashes:
[
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
And I would like to translate this array into a hash that includes the counts for each item:
Output:
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
I have tried two approaches to translate the array into a hash.
new_cart = cart.inject(:merge)
hash = Hash[cart.collect { |item| [item, ""] } ]
Both work but then I am stumped at how to capture and pass the count value.
Expected output
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
We are given the array:
arr = [
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
and make the assumption that each hash has a single key and if two hashes have the same (single) key, the value of that key is the same in both hashes.
The first step is create an empty hash to which will add key-value pairs:
h = {}
Now we loop through arr to build the hash h. I've added a puts statement to display intermediate values in the calculation.
arr.each do |g|
k, v = g.first
puts "k=#{k}, v=#{v}"
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
displays:
k=BREAD, v={:price=>1.5, :discount=>true}
k=BREAD, v={:price=>1.5, :discount=>true}
k=MARMITE, v={:price=>1.6, :discount=>false}
and returns:
#=> [{"BREAD" =>{:price=>1.5, :discount=>true}},
# {"BREAD" =>{:price=>1.5, :discount=>true}},
# {"MARMITE"=>{:price=>1.6, :discount=>false}}]
each always returns its receiver (here arr), which is not what we want.
h #=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
is the result we need. See Hash#key? (aka, has_key?), Hash#[], Hash#[]= and Hash#merge.
Now let's wrap this in a method.
def hashify(arr)
h = {}
arr.each do |g|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count=>1 })
end
end
h
end
hashify(arr)
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
Rubyists would often use the method Enumerable#each_with_object to simplify.
def hashify(arr)
arr.each_with_object({}) do |g,h|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
end
Compare the two methods to identify their differences. See Enumerable#each_with_object.
When, as here, the keys are symbols, Ruby allows you to use the shorthand { count: 1 } for { :count=>1 }. Moreover, she permits you to write :count = 1 or count: 1 without the braces when the hash is an argument. For example,
{}.merge('cat'=>'meow', dog:'woof', :pig=>'oink')
#=> {"cat"=>"meow", :dog=>"woof", :pig=>"oink"}
It's probably more common to see the form count: 1 when keys are symbols and for the braces to be omitted when a hash is an argument.
Here's a further refinement you might see. First create
h = arr.group_by { |h| h.keys.first }
#=> {"BREAD" =>[{"BREAD"=>{:price=>1.5, :discount=>true}},
# {"BREAD"=>{:price=>1.5, :discount=>true}}],
# "MARMITE"=>[{"MARMITE"=>{:price=>1.6, :discount=>false}}]}
See Enumerable#group_by. Now convert the values (arrays) to their sizes:
counts = h.transform_values { |arr| arr.size }
#=> {"BREAD"=>2, "MARMITE"=>1}
which can be written in abbreviated form:
counts = h.transform_values(&:size)
#=> {"BREAD"=>2, "MARMITE"=>1}
See Hash#transform_values. We can now write:
uniq_arr = arr.uniq
#=> [{"BREAD"=>{:price=>1.5, :discount=>true}},
#= {"MARMITE"=>{:price=>1.6, :discount=>false}}]
uniq_arr.each_with_object({}) do |g,h|
puts "g=#{g}"
k,v = g.first
puts " k=#{k}, v=#{v}"
h[k] = v.merge(counts: counts[k])
puts " h=#{h}"
end
which displays:
g={"BREAD"=>{:price=>1.5, :discount=>true}}
k=BREAD, v={:price=>1.5, :discount=>true}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2}}
g={"MARMITE"=>{:price=>1.6, :discount=>false}}
k=MARMITE, v={:price=>1.6, :discount=>false}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
"MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
and returns:
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
See Array#uniq.
This did the trick:
arr = [
{ bread: { price: 1.50, discount: true } },
{ bread: { price: 1.50, discount: true } },
{ marmite: { price: 1.60, discount: false } }
]
Get the count for each occurrence of hash, add as key value pair and store:
h = arr.uniq.each { |x| x[x.first.first][:count] = arr.count(x) }
Then convert hashes into arrays, flatten to a single array then construct a hash:
Hash[*h.collect(&:to_a).flatten]
#=> {:bread=>{:price=>1.50, :discount=>true, :count=>2}, :marmite=>{:price=>1.60, :discount=>false, :count=>1}}
Combined a couple of nice ideas from here:
https://raycodingdotnet.wordpress.com/2013/08/05/array-of-hashes-into-single-hash-in-ruby/
and here:
http://carol-nichols.com/2015/08/07/ruby-occurrence-couting/

Most performant way to group/summarise two hashes?

I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}

Add values of an array inside a hash

I created a hash that has an array as a value.
{
"0":[0,14,0,14],
"1":[0,14],
"2":[0,11,0,12],
"3":[0,11,0,12],
"4":[0,10,0,13,0,11],
"5":[0,10,0,14,0,0,0,11,12,0],
"6":[0,0,12],
"7":[],
"8":[0,14,0,12],
"9":[0,14,0,0,11,14],
"10":[0,11,0,12],
"11":[0,13,0,14]
}
I want the sum of all values in each array. I expect to have such output as:
{
"0":[24],
"1":[14],
"2":[23],
"3":[23],
"4":[34],
"5":[47],
"6":[12],
"7":[],
"8":[26],
"9":[39],
"10":[23],
"11":[27]
}
I do not know how to proceed from here. Any pointers are thankful.
I would do something like this:
hash = { "0" => [0,14,0,14], "1" => [0,14], "7" => [] }
hash.each { |k, v| hash[k] = Array(v.reduce(:+)) }
# => { "0" => [28], "1" => [14], "7" => [] }
For hash object you as this one.
hash = {"0"=>[0,14,0,14],"1"=>[0,14],"2"=>[0,11,0,12],"3"=>[0,11,0,12],"4"=>[0,10,0,13,0,11],"5"=>[0,10,0,14,0,0,0,11,12,0],"6"=>[0,0,12],"7"=>[],"8"=>[0,14,0,12],"9"=>[0,14,0,0,11,14],"10"=>[0,11,0,12],"11"=>[0,13,0,14]}
You could change value of each k => v pair
hash.each_pair do |k, v|
hash[k] = [v.reduce(0, :+)]
end
will resolve in
hash = {"0"=>[28], "1"=>[14], "2"=>[23], "3"=>[23], "4"=>[34], "5"=>[47], "6"=>[12], "7"=>[0], "8"=>[26], "9"=>[39], "10"=>[23], "11"=>[27]}
If your string is just like you mentioned you can parse it using JSON, or jump that step if you have already an Hash.
You can check the documentation of inject here
require 'json'
json_string = '
{
"0":[0,14,0,14],
"1":[0,14],
"2":[0,11,0,12],
"3":[0,11,0,12],
"4":[0,10,0,13,0,11],
"5":[0,10,0,14,0,0,0,11,12,0],
"6":[0,0,12],
"7":[],
"8":[0,14,0,12],
"9":[0,14,0,0,11,14],
"10":[0,11,0,12],
"11":[0,13,0,14]
}
'
hash = JSON.parse json_string
result = Hash.new
hash.each do |key, value|
result[key] = value.inject(:+)
end
puts result.inspect
and the result:
{"0"=>28, "1"=>14, "2"=>23, "3"=>23, "4"=>34, "5"=>47, "6"=>12, "7"=>nil, "8"=>26, "9"=>39, "10"=>23, "11"=>27}
Classic example for map/reduce. You need to iterate on the hash keys and values (map {|k,v|}), and for each value count the sum using reduce(0) {|acc, x| acc+x}
h = {1 => [1,2,3], 2 => [3,4,5], 7 => []} #example
a = h.map {|k,v| [k ,v.reduce(0) {|acc, x| acc+x}] }
=> [[1, 6], [2, 12], [7, 0]]
Hash[a]
=> {1=>6, 2=>12, 7=>0}

Create Nested Hashes from a List of Hashes in Ruby

I have a set of categories and their values stored as a list of hashes:
r = [{:A => :X}, {:A => :Y}, {:B => :X}, {:A => :X}, {:A => :Z}, {:A => :X},
{:A => :X}, {:B => :Z}, {:C => :X}, {:C => :Y}, {:B => :X}, {:C => :Y},
{:C => :Y}]
I'd like to get a count of each value coupled with its category as a hash like this:
{:A => {:X => 4, :Y => 1, :Z => 1},
:B => {:X => 2, :Z => 1},
:C => {:X => 1, :Y => 3}}
How can I do this efficiently?
Here's what I have so far (it returns inconsistent values):
r.reduce(Hash.new(Hash.new(0))) do |memo, x|
memo[x.keys.first][x.values.first] += 1
memo
end
Should I first compute the counts of all instances of specific {:cat => :val}s and then create the hash? Should I give a different base-case to reduce and change the body to check for nil cases (and assign zero when nil) instead of always adding 1?
EDIT:
I ended up changing my code and using the below method to have a cleaner way of achieving a nested hash:
r.map do |x|
[x.keys.first, x.values.last]
end.reduce({}) do |memo, x|
memo[x.first] = Hash.new(0) if memo[x.first].nil?
memo[x.first][x.last] += 1
memo
end
The problem of your code is: memo did not hold the value.
Use a variable outside the loop to hold the value would be ok:
memo = Hash.new {|h,k| h[k] = Hash.new {|hh, kk| hh[kk] = 0 } }
r.each do |x|
memo[x.keys.first][x.values.first] += 1
end
p memo
And what's more, it won't work to init a hash nested inside a hash directly like this:
# NOT RIGHT
memo = Hash.new(Hash.new(0))
memo = Hash.new({})
Here is a link for more about the set default value issue:
http://www.themomorohoax.com/2008/12/31/why-setting-the-default-value-of-a-hash-to-be-a-hash-is-wrong
Not sure what "inconsistent values" means, but your problem is the hash you're injecting into is not remembering its results
r.each_with_object(Hash.new { |h, k| h[k] = Hash.new 0 }) do |individual, consolidated|
individual.each do |key, value|
consolidated[key][value] += 1
end
end
But honestly, it would probably be better to just go to wherever you're making this array and change it to aggregate values like this.
Functional approach using some handy abstractions -no need to reinvent the wheel- from facets:
require 'facets'
r.map_by { |h| h.to_a }.mash { |k, vs| [k, vs.frequency] }
#=> {:A=>{:X=>4, :Y=>1, :Z=>1}, :B=>{:X=>2, :Z=>1}, :C=>{:X=>1, :Y=>3}}

Resources