I have a set of categories and their values stored as a list of hashes:
r = [{:A => :X}, {:A => :Y}, {:B => :X}, {:A => :X}, {:A => :Z}, {:A => :X},
{:A => :X}, {:B => :Z}, {:C => :X}, {:C => :Y}, {:B => :X}, {:C => :Y},
{:C => :Y}]
I'd like to get a count of each value coupled with its category as a hash like this:
{:A => {:X => 4, :Y => 1, :Z => 1},
:B => {:X => 2, :Z => 1},
:C => {:X => 1, :Y => 3}}
How can I do this efficiently?
Here's what I have so far (it returns inconsistent values):
r.reduce(Hash.new(Hash.new(0))) do |memo, x|
memo[x.keys.first][x.values.first] += 1
memo
end
Should I first compute the counts of all instances of specific {:cat => :val}s and then create the hash? Should I give a different base-case to reduce and change the body to check for nil cases (and assign zero when nil) instead of always adding 1?
EDIT:
I ended up changing my code and using the below method to have a cleaner way of achieving a nested hash:
r.map do |x|
[x.keys.first, x.values.last]
end.reduce({}) do |memo, x|
memo[x.first] = Hash.new(0) if memo[x.first].nil?
memo[x.first][x.last] += 1
memo
end
The problem of your code is: memo did not hold the value.
Use a variable outside the loop to hold the value would be ok:
memo = Hash.new {|h,k| h[k] = Hash.new {|hh, kk| hh[kk] = 0 } }
r.each do |x|
memo[x.keys.first][x.values.first] += 1
end
p memo
And what's more, it won't work to init a hash nested inside a hash directly like this:
# NOT RIGHT
memo = Hash.new(Hash.new(0))
memo = Hash.new({})
Here is a link for more about the set default value issue:
http://www.themomorohoax.com/2008/12/31/why-setting-the-default-value-of-a-hash-to-be-a-hash-is-wrong
Not sure what "inconsistent values" means, but your problem is the hash you're injecting into is not remembering its results
r.each_with_object(Hash.new { |h, k| h[k] = Hash.new 0 }) do |individual, consolidated|
individual.each do |key, value|
consolidated[key][value] += 1
end
end
But honestly, it would probably be better to just go to wherever you're making this array and change it to aggregate values like this.
Functional approach using some handy abstractions -no need to reinvent the wheel- from facets:
require 'facets'
r.map_by { |h| h.to_a }.mash { |k, vs| [k, vs.frequency] }
#=> {:A=>{:X=>4, :Y=>1, :Z=>1}, :B=>{:X=>2, :Z=>1}, :C=>{:X=>1, :Y=>3}}
Related
I am trying to merge an array of hashes based on a particular key/value pair.
array = [ {:id => '1', :value => '2'}, {:id => '1', :value => '5'} ]
I would want the output to be
{:id => '1', :value => '7'}
As patru stated, in sql terms this would be equivalent to:
SELECT SUM(value) FROM Hashes GROUP BY id
In other words, I have an array of hashes that contains records. I would like to obtain the sum of a particular field, but the sum would grouped by key/value pairs. In other words, if my selection criteria is :id as in the example above, then it would seperate the hashes into groups where the id was the same and the sum the other keys.
I apologize for any confusion due to the typo earlier.
Edit: The question has been clarified since I first posted my answer. As a result, I have revised my answer substantially.
Here are two "standard" ways of addressing this problem. Both use Enumerable#select to first extract the elements from the array (hashes) that contain the given key/value pair.
#1
The first method uses Hash#merge! to sequentially merge each array element (hashes) into a hash that is initially empty.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
end
Example
arr = [{:id => '1', :value => '2'}, {:id => '2', :value => '3'},
{:id => '1', :chips => '4'}, {:zd => '1', :value => '8'},
{:cat => '2', :value => '3'}, {:id => '1', :value => '5'}]
doit(arr, :id, '1')
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Explanation
The key here is to use the version of Hash#merge! that uses a block to determine the value for each key/value pair whose key appears in both of the hashes being merged. The two values for that key are represented above by the block variables hv and gv. We simply want to add them together. Note that g is the (initially empty) hash object created by each_with_object, and returned by doit.
target_key = :id
target_value = '1'
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
#=> [{:id=>"1", :value=>"2"},{:id=>"1", :chips=>"4"},{:id=>"1", :value=>"5"}]
qualified.empty?
#=> false
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
#2
The other common way to do this kind of calculation is to use Enumerable#flat_map, followed by Enumerable#group_by.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.flat_map(&:to_a)
.group_by(&:first)
.values.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}]}.to_h
end
Explanation
This may look complex, but it's not so bad if you break it down into steps. Here's what's happening. (The calculation of qualified is the same as in #1.)
target_key = :id
target_value = '1'
c = qualified.flat_map(&:to_a)
#=> [[:id,"1"],[:value,"2"],[:id,"1"],[:chips,"4"],[:id,"1"],[:value,"5"]]
d = c.group_by(&:first)
#=> {:id=>[[:id, "1"], [:id, "1"], [:id, "1"]],
# :value=>[[:value, "2"], [:value, "5"]],
# :chips=>[[:chips, "4"]]}
e = d.values
#=> [[[:id, "1"], [:id, "1"], [:id, "1"]],
# [[:value, "2"], [:value, "5"]],
# [[:chips, "4"]]]
f = e.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }
#=> [[:id, "1"], [:value, "7"], [:chips, "4"]]
f.to_h => {:id=>"1", :value=>"7", :chips=>"4"}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Comment
You may wish to consider makin the values in the hashes integers and exclude the target_key/target_value pairs from qualified:
arr = [{:id => 1, :value => 2}, {:id => 2, :value => 3},
{:id => 1, :chips => 4}, {:zd => 1, :value => 8},
{:cat => 2, :value => 3}, {:id => 1, :value => 5}]
target_key = :id
target_value = 1
qualified = arr.select { |h| h.key?(target_key) && h[target_key]==target_value}
.each { |h| h.delete(target_key) }
#=> [{:value=>2}, {:chips=>4}, {:value=>5}]
return nil if qualified.empty?
Then either
qualified.each_with_object({}) {|h,g| g.merge!(h) { |k,gv,hv| gv + hv } }
#=> {:value=>7, :chips=>4}
or
qualified.flat_map(&:to_a)
.group_by(&:first)
.values
.map { |a| [a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }.to_h
#=> {:value=>7, :chips=>4}
here is what i got:
hash = {:a => {:b => [{:c => old_val}]}}
keys = [:a, :b, 0, :c]
new_val = 10
hash structure and set of keys can vary.
i need to get
hash[:a][:b][0][:c] == new_val
Thanks!
You can use inject to traverse your nested structures:
hash = {:a => {:b => [{:c => "foo"}]}}
keys = [:a, :b, 0, :c]
keys.inject(hash) {|structure, key| structure[key]}
# => "foo"
So, you just need to modify this to do a set on the last key. Perhaps something like
last_key = keys.pop
# => :c
nested_hash = keys.inject(hash) {|structure, key| structure[key]}
# => {:c => "foo"}
nested_hash[last_key] = "bar"
hash
# => {:a => {:b => [{:c => "bar"}]}}
Similar to Andy's, but you can use Symbol#to_proc to shorten it.
hash = {:a => {:b => [{:c => :old_val}]}}
keys = [:a, :b, 0, :c]
new_val = 10
keys[0...-1].inject(hash, &:fetch)[keys.last] = new_val
Hash.each returns an array [key, value],
but if I want a hash?
Example: {:key => value }
I'm assuming you meant "yields" where you said "return" because Hash#each already returns a hash (the receiver).
To answer your question: If you need a hash with the key and the value you can just create one. Like this:
hash.each do |key, value|
kv_hash = {key => value}
do_something_with(kv_hash)
end
There is no alternative each method that yields hashs, so the above is the best you can do.
I think you are trying to transform the hash somehow, so I will give you my solution to this problem, which may be not exactly the same. To modify a hash, you have to .map them and construct a new hash.
This is how I reverse key and values:
h = {:a => 'a', :b => 'b'}
Hash[h.map{ |k,v| [v, k] }]
# => {"a"=>:a, "b"=>:b}
Call .each with two parameters:
>> a = {1 => 2, 3 => 4}
>> a.each { |b, c|
?> puts "#{b} => #{c}"
>> }
1 => 2
3 => 4
=> {1=>2, 3=>4}
You could map the hash to a list of single-element hashes, then call each on the list:
h = {:a => 'a', :b => 'b'}
h.map{ |k,v| {k => v}}.each{ |x| puts x }
This snippet of code populates an #options hash. values is an Array which contains zero or more heterogeneous items. If you invoke populate with arguments that are Hash entries, it uses the value you specify for each entry to assume a default value.
def populate(*args)
args.each do |a|
values = nil
if (a.kind_of? Hash)
# Converts {:k => "v"} to `a = :k, values = "v"`
a, values = a.to_a.first
end
#options[:"#{a}"] ||= values ||= {}
end
end
What I'd like to do is change populate such that it recursively populates #options. There is a special case: if the values it's about to populate a key with are an Array consisting entirely of (1) Symbols or (2) Hashes whose keys are Symbols (or some combination of the two), then they should be treated as subkeys rather than the values associated with that key, and the same logic used to evaluate the original populate arguments should be recursively re-applied.
That was a little hard to put into words, so I've written some test cases. Here are some test cases and the expected value of #options afterwards:
populate :a
=> #options is {:a => {}}
populate :a => 42
=> #options is {:a => 42}
populate :a, :b, :c
=> #options is {:a => {}, :b => {}, :c => {}}
populate :a, :b => "apples", :c
=> #options is {:a => {}, :b => "apples", :c => {}}
populate :a => :b
=> #options is {:a => :b}
# Because [:b] is an Array consisting entirely of Symbols or
# Hashes whose keys are Symbols, we assume that :b is a subkey
# of #options[:a], rather than the value for #options[:a].
populate :a => [:b]
=> #options is {:a => {:b => {}}}
populate :a => [:b, :c => :d]
=> #options is {:a => {:b => {}, :c => :d}}
populate :a => [:a, :b, :c]
=> #options is {:a => {:a => {}, :b => {}, :c => {}}}
populate :a => [:a, :b, "c"]
=> #options is {:a => [:a, :b, "c"]}
populate :a => [:one], :b => [:two, :three => "four"]
=> #options is {:a => :one, :b => {:two => {}, :three => "four"}}
populate :a => [:one], :b => [:two => {:four => :five}, :three => "four"]
=> #options is {:a => :one,
:b => {
:two => {
:four => :five
}
},
:three => "four"
}
}
It is acceptable if the signature of populate needs to change to accommodate some kind of recursive version. There is no limit to the amount of nesting that could theoretically happen.
Any thoughts on how I might pull this off?
So here's some simple code that works.
def to_value args
ret = {}
# make sure we were given an array
raise unless args.class == Array
args.each do |arg|
case arg
when Symbol
ret[arg] = {}
when Hash
arg.each do |k,v|
# make sure that all the hash keys are symbols
raise unless k.class == Symbol
ret[k] = to_value v
end
else
# make sure all the array elements are symbols or symbol-keyed hashes
raise
end
end
ret
rescue
args
end
def populate *args
#options ||= {}
value = to_value(args)
if value.class == Hash
#options.merge! value
end
end
It does deviate from your test cases:
test case populate :a, :b => "apples", :c is a ruby syntax error. Ruby will assume the final argument to a method is a hash (when not given braces), but not a non-final one, as you assume here. The given code is a syntax error (no matter the definition of populate) since it assumes :c is a hash key, and finds an end of line when it's looking for :c's value. populate :a, {:b => "apples"}, :c works as expected
test case populate :a => [:one], :b => [:two, :three => "four"] returns {:a=>{:one=>{}}, :b=>{:two=>{}, :three=>"four"}}. This is consistent with the test case populate :a => [:b].
Ruby isn't Perl, => works only inside real Hash definition or as final argument in method call. Most things you want will result in a syntax error.
Are you sure that populate limited to cases supported by Ruby syntax is worth it?
We have the following datastructures:
{:a => ["val1", "val2"], :b => ["valb1", "valb2"], ...}
And I want to turn that into
[{:a => "val1", :b => "valb1"}, {:a => "val2", :b => "valb2"}, ...]
And then back into the first form. Anybody with a nice looking implementation?
This solution works with arbitrary numbers of values (val1, val2...valN):
{:a => ["val1", "val2"], :b => ["valb1", "valb2"]}.inject([]){|a, (k,vs)|
vs.each_with_index{|v,i| (a[i] ||= {})[k] = v}
a
}
# => [{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}]
[{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}].inject({}){|a, h|
h.each_pair{|k,v| (a[k] ||= []) << v}
a
}
# => {:a=>["val1", "val2"], :b=>["valb1", "valb2"]}
Using a functional approach (see Enumerable):
hs = h.values.transpose.map { |vs| h.keys.zip(vs).to_h }
#=> [{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}]
And back:
h_again = hs.first.keys.zip(hs.map(&:values).transpose).to_h
#=> {:a=>["val1", "val2"], :b=>["valb1", "valb2"]}
Let's look closely what the data structure we are trying to convert between:
#Format A
[
["val1", "val2"], :a
["valb1", "valb2"], :b
["valc1", "valc2"] :c
]
#Format B
[ :a :b :c
["val1", "valb1", "valc1"],
["val2", "valb2", "valc3"]
]
It is not diffculty to find Format B is the transpose of Format A in essential , then we can come up with this solution:
h={:a => ["vala1", "vala2"], :b => ["valb1", "valb2"], :c => ["valc1", "valc2"]}
sorted_keys = h.keys.sort_by {|a,b| a.to_s <=> b.to_s}
puts sorted_keys.inject([]) {|s,e| s << h[e]}.transpose.inject([]) {|r, a| r << Hash[*sorted_keys.zip(a).flatten]}.inspect
#[{:b=>"valb1", :c=>"valc1", :a=>"vala1"}, {:b=>"valb2", :c=>"valc2", :a=>"vala2"}]
m = {}
a,b = Array(h).transpose
b.transpose.map { |y| [a, y].transpose.inject(m) { |m,x| m.merge Hash[*x] }}
My attempt, perhaps slightly more compact.
h = { :a => ["val1", "val2"], :b => ["valb1", "valb2"] }
h.values.transpose.map { |s| Hash[h.keys.zip(s)] }
Should work in Ruby 1.9.3 or later.
Explanation:
First, 'combine' the corresponding values into 'rows'
h.values.transpose
# => [["val1", "valb1"], ["val2", "valb2"]]
Each iteration in the map block will produce one of these:
h.keys.zip(s)
# => [[:a, "val1"], [:b, "valb1"]]
and Hash[] will turn them into hashes:
Hash[h.keys.zip(s)]
# => {:a=>"val1", :b=>"valb1"} (for each iteration)
This will work assuming all the arrays in the original hash are the same size:
hash_array = hash.first[1].map { {} }
hash.each do |key,arr|
hash_array.zip(arr).each {|inner_hash, val| inner_hash[key] = val}
end
You could use inject to build an array of hashes.
hash = { :a => ["val1", "val2"], :b => ["valb1", "valb2"] }
array = hash.inject([]) do |pairs, pair|
pairs << { pair[0] => pair[1] }
pairs
end
array.inspect # => "[{:a=>["val1", "val2"]}, {:b=>["valb1", "valb2"]}]"
Ruby documentation has a few more examples of working with inject.