Merging an array of hashes - ruby

I have the following array:
x = [ { a: [1,2] }, { a: [3,4] }, { a: [5,6] } ]
and I need to get
{ a: [[1,2], [3,4], [5,6]] }
I have tried to use (among other options) merge:
x.each_with_object({}) do |a, b|
b.merge!(a) {|k, o, n| o.zip(n) }
end
But unfortunately, I get an extra array around the result.
Any suggestions?
THANKS

x.flat_map(&:to_a).group_by(&:first).map{ |k, v| [k, v.map(&:last)] }.to_h
#=> [{:a=>[[1, 2], [3, 4], [5, 6]]}]

It is maybe not the most efficient way to get the expected result but you can do
h = Hash.new([])
x.each { |hash|
hash.each { |key, values|
h[key] = h[key] + [values]
}
}
That way, at the end h is {:a=>[[1, 2], [3, 4], [5, 6]]}

key = x.first.first.first
#=> :a
{ key=>x.map { |h| h[key] } }
#=> {:a=>[[1, 2], [3, 4], [5, 6]]}
Note
a = x.first
#=> {:a=>[1, 2]}
b = a.first
#=> [:a, [1, 2]]
b.first
#=> :a
Another way:
a = x.map { |h| h.merge(h) { |_,v,_| [v] } }
#=> [{:a=>[[1, 2]]}, {:a=>[[3, 4]]}, {:a=>[[5, 6]]}]
a.reduce { |t,h| t.merge(h) { |_,o,n| o+n } }
#=> {:a=>[[1, 2], [3, 4], [5, 6]]}
Both steps use the form of Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for details.
These two steps could be combined into one as follows:
x.reduce { |t,h| t.merge(h.merge(h) { |_,v,_| [v] }) { |_,o,n| o+n } }
#=> {:a=>[1, 2, [3, 4], [5, 6]]}

Related

Group hash of arrays by array element

I have this data structure resulted from a query grouping
{
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
I want to manipulate it so I end up with a structure grouped like this
{
0 => {
'AR' => 2,
'AQ' => 6,
nil => 1
},
1 => {
'AQ' => 1,
nil => 4
},
2 => {
'BG' => 1,
nil => 1
}
}
input = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
result = {}
input.each do |k, v|
if result[k[0]]
result[k[0]].merge!({ k[1] => v })
else
result[k[0]] = { k[1] => v }
end
end
puts result
#{0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
I think this is not the most succinct way, I hope some advice!
hash = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
new_hash = {}
hash.each{|k, v| new_hash[k[0]] ||= {}; new_hash[k[0]].merge!({k[1] => v})}
puts new_hash # {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
Here is one more very similar to previous answers but with using of #each_with_object:
hash = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
result_hash = Hash.new { |h,k| h[k] = {} }
hash.each_with_object(result_hash) do |((parrent_key, key), value), res|
res[parrent_key].merge!(key => value)
end
=> {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
I came up with an answer that doesn't require additional variable assignments in its enclosing scope (it has "referential transparency": https://en.wikipedia.org/wiki/Referential_transparency)
input
.group_by { |(arr, num)| arr.first }
.each_with_object(Hash.new) do |(key, vals), hsh|
vals.each do |((key, innerkey), innerval)|
hsh[key] ||= {}
hsh[key][innerkey] = innerval
end
hsh
end
# {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
Two high-level steps:
I noticed the output object is grouped by the first array element (here, 0/1/2). I use #group_by to create a hash with that structure.
# output of `#group_by` on first array element:
key: 0, vals: [ [[0, "AR"], 2], [[0, nil], 1], [[0, "AQ"], 6] ]
key: 1, vals: [ [[1, nil], 4], [[1, "AQ"], 3] ]
key: 2, vals: [ [[2, "BG"], 1], [[2, nil], 1] ]
I use #each_with_object to construct the nested hashes. For each vals array above, I extracted the second and third values by destructuring the arrays in the block parameter (((key, innerkey), innerval)) and then the hash assignment was straightforward.

Create a hash out of an array where the values are the indices of the elements

I have an array and I want to create a hash whose keys are the elements of the array and whose values are (an array of) the indices of the array. I want to get something like:
array = [1,3,4,5]
... # => {1=>0, 3=>1, 4=>2, 5=>3}
array = [1,3,4,5,6,6,6]
... # => {1=>0, 3=>1, 4=>2, 5=>3, 6=>[4,5,6]}
This code:
hash = Hash.new 0
array.each_with_index do |x, y|
hash[x] = y
end
works fine only if I don't have duplicate elements. When I have duplicate elements, it does not.
Any idea on how I can get something like this?
You can change the logic to special-case the situation when the key already exists, turning it into an array and pushing the new index:
arr = %i{a a b a c}
result = arr.each.with_object({}).with_index do |(elem, memo), idx|
memo[elem] = memo.key?(elem) ? [*memo[elem], idx] : idx
end
puts result
# => {:a=>[0, 1, 3], :b=>2, :c=>4}
It's worth mentioning, though, that whatever you're trying to do here could possibly be accomplished in a different way ... we have no context. In general, it's a good idea to keep key-val data types uniform, e.g. the fact that values here can be numbers or arrays is a bit of a code smell.
Also note that it doesn't make sense to use Hash.new(0) here unless you're intentionally setting a default value (which there's no reason to do). Use {} instead
I'm adding my two cents:
array = [1,3,4,5,6,6,6,8,8,8,9,7,7,7]
hash = {}
array.map.with_index {|val, idx| [val, idx]}.group_by(&:first).map do |k, v|
hash[k] = v[0][1] if v.size == 1
hash[k] = v.map(&:last) if v.size > 1
end
p hash #=> {1=>0, 3=>1, 4=>2, 5=>3, 6=>[4, 5, 6], 8=>[7, 8, 9], 9=>10, 7=>[11, 12, 13]}
It fails with duplicated element not adjacent, of course.
This is the expanded version, step by step, to show how it works.
The basic idea is to build a temporary array with pairs of value and index, then work on it.
array = [1,3,4,5,6,6,6]
tmp_array = []
array.each_with_index do |val, idx|
tmp_array << [val, idx]
end
p tmp_array #=> [[1, 0], [3, 1], [4, 2], [5, 3], [6, 4], [6, 5], [6, 6]]
tmp_hash = tmp_array.group_by { |e| e[0] }
p tmp_hash #=> {1=>[[1, 0]], 3=>[[3, 1]], 4=>[[4, 2]], 5=>[[5, 3]], 6=>[[6, 4], [6, 5], [6, 6]]}
hash = {}
tmp_hash.map do |k, v|
hash[k] = v[0][0] if v.size == 1
hash[k] = v.map {|e| e[1]} if v.size > 1
end
p hash #=> {1=>1, 3=>3, 4=>4, 5=>5, 6=>[4, 5, 6]}
It can be written as one line as:
hash = {}
array.map.with_index.group_by(&:first).map { |k, v| v.size == 1 ? hash[k] = v[0][1] : hash[k] = v.map(&:last) }
p hash
If you are prepared to accept
{ 1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4,5,6] }
as the return value you may write the following.
array.each_with_index.group_by(&:first).transform_values { |v| v.map(&:last) }
#=> {1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4, 5, 6]}
The first step in this calculation is the following.
array.each_with_index.group_by(&:first)
#=> {1=>[[1, 0]], 3=>[[3, 1]], 4=>[[4, 2]], 5=>[[5, 3]], 6=>[[6, 4], [6, 5], [6, 6]]}
This may help readers to follow the subsequent calculations.
I think you will find this return value generally more convenient to use than the one given in the question.
Here are a couple of examples where it's clearly preferable for all values to be arrays. Let:
h_orig = { 1=>0, 3=>1, 4=>2, 5=>3, 6=>[4,5,6] }
h_mod { 1=>[0], 3=>[1], 4=>[2], 5=>[3], 6=>[4,5,6] }
Create a hash h whose keys are unique elements of array and whose values are the numbers of times the key appears in the array
h_mod.transform_values(&:count)
#=> {1=>1, 3=>1, 4=>1, 5=>1, 6=>3}
h_orig.transform_values { |v| v.is_a?(Array) ? v.count : 1 }
Create a hash h whose keys are unique elements of array and whose values equal the index of the first instance of the element in the array.
h_mod.transform_values(&:min)
#=> {1=>0, 3=>1, 4=>2, 5=>3, 6=>4}
h_orig.transform_values { |v| v.is_a?(Array) ? v.min : v }
In these examples, given h_orig, we could alternatively convert values that are indices to arrays containing a single index.
h_orig.transform_values { |v| [*v].count }
h_orig.transform_values { |v| [*v].min }
This is hardly proof that it is generally more convenient for all values to be arrays, but that has been my experience and the experience of many others.

Array to hash while summing values

I have a an array as follows:
[[172, 3],
[173, 1],
[174, 2],
[174, 3],
[174, 1]]
That I'd like to convert into an array, but while summing the values for matching keys. So I'd get the following:
{172 => 3, 173 => 1, 174 => 6}
How would I go about doing this?
How would I go about doing this?
Solve one problem at a time.
Given your array:
a = [[172, 3], [173, 1], [174, 2], [174, 3], [174, 1]]
We need an additional hash:
h = {}
Then we have to traverse the pairs in the array:
a.each do |k, v|
if h.key?(k) # If the hash already contains the key
h[k] += v # we add v to the existing value
else # otherwise
h[k] = v # we use v as the initial value
end
end
h #=> {172=>3, 173=>1, 174=>6}
Now let's refactor it. The conditional looks a bit cumbersome, what if we would just add everything?
h = {}
a.each { |k, v| h[k] += v }
#=> NoMethodError: undefined method `+' for nil:NilClass
Bummer, that doesn't work because the hash's values are initially nil. Let's fix that:
h = Hash.new(0) # <- hash with default values of 0
a.each { |k, v| h[k] += v }
h #=> {172=>3, 173=>1, 174=>6}
That looks good. We can even get rid of the temporary variable by using each_with_object:
a.each_with_object(Hash.new(0)) { |(k, v), h| h[k] += v }
#=> {172=>3, 173=>1, 174=>6}
You can try something about:
> array
#=> [[172, 3], [173, 1], [174, 2], [174, 3], [174, 1]]
array.group_by(&:first).map { |k, v| [k, v.map(&:last).inject(:+)] }.to_h
#=> => {172=>3, 173=>1, 174=>6}
Ruby 2.4.0 version:
a.group_by(&:first).transform_values { |e| e.sum(&:last) }
#=> => {172=>3, 173=>1, 174=>6}
For:
array = [[172, 3], [173, 1], [174, 2], [174, 3], [174, 1]]
You could use a hash with a default value 0 like this
hash = Hash.new{|h,k| h[k] = 0 }
Then use it, and sum up values:
array.each { |a, b| hash[a] += b }
#=> {172=>3, 173=>1, 174=>6}
Another possible solution:
arr = [[172, 3], [173, 1], [174, 2], [174, 3], [174, 1]]
hash = arr.each_with_object({}) {|a,h| h[a[0]] = h[a[0]].to_i + a[1]}
p hash
# {172=>3, 173=>1, 174=>6}

Why can't I use |a,b| instead of |(a,b)| in arr.map { |(a, b)| !b.nil? ? a + b : a }?

In the code below, arr is meant to be a two-dimensional array, such as [[1,2],[4,5]]. It computes the sum of the elements of the sub arrays. A subarray can have only one element, in which case the sum is just that one element.
def compute(arr)
return nil unless arr
arr.map { |(a, b)| !b.nil? ? a + b : a }
end
Why does the code have to be |(a, b)| instead of |a,b|?
What does (a,b) mean in Ruby?
You could use |a,b| too, it's nothing different from |(a,b)|.
You may also rewrite the code as below, which doesn't have the element number limit for the sub arrays:
arr.map { |a| a.inject{ |sum,x| sum + x } }
or even:
arr.map { |a| a.inject(:+) }
Both are equivalent if arr is an array:
arr = [[1, 2], [4, 5]]
arr.map { |a, b| [a, b] } #=> [[1, 2], [4, 5]]
arr.map { |(a, b)| [a, b] } #=> [[1, 2], [4, 5]]
This is because the block is called with a single argument at a time: the subarray. Something like:
yield [1, 2]
yield [4, 5]
This changes if more than one arguments is yielded. each_with_index for example, calls the block with two arguments: the item (i.e. the subarray) and its index. Something like:
yield [1, 2], 0
yield [4, 5], 1
The difference is obvious:
enum = [[1, 2], [4, 5]].each_with_index
enum.map { |a, b| [a, b] } #=> [[[1, 2], 0], [[4, 5], 1]]
enum.map { |(a, b)| [a, b] } #=> [[1, 2], [4, 5]]
Note that omitting parenthesis also allows you to set a default argument value:
arr = [[1, 2], [4]]
arr.map { |a, b = 0| a + b } #=> [3, 4]

Ruby: Sum selected hash values

I've got an array of hashes and would like to sum up selected values. I know how to sum all of them or one of them but not how to select more than one key.
i.e.:
[{"a"=>5, "b"=>10, "active"=>"yes"}, {"a"=>5, "b"=>10, "active"=>"no"}, {"a"=>5, "b"=>10, "action"=>"yes"}]
To sum all of them I using:
t = h.inject{|memo, el| memo.merge( el ){|k, old_v, new_v| old_v + new_v}}
=> {"a"=>15, "b"=>30, "active"=>"yesnoyes"} # I do not want 'active'
To sum one key, I do:
h.map{|x| x['a']}.reduce(:+)
=> 15
How do I go about summing up values for keys 'a' and 'b'?
You can use values_at:
hs = [{:a => 1, :b => 2, :c => ""}, {:a => 2, :b => 4, :c => ""}]
keys = [:a, :b]
hs.map { |h| h.values_at(*keys) }.inject { |a, v| a.zip(v).map { |xy| xy.compact.sum }}
# => [3, 6]
If all required keys have values it will be shorter:
hs.map { |h| h.values_at(*keys) }.inject { |a, v| a.zip(v).map(&:sum) }
# => [3, 6]
If you want Hash back:
Hash[keys.zip(hs.map { |h| h.values_at(*keys) }.inject{ |a, v| a.zip(v).map(&:sum) })]
# => {:a => 3, :b => 6}
I'd do something like this:
a.map { |h| h.values_at("a", "b") }.transpose.map { |v| v.inject(:+) }
#=> [15, 30]
Step by step:
a.map { |h| h.values_at("a", "b") } #=> [[5, 10], [5, 10], [5, 10]]
.transpose #=> [[5, 5, 5], [10, 10, 10]]
.map { |v| v.inject(:+) } #=> [15, 30]
How is this ?
h = [{"a"=>5, "b"=>10, "active"=>"yes"}, {"a"=>5, "b"=>10, "active"=>"no"}, {"a"=>5, "b"=>10, "action"=>"yes"}]
p h.map{|e| e.reject{|k,v| %w(active action).include? k } }.inject{|memo, el| memo.merge( el ){|k, old_v, new_v| old_v + new_v}}
# >> {"a"=>15, "b"=>30}

Resources