Merge Ruby Hash values with same key - ruby

Is this possible to achieve with selected keys:
Eg
h = [
{a: 1, b: "Hello", c: "Test1"},
{a: 2, b: "Hey", c: "Test1"},
{a: 3, b: "Hi", c: "Test2"}
]
Expected Output
[
{a: 1, b: "Hello, Hey", c: "Test1"}, # See here, I don't want key 'a' to be merged
{a: 3, b: "Hi", c: "Test2"}
]
My Try
g = h.group_by{|k| k[:c]}.values
OUTPUT =>
[
[
{:a=>1, :b=>"Hello", :c=>"Test1"},
{:a=>2, :b=>"Hey", :c=>"Test1"}
], [
{:a=>3, :b=>"Hi", :c=>"Test2"}
]
]
g.each do |v|
if v.length > 1
c = v.reduce({}) do |s, l|
s.merge(l) { |_, a, b| [a, b].uniq.join(", ") }
end
end
p c #{:a=>"1, 2", :b=>"Hello, Hey", :c=>"Test1"}
end
So, the output I get is
{:a=>"1, 2", :b=>"Hello, Hey", :c=>"Test1"}
But, I needed
{a: 1, b: "Hello, Hey", c: "Test1"}
NOTE: This is just a test array of HASH I have taken to put my question. But, the actual hash has a lots of keys. So, please don't reply with key comparison answers
I need a less complex solution

I can't see a simpler version of your code. To make it fully work, you can use the first argument in the merge block instead of dismissing it to differentiate when you need to merge a and b or when you just use a. Your line becomes:
s.merge(l) { |key, a, b| key == :a ? a : [a, b].uniq.join(", ") }

Maybe you can consider this option, but I don't know if it is less complex:
h.group_by { |h| h[:c] }.values.map { |tmp| tmp[0].merge(*tmp[1..]) { |key, oldval, newval| key == :b ? [oldval, newval].join(' ') : oldval } }
#=> [{:a=>1, :b=>"Hello Hey", :c=>"Test1"}, {:a=>3, :b=>"Hi", :c=>"Test2"}]
The first part groups the hashes by :c
h.group_by { |h| h[:c] }.values #=> [[{:a=>1, :b=>"Hello", :c=>"Test1"}, {:a=>2, :b=>"Hey", :c=>"Test1"}], [{:a=>3, :b=>"Hi", :c=>"Test2"}]]
Then it maps to merge the first elements with others using Hash#merge

h.each_with_object({}) do |g,h|
h.update(g[:c]=>g) { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }
end.values
#=> [{:a=>1, :b=>"Hello, Hey", :c=>"Test1"},
# {:a=>3, :b=>"Hi", :c=>"Test2"}]
This uses the form of Hash#update that employs a block (here { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }) to determine the values of keys that are present in both hashes being merged. The first block variable holds the common key. I’ve used an underscore for that variable mainly to signal to the reader that it is not used in the block calculation. See the doc for definitions of the other two block variables.
Note that the receiver of values equals the following.
h.each_with_object({}) do |g,h|
h.update(g[:c]=>g) { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }
end
#=> { “Test1”=>{:a=>1, :b=>"Hello, Hey", :c=>"Test1"},
# “Test2=>{:a=>3, :b=>"Hi", :c=>"Test2"} }

Related

Merge hash of arrays into array of hashes

So, I have a hash with arrays, like this one:
{"name": ["John","Jane","Chris","Mary"], "surname": ["Doe","Doe","Smith","Martins"]}
I want to merge them into an array of hashes, combining the corresponding elements.
The results should be like that:
[{"name"=>"John", "surname"=>"Doe"}, {"name"=>"Jane", "surname"=>"Doe"}, {"name"=>"Chris", "surname"=>"Smith"}, {"name"=>"Mary", "surname"=>"Martins"}]
Any idea how to do that efficiently?
Please, note that the real-world use scenario could contain a variable number of hash keys.
Try this
h[:name].zip(h[:surname]).map do |name, surname|
{ 'name' => name, 'surname' => surname }
end
I suggest writing the code to permit arbitrary numbers of attributes. It's no more difficult than assuming there are two (:name and :surname), yet it provides greater flexibility, accommodating, for example, future changes to the number or naming of attributes:
def squish(h)
keys = h.keys.map(&:to_s)
h.values.transpose.map { |a| keys.zip(a).to_h }
end
h = { name: ["John", "Jane", "Chris"],
surname: ["Doe", "Doe", "Smith"],
age: [22, 34, 96]
}
squish(h)
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
The steps for the example above are as follows:
b = h.keys
#=> [:name, :surname, :age]
keys = b.map(&:to_s)
#=> ["name", "surname", "age"]
c = h.values
#=> [["John", "Jane", "Chris"], ["Doe", "Doe", "Smith"], [22, 34, 96]]
d = c.transpose
#=> [["John", "Doe", 22], ["Jane", "Doe", 34], ["Chris", "Smith", 96]]
d.map { |a| keys.zip(a).to_h }
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
In the last step the first value of b is passed to map's block and the block variable is assigned its value.
a = d.first
#=> ["John", "Doe", 22]
e = keys.zip(a)
#=> [["name", "John"], ["surname", "Doe"], ["age", 22]]
e.to_h
#=> {"name"=>"John", "surname"=>"Doe", "age"=>22}
The remaining calculations are similar.
If your dataset is really big, you can consider using Enumerator::Lazy.
This way Ruby will not create intermediate arrays during calculations.
This is how #Ursus answer can be improved:
h[:name]
.lazy
.zip(h[:surname])
.map { |name, surname| { 'name' => name, 'surname' => surname } }
.to_a
Other option for the case where:
[..] the real-world use scenario could contain a variable number of hash keys
h = {
'name': ['John','Jane','Chris','Mary'],
'surname': ['Doe','Doe','Smith','Martins'],
'whathever': [1, 2, 3, 4, 5]
}
You could use Object#then with a splat operator in a one liner:
h.values.then { |a, *b| a.zip *b }.map { |e| (h.keys.zip e).to_h }
#=> [{:name=>"John", :surname=>"Doe", :whathever=>1}, {:name=>"Jane", :surname=>"Doe", :whathever=>2}, {:name=>"Chris", :surname=>"Smith", :whathever=>3}, {:name=>"Mary", :surname=>"Martins", :whathever=>4}]
The first part, works this way:
h.values.then { |a, *b| a.zip *b }
#=> [["John", "Doe", 1], ["Jane", "Doe", 2], ["Chris", "Smith", 3], ["Mary", "Martins", 4]]
The last part just maps the elements zipping each with the original keys then calling Array#to_h to convert to hash.
Here I removed the call .to_h to show the intermediate result:
h.values.then { |a, *b| a.zip *b }.map { |e| h.keys.zip e }
#=> [[[:name, "John"], [:surname, "Doe"], [:whathever, 1]], [[:name, "Jane"], [:surname, "Doe"], [:whathever, 2]], [[:name, "Chris"], [:surname, "Smith"], [:whathever, 3]], [[:name, "Mary"], [:surname, "Martins"], [:whathever, 4]]]
[h[:name], h[:surname]].transpose.map do |name, surname|
{ 'name' => name, 'surname' => surname }
end

Counting unique occurrences of values in a hash

I'm trying to count occurrences of unique values matching a regex pattern in a hash.
If there's three different values, multiple times, I want to know how much each value occurs.
This is the code I've developed to achieve that so far:
def trim(results)
open = []
results.map { |k, v| v }.each { |n| open << n.to_s.scan(/^closed/) }
puts open.size
end
For some reason, it returns the length of all the values, not just the ones I tried a match on. I've also tried using results.each_value, to no avail.
Another way:
hash = {a: 'foo', b: 'bar', c: 'baz', d: 'foo'}
hash.each_with_object(Hash.new(0)) {|(k,v),h| h[v]+=1 if v.start_with?('foo')}
#=> {"foo"=>2}
or
hash.each_with_object(Hash.new(0)) {|(k,v),h| h[v]+=1 if v =~ /^foo|bar/}
#=> {"foo"=>2, "bar"=>1}
Something like this?
hash = {a: 'foo', b: 'bar', c: 'baz', d: 'foo'}
groups = hash.group_by{ |k, v| v[/(?:foo|bar)/] }
# => {"foo"=>[[:a, "foo"], [:d, "foo"]],
# "bar"=>[[:b, "bar"]],
# nil=>[[:c, "baz"]]}
Notice that there is a nil key, which means the regex didn't match anything. We can get rid of it because we (probably) don't care. Or maybe you do care, in which case, don't get rid of it.
groups.delete(nil)
This counts the number of matching "hits":
groups.map{ |k, v| [k, v.size] }
# => [["foo", 2], ["bar", 1]]
group_by is a magical method and well worthy of learning.
def count(hash, pattern)
hash.each_with_object({}) do |(k, v), counts|
counts[k] = v.count{|s| s.to_s =~ pattern}
end
end
h = { a: ['open', 'closed'], b: ['closed'] }
count(h, /^closed/)
=> {:a=>1, :b=>1}
Does that work for you?
I think it worths to update for RUBY_VERSION #=> "2.7.0" which introduces Enumerable#tally:
h = {a: 'foo', b: 'bar', c: 'baz', d: 'foo'}
h.values.tally #=> {"foo"=>2, "bar"=>1, "baz"=>1}
h.values.tally.select{ |k, _| k=~ /^foo|bar/ } #=> {"foo"=>2, "bar"=>1}

Is there a better solution to partition a hash into two hashes?

I wrote a method to split a hash into two hashes based on a criteria (a particular hash value). My question is different from another question on Hash. Here is an example of what I expect:
h={
:a => "FOO",
:b => "FOO",
:c => "BAR",
:d => "BAR",
:e => "FOO"
}
h_foo, h_bar = partition(h)
I need h_foo and h_bar to be like:
h_foo={
:a => "FOO",
:b => "FOO",
:e => "FOO"
}
h_bar={
:c => "BAR",
:d => "BAR"
}
My solution is:
def partition h
h.group_by{|k,v| v=="FOO"}.values.collect{|ary| Hash[*ary.flatten]}
end
Is there a clever solution?
There's Enumerable#partition:
h.partition { |k, v| v == "FOO" }.map(&:to_h)
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
Or you could use Enumerable#each_with_object to avoid the intermediate arrays:
h.each_with_object([{}, {}]) { |(k, v), (h_foo, h_bar)|
v == "FOO" ? h_foo[k] = v : h_bar[k] = v
}
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
I don't think there is a clever one liner, but you can make it slightly more generic by doing something like:
def transpose(h,k,v)
h[v] ||= []
h[v] << k
end
def partition(h)
n = {}
h.map{|k,v| transpose(n,k,v)}
result = n.map{|k,v| Hash[v.map{|e| [e, k]}] }
end
which will yield
[{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
when run against your initial hash h
Edit - TIL about partition. Wicked.
Why not use builtin partition, which is doing almost exactly what you are looking for?
h_foo, h_bar = h.partition { |key, value| value == 'FOO' }
The only downside is that you will get arrays instead of hashes (but you already know how to convert that). In ruby 2.1+ you could simply call .map(&:to_h) at the end of call chain.

Ruby: Link two arrays of objects by attribute value

I'm pretty new in Ruby programming. In Ruby there are plenty ways to write elegant code. Is there any elegant way to link two arrays with objects of the same type by attribute value?
It's hard to explain. Let's look at the next example:
a = [ { :id => 1, :value => 1 }, { :id => 2, :value => 2 }, { :id => 3, :value => 3 } ]
b = [ { :id => 1, :value => 2 }, { :id => 3, :value => 4 } ]
c = link a, b
# Result structure after linkage.
c = {
"1" => {
:a => { :id => 1, :value => 1 },
:b => { :id => 1, :value => 1 }
},
"3" => {
:a => { :id => 3, :value => 3 },
:b => { :id => 3, :value => 4 }
}
}
So the basic idea is to get pairs of objects from different arrays by their common ID and construct a hash, which will give this pair by ID.
Thanks in advance.
If you want to take an adventure through Enumerable, you could say this:
(a.map { |h| [:a, h] } + b.map { |h| [:b, h] })
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == 2 }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
And if you really want the keys to be strings, say n.to_s => Hash[v] instead of n => Hash[v].
The logic works like this:
We need to know where everything comes from we decorate the little hashes with :a and :b symbols to track their origins.
Then add the decorated arrays together into one list so that...
group_by can group things into almost-the-final-format.
Then find the groups of size two since those groups contain the entries that appeared in both a and b. Groups of size one only appeared in one of a or b so we throw those away.
Then a little injection to rearrange things into their final format. Note that the arrays we built in (1) just somehow happen to be in the format that Hash[] is looking for.
If you wanted to do this in a link method then you'd need to say things like:
link :a => a, :b => b
so that the method will know what to call a and b. This hypothetical link method also easily generalizes to more arrays:
def link(input)
input.map { |k, v| v.map { |h| [k, h] } }
.inject(:+)
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == input.length }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
end
link :a => [...], :b => [...], :c => [...]
I assume that, for any two elements h1 and h2 of a (or of b), h1[:id] != h2[:id].
I would do this:
def convert(arr) Hash[arr.map {|h| [h[:id], h]}] end
ah, bh = convert(a), convert(b)
c = ah.keys.each_with_object({}) {|k,h|h[k]={a: ah[k], b: bh[k]} if bh.key?(k)}
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Note that:
ah = convert(a)
# => {1=>{:id=>1, :value=>1}, 2=>{:id=>2, :value=>2}, 3=>{:id=>3, :value=>3}}
bh = convert(b)
# => {1=>{:id=>1, :value=>2}, 3=>{:id=>3, :value=>4}}
Here's a second approach. I don't like it as well, but it represents a different way of looking at the problem.
def sort_by_id(a) a.sort_by {|h| h[:id]} end
c = Hash[*sort_by_id(a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}})
.zip(sort_by_id(b))
.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
.flatten]
Here's what's happening. The first step is to select only the elements ha of a for which there is an element hb of b for which ha[:id] = hb[id]. Then we sort both (what's left of) a and b on h[:id], zip them together and then make the hash c.
r1 = a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}}
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r2 = sort_by_id(r1)
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r3 = sort_by_id(b)
# => [{:id=>1, :value=>2}, {:id=>3, :value=>4}]
r4 = r2.zip(r3)
# => [[{:id=>1, :value=>1}, {:id=>1, :value=>2}],
# [{:id=>3, :value=>3}, {:id=>3, :value=>4}]]
r5 = r4.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
# => [[1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}}],
# [3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]]
r6 = r5.flatten
# => [1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]
c = Hash[*r6]
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Ok, I've found the answer by myself. Here is a quite short line of code, which should do the trick:
Hash[a.product(b)
.select { |pair| pair[0][:id] == pair[1][:id] }
.map { |pair| [pair[0][:id], { :a => pair[0], :b => pair[1] }] }]
The product method gives us all possible pairs, then we filter them by equal IDs of pair elements. And then we map pairs to the special form, which will produce a Hash we are looking for.
So Hash[["key1", "value1"], ["key2", "value2"]] returns { "key1" => "value1", "key2" => "value2" }. And I use this to get the answer on my question.
Thanks.
P.S.: you can use pair.first instead of pair[0] and pair.last instead of pair[1] for better readability.
UPDATE
As Cary pointed out, it is better to replace |pair| with |ha, hb| to avoid these ugly indices:
Hash[a.product(b)
.select { |ha, hb| ha[:id] == hb[:id] }
.map { |ha, hb| [ha[:id], { :a => ha, :b => hb }] }]

How to merge multiple hashes in Ruby?

h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
Hash#merge works for 2 hashes: h.merge(h2)
How to merge 3 hashes?
h.merge(h2).merge(h3) works but is there a better way?
You could do it like this:
h, h2, h3 = { a: 1 }, { b: 2 }, { c: 3 }
a = [h, h2, h3]
p Hash[*a.map(&:to_a).flatten] #= > {:a=>1, :b=>2, :c=>3}
Edit: This is probably the correct way to do it if you have many hashes:
a.inject{|tot, new| tot.merge(new)}
# or just
a.inject(&:merge)
Since Ruby 2.0 on that can be accomplished more graciously:
h.merge **h1, **h2
And in case of overlapping keys - the latter ones, of course, take precedence:
h = {}
h1 = { a: 1, b: 2 }
h2 = { a: 0, c: 3 }
h.merge **h1, **h2
# => {:a=>0, :b=>2, :c=>3}
h.merge **h2, **h1
# => {:a=>1, :c=>3, :b=>2}
You can just do
[*h,*h2,*h3].to_h
# => {:a=>1, :b=>2, :c=>3}
This works whether or not the keys are Symbols.
Ruby 2.6 allows merge to take multiple arguments:
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
h4 = { 'c' => 4 }
h5 = {}
h.merge(h2, h3, h4, h5) # => {:a=>1, :b=>2, :c=>3, "c"=>4}
This works with Hash.merge! and Hash.update too. Docs for this here.
Also takes empty hashes and keys as symbols or strings.
Much simpler :)
Answer using reduce (same as inject)
hash_arr = [{foo: "bar"}, {foo2: "bar2"}, {foo2: "bar2b", foo3: "bar3"}]
hash_arr.reduce { |acc, h| (acc || {}).merge h }
# => {:foo2=>"bar2", :foo3=>"bar3", :foo=>"bar"}
Explanation
For those beginning with Ruby or functional programming, I hope this brief explanation might help understand what's happening here.
The reduce method when called on an Array object (hash_arr) will iterate through each element of the array with the returned value of the block being stored in an accumulator (acc). Effectively, the h parameter of my block will take on the value of each hash in the array, and the acc parameter will take on the value that is returned by the block through each iteration.
We use (acc || {}) to handle the initial condition where acc is nil. Note that the merge method gives priority to keys/values in the original hash. This is why the value of "bar2b" doesn't appear in my final hash.
Hope that helps!
To build upon #Oleg Afanasyev's answer, you can also do this neat trick:
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
z = { **h, **h2, **h3 } # => {:a=>1, :b=>2, :c=>3}
Cheers!
class Hash
def multi_merge(*args)
args.unshift(self)
args.inject { |accum, ele| accum.merge(ele) }
end
end
That should do it. You could easily monkeypatch that into Hash as I have shown.
newHash = [h, h2, h3].each_with_object({}) { |oh, nh| nh.merge!(oh)}
# => {:a=>1, :b=>2, :c=>3}
Here are the 2 monkeypatched ::Hash instance methods we use in our app. Backed by Minitest specs. They use merge! instead of merge internally, for performance reasons.
class ::Hash
# Merges multiple Hashes together. Similar to JS Object.assign.
# Returns merged hash without modifying the receiver.
#
# #param *other_hashes [Hash]
#
# #return [Hash]
def merge_multiple(*other_hashes)
other_hashes.each_with_object(self.dup) do |other_hash, new_hash|
new_hash.merge!(other_hash)
end
end
# Merges multiple Hashes together. Similar to JS Object.assign.
# Modifies the receiving hash.
# Returns self.
#
# #param *other_hashes [Hash]
#
# #return [Hash]
def merge_multiple!(*other_hashes)
other_hashes.each(&method(:merge!))
self
end
end
Tests:
describe "#merge_multiple and #merge_multiple!" do
let(:hash1) {{
:a => "a",
:b => "b"
}}
let(:hash2) {{
:b => "y",
:c => "c"
}}
let(:hash3) {{
:d => "d"
}}
let(:merged) {{
:a => "a",
:b => "y",
:c => "c",
:d => "d"
}}
describe "#merge_multiple" do
subject { hash1.merge_multiple(hash2, hash3) }
it "should merge three hashes properly" do
assert_equal(merged, subject)
end
it "shouldn't modify the receiver" do
refute_changes(->{ hash1 }) do
subject
end
end
end
describe "#merge_multiple!" do
subject { hash1.merge_multiple!(hash2, hash3) }
it "should merge three hashes properly" do
assert_equal(merged, subject)
end
it "shouldn't modify the receiver" do
assert_changes(->{ hash1 }, :to => merged) do
subject
end
end
end
end
Just for fun, you can do it also this way:
a = { a: 1 }, { b: 2 }, { c: 3 }
{}.tap { |h| a.each &h.method( :update ) }
#=> {:a=>1, :b=>2, :c=>3}
With modern Ruby, you wont even have to use merge unless you need to change the variable in place using the ! variant, you can just double splat (**) your way through.
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
merged_hash = { **h, **h2, **h3 }
=> { a: 1, b: 2, c:3 }

Resources