Merge an array of hashes by key-value pair - ruby

I have an array of hashes as follows:
[
{'abc_id'=>'1234', 'def_id'=>[]},
{'abc_id'=>'5678', 'def_id'=>['11', '22']},
{'abc_id'=>'1234', 'def_id'=>['33', '44']},
{'abc_id'=>'5678', 'def_id'=>['55', '66']}
]
I'm trying to combine multiple hashes with the same key-value pair into one hash. Thus, we have two pairs with the same value for 'abc_id' key as follows:
{'abc_id'=>'1234', 'def_id'=>[]} and {'abc_id'=>'1234', 'def_id'=>['33', '44']}
{'abc_id'=>'5678', 'def_id'=>['11', '22']} and {'abc_id'=>'5678', 'def_id'=>['55', '66']}
I'm expecting multiple hashes with the same key-value pairs to be merged into one individual hash. For the two pairs above, they should be respectively:
{'abc_id'=>'1234', 'def_id'=>['33', '44']}
{'abc_id'=>'5678', 'def_id'=>['11', '22', '55', '66']}

The more-or-less generic and extendable variant would be:
input.
group_by { |h| h['abc_id'] }.
map do |k, v|
v.reduce do |acc, arr|
# use `+` instead of `|` to save duplicates ⇓⇓⇓
acc.merge(arr) { |_, v1, v2| Array === v1 ? v1 | v2 : v1 }
end
end
#⇒ [{"abc_id"=>"1234", "def_id"=>["33", "44"]},
# {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]

One option more:
array
.map.with_object({}) { |h, hh| hh[h['abc_id']].nil? ? hh[h['abc_id']] = h['def_id'] : hh[h['abc_id']] += h['def_id'] }
.map{ |k, v| {'abc_id' => k, 'def_id' => v} }
The first part returns
# {"1234"=>["33", "44"], "5678"=>["11", "22", "55", "66"]}
The second part rebuilds the original structure, returning:
#=> [{"abc_id"=>"1234", "def_id"=>["33", "44"]}, {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]

One could use the form of Hash#update (aka merge!) and Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. Here this needs to be done at two levels.
Letting arr be the array given in the question, these methods are used as follows.
arr.each_with_object({}) do |g,h|
h.update(g['abc_id']=>g) do |_,o,n|
o.merge(n) { |k,oo,nn| k=='def_id' ? oo+nn : oo }
end
end.values
#=> [{"abc_id"=>"1234", "def_id"=>["33", "44"]},
# {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]
See the docs for an explanation of the block variables _, o, n, k, oo and nn. I used an underscore to represent the common key
with update to tell the reader that it is not used in the block calculation.
Note that the receiver of Hash#values is the following.
{ "1234"=>{ "abc_id"=>"1234", "def_id"=>["33", "44"] },
"5678"=>{ "abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"] } }

Related

How to generate the expected output by using split method used in my code?

Question:
Create a method for Array that returns a hash having 'key' as length of the element and value as an array of all the elements of that length. Make use of Array#each.
Returned Hash should be sorted by key.
I have tried to do it through Hash sorting over length. I have almost resolved it using another method but I want to use split and hash to achieve expected output.
Can anyone suggest any amendments in my code below?
Input argument:
array-hash.rb "['abc','def',1234,234,'abcd','x','mnop',5,'zZzZ']"
Expected output:
{1=>["x", "5"], 3=>["abc", "def", "234"], 4=>["1234", "abcd", "mnop", "zZzZ"]}
class String
def key_length(v2)
hash = {}
v2.each do |item|
item_length = item.to_s.length
hash[item_length] ||= []
hash[item_length].push(item)
end
Hash[hash.sort]
end
end
reader = ''
if ARGV.empty?
puts 'Please provide an input'
else
v1 = ARGV[0]
v2 = v1.tr("'[]''",'').split
p reader.key_length(v2)
end
Actual output:
{35=>["abc,def,1234,234,abcd,x,mnop,5,zZzZ"]}
Given the array (converted from string, note integers as string between ""):
ary = str[1..-2].delete('\'').split(',')
ary #=> ["abc", "def", "1234", "234", "abcd", "x", "mnop", "5", "zZzZ"]
The most "idiomatic" way should be using group_by:
ary.group_by(&:size)
If you want to use each, then you could use Enumerable#each_with_object, where the object is an Hash#new with an empty array as default:
ary.each_with_object(Hash.new{ |h,k| h[k] = []}) { |e, h| h[e.size] << e }
Which is the same as
res = Hash.new{ |h,k| h[k] = []}
ary.each { |e| res[e.size] << e }
Not sure why you need to monkeypatch* array here, is this a school exercise or something?
I think your bug is you need to pass in the comma delimiter arg to split.
I would solve the underlying problem as a reduce/inject/fold thing, myself.
s = "['abc','def',1234,234,'abcd','x','mnop',5,'zZzZ']"
splits = s.tr("'[]''",'').split(',') # need to pass in the comma for the split
Hash[splits.inject({}) { |memo,s| memo[s.length] ||= []; memo[s.length] << s; memo }.sort] # doesn't use Array.each but?
{1=>["x", "5"], 3=>["def", "234"], 4=>["1234", "abcd", "mnop"],
5=>["['abc"], 6=>["zZzZ']"]}

Merging Three hashes and getting this resultant hash

I have read the xls and have formed these three hashes
hash1=[{'name'=>'Firstname',
'Locator'=>'id=xxx',
'Action'=>'TypeAndWait'},
{'name'=>'Password',
'Locator'=>'id=yyy',
'Action'=>'TypeAndTab'}]
Second Hash
hash2=[{'Test Name'=>'Example',
'TestNumber'=>'Test1'},
{'Test Name'=>'Example',
'TestNumber'=>'Test2'}]
My Thrid Hash
hash3=[{'name'=>'Firstname',
'Test1'=>'four',
'Test2'=>'Five',
'Test3'=>'Six'},
{'name'=>'Password',
'Test1'=>'Vicky',
'Test2'=>'Sujin',
'Test3'=>'Sivaram'}]
Now my resultant hash is
result={"Example"=>
{"Test1"=>
{'Firstname'=>
["id=xxx","four", "TypeAndWait"],
'Password'=>
["id=yyy","Vicky", "TypeAndTab"]},
"Test2"=>
{'Firstname'=>
["id=xxx","Five", "TypeAndWait"],
'Password'=>
["id=yyy","Sujin", "TypeAndTab"]}}}
I have gotten this result, but I had to write 60 lines of code in my program, but I don't think I have to write such a long program when I use Ruby, I strongly believe some easy way to achieve this. Can some one help me?
The second hash determines the which testcase has to be read, for an example, test3 is not present in the second testcase so resultant hash doesn't have test3.
We are given three arrays, which I've renamed arr1, arr2 and arr3. (hash1, hash2 and hash3 are not especially good names for arrays. :-))
arr1 = [{'name'=>'Firstname', 'Locator'=>'id=xxx', 'Action'=>'TypeAndWait'},
{'name'=>'Password', 'Locator'=>'id=yyy', 'Action'=>'TypeAndTab'}]
arr2 = [{'Test Name'=>'Example', 'TestNumber'=>'Test1'},
{'Test Name'=>'Example', 'TestNumber'=>'Test2'}]
arr3=[{'name'=>'Firstname', 'Test1'=>'four', 'Test2'=>'Five', 'Test3'=>'Six'},
{'name'=>'Password', 'Test1'=>'Vicky', 'Test2'=>'Sujin', 'Test3'=>'Sivaram'}]
The drivers are the values "Test1" and "Test2" in the hashes that are elements of arr2. Nothing else in that array is needed, so let's extract those values (of which there could be any number, but here there are just two).
a2 = arr2.map { |h| h['TestNumber'] }
#=> ["Test1", "Test2"]
Next we need to rearrange the information in arr3 by creating a hash whose keys are the elements of a2.
h3 = a2.each_with_object({}) { |test,h|
h[test] = arr3.each_with_object({}) { |f,g| g[f['name']] = f[test] } }
#=> {"Test1"=>{"Firstname"=>"four", "Password"=>"Vicky"},
# "Test2"=>{"Firstname"=>"Five", "Password"=>"Sujin"}}
Next we need to rearrange the content of arr1 by creating a hash whose keys match the keys of values of h3.
h1 = arr1.each_with_object({}) { |g,h| h[g['name']] = g.reject { |k,_| k == 'name' } }
#=> {"Firstname"=>{"Locator"=>"id=xxx", "Action"=>"TypeAndWait"},
# "Password"=>{"Locator"=>"id=yyy", "Action"=>"TypeAndTab"}}
It is now a simple matter of extracting information from these three objects.
{ 'Example'=>
a2.each_with_object({}) do |test,h|
h[test] = h3[test].each_with_object({}) do |(k,v),g|
f = h1[k]
g[k] = [f['Locator'], v, f['Action']]
end
end
}
#=> {"Example"=>
# {"Test1"=>{"Firstname"=>["id=xxx", "four", "TypeAndWait"],
# "Password"=>["id=yyy", "Vicky", "TypeAndTab"]},
# "Test2"=>{"Firstname"=>["id=xxx", "Five", "TypeAndWait"],
# "Password"=>["id=yyy", "Sujin", "TypeAndTab"]}}}
What do you call hash{1-2-3} are arrays in the first place. Also, I am pretty sure you have mistyped hash1#Locator and/or hash3#name. The code below works for this exact data, but it should not be hard to update it to reflect any changes.
hash2.
map(&:values).
group_by(&:shift).
map do |k, v|
[k, v.flatten.map do |k, v|
[k, hash3.map do |h3|
# lookup a hash from hash1
h1 = hash1.find do |h1|
h3['name'].start_with?(h1['Locator'])
end
# can it be nil btw?
[
h1['name'],
[
h3['name'][/.*(?=-id)/],
h3[k],
h1['Action']
]
]
end.to_h]
end.to_h]
end.to_h

Ruby using regex in select block

I've been having a lot of trouble sifting out regex matches. I could use scan, but since it only operates over a string, and I dont want to use a join on the array in question, it is much more tedious. I want to be able to do something like this:
array = ["a1d", "6dh","th3"].select{|x| x =~ /\d/}
# => ["1", "6", "3"}
However this never seems to work. Is there a work around or do I just need to use scan?
Try: Array#map
> array = ["a1d", "6dh","th3"].map {|x| x[/\d+/]}
#=> ["1", "6", "3"]
Note:
select
Returns a new array containing all elements of ary for which the given
block returns a true value.
In your case each element contains digit and it returns true, so you are getting original element via select. while map will perform action on each element and return new array with performed action on each element.
You can use grep with a block:
array = ["a1d", "6dh", "th3"]
array.grep(/(\d)/) { $1 }
#=> ["1", "6", "3"]
It passes each matching element to the block and returns an array containing the block's results.
$1 is a special global variable containing the first capture group.
Unlike map, only matching elements are returned:
array = ["a1d", "foo", "6dh", "bar", "th3"]
array.grep(/(\d)/) { $1 }
#=> ["1", "6", "3"]
array.map { |s| s[/\d/] }
#=> ["1", nil, "6", nil, "3"]
Depending on your requirements, you may wish to construct a hash.
arr = ["a1d", "6dh", "th3", "abc", "3for", "rg6", "def"]
arr.each_with_object(Hash.new { |h,k| h[k] = [] }) { |str,h| h[str[/\d+/]] << str }
#=> {"1"=>["a1d"], "6"=>["6dh", "rg6"], "3"=>["th3", "3for"], nil=>["abc", "def"]}
Hash.new { |h,k| h[k] = [] } creates an empty hash with a default block, represented by the block variable h. That means that if the hash does not have a key k, the block is executed, adding the key value pair k=>[] to the hash, after which h[k] << k is executed.
The above is a condensed (and Ruby-like) way of writing the following.
h = {}
arr.each do |str|
s = str[/\d+/]
h[s] = [] unless h.key?(s)
h[s] << str
end
h
# => {"1"=>["a1d"], "6"=>["6dh", "rg6"], "3"=>["th3", "3for"], nil=>["abc", "def"]}
The expression in the third line could alternatively be written
arr.each_with_object({}) { |str,h| (h[str[/\d+/]] ||= []) << str }
h[str[/\d+/]] ||= [] sets h[str[/\d+/]] to an empty array if the hash h does not have a key str[/\d+/].
See Enumerable#each_with_object and Hash::new.
#Stefan suggests
arr.group_by { |str| str[/\d+/] }
#=> {"1"=>["a1d"], "6"=>["6dh", "rg6"], "3"=>["th3", "3for"], nil=>["abc", "def"]}
What can I say?

add up values from 2 arrays based on duplicate values of the other one

A similar question has been answered here However I'd like to know how I can add up/group the numbers from one array based on the duplicate values of another array.
test_names = ["TEST1", "TEST1", "TEST2", "TEST3", "TEST2", "TEST4", "TEST4", "TEST4"]
numbers = ["5", "4", "3", "2", "9", "7", "6", "1"]
The ideal result I'd like to get is a hash or an array with:
{"TEST1" => 9, "TEST2" => 12, "TEST3" => 2, "TEST4" => 14}
Another way I found you can do:
test_names.zip(numbers).each_with_object(Hash.new(0)) {
|arr, hsh| hsh[arr[0]] += arr[1].to_i }
You can do it like this:
my_hash = Hash.new(0)
test_names.each_with_index {|name, index| my_hash[name] += numbers[index].to_i}
my_hash
#=> {"TEST1"=>9, "TEST2"=>12, "TEST3"=>2, "TEST4"=>14}
I wish to follow #squidguy's example and use Enumerable#zip, but with a different twist:
{}.tap { |h| test_names.zip(numbers.map(&:to_i)) { |a|
h.update([a].to_h) { |_,o,n| o+n } } }
#=> {"TEST1"=>9, "TEST2"=>12, "TEST3"=>2, "TEST4"=>14}
Object#tap is here just a substitute for Enumerable#each_with_object or for having h={} initially and a last line with just h.
I'm using the form of Hash#update (aka merge!) that takes a block for determining the merged value for each key that is present in both the original hash (h) and the hash being merged ([a].to_h). There are three block variables, the shared key (which we don't use here, so I've replaced it with the placeholder _), and the values for that key for the original hash (o) and for the hash being merged (n).

Consolidate duplicate values of a certain key from an array of hashes into array

I have an array of hashes:
connections = [
{:name=>"John Doe", :number=>"5551234567", :count=>8},
{:name=>"Jane Doe", :number=>"5557654321", :count=>6},
{:name=>"John Doe", :number=>"5559876543", :count=>3}
]
If the :name value is a duplicate, as is the case with John Doe, it should combine the :number values into an array. The count is not important anymore, so the output should be in the following format:
{"John Doe"=>["5551234567","5559876543"],
"Jane Doe"=>["5557654321"]}
What I have so far is:
k = connections.inject(Hash.new{ |h,k| h[k[:name]] = [k[:number]] }) { |h,(k,v)| h[k] << v ; h }
But this only outputs
{"John Doe"=>["5559876543", nil], "Jane Doe"=>["5557654321", nil]}
This works:
connections.group_by do |h|
h[:name]
end.inject({}) do |h,(k,v)|
h.merge( { k => (v.map do |i| i[:number] end) } )
end
# => {"John Doe"=>["5551234567", "5559876543"], "Jane Doe"=>["5557654321"]}
Step by step...
connections is the same as in your post:
connections
# => [{:name=>"John Doe", :number=>"5551234567", :count=>8},
# {:name=>"Jane Doe", :number=>"5557654321", :count=>6}, {:name=>"John Doe",
# :number=>"5559876543", :count=>3}]
First we use group_by to combine the hash entries with the same :name:
connections.group_by do |h| h[:name] end
# => {"John Doe"=>[{:name=>"John Doe", :number=>"5551234567", :count=>8},
# {:name=>"John Doe", :number=>"5559876543", :count=>3}],
# "Jane Doe"=>[{:name=>"Jane Doe", :number=>"5557654321", :count=>6}]}
That's great, but we want the values of the result hash to be just the numbers that show up as values of the :number key, not the full original entry hashes.
Given just one of the list values, we can get the desired result this way:
[{:name=>"John Doe", :number=>"5551234567", :count=>8},
{:name=>"John Doe", :number=>"5559876543", :count=>3}].map do |i|
i[:number]
end
# => ["5551234567", "5559876543"]
But we want to do that to all of the list values at once, while keeping the association with their keys. It's basically a nested map operation, but the outer map runs across a Hash instead of an Array.
You can in fact do it with map. The only tricky part is that map on a Hash doesn't return a Hash, but an Array of nested [key,value] Arrays. By wrapping the call in a Hash[...] constructor, you can turn the result back into a Hash:
Hash[
connections.group_by do |h|
h[:name]
end.map do |k,v|
[ k, (v.map do |i| i[:number] end) ]
end
]
That returns the same result as my original full answer above, and is arguably clearer, so you might want to just use that version.
But the mechanism I used instead was inject. It's like map, but instead of just returning an Array of the return values from the block, it gives you full control over how the return value is constructed out of the individual block calls:
connections.group_by do |h|
h[:name]
end.inject({}) do |h,(k,v)|
h.merge( { k => (v.map do |i| i[:number] end) } )
end
That creates a new Hash, which starts out empty (the {} passed to inject), and passes it to the do block (where it shows up as h) along with the first key/value pair in the Hash returned by group_by. That block creates another new Hash with the single key passed in and the result of transforming the value as we did above, and merges that into the passed-in one, returning the new value - basically, it adds one new key/value pair to the Hash, with the value transformed into the desired form by the inner map. The new Hash is returned from the block, so it becomes the new value of h for the next time through the block.
(We could also just assign the entry into h directly with h[k] = v.map ..., but the block would then need to return h afterward as a separate statement, since it is the return value of the block, and not the value of h at the end of the block's execution, that gets passed to the next iteration.)
As an aside: I used do...end instead of {...} around my blocks to avoid confusion with the {...} used for Hash literals. There is no semantic difference; it's purely a matter of style. In standard Ruby style, you would use {...} for single-line blocks, and restrict do...end to blocks that span more than one line.
In one line:
k = connections.each.with_object({}) {|conn,result| (result[conn[:name]] ||= []) << conn[:number] }
More readable:
result = Hash.new {|h,k| h[k] = [] }
connections.each {|conn| result[conn[:name]] << conn[:number] }
result #=> {"John Doe"=>["5551234567", "5559876543"], "Jane Doe"=>["5557654321"]}
names = {}
connections.each{ |c| names[c[:name]] ||= []; names[c[:name]].push(c[:number]) }
puts names

Resources