Extend an array of hash with values from an array - ruby

I have this array
types = ['first', 'second', 'third']
and this array of hashes
data = [{query: "A"}, {query: "B"}, {query:"C", type: 'first'}]
Now I have to "extend" each Hash of data with each type if not already exists. All existing keys of the hash must be copied too (eg. :query).
So the final result must be:
results = [
{query: "A", type: 'first'}, {query: "A", type: "second"}, {query: "A", type: "third"},
{query: "B", type: 'first'}, {query: "B", type: "second"}, {query: "D", type: "third"},
{query: "C", type: 'first'}, {query: "C", type: "second"}, {query: "C", type: "third"}
]
the data array is quite big for performance matters.

You can use Array#product to combine both arrays and Hash#merge to add the :type key:
data.product(types).map { |h, t| h.merge(type: t) }
#=> [
# {:query=>"A", :type=>"first"}, {:query=>"A", :type=>"second"}, {:query=>"A", :type=>"third"},
# {:query=>"B", :type=>"first"}, {:query=>"B", :type=>"second"}, {:query=>"B", :type=>"third"},
# {:query=>"C", :type=>"first"}, {:query=>"C", :type=>"second"}, {:query=>"C", :type=>"third"}
# ]
Note that the above will replace existing values for :type with the values from the types array. (there can only be one :type per hash)
If you need more complex logic, you can pass a block to merge which handles existing / conflicting keys, e.g.:
h = { query: 'C', type: 'first' }
t = 'third'
h.merge(type: t) { |h, v1, v2| v1 } # preserve existing value
#=> {:query=>"C", :type=>"first"}
h.merge(type: t) { |h, v1, v2| [v1, v2] } # put both values in an array
#=> {:query=>"C", :type=>["first", "third"]}

We see that each hash in data is mapped to an array of three hashes and the resulting array of three arrays is then to be flattended, suggesting we skip a step by using the method Enumerable#flat_map on data. The construct is as follows.
n = types.size
#=> 3
data.flat_map { |h| n.times.map { |i| ... } }
where ... produces a hash such as
{:query=>"A", :type=>"second"}
Next we see that the value of :type in the array of hashes returned equals :first then :second then :third then :first and so on. That is, the value cycles among the elements of types. Also, the fact that one of the hashes in data has a key :type is irrelevant, as it will be overwritten. Therefore, for each value of i (0, 1 or 2) in map's block above, we wish to merge h with { type: types[i%n] }:
n = types.size
data.flat_map { |h| n.times.map { |i| h.merge(type: types[i%n]) } }
#=> [{:query=>"A", :type=>"first"}, {:query=>"A", :type=>"second"},
# {:query=>"A", :type=>"third"},
# {:query=>"B", :type=>"first"}, {:query=>"B", :type=>"second"},
# {:query=>"B", :type=>"third"},
# {:query=>"C", :type=>"first"}, {:query=>"C", :type=>"second"},
# {:query=>"C", :type=>"third"}]
We may alternatively make use of the method Array#cycle.
enum = types.cycle
#=> #<Enumerator: ["first", "second", "third"]:cycle>
As the name of the method suggests,
enum.next
#=> "first"
enum.next
#=> "second"
enum.next
#=> "third"
enum.next
#=> "first"
enum.next
#=> "second"
...
ad infinitum. Before continuing let me reset the enumerator.
enum.rewind
See Enumerator#next and Enumerator#rewind.
n = types.size
data.flat_map { |h| n.times.map { h.merge(type: enum.next) } }
#=> <as above>

Related

Merge Ruby Hash values with same key

Is this possible to achieve with selected keys:
Eg
h = [
{a: 1, b: "Hello", c: "Test1"},
{a: 2, b: "Hey", c: "Test1"},
{a: 3, b: "Hi", c: "Test2"}
]
Expected Output
[
{a: 1, b: "Hello, Hey", c: "Test1"}, # See here, I don't want key 'a' to be merged
{a: 3, b: "Hi", c: "Test2"}
]
My Try
g = h.group_by{|k| k[:c]}.values
OUTPUT =>
[
[
{:a=>1, :b=>"Hello", :c=>"Test1"},
{:a=>2, :b=>"Hey", :c=>"Test1"}
], [
{:a=>3, :b=>"Hi", :c=>"Test2"}
]
]
g.each do |v|
if v.length > 1
c = v.reduce({}) do |s, l|
s.merge(l) { |_, a, b| [a, b].uniq.join(", ") }
end
end
p c #{:a=>"1, 2", :b=>"Hello, Hey", :c=>"Test1"}
end
So, the output I get is
{:a=>"1, 2", :b=>"Hello, Hey", :c=>"Test1"}
But, I needed
{a: 1, b: "Hello, Hey", c: "Test1"}
NOTE: This is just a test array of HASH I have taken to put my question. But, the actual hash has a lots of keys. So, please don't reply with key comparison answers
I need a less complex solution
I can't see a simpler version of your code. To make it fully work, you can use the first argument in the merge block instead of dismissing it to differentiate when you need to merge a and b or when you just use a. Your line becomes:
s.merge(l) { |key, a, b| key == :a ? a : [a, b].uniq.join(", ") }
Maybe you can consider this option, but I don't know if it is less complex:
h.group_by { |h| h[:c] }.values.map { |tmp| tmp[0].merge(*tmp[1..]) { |key, oldval, newval| key == :b ? [oldval, newval].join(' ') : oldval } }
#=> [{:a=>1, :b=>"Hello Hey", :c=>"Test1"}, {:a=>3, :b=>"Hi", :c=>"Test2"}]
The first part groups the hashes by :c
h.group_by { |h| h[:c] }.values #=> [[{:a=>1, :b=>"Hello", :c=>"Test1"}, {:a=>2, :b=>"Hey", :c=>"Test1"}], [{:a=>3, :b=>"Hi", :c=>"Test2"}]]
Then it maps to merge the first elements with others using Hash#merge
h.each_with_object({}) do |g,h|
h.update(g[:c]=>g) { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }
end.values
#=> [{:a=>1, :b=>"Hello, Hey", :c=>"Test1"},
# {:a=>3, :b=>"Hi", :c=>"Test2"}]
This uses the form of Hash#update that employs a block (here { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }) to determine the values of keys that are present in both hashes being merged. The first block variable holds the common key. I’ve used an underscore for that variable mainly to signal to the reader that it is not used in the block calculation. See the doc for definitions of the other two block variables.
Note that the receiver of values equals the following.
h.each_with_object({}) do |g,h|
h.update(g[:c]=>g) { |_,o,n| o.merge(b: "#{o[:b]}, #{n[:b]}") }
end
#=> { “Test1”=>{:a=>1, :b=>"Hello, Hey", :c=>"Test1"},
# “Test2=>{:a=>3, :b=>"Hi", :c=>"Test2"} }

Questions on implementing hashes in ruby

I'm new to ruby, I am solving a problem that involves hashes and key. The problem asks me to Implement a method, #pet_types, that accepts a hash as an argument. The hash uses people's # names as keys, and the values are arrays of pet types that the person owns. My question is about using Hash#each method to iterate through each num inside the array. I was wondering if there's any difference between solving the problem using hash#each or hash.sort.each?
I spent several hours coming up different solution and still to figure out what are the different approaches between the 2 ways of solving the problem below.
I include my code in repl.it: https://repl.it/H0xp/6 or you can see below:
# Pet Types
# ------------------------------------------------------------------------------
# Implement a method, #pet_types, that accepts a hash as an argument. The hash uses people's
# names as keys, and the values are arrays of pet types that the person owns.
# Example input:
# {
# "yi" => ["dog", "cat"],
# "cai" => ["dog", "cat", "mouse"],
# "venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
# }
def pet_types(owners_hash)
results = Hash.new {|h, k| h[k] = [ ] }
owners_hash.sort.each { |k, v| v.each { |pet| results[pet] << k } }
results
end
puts "-------Pet Types-------"
owners_1 = {
"yi" => ["cat"]
}
output_1 = {
"cat" => ["yi"]
}
owners_2 = {
"yi" => ["cat", "dog"]
}
output_2 = {
"cat" => ["yi"],
"dog" => ["yi"]
}
owners_3 = {
"yi" => ["dog", "cat"],
"cai" => ["dog", "cat", "mouse"],
"venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
}
output_3 = {
"dog" => ["cai", "yi"],
"cat" => ["cai", "venus", "yi"],
"mouse" => ["cai", "venus"],
"pterodactyl" => ["venus"],
"chinchilla" => ["venus"]
}
# method 2
# The 2nd and 3rd method should return a hash that uses the pet types as keys and the values should
# be a list of the people that own that pet type. The names in the output hash should
# be sorted alphabetically
# switched_hash = Hash.new()
# owners_hash.each do |owner, pets_array|
# pets_array.each do |pet|
# select_owners = owners_hash.select { |owner, pets_array|
owners_hash[owner].include?(pet) }
# switched_hash[pet] = select_owners.keys.sort
# end
# end
# method 3
#switched_hash
# pets = Hash.new {|h, k| h[k] = [ ] } # WORKS SAME AS: pets = Hash.new( Array.new )
# owners = owners_hash.keys.sort
# owners.each do |owner|
# owners_hash[owner].each do |pet|
# pets[pet] << owner
# end
# end
# pets
# Example output:
# output_3 = {
# "dog" => ["cai", "yi"],
# "cat" => ["cai", "venus", "yi"], ---> (sorted alphabetically!)
# "mouse" => ["cai", "venus"],
# "pterodactyl" => ["venus"],
# "chinchilla" => ["venus"]
# }
I used a hash data structure in my program to first solve this problem. Then I tried to rewrite it using the pet_hash. And my final codes is the following:
def pet_types(owners_hash)
pets_hash = Hash.new { |k, v| v = [] }
owners_hash.each do |owner, pets|
pets.each do |pet|
pets_hash[pet] += [owner]
end
end
pets_hash.values.each(&:sort!)
pets_hash
end
puts "-------Pet Types-------"
owners_1 = {
"yi" => ["cat"]
}
output_1 = {
"cat" => ["yi"]
}
owners_2 = {
"yi" => ["cat", "dog"]
}
output_2 = {
"cat" => ["yi"],
"dog" => ["yi"]
}
owners_3 = {
"yi" => ["dog", "cat"],
"cai" => ["dog", "cat", "mouse"],
"venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
}
output_3 = {
"dog" => ["cai", "yi"],
"cat" => ["cai", "venus", "yi"],
"mouse" => ["cai", "venus"],
"pterodactyl" => ["venus"],
"chinchilla" => ["venus"]
}
puts pet_types(owners_1) == output_1
puts pet_types(owners_2) == output_2
puts pet_types(owners_3) == output_3
Hash#sort has the same effect (at least for my basic test) as Hash#to_a followed by Array#sort.
hash = {b: 2, a: 1}
hash.to_a.sort # => [[:a, 1, [:b, 2]]
hash.sort # => the same
Now let's look at #each, both on Hash and Array.
When you provide two arguments to the block, that can handle both cases. For the hash, the first argument will be the key and the second will be the value. For the nested array, the values essentially get splatted out to the args:
[[:a, 1, 2], [:b, 3, 4]].each { |x, y, z| puts "#{x}-#{y}-#{z}" }
# => a-1-2
# => b-3-4
So basically, you should think of Hash#sort to be a shortcut to Hash#to_a followed by Array#sort, and recognize that #each will work the same on a hash as a hash converted to array (a nested array). In this case, it doesn't matter which approach you take. Clearly if you need to sort iteration by the keys then you should use sort.

Multiple sub-hashes out of one hash

I have a hash:
hash = {"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3", "a_2_a" => "3",
"a_2_b" => "4", "a_2_c" => "4"}
What's the best way to get the following sub-hashes:
[{"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3"},
{"a_2_a" => "3", "a_2_b" => "4", "a_2_c" => "4"}]
I want them grouped by the key, based on the regexp /^a_(\d+)/. I'll have 50+ key/value pairs in the original hash, so something dynamic would work best, if anyone has any suggestions.
If you're only concerned about the middle component you can use group_by to get you most of the way there:
hash.group_by do |k,v|
k.split('_')[1]
end.values.map do |list|
Hash[list]
end
# => [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"}, {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
The final step is extracting the grouped lists and combining those back into the required hashes.
Code
def partition_hash(hash)
hash.each_with_object({}) do |(k,v), h|
key = k[/(?<=_).+(?=_)/]
h[key] = (h[key] || {}).merge(k=>v)
end.values
end
Example
hash = {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}
partition_hash(hash)
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
Explanation
The steps are as follows.
enum = hash.each_with_object({})
#=> #<Enumerator: {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3",
# "a_2_b"=>"4", "a_2_c"=>"4"}:each_with_object({})>
The first element of this enumerator is generated and passed to the block, and the block variables are computed using parallel assignment.
(k,v), h = enum.next
#=> [["a_1_a", "1"], {}]
k #=> "a_1_a"
v #=> "1"
h #=> {}
and the block calculation is performed.
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_a"=>"1")
#=> h["1"] = (nil || {}).merge("a_1_a"=>"1")
#=> h["1"] = {}.merge("a_1_a"=>"1")
#=> h["1"] = {"a_1_a"=>"1"}
so now
h #=> {"1"=>{"a_1_a"=>"1"}}
The next value of enum is now generated and passed to the block, and the following calculations are performed.
(k,v), h = enum.next
#=> [["a_1_b", "2"], {"1"=>{"a_1_a"=>"1"}}]
k #=> "a_1_b"
v #=> "2"
h #=> {"1"=>{"a_1_a"=>"1"}}
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_b"=>"2")
#=> h["1"] = ({"a_1_a"=>"1"}} || {}).merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1"}}.merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1", "a_1_b"=>"2"}
After the remaining four elements of enum have been passed to the block the following has is returned.
h #=> {"1"=>{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# "2"=>{"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}}
The final step is simply to extract the values.
h.values
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]

Ruby chain two 'group_by' methods

I have an array of objects that looks like this:
[
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
I want to group them by days, and then by classes i.e.
groupedArray['Monday'] => {'1' => [{name: 'X'}, {name: 'T'}], '2' => [{name: 'Y'}]}
I've seen there is a
group_by { |a| [a.day, a.class]}
But this creates a hash with a [day, class] key.
Is there a way I can achieve this, without having to group them first by day, and then iterate through each day, and group them by class, then pushing them into a new hash?
arr = [
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
One way of obtaining the desired hash is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged. Here that is done twice, first when values of :day are the same, then for each such occurrence, when the values of :class are the same (for a given value of :day).
arr.each_with_object({}) { |g,h|
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] }) { |_,h1,h2|
h1.update(h2) { |_,p,q| p+q } } }
#=> {"Monday" =>{"1"=>[{:name=>"X"}, {:name=>"T"}], "2"=>[{:name=>"Y"}]},
# "Tuesday"=>{"1"=>[{:name=>"Z"}]}}
The steps are as follows.
enum = arr.each_with_object({})
#=> #<Enumerator: [{:day=>"Monday", :class=>1, :name=>"X"},
# {:day=>"Monday", :class=>2, :name=>"Y"},
# {:day=>"Tuesday", :class=>1, :name=>"Z"},
# {:day=>"Monday", :class=>1, :name=>"T"}]:each_with_object({})>
We can see the values that will be generated by this enumerator by converting it to an array:
enum.to_a
#=> [[{:day=>"Monday", :class=>1, :name=>"X"}, {}],
# [{:day=>"Monday", :class=>2, :name=>"Y"}, {}],
# [{:day=>"Tuesday", :class=>1, :name=>"Z"}, {}],
# [{:day=>"Monday", :class=>1, :name=>"T"}, {}]]
The empty hash in each array is the hash being built and returned. It is initially empty, but will be partially formed as each element of enum is processed.
The first element of enum is passed to the block (by Enumerator#each) and the block variables are assigned using parallel assignment (somtimes called multiple assignment):
g,h = enum.next
#=> [{:day=>"Monday", :class=>1, :name=>"X"}, {}]
g #=> {:day=>"Monday", :class=>1, :name=>"X"}
h #=> {}
We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
#=> {}.update("Monday"=>{ "1"=>[{name: "X"}] })
#=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
This operation returns the updated value of h, the hash being constructed.
Note that update's argument
"Monday"=>{ "1"=>[{name: "X"}] }
is shorthand for
{ "Monday"=>{ "1"=>[{name: "X"}] } }
Because the key "Monday" was not present in both hashes being merged (h had no keys), the block
{ |_,h1,h2| h1.update(h2) { |_,p,q| p+q } } }
was not used to determine the value of "Monday".
Now the next value of enum is passed to the block and the block variables are assigned:
g,h = enum.next
#=> [{:day=>"Monday", :class=>2, :name=>"Y"},
# {"Monday"=>{"1"=>[{:name=>"X"}]}}]
g #=> {:day=>"Monday", :class=>2, :name=>"Y"}
h #=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
Note that h was updated. We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
# {"Monday"=>{"1"=>[{:name=>"X"}]}}.update("Monday"=>{ "2"=>[{name: "Y"}] })
Both hashes being merged share the key "Monday". We therefore must use the block to determine the merged value of "Monday":
{ |k,h1,h2| h1.update(h2) { |m,p,q| p+q } } }
#=> {"1"=>[{:name=>"X"}]}.update("2"=>[{name: "Y"}])
#=> {"1"=>[{:name=>"X"}], "2"=>[{:name=>"Y"}]}
See the doc for update for an explanation of the block variables k, h1 and h2 for the outer update and m, p and q for the inner update. k and m are the values of the common key. As they are not used in the block calculations, I have replaced them with underscores, which is common practice.
So now:
h #=> { "Monday" => { "1"=>[{ :name=>"X" }], "2"=>[{ :name=>"Y"}] } }
Prior to this operation the hash h["Monday] did not yet have a key 2, so the second update did not require use of the block
{ |_,p,q| p+q }
This block is used, however, when the last element of enum is merged into h, since the values of both :day and :class are the same for the two hashes being merged.
The remaining calculations are similar.

Ruby - sort_by using dynamic keys

I have an array of hashes:
array = [
{
id: 1,
name: "A",
points: 20,
victories: 4,
goals: 5,
},
{
id: 1,
name: "B",
points: 20,
victories: 4,
goals: 8,
},
{
id: 1,
name: "C",
points: 21,
victories: 5,
goals: 8,
}
]
To sort them using two keys I do:
array = array.group_by do |key|
[key[:points], key[:goals]]
end.sort_by(&:first).map(&:last)
But in my program, the sort criterias are stored in a database and I can get them and store in a array for example: ["goals","victories"] or ["name","goals"].
How can I sort the array using dinamic keys?
I tried many ways with no success like this:
criterias_block = []
criterias.each do |criteria|
criterias_block << "key[:#{criteria}]"
end
array = array.group_by do |key|
criterias_block
end.sort_by(&:first).map(&:last)
Array#sort can do this
criteria = [:points, :goals]
array.sort_by { |entry|
criteria.map { |c| entry[c] }
}
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
This works because if you sort an array [[1,2], [1,1], [2,3]], it sorts by the first elements, using any next elements to break ties
You can use values_at:
criteria = ["goals", "victories"]
criteria = criteria.map(&:to_sym)
array = array.group_by do |key|
key.values_at(*criteria)
end.sort_by(&:first).map(&:last)
# => [[{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]]
values_at returns an array of all the keys requested:
array[0].values_at(*criteria)
# => [4, 5]
I suggest doing it like this.
Code
def sort_it(array,*keys)
array.map { |h| [h.values_at(*keys), h] }.sort_by(&:first).map(&:last)
end
Examples
For array as given by you:
sort_it(array, :goals, :victories)
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
sort_it(array, :name, :goals)
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
For the first of these examples, you could of course write:
sort_it(array, *["goals", "victories"].map(&:to_sym))

Resources