Ruby chain two 'group_by' methods - ruby

I have an array of objects that looks like this:
[
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
I want to group them by days, and then by classes i.e.
groupedArray['Monday'] => {'1' => [{name: 'X'}, {name: 'T'}], '2' => [{name: 'Y'}]}
I've seen there is a
group_by { |a| [a.day, a.class]}
But this creates a hash with a [day, class] key.
Is there a way I can achieve this, without having to group them first by day, and then iterate through each day, and group them by class, then pushing them into a new hash?

arr = [
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
One way of obtaining the desired hash is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged. Here that is done twice, first when values of :day are the same, then for each such occurrence, when the values of :class are the same (for a given value of :day).
arr.each_with_object({}) { |g,h|
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] }) { |_,h1,h2|
h1.update(h2) { |_,p,q| p+q } } }
#=> {"Monday" =>{"1"=>[{:name=>"X"}, {:name=>"T"}], "2"=>[{:name=>"Y"}]},
# "Tuesday"=>{"1"=>[{:name=>"Z"}]}}
The steps are as follows.
enum = arr.each_with_object({})
#=> #<Enumerator: [{:day=>"Monday", :class=>1, :name=>"X"},
# {:day=>"Monday", :class=>2, :name=>"Y"},
# {:day=>"Tuesday", :class=>1, :name=>"Z"},
# {:day=>"Monday", :class=>1, :name=>"T"}]:each_with_object({})>
We can see the values that will be generated by this enumerator by converting it to an array:
enum.to_a
#=> [[{:day=>"Monday", :class=>1, :name=>"X"}, {}],
# [{:day=>"Monday", :class=>2, :name=>"Y"}, {}],
# [{:day=>"Tuesday", :class=>1, :name=>"Z"}, {}],
# [{:day=>"Monday", :class=>1, :name=>"T"}, {}]]
The empty hash in each array is the hash being built and returned. It is initially empty, but will be partially formed as each element of enum is processed.
The first element of enum is passed to the block (by Enumerator#each) and the block variables are assigned using parallel assignment (somtimes called multiple assignment):
g,h = enum.next
#=> [{:day=>"Monday", :class=>1, :name=>"X"}, {}]
g #=> {:day=>"Monday", :class=>1, :name=>"X"}
h #=> {}
We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
#=> {}.update("Monday"=>{ "1"=>[{name: "X"}] })
#=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
This operation returns the updated value of h, the hash being constructed.
Note that update's argument
"Monday"=>{ "1"=>[{name: "X"}] }
is shorthand for
{ "Monday"=>{ "1"=>[{name: "X"}] } }
Because the key "Monday" was not present in both hashes being merged (h had no keys), the block
{ |_,h1,h2| h1.update(h2) { |_,p,q| p+q } } }
was not used to determine the value of "Monday".
Now the next value of enum is passed to the block and the block variables are assigned:
g,h = enum.next
#=> [{:day=>"Monday", :class=>2, :name=>"Y"},
# {"Monday"=>{"1"=>[{:name=>"X"}]}}]
g #=> {:day=>"Monday", :class=>2, :name=>"Y"}
h #=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
Note that h was updated. We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
# {"Monday"=>{"1"=>[{:name=>"X"}]}}.update("Monday"=>{ "2"=>[{name: "Y"}] })
Both hashes being merged share the key "Monday". We therefore must use the block to determine the merged value of "Monday":
{ |k,h1,h2| h1.update(h2) { |m,p,q| p+q } } }
#=> {"1"=>[{:name=>"X"}]}.update("2"=>[{name: "Y"}])
#=> {"1"=>[{:name=>"X"}], "2"=>[{:name=>"Y"}]}
See the doc for update for an explanation of the block variables k, h1 and h2 for the outer update and m, p and q for the inner update. k and m are the values of the common key. As they are not used in the block calculations, I have replaced them with underscores, which is common practice.
So now:
h #=> { "Monday" => { "1"=>[{ :name=>"X" }], "2"=>[{ :name=>"Y"}] } }
Prior to this operation the hash h["Monday] did not yet have a key 2, so the second update did not require use of the block
{ |_,p,q| p+q }
This block is used, however, when the last element of enum is merged into h, since the values of both :day and :class are the same for the two hashes being merged.
The remaining calculations are similar.

Related

Extend an array of hash with values from an array

I have this array
types = ['first', 'second', 'third']
and this array of hashes
data = [{query: "A"}, {query: "B"}, {query:"C", type: 'first'}]
Now I have to "extend" each Hash of data with each type if not already exists. All existing keys of the hash must be copied too (eg. :query).
So the final result must be:
results = [
{query: "A", type: 'first'}, {query: "A", type: "second"}, {query: "A", type: "third"},
{query: "B", type: 'first'}, {query: "B", type: "second"}, {query: "D", type: "third"},
{query: "C", type: 'first'}, {query: "C", type: "second"}, {query: "C", type: "third"}
]
the data array is quite big for performance matters.
You can use Array#product to combine both arrays and Hash#merge to add the :type key:
data.product(types).map { |h, t| h.merge(type: t) }
#=> [
# {:query=>"A", :type=>"first"}, {:query=>"A", :type=>"second"}, {:query=>"A", :type=>"third"},
# {:query=>"B", :type=>"first"}, {:query=>"B", :type=>"second"}, {:query=>"B", :type=>"third"},
# {:query=>"C", :type=>"first"}, {:query=>"C", :type=>"second"}, {:query=>"C", :type=>"third"}
# ]
Note that the above will replace existing values for :type with the values from the types array. (there can only be one :type per hash)
If you need more complex logic, you can pass a block to merge which handles existing / conflicting keys, e.g.:
h = { query: 'C', type: 'first' }
t = 'third'
h.merge(type: t) { |h, v1, v2| v1 } # preserve existing value
#=> {:query=>"C", :type=>"first"}
h.merge(type: t) { |h, v1, v2| [v1, v2] } # put both values in an array
#=> {:query=>"C", :type=>["first", "third"]}
We see that each hash in data is mapped to an array of three hashes and the resulting array of three arrays is then to be flattended, suggesting we skip a step by using the method Enumerable#flat_map on data. The construct is as follows.
n = types.size
#=> 3
data.flat_map { |h| n.times.map { |i| ... } }
where ... produces a hash such as
{:query=>"A", :type=>"second"}
Next we see that the value of :type in the array of hashes returned equals :first then :second then :third then :first and so on. That is, the value cycles among the elements of types. Also, the fact that one of the hashes in data has a key :type is irrelevant, as it will be overwritten. Therefore, for each value of i (0, 1 or 2) in map's block above, we wish to merge h with { type: types[i%n] }:
n = types.size
data.flat_map { |h| n.times.map { |i| h.merge(type: types[i%n]) } }
#=> [{:query=>"A", :type=>"first"}, {:query=>"A", :type=>"second"},
# {:query=>"A", :type=>"third"},
# {:query=>"B", :type=>"first"}, {:query=>"B", :type=>"second"},
# {:query=>"B", :type=>"third"},
# {:query=>"C", :type=>"first"}, {:query=>"C", :type=>"second"},
# {:query=>"C", :type=>"third"}]
We may alternatively make use of the method Array#cycle.
enum = types.cycle
#=> #<Enumerator: ["first", "second", "third"]:cycle>
As the name of the method suggests,
enum.next
#=> "first"
enum.next
#=> "second"
enum.next
#=> "third"
enum.next
#=> "first"
enum.next
#=> "second"
...
ad infinitum. Before continuing let me reset the enumerator.
enum.rewind
See Enumerator#next and Enumerator#rewind.
n = types.size
data.flat_map { |h| n.times.map { h.merge(type: enum.next) } }
#=> <as above>

Merge hash of arrays into array of hashes

So, I have a hash with arrays, like this one:
{"name": ["John","Jane","Chris","Mary"], "surname": ["Doe","Doe","Smith","Martins"]}
I want to merge them into an array of hashes, combining the corresponding elements.
The results should be like that:
[{"name"=>"John", "surname"=>"Doe"}, {"name"=>"Jane", "surname"=>"Doe"}, {"name"=>"Chris", "surname"=>"Smith"}, {"name"=>"Mary", "surname"=>"Martins"}]
Any idea how to do that efficiently?
Please, note that the real-world use scenario could contain a variable number of hash keys.
Try this
h[:name].zip(h[:surname]).map do |name, surname|
{ 'name' => name, 'surname' => surname }
end
I suggest writing the code to permit arbitrary numbers of attributes. It's no more difficult than assuming there are two (:name and :surname), yet it provides greater flexibility, accommodating, for example, future changes to the number or naming of attributes:
def squish(h)
keys = h.keys.map(&:to_s)
h.values.transpose.map { |a| keys.zip(a).to_h }
end
h = { name: ["John", "Jane", "Chris"],
surname: ["Doe", "Doe", "Smith"],
age: [22, 34, 96]
}
squish(h)
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
The steps for the example above are as follows:
b = h.keys
#=> [:name, :surname, :age]
keys = b.map(&:to_s)
#=> ["name", "surname", "age"]
c = h.values
#=> [["John", "Jane", "Chris"], ["Doe", "Doe", "Smith"], [22, 34, 96]]
d = c.transpose
#=> [["John", "Doe", 22], ["Jane", "Doe", 34], ["Chris", "Smith", 96]]
d.map { |a| keys.zip(a).to_h }
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
In the last step the first value of b is passed to map's block and the block variable is assigned its value.
a = d.first
#=> ["John", "Doe", 22]
e = keys.zip(a)
#=> [["name", "John"], ["surname", "Doe"], ["age", 22]]
e.to_h
#=> {"name"=>"John", "surname"=>"Doe", "age"=>22}
The remaining calculations are similar.
If your dataset is really big, you can consider using Enumerator::Lazy.
This way Ruby will not create intermediate arrays during calculations.
This is how #Ursus answer can be improved:
h[:name]
.lazy
.zip(h[:surname])
.map { |name, surname| { 'name' => name, 'surname' => surname } }
.to_a
Other option for the case where:
[..] the real-world use scenario could contain a variable number of hash keys
h = {
'name': ['John','Jane','Chris','Mary'],
'surname': ['Doe','Doe','Smith','Martins'],
'whathever': [1, 2, 3, 4, 5]
}
You could use Object#then with a splat operator in a one liner:
h.values.then { |a, *b| a.zip *b }.map { |e| (h.keys.zip e).to_h }
#=> [{:name=>"John", :surname=>"Doe", :whathever=>1}, {:name=>"Jane", :surname=>"Doe", :whathever=>2}, {:name=>"Chris", :surname=>"Smith", :whathever=>3}, {:name=>"Mary", :surname=>"Martins", :whathever=>4}]
The first part, works this way:
h.values.then { |a, *b| a.zip *b }
#=> [["John", "Doe", 1], ["Jane", "Doe", 2], ["Chris", "Smith", 3], ["Mary", "Martins", 4]]
The last part just maps the elements zipping each with the original keys then calling Array#to_h to convert to hash.
Here I removed the call .to_h to show the intermediate result:
h.values.then { |a, *b| a.zip *b }.map { |e| h.keys.zip e }
#=> [[[:name, "John"], [:surname, "Doe"], [:whathever, 1]], [[:name, "Jane"], [:surname, "Doe"], [:whathever, 2]], [[:name, "Chris"], [:surname, "Smith"], [:whathever, 3]], [[:name, "Mary"], [:surname, "Martins"], [:whathever, 4]]]
[h[:name], h[:surname]].transpose.map do |name, surname|
{ 'name' => name, 'surname' => surname }
end

Understanding Ruby sort_by

I am new to ruby and I struggle understanding sort_by!. This method does something magically I really don't understand.
Here is a simple example:
pc= ["Z6","Z5","Z4"]
c = [
{
id: "Z4",
name: "zlah1"
},
{
id: "Z5",
name: "blah2"
},
{
id: "Z6",
name: "clah3"
}
]
c.sort_by! do |c|
pc.index c[:id]
end
This procedure returns:
=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]
It somehow reverses the array order. How does it do that? pc.index c[:id] just returns a number. What does this method do under the hood? The documentation is not very beginners friendly.
Suppose we are given the following:
order = ['Z6', 'Z5', 'Z4']
array = [{id: 'Z4', name: 'zlah1'},
{id: 'Z5', name: 'blah2'},
{id: 'Z6', name: 'clah3'},
{id: 'Z5', name: 'dlah4'}]
Notice that I added a 4th hash ({id: 'Z5', name: 'dlah4'}) to the array array given in the question. I did this so that two elements of array would the same value for the key :id ("Z5").
Now let's consider how Ruby might implement the following:
array.sort_by { |hash| order.index(hash[:id]) }
#=> [{:id=>"Z6", :naCme=>"clah3"},
# {:id=>"Z5", :name=>"blah2"},
# {:id=>"Z5", :name=>"dlah4"},
# {:id=>"Z4", :name=>"zlah1"}]
That could be done in four steps.
Step 1: Create a hash that maps the values of the sort criterion to the values of sort_by's receiver
sort_map = array.each_with_object(Hash.new { |h,k| h[k] = [] }) do |hash,h|
h[order.index(hash[:id])] << hash
end
#=> {2=>[{:id=>"Z4", :name=>"zlah1"}],
# 1=>[{:id=>"Z5", :name=>"blah2"}, {:id=>"Z5", :name=>"dlah4"}],
# 0=>[{:id=>"Z6", :name=>"clah3"}]}
h = Hash.new { |h,k| h[k] = [] } creates an empty hash with a default proc that operates as follows when evaluating:
h[k] << hash
If h has a key k this operation is performed as usual. If, however, h does not have a key k the proc is called, causing the operation h[k] = [] to be performed, after which h[k] << hash is executed as normal.
The values in this hash must be arrays, rather than individual elements of array, due the possibility that, as here, two elements of sort_by's receiver map to the same key. Note that this operation has nothing to do with the particular mapping of the elements of sort_by's receiver to the sort criterion.
Step 2: Sort the keys of sort_map
keys = sort_map.keys
#=> [2, 1, 0]
sorted_keys = keys.sort
#=> [0, 1, 2]
Step 3: Map sorted_keys to the values of sort_map
sort_map_values = sorted_keys.map { |k| sort_map[k] }
#=> [[{:id=>"Z6", :name=>"clah3"}],
# [{:id=>"Z5", :name=>"blah2"}, {:id=>"Z5", :name=>"dlah4"}],
# [{:id=>"Z4", :name=>"zlah1"}]]
Step 4: Flatten sort_map_values
sort_map_values.flatten
#=> [{:id=>"Z6", :name=>"clah3"},
# {:id=>"Z5", :name=>"blah2"},
# {:id=>"Z5", :name=>"dlah4"},
# {:id=>"Z4", :name=>"zlah1"}]
One of the advantages of using sort_by rather than sort (with a block) is that the sort criterion (here order.index(hash[:id])) is computed only once for each element of sort_by's receiver, whereas sort would recompute these values for each pairwise comparison in its block. The time savings can be considerable if this operation is computationally expensive.
order = ['Z6', 'Z5', 'Z4']
array = [{id: 'Z4', name: 'zlah1'},
{id: 'Z5', name: 'blah2'},
{id: 'Z6', name: 'clah3'}]
array.sort_by { |hash| order.index(hash[:id]) }
#=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]
This doesn't magically reverse the order of the array. To explain what happens we first need to understand what order.index(hash[:id]) does. This becomes better visible with the map method.
array.map { |hash| order.index(hash[:id]) }
#=> [2, 1, 0]
Like you can see, the first element with id 'Z4' will return the number 2 since 'Z4' in the order array has index 2. The same happens with all other array elements. The retuned value is used to sort the objects, sort_by will always sort asynchronous, so the order of the above array should become [0, 1, 2]. However, the actual content is not replaced, the number is only used for comparison vs other elements. Thus resulting in:
#=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]

Merge duplicates in array of hashes

I have an array of hashes in ruby:
[
{name: 'one', tags: 'xxx'},
{name: 'two', tags: 'yyy'},
{name: 'one', tags: 'zzz'},
]
and i'm looking for any clean ruby solution, which will make it able to simply merge all the duplicates in that array (by merging i mean concatinating the tags param) so the above example will be transformed to:
[
{name: 'one', tags: 'xxx, zzz'},
{name: 'two', tags: 'yyy'},
]
I can iterate through each array element, check if there is a duplicate, merge it with the original entry and delete the duplicate but i feel that there can be a better solution for this and that there are some caveats in such approach i don't know about. Thanks for any clue.
I can think of as
arr = [
{name: 'one', tags: 'xxx'},
{name: 'two', tags: 'yyy'},
{name: 'one', tags: 'zzz'},
]
merged_array_hash = arr.group_by { |h1| h1[:name] }.map do |k,v|
{ :name => k, :tags => v.map { |h2| h2[:tags] }.join(" ,") }
end
merged_array_hash
# => [{:name=>"one", :tags=>"xxx ,zzz"}, {:name=>"two", :tags=>"yyy"}]
Here's a way that makes use of the form of Hash#update (aka Hash.merge!) that takes a block for determining the merged value for every key that is present in both of the two hashes being merged.
Code
def combine(a)
a.each_with_object({}) { |g,h| h.update({ g[:name]=>g }) { |k,hv,gv|
{ name: k, tags: hv[:tags]+", "+gv[:tags] } } }.values
end
Example
a = [{name: 'one', tags: 'uuu'},
{name: 'two', tags: 'vvv'},
{name: 'one', tags: 'www'},
{name: 'six', tags: 'xxx'},
{name: 'one', tags: 'yyy'},
{name: 'two', tags: 'zzz'}]
combine(a)
#=> [{:name=>"one", :tags=>"uuu, www, yyy"},
# {:name=>"two", :tags=>"vvv, zzz" },
# {:name=>"six", :tags=>"xxx" }]
Explanation
Suppose
a = [{name: 'one', tags: 'uuu'},
{name: 'two', tags: 'vvv'},
{name: 'one', tags: 'www'}]
b = a.each_with_object({})
#=> #<Enumerator: [{:name=>"one", :tags=>"uuu"},
# {:name=>"two", :tags=>"vvv"},
# {:name=>"one", :tags=>"www"}]:each_with_object({})>
We can convert the enumerator b to an array to see what values it will pass into its block:
b.to_a
#=> [[{:name=>"one", :tags=>"uuu"}, {}],
# [{:name=>"two", :tags=>"vvv"}, {}],
# [{:name=>"one", :tags=>"www"}, {}]]
The first value passed to the block and assigned to the block variables is:
g,h = [{:name=>"one", :tags=>"uuu"}, {}]
g #=> {:name=>"one", :tags=>"uuu"}
h #=> {}
The first merge operation is now performed (the merged h is returned):
h.update({ g[:name] => g })
#=> h.update({ "one" => {:name=>"one", :tags=>"uuu"} })
#=> {"one"=>{:name=>"one", :tags=>"uuu"}}
h does not have the key "one", so update's block is not involed.
Next, the enumerator b passes the following into the block:
g #=> {:name=>"two", :tags=>"vvv"}
h #=> {"one"=>{:name=>"one", :tags=>"uuu"}}
so we execute:
h.update({ g[:name] => g })
#=> h.update({ "two"=>{:name=>"two", :tags=>"vvv"})
#=> {"one"=>{:name=>"one", :tags=>"uuu"},
# "two"=>{:name=>"two", :tags=>"vvv"}}
Again, h does not have the key "two", so the block is not used.
Lastly, each_with_object passes the final tuple into the block:
g #=> {:name=>"one", :tags=>"www"}
h #=> {"one"=>{:name=>"one", :tags=>"uuu"},
# "two"=>{:name=>"two", :tags=>"vvv"}}
and we execute:
h.update({ g[:name] => g })
#=> h.update({ "one"=>{:name=>"one", :tags=>"www"})
h has a key/value pair with key "one":
"one"=>{:name=>"one", :tags=>"uuu"}
update's block is therefore executed to determine the merged value. The following values are passed to that block's variables:
k #=> "one"
hv #=> {:name=>"one", :tags=>"uuu"} <h's value for "one">
gv #=> {:name=>"one", :tags=>"www"} <g's value for "one">
and the block calculation creates this hash (as the merged value for the key "one"):
{ name: k, tags: hv[:tags]+", "+gv[:tags] }
#=> { name: "one", tags: "uuu" + ", " + "www" }
#=> { name: "one", tags: "uuu, www" }
So the merged hash now becomes:
h #=> {"one"=>{:name=>"one", :tags=>"uuu, www"},
# "two"=>{:name=>"two", :tags=>"vvv" }}
All that remains is to extract the values:
h.values
#=> [{:name=>"one", :tags=>"uuu, www"}, {:name=>"two", :tags=>"vvv"}]

Search one hash for its keys, grab the values, put them in an array and merge the first hash with the second

I have a hash with items like
{'people'=>'50'},
{'chairs'=>'23'},
{'footballs'=>'5'},
{'crayons'=>'1'},
and I have another hash with
{'people'=>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people'=>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people'=>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people'=>'http://www.thing.com/the-post'},
{'crayons'=>'http://www.thing.com/the-blah'},
{'chairs'=>'http://www.thing.com/the-page'},
and I want something like the following that takes the first hash and then looks through the second one, grabs all the links for each word and puts them into a array appended onto the end of the hash somehow.
{'people', '50' => {'http://www.thing.com/this-post', 'http://www.thing.com/nice-post', 'http://www.thing.com/thingy-post'}},
{'footballs', '5' => {'http://www.thing.com/the-post', 'http://www.thing.com/the-post'}},
{'crayons', '1' => {'http://www.thing.com/the-blah'}},
{'chairs', '23' => {'chairs'=>'http://www.thing.com/the-page'}},
I am very new to Ruby, and I have tried quite a few combinations, and I need some help.
Excuse the example, I hope that it makes sense.
What you have is a mix of hashes, arrays, and something in the middle. I'm going to assume the following inputs:
categories = {'people'=>'50',
'chairs'=>'23',
'footballs'=>'5',
'crayons'=>'1'}
and:
posts = [['people', 'http://www.thing.com/this-post'],
['footballs','http://www.thing.com/that-post'],
['people','http://www.thing.com/nice-post'],
['footballs','http://www.thing.com/other-post'],
['people','http://www.thing.com/thingy-post'],
['footballs','http://www.thing.com/the-post'],
['people','http://www.thing.com/the-post'],
['crayons','http://www.thing.com/the-blah'],
['chairs','http://www.thing.com/the-page']]
and the following output:
[['people', '50', ['http://www.thing.com/this-post', 'http://www.thing.com/nice-post', 'http://www.thing.com/thingy-post']],
[['footballs', '5', ['http://www.thing.com/the-post', 'http://www.thing.com/the-post']],
['crayons', '1', ['http://www.thing.com/the-blah']],
['chairs', '23' => {'chairs'=>'http://www.thing.com/the-page']]]
In which case what you would need is:
categories.map do |name, count|
[name, count, posts.select do |category, _|
category == name
end.map { |_, post| post }]
end
You need to understand the different syntax for Array and Hash in Ruby:
Hash:
{ 'key1' => 'value1',
'key2' => 'value2' }
Array:
[ 'item1', 'item2', 'item3', 'item4' ]
A Hash in ruby (like in every other language) can't have more than once instance of any single key, meaning that a Hash {'key1' => 1, 'key1' => 2} is invalid and will result in an unexpected value (duplicate keys are overridden - you'll have {'key1' => 2 }).
Since there is some confusion about the format of the data, I will suggest how you might effectively structure both the input and the output. I will first present some code you could use, then give an example of how it's used, then explain what is happening.
Code
def merge_em(hash, array)
hash_keys = hash.keys
new_hash = hash_keys.each_with_object({}) { |k,h|
h[k] = { qty: hash[k], http: [] } }
array.each do |h|
h.keys.each do |k|
(new_hash.update({k=>h[k]}) { |k,g,http|
{qty: g[:qty], http: (g[:http] << http)}}) if hash_keys.include?(k)
end
end
new_hash
end
Example
Here is a hash that I have modified to include a key/value pair that does not appear in the array below:
hash = {'people' =>'50', 'chairs' =>'23', 'footballs'=>'5',
'crayons'=> '1', 'cat_lives'=> '9'}
Below is your array of hashes that is to be merged into hash. You'll see I've added a key/value pair to your hash with key "chairs". As I hope to make clear, the code is no different (i.e., not simplified) if we know in advance that each hash has only one key value pair. (Aside: if, for example, we want want the key from a hash h that is known to have only one key, we still have to pull out all the keys into an array and then take the only element of the array: h.keys.first).
I have also added a hash to the array that has no key that is among hash's keys.
array =
[{'people' =>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people' =>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people' =>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people' =>'http://www.thing.com/the-post'},
{'crayons' =>'http://www.thing.com/the-blah'},
{'chairs' =>'http://www.thing.com/the-page',
'crayons' =>'http://www.thing.com/blah'},
{'balloons' =>'http://www.thing.com/the-page'}
]
We now merge the information from array into hash, and at the same time change the structure of hash to something more suitable:
result = merge_em(hash, array)
#=> {"people" =>{:qty=>"50",
# :http=>["http://www.thing.com/this-post",
# "http://www.thing.com/nice-post",
# "http://www.thing.com/thingy-post",
# "http://www.thing.com/the-post"]},
# "chairs" =>{:qty=>"23",
# :http=>["http://www.thing.com/the-page"]},
# "footballs"=>{:qty=>"5",
# :http=>["http://www.thing.com/that-post",
# "http://www.thing.com/other-post",
# "http://www.thing.com/the-post"]},
# "crayons" =>{:qty=>"1",
# :http=>["http://www.thing.com/the-blah",
# "http://www.thing.com/blah"]},
# "cat_lives"=>{:qty=>"9",
# :http=>[]}}
I've assumed you want to look up the content of result with hash's keys. It is therefore convenient to make the values associated with those keys hashes themselves, with keys :qty and http. The former is for the values in hash (the naming may be wrong); the latter is an array containing the strings drawn from array.
This way, if we want the value for the key "crayons", we could write:
result["crayons"]
#=> {:qty=>"1",
# :http=>["http://www.thing.com/the-blah", "http://www.thing.com/blah"]}
or
irb(main):133:0> result["crayons"][:qty]
#=> "1"
irb(main):134:0> result["crayons"][:http]
#=> ["http://www.thing.com/the-blah", "http://www.thing.com/blah"]
Explanation
Let's go through this line-by-line. First, we need to reference hash.keys more than once, so let's make it a variable:
hash_keys = hash.keys
#=> ["people", "chairs", "footballs", "crayons", "cat_lives"]
We may as well convert this hash to the output format now. We could do it during the merge operation below, but I think is clearer to do it as a separate step:
new_hash = hash_keys.each_with_object({}) { |k,h|
h[k] = { qty: hash[k], http: [] } }
#=> {"people" =>{:qty=>"50", :http=>[]},
# "chairs" =>{:qty=>"23", :http=>[]},
# "footballs"=>{:qty=>"5", :http=>[]},
# "crayons" =>{:qty=>"1", :http=>[]},
# "cat_lives"=>{:qty=>"9", :http=>[]}}
Now we merge each (hash) element of array into new_hash:
array.each do |h|
h.keys.each do |k|
(new_hash.update({k=>h[k]}) { |k,g,http|
{ qty: g[:qty], http: (g[:http] << http) } }) if hash_keys.include?(k)
end
end
The first hash h from array that is passed into the block by each is:
{'people'=>'http://www.thing.com/this-post'}
which is assigned to the block variable h. We next construct an array of h's keys:
h.keys #=> ["people"]
The first of these keys, "people" (pretend there were more, as there would be in the penultimate element of array) is passed by its each into the inner block, whose block variable, k, is assigned the value "people". We then use Hash#update (aka merge!) to merge the hash:
{k=>h[k]} #=> {"people"=>'http://www.thing.com/this-post'}
into new_hash, but only because:
hash_keys.include?(k)
#=> ["people", "chairs", "footballs", "crayons", "cat_lives"].include?("people")
#=> true
evaluates to true. Note that this will evaluate to false for the key "balloons", so the hash in array with that key will not be merged. update's block:
{ |k,g,http| { qty: g[:qty], http: (g[:http] << http) } }
is crucial. This is update's way of determining the value of a key that is in both new_hash and in the hash being merged, {k=>h[k]}. The three block variables are assigned the following values by update:
k : the key ("people")
g : the current value of `new_hash[k]`
#=> `new_hash["people"] => {:qty=>"50", :http=>[]}`
http: the value of the key/value being merged
#=> 'http://www.thing.com/this-post'
We want the merged hash value for key "people" to be:
{ qty: g[:qty], http: (g[:http] << http) }
#=> { qty: 50, http: ([] << 'http://www.thing.com/this-post') }
#=> { qty: 50, http: ['http://www.thing.com/this-post'] }
so now:
new_hash
#=> {"people" =>{:qty=>"50", :http=>['http://www.thing.com/this-post']},
# "chairs" =>{:qty=>"23", :http=>[]},
# "footballs"=>{:qty=>"5", :http=>[]},
# "crayons" =>{:qty=>"1", :http=>[]},
# "cat_lives"=>{:qty=>"9", :http=>[]}}
We do the same for each of the other elements of array.
Lastly, we need to return the merged new_hash, so we make last line of the method:
new_hash
You could also do this
cat = [{'people'=>'50'},
{'chairs'=>'23'},
{'footballs'=>'5'},
{'crayons'=>'1'}]
pages = [{'people'=>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people'=>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people'=>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people'=>'http://www.thing.com/the-post'},
{'crayons'=>'http://www.thing.com/the-blah'},
{'chairs'=>'http://www.thing.com/the-page'}]
cat.map do |c|
c.merge(Hash['pages',pages.collect{|h| h[c.keys.pop]}.compact])
end
#=> [{"people"=>"50", "pages"=>["http://www.thing.com/this-post", "http://www.thing.com/nice-post", "http://www.thing.com/thingy-post", "http://www.thing.com/the-post"]},
{"chairs"=>"23", "pages"=>["http://www.thing.com/the-page"]},
{"footballs"=>"5", "pages"=>["http://www.thing.com/that-post", "http://www.thing.com/other-post", "http://www.thing.com/the-post"]},
{"crayons"=>"1", "pages"=>["http://www.thing.com/the-blah"]}]
Which is closer to your request but far less usable than some of the other posts.

Resources