Understanding Ruby sort_by - ruby

I am new to ruby and I struggle understanding sort_by!. This method does something magically I really don't understand.
Here is a simple example:
pc= ["Z6","Z5","Z4"]
c = [
{
id: "Z4",
name: "zlah1"
},
{
id: "Z5",
name: "blah2"
},
{
id: "Z6",
name: "clah3"
}
]
c.sort_by! do |c|
pc.index c[:id]
end
This procedure returns:
=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]
It somehow reverses the array order. How does it do that? pc.index c[:id] just returns a number. What does this method do under the hood? The documentation is not very beginners friendly.

Suppose we are given the following:
order = ['Z6', 'Z5', 'Z4']
array = [{id: 'Z4', name: 'zlah1'},
{id: 'Z5', name: 'blah2'},
{id: 'Z6', name: 'clah3'},
{id: 'Z5', name: 'dlah4'}]
Notice that I added a 4th hash ({id: 'Z5', name: 'dlah4'}) to the array array given in the question. I did this so that two elements of array would the same value for the key :id ("Z5").
Now let's consider how Ruby might implement the following:
array.sort_by { |hash| order.index(hash[:id]) }
#=> [{:id=>"Z6", :naCme=>"clah3"},
# {:id=>"Z5", :name=>"blah2"},
# {:id=>"Z5", :name=>"dlah4"},
# {:id=>"Z4", :name=>"zlah1"}]
That could be done in four steps.
Step 1: Create a hash that maps the values of the sort criterion to the values of sort_by's receiver
sort_map = array.each_with_object(Hash.new { |h,k| h[k] = [] }) do |hash,h|
h[order.index(hash[:id])] << hash
end
#=> {2=>[{:id=>"Z4", :name=>"zlah1"}],
# 1=>[{:id=>"Z5", :name=>"blah2"}, {:id=>"Z5", :name=>"dlah4"}],
# 0=>[{:id=>"Z6", :name=>"clah3"}]}
h = Hash.new { |h,k| h[k] = [] } creates an empty hash with a default proc that operates as follows when evaluating:
h[k] << hash
If h has a key k this operation is performed as usual. If, however, h does not have a key k the proc is called, causing the operation h[k] = [] to be performed, after which h[k] << hash is executed as normal.
The values in this hash must be arrays, rather than individual elements of array, due the possibility that, as here, two elements of sort_by's receiver map to the same key. Note that this operation has nothing to do with the particular mapping of the elements of sort_by's receiver to the sort criterion.
Step 2: Sort the keys of sort_map
keys = sort_map.keys
#=> [2, 1, 0]
sorted_keys = keys.sort
#=> [0, 1, 2]
Step 3: Map sorted_keys to the values of sort_map
sort_map_values = sorted_keys.map { |k| sort_map[k] }
#=> [[{:id=>"Z6", :name=>"clah3"}],
# [{:id=>"Z5", :name=>"blah2"}, {:id=>"Z5", :name=>"dlah4"}],
# [{:id=>"Z4", :name=>"zlah1"}]]
Step 4: Flatten sort_map_values
sort_map_values.flatten
#=> [{:id=>"Z6", :name=>"clah3"},
# {:id=>"Z5", :name=>"blah2"},
# {:id=>"Z5", :name=>"dlah4"},
# {:id=>"Z4", :name=>"zlah1"}]
One of the advantages of using sort_by rather than sort (with a block) is that the sort criterion (here order.index(hash[:id])) is computed only once for each element of sort_by's receiver, whereas sort would recompute these values for each pairwise comparison in its block. The time savings can be considerable if this operation is computationally expensive.

order = ['Z6', 'Z5', 'Z4']
array = [{id: 'Z4', name: 'zlah1'},
{id: 'Z5', name: 'blah2'},
{id: 'Z6', name: 'clah3'}]
array.sort_by { |hash| order.index(hash[:id]) }
#=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]
This doesn't magically reverse the order of the array. To explain what happens we first need to understand what order.index(hash[:id]) does. This becomes better visible with the map method.
array.map { |hash| order.index(hash[:id]) }
#=> [2, 1, 0]
Like you can see, the first element with id 'Z4' will return the number 2 since 'Z4' in the order array has index 2. The same happens with all other array elements. The retuned value is used to sort the objects, sort_by will always sort asynchronous, so the order of the above array should become [0, 1, 2]. However, the actual content is not replaced, the number is only used for comparison vs other elements. Thus resulting in:
#=> [{:id=>"Z6", :name=>"clah3"}, {:id=>"Z5", :name=>"blah2"}, {:id=>"Z4", :name=>"zlah1"}]

Related

Merging Three hashes and getting this resultant hash

I have read the xls and have formed these three hashes
hash1=[{'name'=>'Firstname',
'Locator'=>'id=xxx',
'Action'=>'TypeAndWait'},
{'name'=>'Password',
'Locator'=>'id=yyy',
'Action'=>'TypeAndTab'}]
Second Hash
hash2=[{'Test Name'=>'Example',
'TestNumber'=>'Test1'},
{'Test Name'=>'Example',
'TestNumber'=>'Test2'}]
My Thrid Hash
hash3=[{'name'=>'Firstname',
'Test1'=>'four',
'Test2'=>'Five',
'Test3'=>'Six'},
{'name'=>'Password',
'Test1'=>'Vicky',
'Test2'=>'Sujin',
'Test3'=>'Sivaram'}]
Now my resultant hash is
result={"Example"=>
{"Test1"=>
{'Firstname'=>
["id=xxx","four", "TypeAndWait"],
'Password'=>
["id=yyy","Vicky", "TypeAndTab"]},
"Test2"=>
{'Firstname'=>
["id=xxx","Five", "TypeAndWait"],
'Password'=>
["id=yyy","Sujin", "TypeAndTab"]}}}
I have gotten this result, but I had to write 60 lines of code in my program, but I don't think I have to write such a long program when I use Ruby, I strongly believe some easy way to achieve this. Can some one help me?
The second hash determines the which testcase has to be read, for an example, test3 is not present in the second testcase so resultant hash doesn't have test3.
We are given three arrays, which I've renamed arr1, arr2 and arr3. (hash1, hash2 and hash3 are not especially good names for arrays. :-))
arr1 = [{'name'=>'Firstname', 'Locator'=>'id=xxx', 'Action'=>'TypeAndWait'},
{'name'=>'Password', 'Locator'=>'id=yyy', 'Action'=>'TypeAndTab'}]
arr2 = [{'Test Name'=>'Example', 'TestNumber'=>'Test1'},
{'Test Name'=>'Example', 'TestNumber'=>'Test2'}]
arr3=[{'name'=>'Firstname', 'Test1'=>'four', 'Test2'=>'Five', 'Test3'=>'Six'},
{'name'=>'Password', 'Test1'=>'Vicky', 'Test2'=>'Sujin', 'Test3'=>'Sivaram'}]
The drivers are the values "Test1" and "Test2" in the hashes that are elements of arr2. Nothing else in that array is needed, so let's extract those values (of which there could be any number, but here there are just two).
a2 = arr2.map { |h| h['TestNumber'] }
#=> ["Test1", "Test2"]
Next we need to rearrange the information in arr3 by creating a hash whose keys are the elements of a2.
h3 = a2.each_with_object({}) { |test,h|
h[test] = arr3.each_with_object({}) { |f,g| g[f['name']] = f[test] } }
#=> {"Test1"=>{"Firstname"=>"four", "Password"=>"Vicky"},
# "Test2"=>{"Firstname"=>"Five", "Password"=>"Sujin"}}
Next we need to rearrange the content of arr1 by creating a hash whose keys match the keys of values of h3.
h1 = arr1.each_with_object({}) { |g,h| h[g['name']] = g.reject { |k,_| k == 'name' } }
#=> {"Firstname"=>{"Locator"=>"id=xxx", "Action"=>"TypeAndWait"},
# "Password"=>{"Locator"=>"id=yyy", "Action"=>"TypeAndTab"}}
It is now a simple matter of extracting information from these three objects.
{ 'Example'=>
a2.each_with_object({}) do |test,h|
h[test] = h3[test].each_with_object({}) do |(k,v),g|
f = h1[k]
g[k] = [f['Locator'], v, f['Action']]
end
end
}
#=> {"Example"=>
# {"Test1"=>{"Firstname"=>["id=xxx", "four", "TypeAndWait"],
# "Password"=>["id=yyy", "Vicky", "TypeAndTab"]},
# "Test2"=>{"Firstname"=>["id=xxx", "Five", "TypeAndWait"],
# "Password"=>["id=yyy", "Sujin", "TypeAndTab"]}}}
What do you call hash{1-2-3} are arrays in the first place. Also, I am pretty sure you have mistyped hash1#Locator and/or hash3#name. The code below works for this exact data, but it should not be hard to update it to reflect any changes.
hash2.
map(&:values).
group_by(&:shift).
map do |k, v|
[k, v.flatten.map do |k, v|
[k, hash3.map do |h3|
# lookup a hash from hash1
h1 = hash1.find do |h1|
h3['name'].start_with?(h1['Locator'])
end
# can it be nil btw?
[
h1['name'],
[
h3['name'][/.*(?=-id)/],
h3[k],
h1['Action']
]
]
end.to_h]
end.to_h]
end.to_h

Ruby chain two 'group_by' methods

I have an array of objects that looks like this:
[
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
I want to group them by days, and then by classes i.e.
groupedArray['Monday'] => {'1' => [{name: 'X'}, {name: 'T'}], '2' => [{name: 'Y'}]}
I've seen there is a
group_by { |a| [a.day, a.class]}
But this creates a hash with a [day, class] key.
Is there a way I can achieve this, without having to group them first by day, and then iterate through each day, and group them by class, then pushing them into a new hash?
arr = [
{day: 'Monday', class: 1, name: 'X'},
{day: 'Monday', class: 2, name: 'Y'},
{day: 'Tuesday', class: 1, name: 'Z'},
{day: 'Monday', class: 1, name: 'T'}
]
One way of obtaining the desired hash is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged. Here that is done twice, first when values of :day are the same, then for each such occurrence, when the values of :class are the same (for a given value of :day).
arr.each_with_object({}) { |g,h|
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] }) { |_,h1,h2|
h1.update(h2) { |_,p,q| p+q } } }
#=> {"Monday" =>{"1"=>[{:name=>"X"}, {:name=>"T"}], "2"=>[{:name=>"Y"}]},
# "Tuesday"=>{"1"=>[{:name=>"Z"}]}}
The steps are as follows.
enum = arr.each_with_object({})
#=> #<Enumerator: [{:day=>"Monday", :class=>1, :name=>"X"},
# {:day=>"Monday", :class=>2, :name=>"Y"},
# {:day=>"Tuesday", :class=>1, :name=>"Z"},
# {:day=>"Monday", :class=>1, :name=>"T"}]:each_with_object({})>
We can see the values that will be generated by this enumerator by converting it to an array:
enum.to_a
#=> [[{:day=>"Monday", :class=>1, :name=>"X"}, {}],
# [{:day=>"Monday", :class=>2, :name=>"Y"}, {}],
# [{:day=>"Tuesday", :class=>1, :name=>"Z"}, {}],
# [{:day=>"Monday", :class=>1, :name=>"T"}, {}]]
The empty hash in each array is the hash being built and returned. It is initially empty, but will be partially formed as each element of enum is processed.
The first element of enum is passed to the block (by Enumerator#each) and the block variables are assigned using parallel assignment (somtimes called multiple assignment):
g,h = enum.next
#=> [{:day=>"Monday", :class=>1, :name=>"X"}, {}]
g #=> {:day=>"Monday", :class=>1, :name=>"X"}
h #=> {}
We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
#=> {}.update("Monday"=>{ "1"=>[{name: "X"}] })
#=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
This operation returns the updated value of h, the hash being constructed.
Note that update's argument
"Monday"=>{ "1"=>[{name: "X"}] }
is shorthand for
{ "Monday"=>{ "1"=>[{name: "X"}] } }
Because the key "Monday" was not present in both hashes being merged (h had no keys), the block
{ |_,h1,h2| h1.update(h2) { |_,p,q| p+q } } }
was not used to determine the value of "Monday".
Now the next value of enum is passed to the block and the block variables are assigned:
g,h = enum.next
#=> [{:day=>"Monday", :class=>2, :name=>"Y"},
# {"Monday"=>{"1"=>[{:name=>"X"}]}}]
g #=> {:day=>"Monday", :class=>2, :name=>"Y"}
h #=> {"Monday"=>{"1"=>[{:name=>"X"}]}}
Note that h was updated. We now perform the block calculation:
h.update(g[:day]=>{ g[:class].to_s=>[{name: g[:name] }] })
# {"Monday"=>{"1"=>[{:name=>"X"}]}}.update("Monday"=>{ "2"=>[{name: "Y"}] })
Both hashes being merged share the key "Monday". We therefore must use the block to determine the merged value of "Monday":
{ |k,h1,h2| h1.update(h2) { |m,p,q| p+q } } }
#=> {"1"=>[{:name=>"X"}]}.update("2"=>[{name: "Y"}])
#=> {"1"=>[{:name=>"X"}], "2"=>[{:name=>"Y"}]}
See the doc for update for an explanation of the block variables k, h1 and h2 for the outer update and m, p and q for the inner update. k and m are the values of the common key. As they are not used in the block calculations, I have replaced them with underscores, which is common practice.
So now:
h #=> { "Monday" => { "1"=>[{ :name=>"X" }], "2"=>[{ :name=>"Y"}] } }
Prior to this operation the hash h["Monday] did not yet have a key 2, so the second update did not require use of the block
{ |_,p,q| p+q }
This block is used, however, when the last element of enum is merged into h, since the values of both :day and :class are the same for the two hashes being merged.
The remaining calculations are similar.

Flatten deep nested hash to array for sha1 hashing

I want to compute an unique sha1 hash from a ruby hash. I thought about
(Deep) Converting the Hash into an array
Sorting the array
Join array by empty string
calculate sha1
Consider the following hash:
hash = {
foo: "test",
bar: [1,2,3]
hello: {
world: "world",
arrays: [
{foo: "bar"}
]
}
}
How can I get this kind of nested hash into an array like
[:foo, "test", :bar, 1, 2, 3, :hello, :world, "earth", :arrays, :my, "example"]
I would then sort the array, join it with array.join("") and compute the sha1 hash like this:
require 'digest/sha1'
Digest::SHA1.hexdigest hash_string
How could I flatten the hash like I described above?
Is there already a gem for this?
Is there a quicker / easier way to solve this? I have a large amount of objects to convert (~700k), so performance does matter.
EDIT
Another problem that I figured out by the answers below are this two hashes:
a = {a: "a", b: "b"}
b = {a: "b", b: "a"}
When flattening the hash and sorting it, this two hashes produce the same output, even when a == b => false.
EDIT 2
The use case for this whole thing is product data comparison. The product data is stored inside a hash, then serialized and sent to a service that creates / updates the product data.
I want to check if anything has changed inside the product data, so I generate a hash from the product content and store it in a database. The next time the same product is loaded, I calculate the hash again, compare it to the one in the DB and decide wether the product needs an update or not.
EDIT : As you detailed, two hashes with keys in different order should give the same string. I would reopen the Hash class to add my new custom flatten method :
class Hash
def custom_flatten()
self.sort.map{|pair| ["key: #{pair[0]}", pair[1]]}.flatten.map{ |elem| elem.is_a?(Hash) ? elem.custom_flatten : elem }.flatten
end
end
Explanation :
sort converts the hash to a sorted array of pairs (for the comparison of hashes with different keys order)
.map{|pair| ["key: #{pair[0]}", pair[1]]} is a trick to differentiate keys from values in the final flatten array, to avoid the problem of {a: {b: {c: :d}}}.custom_flatten == {a: :b, c: :d}.custom_flatten
flatten converts an array of arrays into a single array of values
map{ |elem| elem.is_a?(Hash) ? elem.custom_flatten : elem } calls back fully_flatten on any sub-hash left.
Then you just need to use :
require 'digest/sha1'
Digest::SHA1.hexdigest hash.custom_flatten.to_s
I am not aware of a gem that does something like what you are looking for. There is a Hash#flatten method in ruby, but it does not flatten nested hashes recursively. Here is a straight forward recursive function that will flatten in the way that you requested in your question:
def completely_flatten(hsh)
hsh.flatten(-1).map{|el| el.is_a?(Hash) ? completely_flatten(el) : el}.flatten
end
This will yield
hash = {
foo: "test",
bar: [1,2,3]
hello: {
world: "earth",
arrays: [
{my: "example"}
]
}
}
completely_flatten(hash)
#=> [:foo, "test", :bar, 1, 2, 3, :hello, :world, "earth", :arrays, :my, "example"]
To get the string representation you are looking for (before making the sha1 hash) convert everything in the array to a string before sorting so that all of the elements can be meaningfully compared or else you will get an error:
hash_string = completely_flatten(hash).map(&:to_s).sort.join
#=> "123arraysbarearthexamplefoohellomytestworld"
The question is how to "flatten" a hash. There is a second, implicit, question concerning sha1, but, by SO rules, that needs to be addressed in a separate question. You can "flatten" any hash or array as follows.
Code
def crush(obj)
recurse(obj).flatten
end
def recurse(obj)
case obj
when Array then obj.map { |e| recurse e }
when Hash then obj.map { |k,v| [k, recurse(v)] }
else obj
end
end
Example
crush({
foo: "test",
bar: [1,2,3],
hello: {
world: "earth",
arrays: [{my: "example"}]
}
})
#=> [:foo, "test", :bar, 1, 2, 3, :hello, :world, "earth", :arrays, :my, "example"]
crush([[{ a:1, b:2 }, "cat", [3,4]], "dog", { c: [5,6] }])
#=> [:a, 1, :b, 2, "cat", 3, 4, "dog", :c, 5, 6]
Use Marshal for Fast Serialization
You haven't articulated a useful reason to change your data structure before hashing. Therefore, you should consider marshaling for speed unless your data structures contain unsupported objects like bindings or procs. For example, using your hash variable with the syntax corrected:
require 'digest/sha1'
hash = {
foo: "test",
bar: [1,2,3],
hello: {
world: "world",
arrays: [
{foo: "bar"}
]
}
}
Digest::SHA1.hexdigest Marshal.dump(hash)
#=> "f50bc3ceb514ae074a5ab9672ae5081251ae00ca"
Marshal is generally faster than other serialization options. If all you need is speed, that will be your best bet. However, you may find that JSON, YAML, or a simple #to_s or #inspect meet your needs better for other reasons. As long as you are comparing similar representations of your object, the internal format of the hashed object is largely irrelevant to ensuring you have a unique or unmodified object.
Any solution based on flattening the hash will fail for nested hashes. A robust solution is to explicitly sort the keys of each hash recursively (from ruby 1.9.x onwards, hash keys order is preserved), and then serialize it as a string and digest it.
def canonize_hash(h)
r = h.map { |k, v| [k, v.is_a?(Hash) ? canonize_hash(v) : v] }
Hash[r.sort]
end
def digest_hash(hash)
Digest::SHA1.hexdigest canonize_hash(hash).to_s
end
digest_hash({ foo: "foo", bar: "bar" })
# => "ea1154f35b34c518fda993e8bb0fe4dbb54ae74a"
digest_hash({ bar: "bar", foo: "foo" })
# => "ea1154f35b34c518fda993e8bb0fe4dbb54ae74a"

Iterating over an array of arrays

def compute(ary)
return nil unless ary
ary.map { |a, b| !b.nil? ? a + b : a }
end
compute([1,2],[3,4])
Can someone please explain to me how compute adds the inner array's values?
To me it seems that calling map on that array of arrays would add the two arrays together, not the inner elements of each array.
map basically iterates over the elements of the object:
foo = [
['a', 'b'],
['c', 'd']
]
foo.map{ |ary| puts ary.join(',') }
# >> a,b
# >> c,d
In this example it's passing each sub-array, which is assigned to ary.
Looking at it a bit differently:
foo.map{ |ary| puts "ary is a #{ary.class}" }
# >> ary is a Array
# >> ary is a Array
Because Ruby lets us assign multiple values at once, that could have been written:
foo.map{ |item1, item2| puts "item1: #{ item1 }, item2: #{ item2 }" }
# >> item1: a, item2: b
# >> item1: c, item2: d
If map is iterating over an array of hashes, each iteration yields a sub-hash to the block:
foo = [
{'a' => 1},
{'b' => 2}
]
foo.map{ |elem| puts "elem is a #{ elem.class }" }
# >> elem is a Hash
# >> elem is a Hash
If map is iterating over a hash, each iteration yields the key/value pair to the block:
foo = {
'a' => 1,
'b' => 2
}
foo.map{ |k, v| puts "k: #{k}, v: #{v}" }
# >> k: a, v: 1
# >> k: b, v: 2
However, if you only give the block a single parameter, Ruby will assign both the key and value to the variable so you'll see it as an array:
foo.map{ |ary| puts "ary is a #{ary.class}" }
# >> ary is a Array
# >> ary is a Array
So, you have to be aware of multiple things that are happening as you iterate over the container, and as Ruby passes the values into map's block.
Beyond all that, it's important to remember that map is going to return a value, or values, for each thing passed in. map, AKA collect, is used to transform the values. It shouldn't be used as a replacement for each, which only iterates. In all the examples above I didn't really show map used correctly because I was trying to show what happens to the elements passed in. Typically we'd do something like:
foo = [['a', 'b'], ['c', 'd']]
foo.map{ |ary| ary.join(',') }
# => ["a,b", "c,d"]
Or:
bar = [[1,2], [3,4]]
bar.collect{ |i, j| i * j }
# => [2, 12]
There's also map! which changes the object being iterated, rather than returns the values. I'd recommend avoiding map! until you're well aware of why it'd be useful to you, because it seems to confuse people no end unless they understand how variables are passed and how Arrays and Hashes work.
The best thing is to play with map in IRB. You'll be able to see what's happening more easily.
I think I figured this out myself.
map selects the first array of the array-of-arrays and pipes it into the block. The variables a and b therefore refer to the first array's inner elements, rather than the first array and the second array in the array-of-arrays.

Search one hash for its keys, grab the values, put them in an array and merge the first hash with the second

I have a hash with items like
{'people'=>'50'},
{'chairs'=>'23'},
{'footballs'=>'5'},
{'crayons'=>'1'},
and I have another hash with
{'people'=>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people'=>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people'=>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people'=>'http://www.thing.com/the-post'},
{'crayons'=>'http://www.thing.com/the-blah'},
{'chairs'=>'http://www.thing.com/the-page'},
and I want something like the following that takes the first hash and then looks through the second one, grabs all the links for each word and puts them into a array appended onto the end of the hash somehow.
{'people', '50' => {'http://www.thing.com/this-post', 'http://www.thing.com/nice-post', 'http://www.thing.com/thingy-post'}},
{'footballs', '5' => {'http://www.thing.com/the-post', 'http://www.thing.com/the-post'}},
{'crayons', '1' => {'http://www.thing.com/the-blah'}},
{'chairs', '23' => {'chairs'=>'http://www.thing.com/the-page'}},
I am very new to Ruby, and I have tried quite a few combinations, and I need some help.
Excuse the example, I hope that it makes sense.
What you have is a mix of hashes, arrays, and something in the middle. I'm going to assume the following inputs:
categories = {'people'=>'50',
'chairs'=>'23',
'footballs'=>'5',
'crayons'=>'1'}
and:
posts = [['people', 'http://www.thing.com/this-post'],
['footballs','http://www.thing.com/that-post'],
['people','http://www.thing.com/nice-post'],
['footballs','http://www.thing.com/other-post'],
['people','http://www.thing.com/thingy-post'],
['footballs','http://www.thing.com/the-post'],
['people','http://www.thing.com/the-post'],
['crayons','http://www.thing.com/the-blah'],
['chairs','http://www.thing.com/the-page']]
and the following output:
[['people', '50', ['http://www.thing.com/this-post', 'http://www.thing.com/nice-post', 'http://www.thing.com/thingy-post']],
[['footballs', '5', ['http://www.thing.com/the-post', 'http://www.thing.com/the-post']],
['crayons', '1', ['http://www.thing.com/the-blah']],
['chairs', '23' => {'chairs'=>'http://www.thing.com/the-page']]]
In which case what you would need is:
categories.map do |name, count|
[name, count, posts.select do |category, _|
category == name
end.map { |_, post| post }]
end
You need to understand the different syntax for Array and Hash in Ruby:
Hash:
{ 'key1' => 'value1',
'key2' => 'value2' }
Array:
[ 'item1', 'item2', 'item3', 'item4' ]
A Hash in ruby (like in every other language) can't have more than once instance of any single key, meaning that a Hash {'key1' => 1, 'key1' => 2} is invalid and will result in an unexpected value (duplicate keys are overridden - you'll have {'key1' => 2 }).
Since there is some confusion about the format of the data, I will suggest how you might effectively structure both the input and the output. I will first present some code you could use, then give an example of how it's used, then explain what is happening.
Code
def merge_em(hash, array)
hash_keys = hash.keys
new_hash = hash_keys.each_with_object({}) { |k,h|
h[k] = { qty: hash[k], http: [] } }
array.each do |h|
h.keys.each do |k|
(new_hash.update({k=>h[k]}) { |k,g,http|
{qty: g[:qty], http: (g[:http] << http)}}) if hash_keys.include?(k)
end
end
new_hash
end
Example
Here is a hash that I have modified to include a key/value pair that does not appear in the array below:
hash = {'people' =>'50', 'chairs' =>'23', 'footballs'=>'5',
'crayons'=> '1', 'cat_lives'=> '9'}
Below is your array of hashes that is to be merged into hash. You'll see I've added a key/value pair to your hash with key "chairs". As I hope to make clear, the code is no different (i.e., not simplified) if we know in advance that each hash has only one key value pair. (Aside: if, for example, we want want the key from a hash h that is known to have only one key, we still have to pull out all the keys into an array and then take the only element of the array: h.keys.first).
I have also added a hash to the array that has no key that is among hash's keys.
array =
[{'people' =>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people' =>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people' =>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people' =>'http://www.thing.com/the-post'},
{'crayons' =>'http://www.thing.com/the-blah'},
{'chairs' =>'http://www.thing.com/the-page',
'crayons' =>'http://www.thing.com/blah'},
{'balloons' =>'http://www.thing.com/the-page'}
]
We now merge the information from array into hash, and at the same time change the structure of hash to something more suitable:
result = merge_em(hash, array)
#=> {"people" =>{:qty=>"50",
# :http=>["http://www.thing.com/this-post",
# "http://www.thing.com/nice-post",
# "http://www.thing.com/thingy-post",
# "http://www.thing.com/the-post"]},
# "chairs" =>{:qty=>"23",
# :http=>["http://www.thing.com/the-page"]},
# "footballs"=>{:qty=>"5",
# :http=>["http://www.thing.com/that-post",
# "http://www.thing.com/other-post",
# "http://www.thing.com/the-post"]},
# "crayons" =>{:qty=>"1",
# :http=>["http://www.thing.com/the-blah",
# "http://www.thing.com/blah"]},
# "cat_lives"=>{:qty=>"9",
# :http=>[]}}
I've assumed you want to look up the content of result with hash's keys. It is therefore convenient to make the values associated with those keys hashes themselves, with keys :qty and http. The former is for the values in hash (the naming may be wrong); the latter is an array containing the strings drawn from array.
This way, if we want the value for the key "crayons", we could write:
result["crayons"]
#=> {:qty=>"1",
# :http=>["http://www.thing.com/the-blah", "http://www.thing.com/blah"]}
or
irb(main):133:0> result["crayons"][:qty]
#=> "1"
irb(main):134:0> result["crayons"][:http]
#=> ["http://www.thing.com/the-blah", "http://www.thing.com/blah"]
Explanation
Let's go through this line-by-line. First, we need to reference hash.keys more than once, so let's make it a variable:
hash_keys = hash.keys
#=> ["people", "chairs", "footballs", "crayons", "cat_lives"]
We may as well convert this hash to the output format now. We could do it during the merge operation below, but I think is clearer to do it as a separate step:
new_hash = hash_keys.each_with_object({}) { |k,h|
h[k] = { qty: hash[k], http: [] } }
#=> {"people" =>{:qty=>"50", :http=>[]},
# "chairs" =>{:qty=>"23", :http=>[]},
# "footballs"=>{:qty=>"5", :http=>[]},
# "crayons" =>{:qty=>"1", :http=>[]},
# "cat_lives"=>{:qty=>"9", :http=>[]}}
Now we merge each (hash) element of array into new_hash:
array.each do |h|
h.keys.each do |k|
(new_hash.update({k=>h[k]}) { |k,g,http|
{ qty: g[:qty], http: (g[:http] << http) } }) if hash_keys.include?(k)
end
end
The first hash h from array that is passed into the block by each is:
{'people'=>'http://www.thing.com/this-post'}
which is assigned to the block variable h. We next construct an array of h's keys:
h.keys #=> ["people"]
The first of these keys, "people" (pretend there were more, as there would be in the penultimate element of array) is passed by its each into the inner block, whose block variable, k, is assigned the value "people". We then use Hash#update (aka merge!) to merge the hash:
{k=>h[k]} #=> {"people"=>'http://www.thing.com/this-post'}
into new_hash, but only because:
hash_keys.include?(k)
#=> ["people", "chairs", "footballs", "crayons", "cat_lives"].include?("people")
#=> true
evaluates to true. Note that this will evaluate to false for the key "balloons", so the hash in array with that key will not be merged. update's block:
{ |k,g,http| { qty: g[:qty], http: (g[:http] << http) } }
is crucial. This is update's way of determining the value of a key that is in both new_hash and in the hash being merged, {k=>h[k]}. The three block variables are assigned the following values by update:
k : the key ("people")
g : the current value of `new_hash[k]`
#=> `new_hash["people"] => {:qty=>"50", :http=>[]}`
http: the value of the key/value being merged
#=> 'http://www.thing.com/this-post'
We want the merged hash value for key "people" to be:
{ qty: g[:qty], http: (g[:http] << http) }
#=> { qty: 50, http: ([] << 'http://www.thing.com/this-post') }
#=> { qty: 50, http: ['http://www.thing.com/this-post'] }
so now:
new_hash
#=> {"people" =>{:qty=>"50", :http=>['http://www.thing.com/this-post']},
# "chairs" =>{:qty=>"23", :http=>[]},
# "footballs"=>{:qty=>"5", :http=>[]},
# "crayons" =>{:qty=>"1", :http=>[]},
# "cat_lives"=>{:qty=>"9", :http=>[]}}
We do the same for each of the other elements of array.
Lastly, we need to return the merged new_hash, so we make last line of the method:
new_hash
You could also do this
cat = [{'people'=>'50'},
{'chairs'=>'23'},
{'footballs'=>'5'},
{'crayons'=>'1'}]
pages = [{'people'=>'http://www.thing.com/this-post'},
{'footballs'=>'http://www.thing.com/that-post'},
{'people'=>'http://www.thing.com/nice-post'},
{'footballs'=>'http://www.thing.com/other-post'},
{'people'=>'http://www.thing.com/thingy-post'},
{'footballs'=>'http://www.thing.com/the-post'},
{'people'=>'http://www.thing.com/the-post'},
{'crayons'=>'http://www.thing.com/the-blah'},
{'chairs'=>'http://www.thing.com/the-page'}]
cat.map do |c|
c.merge(Hash['pages',pages.collect{|h| h[c.keys.pop]}.compact])
end
#=> [{"people"=>"50", "pages"=>["http://www.thing.com/this-post", "http://www.thing.com/nice-post", "http://www.thing.com/thingy-post", "http://www.thing.com/the-post"]},
{"chairs"=>"23", "pages"=>["http://www.thing.com/the-page"]},
{"footballs"=>"5", "pages"=>["http://www.thing.com/that-post", "http://www.thing.com/other-post", "http://www.thing.com/the-post"]},
{"crayons"=>"1", "pages"=>["http://www.thing.com/the-blah"]}]
Which is closer to your request but far less usable than some of the other posts.

Resources