Simulate join between Hashes - ruby

I have two Arrays of Hashes which simulate two tables in a database, with one key in the first hash referencing a separately-named key in the second hash, example below:
cars = [ { id: 1, color: 'red', owner_id: 1 }, { id: 2, color: 'black', owner_id: 1 } ]
owners = [ { id: 1, name: 'Alice' }, { id: 2, name: 'Bob' } ]
I'd like to try to accomplish a "join" on these two hashes, resulting in a new Array of Hashes, so that the keys and values from owners will be merged into any of the cars hashes where the cars' :owner_id matches an owner's :id. So in the above example, the result would look like this:
[ { id: 1, color: 'red', owner_id: 1, name: 'Alice' }, { id: 2, color: 'black', owner_id: 1, name: 'Alice' } ]
Anyone have any thoughts on how I could achieve this? Thank you!
[EDIT] Updated to clarify that I would like the results to be placed in a new Array of Hashes, rather than mutating either of the original Arrays.

def join(referers, referees, on_referer, on_referee)
referers.map do |record|
referees.find do |referee_record|
record[on_referer] == referee_record[on_referee]
end.merge(record)
end
end
cars = [ { id: 1, color: 'red', owner_id: 1 }, { id: 2, color: 'black', owner_id: 1 } ]
owners = [ { id: 1, name: 'Alice' }, { id: 2, name: 'Bob' } ]
join(cars, owners, :owner_id, :id)
# => [{:id=>1, :name=>"Alice", :color=>"red", :owner_id=>1},
# {:id=>2, :name=>"Alice", :color=>"black", :owner_id=>1}]

Edit: I just noticed that it is the key :owner_id in cars that is to be matched with the :id in owners. I assumed the key :id in cars was to be matched. I will leave my answer as is, considering that the modification is trivial and that it may be easier to follow if the match is to be on the same key names.
Assuming that:
you want to modify (mutate) cars; and
for each element h of owners there is an element g of cars for which h[:id] == g[:id],
it's just
owners.each { |h| cars.find { |g| g[:id] == h[:id] }.update(h) }
cars #=> [{:id=>1, :color=>"red", :owner_id=>1, :name=>"Alice"},
# {:id=>2, :color=>"black", :owner_id=>1, :name=>"Bob"}]
On the other hand, if:
you do not wish to mutate cars or
for a given element h of owners there may be no element g of cars for which h[:id]==g[:id] or
you just want to improve efficiency,
you could first create a hash for cars or owners whose keys are values of :id.
Suppose:
owners = [ { id: 3, name: 'Alice' }, { id: 2, name: 'Bob' } ]
We could create a hash for owners:
owners_by_id = owners.each_with_object({}) { |g,h| h.update(g[:id]=>g) }
#=> {3=>{:id=>3, :name=>"Alice"}, 2=>{:id=>2, :name=>"Bob"}}
and then write:
cars.map do |h|
g = {}.merge(h)
id = g[:id]
g.update(owners_by_id[id]) if owners_by_id.key?(id)
g
end
#=> [{:id=>1, :color=>"red", :owner_id=>1},
# {:id=>2, :color=>"black", :owner_id=>1, :name=>"Bob"}]

Assuming that the hashes at the same position in the arrays correspond:
[cars, owners].transpose.map{|h1, h2| h1.merge(h2)}
Otherwise, your example is bad.

Related

Is there any way to check if hashes in an array contains similar key value pairs in ruby?

For example, I have
array = [ {name: 'robert', nationality: 'asian', age: 10},
{name: 'robert', nationality: 'asian', age: 5},
{name: 'sira', nationality: 'african', age: 15} ]
I want to get the result as
array = [ {name: 'robert', nationality: 'asian', age: 15},
{name: 'sira', nationality: 'african', age: 15} ]
since there are 2 Robert's with the same nationality.
Any help would be much appreciated.
I have tried Array.uniq! {|e| e[:name] && e[:nationality] } but I want to add both numbers in the two hashes which is 10 + 5
P.S: Array can have n number of hashes.
I would start with something like this:
array = [
{ name: 'robert', nationality: 'asian', age: 10 },
{ name: 'robert', nationality: 'asian', age: 5 },
{ name: 'sira', nationality: 'african', age: 15 }
]
array.group_by { |e| e.values_at(:name, :nationality) }
.map { |_, vs| vs.first.merge(age: vs.sum { |v| v[:age] }) }
#=> [
# {
# :name => "robert",
# :nationality => "asian",
# :age => 15
# }, {
# :name => "sira",
# :nationality => "african",
# :age => 15
# }
# ]
Let's take a look at what you want to accomplish and go from there. You have a list of some objects, and you want to merge certain objects together if they have the same ethnicity and name. So we have a key by which we will merge. Let's put that in programming terms.
key = proc { |x| [x[:name], x[:nationality]] }
We've defined a procedure which takes a hash and returns its "key" value. If this procedure returns the same value (according to eql?) for two hashes, then those two hashes need to be merged together. Now, what do we mean by "merge"? You want to add the ages together, so let's write a merge function.
merge = proc { |x, y| x.dup.tap { |x1| x1[:age] += y[:age] } }
If we have two values x and y such that key[x] and key[y] are the same, we want to merge them by making a copy of x and adding y's age to it. That's exactly what this procedure does. Now that we have our building blocks, we can write the algorithm.
We want to produce an array at the end, after merging using the key procedure we've written. Fortunately, Ruby has a handy function called each_with_object which will do something very nice for us. The method each_with_object will execute its block for each element of the array, passing in a predetermined value as the other argument. This will come in handy here.
result = array.each_with_object({}) do |x, hsh|
# ...
end.values
Since we're using keys and values to do the merge, the most efficient way to do this is going to be with a hash. Hence, we pass in an empty hash as the extra object, which we'll modify to accumulate the merge results. At the end, we don't care about the keys anymore, so we write .values to get just the objects themselves. Now for the final pieces.
if hsh.include? key[x]
hsh[ key[x] ] = merge.call hsh[ key[x] ], x
else
hsh[ key[x] ] = x
end
Let's break this down. If the hash already includes key[x], which is the key for the object x that we're looking at, then we want to merge x with the value that is currently at key[x]. This is where we add the ages together. This approach only works if the merge function is what mathematicians call a semigroup, which is a fancy way of saying that the operation is associative. You don't need to worry too much about that; addition is a very good example of a semigroup, so it works here.
Anyway, if the key doesn't exist in the hash, we want to put the current value in the hash at the key position. The resulting hash from merging is returned, and then we can get the values out of it to get the result you wanted.
key = proc { |x| [x[:name], x[:nationality]] }
merge = proc { |x, y| x.dup.tap { |x1| x1[:age] += y[:age] } }
result = array.each_with_object({}) do |x, hsh|
if hsh.include? key[x]
hsh[ key[x] ] = merge.call hsh[ key[x] ], x
else
hsh[ key[x] ] = x
end
end.values
Now, my complexity theory is a bit rusty, but if Ruby implements its hash type efficiently (which I'm fairly certain it does), then this merge algorithm is O(n), which means it will take a linear amount of time to finish, given the problem size as input.
array.each_with_object(Hash.new(0)) { |g,h| h[[g[:name], g[:nationality]]] += g[:age] }.
map { |(name, nationality),age| { name:name, nationality:nationality, age:age } }
[{ :name=>"robert", :nationality=>"asian", :age=>15 },
{ :name=>"sira", :nationality=>"african", :age=>15 }]
The two steps are as follows.
a = array.each_with_object(Hash.new(0)) { |g,h| h[[g[:name], g[:nationality]]] += g[:age] }
#=> { ["robert", "asian"]=>15, ["sira", "african"]=>15 }
This uses the class method Hash::new to create a hash with a default value of zero (represented by the block variable h). Once this hash heen obtained it is a simple matter to construct the desired hash:
a.map { |(name, nationality),age| { name:name, nationality:nationality, age:age } }

Merging hash values in an array of hashes based on key

I have an array of hashes similar to this:
[
{"student": "a","scores": [{"subject": "math","quantity": 10},{"subject": "english", "quantity": 5}]},
{"student": "b", "scores": [{"subject": "math","quantity": 1 }, {"subject": "english","quantity": 2 } ]},
{"student": "a", "scores": [ { "subject": "math", "quantity": 2},{"subject": "science", "quantity": 5 } ] }
]
Is there a simpler way of getting the output similar to this except looping through the array and finding a duplicate and then combining them?
[
{"student": "a","scores": [{"subject": "math","quantity": 12},{"subject": "english", "quantity": 5},{"subject": "science", "quantity": 5 } ]},
{"student": "b", "scores": [{"subject": "math","quantity": 1 }, {"subject": "english","quantity": 2 } ]}
]
Rules for merging duplicate objects:
Students are merged on matching "value" (e.g. student "a", student "b")
Students scores on identical subjects are added (e.g. student a's math scores 2 and 10 become 12 when merged)
Is there a simpler way of getting the output similar to this except looping through the array and finding a duplicate and then combining them?
Not that I know of. IF you explain where this data is comeing form the answer may be different but just based on the Array of Hash objects I think you will haev to iterate and combine.
While it is not elegant you could use a solution like this
arr = [
{"student"=> "a","scores"=> [{"subject"=> "math","quantity"=> 10},{"subject"=> "english", "quantity"=> 5}]},
{"student"=> "b", "scores"=> [{"subject"=> "math","quantity"=> 1 }, {"subject"=> "english","quantity"=> 2 } ]},
{"student"=> "a", "scores"=> [ { "subject"=> "math", "quantity"=> 2},{"subject"=> "science", "quantity"=> 5 } ] }
]
#Group the array by student
arr.group_by{|student| student["student"]}.map do |student_name,student_values|
{"student" => student_name,
#combine all the scores and group by subject
"scores" => student_values.map{|student| student["scores"]}.flatten.group_by{|score| score["subject"]}.map do |subject,subject_values|
{"subject" => subject,
#combine all the quantities into an array and reduce using `+`
"quantity" => subject_values.map{|h| h["quantity"]}.reduce(:+)
}
end
}
end
#=> [
{"student"=>"a", "scores"=>[
{"subject"=>"math", "quantity"=>12},
{"subject"=>"english", "quantity"=>5},
{"subject"=>"science", "quantity"=>5}]},
{"student"=>"b", "scores"=>[
{"subject"=>"math", "quantity"=>1},
{"subject"=>"english", "quantity"=>2}]}
]
I know that you specified your expected result but I wanted to point out that making the output simpler makes the code simpler.
arr.map(&:dup).group_by{|a| a.delete("student")}.each_with_object({}) do |(student, scores),record|
record[student] = scores.map(&:values).flatten.map(&:values).each_with_object(Hash.new(0)) do |(subject,score),obj|
obj[subject] += score
obj
end
record
end
#=>{"a"=>{"math"=>12, "english"=>5, "science"=>5}, "b"=>{"math"=>1, "english"=>2}}
With this structure getting the students is as easy as calling .keys and the scores would be equally as simple. I am thinking something like
above_result.each do |student,scores|
puts student
scores.each do |subject,score|
puts " #{subject.capitalize}: #{score}"
end
end
end
The console out put would be
a
Math: 12
English: 5
Science: 5
b
Math: 1
English: 2
There are two common ways of aggregating values in such instances. The first is to employ the method Enumerable#group_by, as #engineersmnky has done in his answer. The second is to build a hash using the form of the method Hash#update (a.k.a. merge!) that uses a block to resolve the values of keys which are present in both of the hashes being merged. My solution uses the latter approach, not because I prefer it to the group_by, but just to show you a different way it can be done. (Had engineersmnky used update, I would have gone with group_by.)
Your problem is complicated somewhat by the particular data structure you are using. I found that the solution could be simplfied and made easier to follow by first converting the data to a different structure, update the scores, then convert the result back to your data structure. You may want to consider changing the data structure (if that's an option for you). I've addressed that issue in the "Discussion" section.
Code
def combine_scores(arr)
reconstruct(update_scores(simplify(arr)))
end
def simplify(arr)
arr.map do |h|
hash = Hash[h[:scores].map { |g| g.values }]
hash.default = 0
{ h[:student]=> hash }
end
end
def update_scores(arr)
arr.each_with_object({}) do |g,h|
h.update(g) do |_, h_scores, g_scores|
g_scores.each { |subject,score| h_scores[subject] += score }
h_scores
end
end
end
def reconstruct(h)
h.map { |k,v| { student: k, scores: v.map { |subject, score|
{ subject: subject, score: score } } } }
end
Example
arr = [
{ student: "a", scores: [{ subject: "math", quantity: 10 },
{ subject: "english", quantity: 5 }] },
{ student: "b", scores: [{ subject: "math", quantity: 1 },
{ subject: "english", quantity: 2 } ] },
{ student: "a", scores: [{ subject: "math", quantity: 2 },
{ subject: "science", quantity: 5 } ] }]
combine_scores(arr)
#=> [{ :student=>"a",
# :scores=>[{ :subject=>"math", :score=>12 },
# { :subject=>"english", :score=> 5 },
# { :subject=>"science", :score=> 5 }] },
# { :student=>"b",
# :scores=>[{ :subject=>"math", :score=> 1 },
# { :subject=>"english", :score=> 2 }] }]
Explanation
First consider the two intermediate calculations:
a = simplify(arr)
#=> [{ "a"=>{ "math"=>10, "english"=>5 } },
# { "b"=>{ "math"=> 1, "english"=>2 } },
# { "a"=>{ "math"=> 2, "science"=>5 } }]
h = update_scores(a)
#=> {"a"=>{"math"=>12, "english"=>5, "science"=>5}
# "b"=>{"math"=> 1, "english"=>2}}
Then
reconstruct(h)
returns the result shown above.
+ simplify
arr.map do |h|
hash = Hash[h[:scores].map { |g| g.values }]
hash.default = 0
{ h[:student]=> hash }
end
This maps each hash into a simpler one. For example, the first element of arr:
h = { student: "a", scores: [{ subject: "math", quantity: 10 },
{ subject: "english", quantity: 5 }] }
is mapped to:
{ "a"=>Hash[[{ subject: "math", quantity: 10 },
{ subject: "english", quantity: 5 }].map { |g| g.values }] }
#=> { "a"=>Hash[[["math", 10], ["english", 5]]] }
#=> { "a"=>{"math"=>10, "english"=>5}}
Setting the default value of each hash to zero simplifies the update step, which follows.
+ update_scores
For the array of hashes a that is returned by simplify, we compute:
a.each_with_object({}) do |g,h|
h.update(g) do |_, h_scores, g_scores|
g_scores.each { |subject,score| h_scores[subject] += score }
h_scores
end
end
Each element of a (a hash) is merged into an initially-empty hash, h. As update (same as merge!) is used for the merge, h is modified. If both hashes share the same key (e.g., "math"), the values are summed; else subject=>score is added to h.
Notice that if h_scores does not have the key subject, then:
h_scores[subject] += score
#=> h_scores[subject] = h_scores[subject] + score
#=> h_scores[subject] = 0 + score (because the default value is zero)
#=> h_scores[subject] = score
That is, the key-value pair from g_scores is merely added to h_scores.
I've replaced the block variable representing the subject with a placeholder _, to reduce the chance of errors and to inform the reader that it is not used in the block.
+ reconstruct
The final step is to convert the hash returned by update_scores back to the original data structure, which is straightforward.
Discussion
If you change the data structure, and it meets your requirements, you may wish to consider changing it to that produced by combine_scores:
h = { "a"=>{ math: 10, english: 5 }, "b"=>{ math: 1, english: 2 } }
Then to update the scores with:
g = { "a"=>{ math: 2, science: 5 }, "b"=>{ english: 3 }, "c"=>{ science: 4 } }
you would merely to the following:
h.merge(g) { |_,oh,nh| oh.merge(nh) { |_,ohv,nhv| ohv+nhv } }
#=> { "a"=>{ :math=>12, :english=>5, :science=>5 },
# "b"=>{ :math=> 1, :english=>5 },
# "c"=>{ :science=>4 } }

Sort array with custom order

I have an array of ids order say
order = [5,2,8,6]
and another array of hash
[{id: 2,name: name2},{id: 5,name: name5}, {id: 6,name: name6}, {id: 8,name: name8}]
I want it sorted as
[{id: 5,name: name5},{id: 2,name: name2}, {id: 8,name: name8}, {id: 6,name: name6}]
What could be best way to implement this? I can implement this with iterating both and pushing it to new array but looking for better solution.
Try this
arr = [
{:id=>2, :name=>"name2"}, {:id=>5, :name=>"name5"},
{:id=>6, :name=>"name6"}, {:id=>8, :name=>"name8"}
]
order = [5,2,8,6]
arr.sort_by { |a| order.index(a[:id]) }
# => [{:id=>5, :name=>"name5"}, {:id=>2, :name=>"name2"},
#{:id=>8, :name=>"name8"}, {:id=>6, :name=>"name6"}]
Enumerable#in_order_of (Rails 7+)
Starting from Rails 7, there is a new method Enumerable#in_order_of.
A quote right from the official Rails docs:
in_order_of(key, series)
Returns a new Array where the order has been set to that provided in the series, based on the key of the objects in the original enumerable.
[ Person.find(5), Person.find(3), Person.find(1) ].in_order_of(:id, [ 1, 5, 3 ])
=> [ Person.find(1), Person.find(5), Person.find(3) ]
If the series include keys that have no corresponding element in the Enumerable, these are ignored. If the Enumerable has additional elements that aren't named in the series, these are not included in the result.
It is not perfect in a case of hashes, but you can consider something like:
require 'ostruct'
items = [{ id: 2, name: 'name2' }, { id: 5, name: 'name5' }, { id: 6, name: 'name6' }, { id: 8, name: 'name8' }]
items.map(&OpenStruct.method(:new)).in_order_of(:id, [5,2,8,6]).map(&:to_h)
# => [{:id=>5, :name=>"name5"}, {:id=>2, :name=>"name2"}, {:id=>8, :name=>"name8"}, {:id=>6, :name=>"name6"}]
Sources:
Official docs - Enumerable#in_order_of.
PR - Enumerable#in_order_of #41333.
Rails 7 adds Enumerable#in_order_of.

Check if any of array hash objects has needed value

I have an array like this:
arr = [{id: 1, name: 'John' }, {id: 2, name: 'Sam' }, {id: 3, name: 'Bob' }]
I need to check if any of arr objects have name Sam. What is the most elegant way? I can only think of cycling with each.
I need to check if any of arr objects have name Sam
Enumerable#any? is a good way to go.
arr = [ {id: 1, name: 'John' }, {id: 2, name: 'Sam' }, {id: 3, name: 'Bob' }]
arr.any? {|h| h[:name] == "Sam"}
# => true
Now if you also want to see which Array object has the value Sam in it,you can use Enumerable#find for the same:
arr.find {|h| h[:name] == "Sam"}
# => {:id=>2, :name=>"Sam"}
You can also choose select or count methods
Enumberable#select
> arr = [{id: 1, name: 'John' }, {id: 2, name: 'Sam' }, {id: 3, name: 'Bob' }]
> arr.select { | h | h[:name] == 'Sam' }
# => [{:id=>2, :name=>"Sam"}]
Enumberable#count
> arr.count { | h | h[:name] == 'Sam' }
# => 1
You can use Enumberable#find_all to return all object that match the constrain
arr = [{:id=>1,:first_name=>'sam'},{:id=>2,:first_name=>'sam'},{:id=>3,:first_name=>'samanderson'},{:id=>4,:first_name=>'samuel'}]
arr.find_all{|obj| obj.first_name == 'sam'}
# => [{:id=>1,:first_name=>'sam'},{:id=>2,:first_name=>'sam'}]

Convert Array of objects to Hash with a field as the key

I have an Array of objects:
[
#<User id: 1, name: "Kostas">,
#<User id: 2, name: "Moufa">,
...
]
And I want to convert this into an Hash with the id as the keys and the objects as the values. Right now I do it like so but I know there is a better way:
users = User.all.reduce({}) do |hash, user|
hash[user.id] = user
hash
end
The expected output:
{
1 => #<User id: 1, name: "Kostas">,
2 => #<User id: 2, name: "Moufa">,
...
}
users_by_id = User.all.map { |user| [user.id, user] }.to_h
If you are using Rails, ActiveSupport provides Enumerable#index_by:
users_by_id = User.all.index_by(&:id)
You'll get a slightly better code by using each_with_object instead of reduce.
users = User.all.each_with_object({}) do |user, hash|
hash[user.id] = user
end
You can simply do (using the Ruby 3 syntax of _1)
users_by_id = User.all.to_h { [_1.id, _1] }

Resources