Ruby : How to sort an array of hash in a given order of a particular key - ruby

I have an array of hashes, id being one of the keys in the hashes. I want to sort the array elements according to a given order of ID values.
Suppose my array(size=5) is:
[{"id"=>1. ...}, {"id"=>4. ...}, {"id"=>9. ...}, {"id"=>2. ...}, {"id"=>7. ...}]
I want to sort the array elements such that their ids are in the following order:
[1,3,5,7,9,2,4,6,8,10]
So the expected result is:
[{'id' => 1},{'id' => 7},{'id' => 9},{'id' => 2},{'id' => 4}]

Here is a solution for any custom index:
def my_index x
# Custom code can be added here to handle items not in the index.
# Currently an error will be raised if item is not part of the index.
[1,3,5,7,9,2,4,6,8,10].index(x)
end
my_collection = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
p my_collection.sort_by{|x| my_index x['id'] } #=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
Then you can format it in any way you want, maybe this is prettier:
my_index = [1,3,5,7,9,2,4,6,8,10]
my_collection.sort_by{|x| my_index.index x['id'] }

I would map the hash based on the values like so:
a = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
[1,3,5,7,9,2,4,6,8,10].map{|x| a[a.index({"id" => x})] }.compact
#=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]

General note on sorting. Use #sort_by method of the ruby's array class:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by {|x|x['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
Or with usage #values method as a callback:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by(&:values)
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
or you can use more obvious version with #sort method:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort {|x,y| x['id'] <=> y['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
For your case, to sort with extended condition use #% to split even and odd indexes:
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
u = y['id'] % 2 <=> x['id'] % 2
u == 0 && y['id'] <=> x['id'] || u
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
For your case, to sort with extended condition use #% to split according the index, even id value is absent in the index array:
index = [1,3,5,7,4,2,6,8,10] # swapped 2 and 4, 9 is absent
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
!index.rindex( x[ 'id' ] ) && 1 || index.rindex( x[ 'id' ] ) <=> index.rindex( y[ 'id' ] ) || -1
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>4}, {"id"=>2}, {"id"=>9}]

Why not just sort?
def doit(arr, order)
arr.sort { |h1,h2| order.index(h1['id']) <=> order.index(h2['id']) }
end
order = [1,3,5,7,9,2,4,6,8,10]
arr = [{'id' => 1}, {'id' => 4}, {'id' => 9}, {'id' => 2}, {'id' => 7}]
doit(arr, order)
#=> [{'id' => 1}, {'id' => 7}, {'id' => 9}, {'id' => 2}, {'id' => 4}]

a= [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
b=[1,3,5,7,9,2,4,6,8,10]
a.sort_by{|x| b.index (x['id'])}

Related

Filtering when adding elements by id

Tell me how to do this, where you can read about it, because I do not understand at all how to implement it. Thanks.
def initialize
#arr = []
end
def items(init)
arrfinish = init - #arr
#arr = (#arr + init).uniq
yield arrfinish
end
def idefine(find_text)
end
The class has a method(items) that connects arrays by removing duplicate elements. I need to make sure that the idefine
method receives the key by which filtering will be performed when adding new elements, I will give an example below.
app_handler.idefine('id')
app_handler.items([{'id' => 1}, {'id' => 1, 'test_key' => 'Some data'}, {'id' => 2}])
From this example, the second element with id = 1 should be ignored.
Ok, putting aside the class definition, what I understand of the question is:
In an array of hashes, remove the hashes that contain a duplicate value of a given key.
The following function filter the hashes and copy the content of the selected ones in a new array:
require 'set'
def no_dup_val key, arr
previous_values = Set[]
arr.each_with_object([]) do |hash,result|
next unless hash.has_key?(key)
next if previous_values.include?(hash[key])
previous_values << hash[key]
result << hash.dup
end
end
Which gives you:
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}]
Note that the hashes that don't contain the key are also removed, that's my choice, which leads to the following questions:
What happens when the key is not present in a hash?
What happens when the item function is called more than once? Do you take into account the hashes already in #arr?
What happens when you call idefine with a new key? Do you filter the existing elements of #arr with the new key?
As you can see, you need to be a little more specific about what you want to do.
Update
If you don't care about copying the contents of the hashes then these may fit your needs.
Hashes without the id key are removed:
def no_dup_val key, arr
arr.filter{ |h| h.has_key?(key) }.uniq{ |h| h[key] }
end
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}]
Hashes without the id key are treated as having "id" => nil (so the first will be kept):
def no_dup_val key, arr
arr.uniq{ |h| h[key] }
end
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}, {"stock"=>3}]
All the hashes without the id key are kept:

Most performant way to group/summarise two hashes?

I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}

Merge hashes based on particular key/value pair in ruby

I am trying to merge an array of hashes based on a particular key/value pair.
array = [ {:id => '1', :value => '2'}, {:id => '1', :value => '5'} ]
I would want the output to be
{:id => '1', :value => '7'}
As patru stated, in sql terms this would be equivalent to:
SELECT SUM(value) FROM Hashes GROUP BY id
In other words, I have an array of hashes that contains records. I would like to obtain the sum of a particular field, but the sum would grouped by key/value pairs. In other words, if my selection criteria is :id as in the example above, then it would seperate the hashes into groups where the id was the same and the sum the other keys.
I apologize for any confusion due to the typo earlier.
Edit: The question has been clarified since I first posted my answer. As a result, I have revised my answer substantially.
Here are two "standard" ways of addressing this problem. Both use Enumerable#select to first extract the elements from the array (hashes) that contain the given key/value pair.
#1
The first method uses Hash#merge! to sequentially merge each array element (hashes) into a hash that is initially empty.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
end
Example
arr = [{:id => '1', :value => '2'}, {:id => '2', :value => '3'},
{:id => '1', :chips => '4'}, {:zd => '1', :value => '8'},
{:cat => '2', :value => '3'}, {:id => '1', :value => '5'}]
doit(arr, :id, '1')
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Explanation
The key here is to use the version of Hash#merge! that uses a block to determine the value for each key/value pair whose key appears in both of the hashes being merged. The two values for that key are represented above by the block variables hv and gv. We simply want to add them together. Note that g is the (initially empty) hash object created by each_with_object, and returned by doit.
target_key = :id
target_value = '1'
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
#=> [{:id=>"1", :value=>"2"},{:id=>"1", :chips=>"4"},{:id=>"1", :value=>"5"}]
qualified.empty?
#=> false
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
#2
The other common way to do this kind of calculation is to use Enumerable#flat_map, followed by Enumerable#group_by.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.flat_map(&:to_a)
.group_by(&:first)
.values.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}]}.to_h
end
Explanation
This may look complex, but it's not so bad if you break it down into steps. Here's what's happening. (The calculation of qualified is the same as in #1.)
target_key = :id
target_value = '1'
c = qualified.flat_map(&:to_a)
#=> [[:id,"1"],[:value,"2"],[:id,"1"],[:chips,"4"],[:id,"1"],[:value,"5"]]
d = c.group_by(&:first)
#=> {:id=>[[:id, "1"], [:id, "1"], [:id, "1"]],
# :value=>[[:value, "2"], [:value, "5"]],
# :chips=>[[:chips, "4"]]}
e = d.values
#=> [[[:id, "1"], [:id, "1"], [:id, "1"]],
# [[:value, "2"], [:value, "5"]],
# [[:chips, "4"]]]
f = e.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }
#=> [[:id, "1"], [:value, "7"], [:chips, "4"]]
f.to_h => {:id=>"1", :value=>"7", :chips=>"4"}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Comment
You may wish to consider makin the values in the hashes integers and exclude the target_key/target_value pairs from qualified:
arr = [{:id => 1, :value => 2}, {:id => 2, :value => 3},
{:id => 1, :chips => 4}, {:zd => 1, :value => 8},
{:cat => 2, :value => 3}, {:id => 1, :value => 5}]
target_key = :id
target_value = 1
qualified = arr.select { |h| h.key?(target_key) && h[target_key]==target_value}
.each { |h| h.delete(target_key) }
#=> [{:value=>2}, {:chips=>4}, {:value=>5}]
return nil if qualified.empty?
Then either
qualified.each_with_object({}) {|h,g| g.merge!(h) { |k,gv,hv| gv + hv } }
#=> {:value=>7, :chips=>4}
or
qualified.flat_map(&:to_a)
.group_by(&:first)
.values
.map { |a| [a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }.to_h
#=> {:value=>7, :chips=>4}

Given array of hashes, how can I use select on one key of the hash while evaluating on another key?

With my array of hashes:
data = [{:bool => true, :val => 5}, {:bool => false, :val => 9}, {:bool => true, :val => 1}]
I would like to iterate through the data and retrieve an array of values only. I can do:
data.map{|x| x[:val] if x[:bool]}
which returns:
[5, nil, 1]
But this method requires an additional .compact call to get rid of the nil values.
Is there a better way of achieving this?
Use chaining instead to first select only those where :bool is true, then map the results to :val:
data.select { |h| h[:bool] }.map { |h| h[:val] } #=> [5, 1]
data.map { |x| x[:val] if x[:bool] }.compact is probably the easiest to read, but you can go down to one function call via reduce:
data.reduce([]) { |m,x| m << x[:val] if x[:bool]; m }
It does not seem an elegant way but you can try this:
data = [{:bool => true, :val => 5}, {:bool => false, :val => 9}, {:bool => true, :val => 1}]
result = Array.new
data.each do |b|
val = Hash.try_convert(b)[:val]
result.unshift(val)
end

ruby array includes an id

I currently want to iterate over an array of Objects (2 properties: id & name) and check if the array contains a specific Id
How would I do this?
Enumerable#detect is ok, but I think that Enumerable#any? (which returns a boolean), is strictly what you asked for:
xs = [{:id => 1, :name => 'a'}, {:id => 2, :name => 'b'}]
puts xs.any? {|x| x[:id] == 1} # true
puts xs.any? {|x| x[:id] == 5} # false
Try detect
a = [{:id => 1, :name => 'a'}, {:id => 2, :name => 'b'}]
puts a.detect {|x| x[:id] == 1}

Resources