Given array of hashes, how can I use select on one key of the hash while evaluating on another key? - ruby

With my array of hashes:
data = [{:bool => true, :val => 5}, {:bool => false, :val => 9}, {:bool => true, :val => 1}]
I would like to iterate through the data and retrieve an array of values only. I can do:
data.map{|x| x[:val] if x[:bool]}
which returns:
[5, nil, 1]
But this method requires an additional .compact call to get rid of the nil values.
Is there a better way of achieving this?

Use chaining instead to first select only those where :bool is true, then map the results to :val:
data.select { |h| h[:bool] }.map { |h| h[:val] } #=> [5, 1]

data.map { |x| x[:val] if x[:bool] }.compact is probably the easiest to read, but you can go down to one function call via reduce:
data.reduce([]) { |m,x| m << x[:val] if x[:bool]; m }

It does not seem an elegant way but you can try this:
data = [{:bool => true, :val => 5}, {:bool => false, :val => 9}, {:bool => true, :val => 1}]
result = Array.new
data.each do |b|
val = Hash.try_convert(b)[:val]
result.unshift(val)
end

Related

Filtering when adding elements by id

Tell me how to do this, where you can read about it, because I do not understand at all how to implement it. Thanks.
def initialize
#arr = []
end
def items(init)
arrfinish = init - #arr
#arr = (#arr + init).uniq
yield arrfinish
end
def idefine(find_text)
end
The class has a method(items) that connects arrays by removing duplicate elements. I need to make sure that the idefine
method receives the key by which filtering will be performed when adding new elements, I will give an example below.
app_handler.idefine('id')
app_handler.items([{'id' => 1}, {'id' => 1, 'test_key' => 'Some data'}, {'id' => 2}])
From this example, the second element with id = 1 should be ignored.
Ok, putting aside the class definition, what I understand of the question is:
In an array of hashes, remove the hashes that contain a duplicate value of a given key.
The following function filter the hashes and copy the content of the selected ones in a new array:
require 'set'
def no_dup_val key, arr
previous_values = Set[]
arr.each_with_object([]) do |hash,result|
next unless hash.has_key?(key)
next if previous_values.include?(hash[key])
previous_values << hash[key]
result << hash.dup
end
end
Which gives you:
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}]
Note that the hashes that don't contain the key are also removed, that's my choice, which leads to the following questions:
What happens when the key is not present in a hash?
What happens when the item function is called more than once? Do you take into account the hashes already in #arr?
What happens when you call idefine with a new key? Do you filter the existing elements of #arr with the new key?
As you can see, you need to be a little more specific about what you want to do.
Update
If you don't care about copying the contents of the hashes then these may fit your needs.
Hashes without the id key are removed:
def no_dup_val key, arr
arr.filter{ |h| h.has_key?(key) }.uniq{ |h| h[key] }
end
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}]
Hashes without the id key are treated as having "id" => nil (so the first will be kept):
def no_dup_val key, arr
arr.uniq{ |h| h[key] }
end
no_dup_val 'id', [{'id' => 1}, {'id' => 1, 'key' => 'data'}, {'id' => 2}, {'stock' => 3}, {'e-stock'=>0}]
#=> [{"id"=>1}, {"id"=>2}, {"stock"=>3}]
All the hashes without the id key are kept:

Most performant way to group/summarise two hashes?

I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}

Ruby : How to sort an array of hash in a given order of a particular key

I have an array of hashes, id being one of the keys in the hashes. I want to sort the array elements according to a given order of ID values.
Suppose my array(size=5) is:
[{"id"=>1. ...}, {"id"=>4. ...}, {"id"=>9. ...}, {"id"=>2. ...}, {"id"=>7. ...}]
I want to sort the array elements such that their ids are in the following order:
[1,3,5,7,9,2,4,6,8,10]
So the expected result is:
[{'id' => 1},{'id' => 7},{'id' => 9},{'id' => 2},{'id' => 4}]
Here is a solution for any custom index:
def my_index x
# Custom code can be added here to handle items not in the index.
# Currently an error will be raised if item is not part of the index.
[1,3,5,7,9,2,4,6,8,10].index(x)
end
my_collection = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
p my_collection.sort_by{|x| my_index x['id'] } #=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
Then you can format it in any way you want, maybe this is prettier:
my_index = [1,3,5,7,9,2,4,6,8,10]
my_collection.sort_by{|x| my_index.index x['id'] }
I would map the hash based on the values like so:
a = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
[1,3,5,7,9,2,4,6,8,10].map{|x| a[a.index({"id" => x})] }.compact
#=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
General note on sorting. Use #sort_by method of the ruby's array class:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by {|x|x['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
Or with usage #values method as a callback:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by(&:values)
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
or you can use more obvious version with #sort method:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort {|x,y| x['id'] <=> y['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
For your case, to sort with extended condition use #% to split even and odd indexes:
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
u = y['id'] % 2 <=> x['id'] % 2
u == 0 && y['id'] <=> x['id'] || u
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
For your case, to sort with extended condition use #% to split according the index, even id value is absent in the index array:
index = [1,3,5,7,4,2,6,8,10] # swapped 2 and 4, 9 is absent
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
!index.rindex( x[ 'id' ] ) && 1 || index.rindex( x[ 'id' ] ) <=> index.rindex( y[ 'id' ] ) || -1
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>4}, {"id"=>2}, {"id"=>9}]
Why not just sort?
def doit(arr, order)
arr.sort { |h1,h2| order.index(h1['id']) <=> order.index(h2['id']) }
end
order = [1,3,5,7,9,2,4,6,8,10]
arr = [{'id' => 1}, {'id' => 4}, {'id' => 9}, {'id' => 2}, {'id' => 7}]
doit(arr, order)
#=> [{'id' => 1}, {'id' => 7}, {'id' => 9}, {'id' => 2}, {'id' => 4}]
a= [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
b=[1,3,5,7,9,2,4,6,8,10]
a.sort_by{|x| b.index (x['id'])}

Ruby: Link two arrays of objects by attribute value

I'm pretty new in Ruby programming. In Ruby there are plenty ways to write elegant code. Is there any elegant way to link two arrays with objects of the same type by attribute value?
It's hard to explain. Let's look at the next example:
a = [ { :id => 1, :value => 1 }, { :id => 2, :value => 2 }, { :id => 3, :value => 3 } ]
b = [ { :id => 1, :value => 2 }, { :id => 3, :value => 4 } ]
c = link a, b
# Result structure after linkage.
c = {
"1" => {
:a => { :id => 1, :value => 1 },
:b => { :id => 1, :value => 1 }
},
"3" => {
:a => { :id => 3, :value => 3 },
:b => { :id => 3, :value => 4 }
}
}
So the basic idea is to get pairs of objects from different arrays by their common ID and construct a hash, which will give this pair by ID.
Thanks in advance.
If you want to take an adventure through Enumerable, you could say this:
(a.map { |h| [:a, h] } + b.map { |h| [:b, h] })
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == 2 }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
And if you really want the keys to be strings, say n.to_s => Hash[v] instead of n => Hash[v].
The logic works like this:
We need to know where everything comes from we decorate the little hashes with :a and :b symbols to track their origins.
Then add the decorated arrays together into one list so that...
group_by can group things into almost-the-final-format.
Then find the groups of size two since those groups contain the entries that appeared in both a and b. Groups of size one only appeared in one of a or b so we throw those away.
Then a little injection to rearrange things into their final format. Note that the arrays we built in (1) just somehow happen to be in the format that Hash[] is looking for.
If you wanted to do this in a link method then you'd need to say things like:
link :a => a, :b => b
so that the method will know what to call a and b. This hypothetical link method also easily generalizes to more arrays:
def link(input)
input.map { |k, v| v.map { |h| [k, h] } }
.inject(:+)
.group_by { |_, h| h[:id] }
.select { |_, a| a.length == input.length }
.inject({}) { |h, (n, v)| h.update(n => Hash[v]) }
end
link :a => [...], :b => [...], :c => [...]
I assume that, for any two elements h1 and h2 of a (or of b), h1[:id] != h2[:id].
I would do this:
def convert(arr) Hash[arr.map {|h| [h[:id], h]}] end
ah, bh = convert(a), convert(b)
c = ah.keys.each_with_object({}) {|k,h|h[k]={a: ah[k], b: bh[k]} if bh.key?(k)}
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Note that:
ah = convert(a)
# => {1=>{:id=>1, :value=>1}, 2=>{:id=>2, :value=>2}, 3=>{:id=>3, :value=>3}}
bh = convert(b)
# => {1=>{:id=>1, :value=>2}, 3=>{:id=>3, :value=>4}}
Here's a second approach. I don't like it as well, but it represents a different way of looking at the problem.
def sort_by_id(a) a.sort_by {|h| h[:id]} end
c = Hash[*sort_by_id(a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}})
.zip(sort_by_id(b))
.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
.flatten]
Here's what's happening. The first step is to select only the elements ha of a for which there is an element hb of b for which ha[:id] = hb[id]. Then we sort both (what's left of) a and b on h[:id], zip them together and then make the hash c.
r1 = a.select {|ha| b.find {|hb| hb[:id] == ha[:id]}}
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r2 = sort_by_id(r1)
# => [{:id=>1, :value=>1}, {:id=>3, :value=>3}]
r3 = sort_by_id(b)
# => [{:id=>1, :value=>2}, {:id=>3, :value=>4}]
r4 = r2.zip(r3)
# => [[{:id=>1, :value=>1}, {:id=>1, :value=>2}],
# [{:id=>3, :value=>3}, {:id=>3, :value=>4}]]
r5 = r4.map {|ha,hb| [ha[:id], {a: ha, b: hb}]}
# => [[1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}}],
# [3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]]
r6 = r5.flatten
# => [1, {:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3, {:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}]
c = Hash[*r6]
# => {1=>{:a=>{:id=>1, :value=>1}, :b=>{:id=>1, :value=>2}},
# 3=>{:a=>{:id=>3, :value=>3}, :b=>{:id=>3, :value=>4}}}
Ok, I've found the answer by myself. Here is a quite short line of code, which should do the trick:
Hash[a.product(b)
.select { |pair| pair[0][:id] == pair[1][:id] }
.map { |pair| [pair[0][:id], { :a => pair[0], :b => pair[1] }] }]
The product method gives us all possible pairs, then we filter them by equal IDs of pair elements. And then we map pairs to the special form, which will produce a Hash we are looking for.
So Hash[["key1", "value1"], ["key2", "value2"]] returns { "key1" => "value1", "key2" => "value2" }. And I use this to get the answer on my question.
Thanks.
P.S.: you can use pair.first instead of pair[0] and pair.last instead of pair[1] for better readability.
UPDATE
As Cary pointed out, it is better to replace |pair| with |ha, hb| to avoid these ugly indices:
Hash[a.product(b)
.select { |ha, hb| ha[:id] == hb[:id] }
.map { |ha, hb| [ha[:id], { :a => ha, :b => hb }] }]

Return hash with modified values in Ruby

I'm trying this:
{:id => 5, :foos => [1,2,3]}.each {|k,v| v.to_s}
But that's returning this:
{:id=>5, :foos=>[1, 2, 3]}
I'd like to see this:
{:id=>"5", :foos=>"[1, 2, 3]"}
I've also tried variations of Hash#collect and Hash#map. Any ideas?
you could use Object#inspect:
{ :id => 5, :foos => [1, 2, 3] }.inject({}) do |hash, (key, value)|
hash.merge key => value.inspect
end
which returns:
{ :foos => "[1, 2, 3]", :id => "5" }
or if you want it to be destructive:
hash = { :id => 5, :foos => [1, 2, 3] }
hash.each_key { |key| hash[key] = hash[key].inspect }
Your stuff doesn't work because v.to_s doesn't modify v, so essentially the block doesn't do anything.
You could do it like this:
hash = {:id => 5, :foos => [1,2,3]}
hash.each_key { |k| hash[k] = hash[k].to_s }
If you don't want to modify the hash:
hash = {:id => 5, :foos => [1,2,3]}
new_hash = {}
hash.each_key { |k| new_hash[k] = hash[k].to_s }

Resources