With this code I implemented a tree
groups = {"al1o0"=>"A1", "al2o2"=>"A10", "al2o3"=>"A11", "al1o1"=>"A2"}
map = {}
arr = []
groups.each_with_index do |group, index|
level = (group.first.split("o")[0].split("al")[1]).to_i - 1
level = level == 0 ? nil : level
order = group.first.split("o")[1]
arr.append({ :id=> index + 1, :order => order, :name => group.last, :parent => level})
end
root = {:id => 0, :name => '', :order => 0, :parent => nil}
arr.each do |e|
map[e[:id]] = e
end
tree = {}
arr.each do |e|
pid = e[:parent]
if pid == nil
(tree[root] ||= []) << e
else
(tree[map[pid]] ||= []) << e
end
end
tree has
=> {{:id=>0, :name=>"", :order=>0, :parent=>nil}=>[{:id=>1, :order=>"0", :name=>"A1", :parent=>nil}, {:id=>4, :order=>"1", :name=>"A2", :parent=>nil}], {:id=>1, :order=>"0", :name=>"A1", :parent=>nil}=>[{:id=>2, :order=>"2", :name=>"A10", :parent=>1}, {:id=>3, :order=>"3", :name=>"A11", :parent=>1}]}
Up to here all right but If I do tree.to_json, the output is
=> "{\"{:id=\\u003e0, :name=\\u003e\\\"\\\", :order=\\u003e0, :parent=\\u003enil}\":[{\"id\":1,\"order\":\"0\",\"name\":\"A1\",\"parent\":null},{\"id\":4,\"order\":\"1\",\"name\":\"A2\",\"parent\":null}],\"{:id=\\u003e1, :order=\\u003e\\\"0\\\", :name=\\u003e\\\"A1\\\", :parent=\\u003enil}\":[{\"id\":2,\"order\":\"2\",\"name\":\"A10\",\"parent\":1},{\"id\":3,\"order\":\"3\",\"name\":\"A11\",\"parent\":1}]}"
Why It changed :id=>0 in :id=\u003e0?
First of all tree looks weird.
{{:id=>0, :name=>"", :order=>0, :parent=>nil}=>[{:id=>1, :order=>"0", :name=>"A1", :parent=>nil}, ...]}}
here is a key
{:id=>0, :name=>"", :order=>0, :parent=>nil}
and
[{:id=>1, :order=>"0", :name=>"A1", :parent=>nil}, ...]
is a value.
Key should not be a hash. How to call it later then.
You might need something like
{"A1" => {name: 'foo', order: '0' }, 'A2' => ...}
Related
I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}
I have this hash:
HASH = {
'x' => { :amount => 0 },
'c' => { :amount => 5 },
'q' => { :amount => 10 },
'y' => { :amount => 20 },
'n' => { :amount => 50 }
}
How can I get the key with the next highest amount from the hash?
For example, if I supply x, it should return c. If there is no higher amount, then the key with the lowest amount should be returned. That means when I supply n, then x would be returned.
Can anybody help?
I'd use something like this:
def next_higher(key)
amount = HASH[key][:amount]
sorted = HASH.sort_by { |_, v| v[:amount] }
sorted.find(sorted.method(:first)) { |_, v| v[:amount] > amount }.first
end
next_higher "x" #=> "c"
next_higher "n" #=> "x"
I'd do something like this:
def find_next_by_amount(hash, key)
sorted = hash.sort_by { |_, v| v[:amount] }
index_of_next = sorted.index { |k, _| k == key }.next
sorted.fetch(index_of_next, sorted.first).first
end
find_next_by_amount(HASH, 'x')
# => "c"
find_next_by_amount(HASH, 'n')
# => "x"
Something like that:
def next(key)
amount = HASH[key][:amount]
kv_pairs = HASH.select{ |k, v| v[:amount] > amount }
result = kv_pairs.empty? ? HASH.first.first : kv_pairs.min_by{ |k, v| v}.first
end
I'm curious, why would you want something like that? Maybe there is better solution to underlying task.
EDIT: Realized that hash isn't necessary sorted by amount, adapted code for unsorted hashes.
One way:
A = HASH.sort_by { |_,h| h[:amount] }.map(&:first)
#=> ['x', 'c', 'q', 'y', 'n']
(If HASH's keys are already in the correct order, this is is just A = HASH.keys.)
def next_one(x)
A[(A.index(x)+1)%A.size]
end
next_one 'x' #=> 'c'
next_one 'q' #=> 'y'
next_one 'n' #=> 'x'
Alternatively, you could create a hash instead of a method:
e = A.cycle
#=> #<Enumerator: ["x", "c", "q", "y", "n"]:cycle>
g = A.size.times.with_object({}) { |_,g| g.update(e.next=>e.peek) }
#=> {"x"=>"c", "c"=>"q", "q"=>"y", "y"=>"n", "n"=>"x"}
What is the best way to construct a hash-like class Case, which is initialized by a hash:
cs = Case.new(:a => 1, /b/ => 2, /c/ => 2, /d/ => 3)
and has a method Case#[] that looks up for the first matching key by === (like a case statement) instead of by == (like the conventional hash) and returns the value:
cs["xxb"] => 2
Here's a possibility.
class Case
def initialize(h)
#h = h
end
def [](key,order=:PRE)
case order
when :PRE
h[#h.keys.find { |k| key === k }]
when :POST
h[#h.keys.find { |k| k === key }]
else
# raise exception
end
end
end
cs = Case.new(:a => 1, /b/ => 2, /c/ => 2, [1,2] => "cat", /d/ => 3)
cs["xxb"] #=> nil
cs["xxb",:POST] #=> 2
cs[Regexp] #=> 2
cs[Regexp,:POST] #=> nil
cs[Array] #=> "cat"
cs[Symbol] #=> 1
This assumes h does not have a key nil.
With the understanding that the key in the hash is to come on the left side of ===, the code would be:
class Case
def initialize(h) #h = h end
def [](key) h[#h.keys.find{|k| k === key}] end
end
I am trying to merge an array of hashes based on a particular key/value pair.
array = [ {:id => '1', :value => '2'}, {:id => '1', :value => '5'} ]
I would want the output to be
{:id => '1', :value => '7'}
As patru stated, in sql terms this would be equivalent to:
SELECT SUM(value) FROM Hashes GROUP BY id
In other words, I have an array of hashes that contains records. I would like to obtain the sum of a particular field, but the sum would grouped by key/value pairs. In other words, if my selection criteria is :id as in the example above, then it would seperate the hashes into groups where the id was the same and the sum the other keys.
I apologize for any confusion due to the typo earlier.
Edit: The question has been clarified since I first posted my answer. As a result, I have revised my answer substantially.
Here are two "standard" ways of addressing this problem. Both use Enumerable#select to first extract the elements from the array (hashes) that contain the given key/value pair.
#1
The first method uses Hash#merge! to sequentially merge each array element (hashes) into a hash that is initially empty.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
end
Example
arr = [{:id => '1', :value => '2'}, {:id => '2', :value => '3'},
{:id => '1', :chips => '4'}, {:zd => '1', :value => '8'},
{:cat => '2', :value => '3'}, {:id => '1', :value => '5'}]
doit(arr, :id, '1')
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Explanation
The key here is to use the version of Hash#merge! that uses a block to determine the value for each key/value pair whose key appears in both of the hashes being merged. The two values for that key are represented above by the block variables hv and gv. We simply want to add them together. Note that g is the (initially empty) hash object created by each_with_object, and returned by doit.
target_key = :id
target_value = '1'
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
#=> [{:id=>"1", :value=>"2"},{:id=>"1", :chips=>"4"},{:id=>"1", :value=>"5"}]
qualified.empty?
#=> false
qualified.each_with_object({}) {|h,g|
g.merge!(h) {|k,gv,hv| k == target_key ? gv : (gv.to_i + hv.to_i).to_s}}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
#2
The other common way to do this kind of calculation is to use Enumerable#flat_map, followed by Enumerable#group_by.
Code
def doit(arr, target_key, target_value)
qualified = arr.select {|h|h.key?(target_key) && h[target_key]==target_value}
return nil if qualified.empty?
qualified.flat_map(&:to_a)
.group_by(&:first)
.values.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}]}.to_h
end
Explanation
This may look complex, but it's not so bad if you break it down into steps. Here's what's happening. (The calculation of qualified is the same as in #1.)
target_key = :id
target_value = '1'
c = qualified.flat_map(&:to_a)
#=> [[:id,"1"],[:value,"2"],[:id,"1"],[:chips,"4"],[:id,"1"],[:value,"5"]]
d = c.group_by(&:first)
#=> {:id=>[[:id, "1"], [:id, "1"], [:id, "1"]],
# :value=>[[:value, "2"], [:value, "5"]],
# :chips=>[[:chips, "4"]]}
e = d.values
#=> [[[:id, "1"], [:id, "1"], [:id, "1"]],
# [[:value, "2"], [:value, "5"]],
# [[:chips, "4"]]]
f = e.map { |a| a.first.first == target_key ? a.first :
[a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }
#=> [[:id, "1"], [:value, "7"], [:chips, "4"]]
f.to_h => {:id=>"1", :value=>"7", :chips=>"4"}
#=> {:id=>"1", :value=>"7", :chips=>"4"}
Comment
You may wish to consider makin the values in the hashes integers and exclude the target_key/target_value pairs from qualified:
arr = [{:id => 1, :value => 2}, {:id => 2, :value => 3},
{:id => 1, :chips => 4}, {:zd => 1, :value => 8},
{:cat => 2, :value => 3}, {:id => 1, :value => 5}]
target_key = :id
target_value = 1
qualified = arr.select { |h| h.key?(target_key) && h[target_key]==target_value}
.each { |h| h.delete(target_key) }
#=> [{:value=>2}, {:chips=>4}, {:value=>5}]
return nil if qualified.empty?
Then either
qualified.each_with_object({}) {|h,g| g.merge!(h) { |k,gv,hv| gv + hv } }
#=> {:value=>7, :chips=>4}
or
qualified.flat_map(&:to_a)
.group_by(&:first)
.values
.map { |a| [a.first.first, a.reduce(0) {|tot,s| tot + s.last}] }.to_h
#=> {:value=>7, :chips=>4}
I have a helper module to generate an array hash data, which is something like:
[{:date => d, :total_amount => 31, :first_category => 1, :second_category => 2,...},
{:date => d+1, :total_amount => 31, :first_category => 1, :second_category => 2,...}]
So I make the method like:
def records_chart_data(category = nil, start = 3.weeks.ago)
total_by_day = Record.total_grouped_by_day(start)
category_sum_by_day = Record.sum_of_category_by_day(start)
(start.to_date..Time.zone.today).map do |date|
{
:date => date,
:total_amount => total_by_day[date].try(:first).try(:total_amount) || 0,
Category.find(1).title => category_sum_by_day[0][date].try(:first).try(:total_amount) || 0,
Category.find(2).title => category_sum_by_day[1][date].try(:first).try(:total_amount) || 0,
Category.find(3).title => category_sum_by_day[2][date].try(:first).try(:total_amount) || 0,
}
end
end
Since the Category will always change, I try to use loop in this method like:
def records_chart_data(category = nil, start = 3.weeks.ago)
total_by_day = Record.total_grouped_by_day(start)
category_sum_by_day = Record.sum_of_category_by_day(start)
(start.to_date..Time.zone.today).map do |date|
{
:date => date,
Category.all.each_with_index do |category, index|
category.title => category_sum_by_day[index][date].try(:first).try(:total_amount) || 0,
end
:total_amount => total_by_day[date].try(:first).try(:total_amount) || 0
}
end
end
But ruby alerts me with an error:
/Users/tsu/Code/CashNotes/app/helpers/records_helper.rb:10: syntax error, unexpected tASSOC, expecting keyword_end
category.title => category_sum_by_day[index][d...
Why does it say expecting keyword_end, and how should I fix it?
The method category_sum_by_day it calls looks like:
def self.sum_of_category_by_day(start)
records = where(date: start.beginning_of_day..Time.zone.today)
records = records.group('category_id, date(date)')
records = records.select('category_id, date, sum(amount) as total_amount')
records = records.group_by{ |r| r.category_id }
records.map do |category_id, value|
value.group_by {|r| r.date.to_date}
end
end
Or should I alter this method to generate a more friendly method for the helper above?
Category.all.each_with_index do |category, index|
category.title => category_sum_by_day # ...snip!
end
Unfortunately, this piece of code does not adhere to Ruby's grammar. The problem is the body of the block. x => y is not an expression and the syntax requires bodies of blocks to be expressions.
If you want to generate a hash by one key-value pair at a time try the following combination of Hash::[], Array#flatten and the splat operator (i.e. unary *):
Hash[*5.times.map { |i| [i * 3, - i * i] }.flatten]
As a result I'd rewrite the last expresion of records_chart_data more or less as follows
(start.to_date..Time.zone.today).map do |date|
categories = Hash[*Category.all.each_with_index do |category, index|
[ category.title, category_sum_by_day[...] ]
end .flatten]
{ :date => date,
:total_amount => total_by_day[date].try(:first).try(:total_amount) || 0
}.merge categories
end
If you consider it unreadable you can do it in a less sophisticated way, i.e.:
(start.to_date..Time.zone.today).map do |date|
hash = {
:date => date,
:total_amount => total_by_day[date].try(:first).try(:total_amount) || 0
}
Category.all.each_with_index do |category, index|
hash[category.title] = category_sum_by_day[...]
end
hash
end
Another idea is to use Array#reduce and adopt a more functional approach.
(start.to_date..Time.zone.today).map do |date|
Category.all.each_with_index.reduce({
:date => date,
:total_amount => total_by_day[date].try(:first).try(:total_amount) || 0
}) do |hash, (category, index)|
hash.merge category.title => category_sum_by_day[...]
end
hash
end