Cleanest way to count multiple values in an array of hashes? - ruby

I have an array of hashes like this:
data = [
{group: "A", result: 1},
{group: "B", result: 1},
{group: "A", result: 0},
{group: "A", result: 1},
{group: "B", result: 1},
{group: "B", result: 1},
{group: "B", result: 0},
{group: "B", result: 0}
]
The group will only be either A or B, and the result will only be 1 or 0. I want to count how many times the result is 0 or 1 for each group, i.e., to get a tally like so:
A: result is "1" 2 times
result is "0" 1 time
B: result is "1" 3 times
result is "0" 2 times
I am thinking of storing the actual results in a nested hash, like:
{ a: { pass: 2, fail: 1 }, b: { pass: 3, fail: 2 } }
but this might not be the best way, so I'm open to other ideas here.
What would be the cleanest way to do this in Ruby while iterating over the data only once? Using data.inject or data.count somehow?

stats = Hash[data.group_by{|h| [h[:group], h[:result]] }.map{|k,v| [k, v.count] }]
#=> {["A", 1]=>2, ["B", 1]=>3, ["A", 0]=>1, ["B", 0]=>2}
I'll leave the transformation to the desired format up to you ;-)

This way would go over the hash only one time:
result = Hash.new { |h, k| h[k] = { pass: 0, fail: 0 }}
data.each do |item|
result[item[:group]][item[:result] == 0 ? :fail : :pass] += 1
end
result
# => {"A"=>{:pass=>2, :fail=>1}, "B"=>{:pass=>3, :fail=>2}}

You could use the form of Hash#update (same as Hash#merge!) that takes a block to determine the values of keys that are contained in both hashes being merged:
data.map(&:values).each_with_object({}) { |(g,r),h|
h.update({g.to_sym=>{pass: r, fail: 1-r } }) { |_,oh,nh|
{ pass: oh[:pass]+nh[:pass], fail: oh[:fail]+nh[:fail] } } }
#=> {:A=>{:pass=>2, :fail=>1}, :B=>{:pass=>3, :fail=>2}}

If that is truely your desired output then something like this would work:
def pass_fail_hash(a=[],statuses=[:pass,:fail])
a.map(&:dup).group_by{|h| h.shift.pop.downcase.to_sym}.each_with_object({}) do |(k,v),obj|
obj[k] = Hash[statuses.zip(v.group_by{|v| v[:result]}.map{|k,v| v.count})]
statuses.each {|status| obj[k][status] ||= 0 }
end
end
Then
pass_fail_hash data
#=> {:a=>{:pass=>2, :fail=>1}, :b=>{:pass=>3, :fail=>2}}
Thank you to #CarySwoveland for pointing out my original method did not take into account cases where there were no passing or failing values. This has now been resolved so that a hash array like [{ group: "A", result: 1 }] will now show {a:{:pass => 1, :fail => 0}} where it would have previously been {a:{:pass => 1, :fail => nil}}.

Related

Ruby select latest duplicated values from an array of hash

Let say I have this kind of array
a = [
{key: "cat", value: 1},
{key: "dog", value: 2},
{key: "mouse", value: 5},
{key: "rat", value: 3},
{key: "cat", value: 5},
{key: "rat", value: 2},
{key: "cat", value: 1},
{key: "cat", value: 1}
]
Let say I have this array, and want to get only the latest value found for "cat".
I know how to select all of them
like
a.select do |e|
e[:key] == "cat"
end
But I'm looking for a way to just get a selection of the last 3
desired result would be
[
{key: "cat", value: 5},
{key: "cat", value: 1},
{key: "cat", value: 1}
]
thanks!
In a comment on the question #Stefan suggested:
a.select { |e| e[:key] == "cat" }.last(3)
Provided a is not too large that is likely what you should use. However, if a is large, and especially if it contains many elements (hashes) h for which h[:key] #=> "cat", it likely would be more efficient to iterate backwards from the end of the array and terminate ("short-circuit") as soon as three elements h have been found for which h[:key] #=> "cat". This also avoids the construction of a potentially-large temporary array (a.select { |e| e[:key] == "cat" }).
One way to do that is as follows.
a.reverse_each.with_object([]) do |h,arr|
arr.insert(0,h) if h[:key] == "cat"
break arr if arr.size == 3
end
#=> [{:key=>"cat", :value=>5},
# {:key=>"cat", :value=>1},
# {:key=>"cat", :value=>1}]
See Array#reverse_each, Enumerator#with_object and Array#insert. Note that because reverse_each and with_object both return enumerators, chaining them produces an enumerator as well:
a.reverse_each.with_object([])
#=> #<Enumerator: #<Enumerator: [{:key=>"cat", :value=>1},
# ...
# {:key=>"cat", :value=>1}]:reverse_each>:with_object([])>
It might be ever-so-slightly faster to replace the block calculation with
arr << h if h[:key] == "cat"
break arr.reverse if arr.size == 3
If a contains fewer elements h for which h[:key] #=> "cat" an array arr will be returned for which arr.size < 3. It therefore is necessary to confirm that the array returned contains three elements.
This check must also be performed when #Stefan's suggested code is used, as (for example)
a.select { |e| e[:key] == "cat" }.last(99)
#=> [{:key=>"cat", :value=>1},
# {:key=>"cat", :value=>5},
# {:key=>"cat", :value=>1},
# {:key=>"cat", :value=>1}]

Creating a Ruby Hash Map in a Functional Way

I have an array I want to turn into a hash map keyed by the item and with an array of indices as the value. For example
arr = ["a", "b", "c", "a"]
would become
hsh = {"a": [0,3], "b": [1], "c": [2]}
I would like to do this in a functional way (rather than a big old for loop), but am a little stuck
lst = arr.collect.with_index { |item, i| [item, i] }
produces
[["a", 0], ["b", 1], ["c", 2], ["a", 3]]
I then tried Hash[lst], but I don't get the array in the value and lose index 0
{"a"=>3, "b"=>1, "c"=>2}
How can I get my desired output in a functional way? I feel like it's something like
Hash[arr.collect.with_index { |item, i| [item, item[i] << i || [i] }]
But that doesn't yield anything.
Note: Trying to not do it this way
hsh = {}
arr.each.with_index do |item, index|
if hsh.has_key?(item)
hsh[item] << index
else
hsh[item] = [index]
end
end
hsh
Input
arr = ["a", "b", "c", "a"]
Code
p arr.map
.with_index
.group_by(&:first)
.transform_values { |arr| arr.map(&:last) }
Output
{"a"=>[0, 3], "b"=>[1], "c"=>[2]}
I would like to do this in a functional way (rather than a big old for loop), but am a little stuck
lst = arr.collect.with_index { |item, i| [item, i] }
produces
[["a", 0], ["b", 1], ["c", 2], ["a", 3]]
This is very close. The first thing I would do is change the inner arrays to hashes:
arr.collect.with_index { |item, i| { item => i }}
#=> [{ "a" => 0 }, { "b" => 1 }, { "c" => 2 }, { "a" => 3 }]
This is one step closer. Now, actually we want the indices in arrays:
arr.collect.with_index { |item, i| { item => [i] }}
#=> [{ "a" => [0] }, { "b" => [1] }, { "c" => [2] }, { "a" => [3] }]
This is even closer. Now, all we need to do is to merge those hashes into one single hash. There is a method for that, which is called Hash#merge. It takes an optional block for deconflicting duplicate keys, and all we need to do is concatenate the arrays:
arr.collect.with_index { |item, i| { item => [i] }}.inject({}) {|acc, h| acc.merge(h) {|_, a, b| a + b } }
#=> { "a" => [0, 3], "b" => [1], "c" => [2] }
And we're done!
How can I get my desired output in a functional way? I feel like it's something like
Hash[arr.collect.with_index { |item, i| [item, item[i] << i || [i] }]
But that doesn't yield anything.
Well, it has a SyntaxError, so obviously if it cannot even be parsed, then it cannot run, and if it doesn't even run, then it cannot possibly yield anything.
However, not that even if it worked, it would still violate your constraint that it should be done "in a functional way", because Array#<< mutates its receiver and is thus not functional.
arr.map.with_index.each_with_object({}){ |(a, i), h| h[a] ? h[a] << i : (h[a] = [i]) }
#=> {"a"=>[0, 3], "b"=>[1], "c"=>[2]}
arr.map.with_index => gives enumeration of each element with it's index
each_with_object => lets you reduce the enumeration on a provided object(represented by h in above)

Merging/adding hashes in array with same key

I have an array of hashes:
array = [
{"points": 0, "block": 3},
{"points": 25, "block": 8},
{"points": 65, "block": 4}
]
I need to merge the hashes. I need the output to be:
{"points": 90, "block": 15}
You could merge the hashes together, adding the values that are in both hashes:
result = array.reduce do |memo, next_hash|
memo.merge(next_hash) do |key, memo_value, next_hash_value|
memo_value + next_hash_value
end
end
result # => {:points=>90, :block=>15}
and if your real hash has keys that don't respond well to +, you have access to the key, you could set up a case statement, to handle the keys differently, if needed.
If you have the array as you mentioned in this structure:
array = [
{"points": 0, "block": 3},
{"points": 25, "block": 8},
{"points": 65, "block": 4}
]
You can use the following code to achieve your goal:
result = {
points: array.map{ |item| item[:points] }.inject(:+),
block: array.map{ |item| item[:block] }.inject(:+)
}
You will get this result:
{:points=>90, :block=>15}
Note: This will iterate twice over the array. I'm trying to figure out a better way to iterate once and still have the same elegant/easy to ready code.
If you want to do it more generically (more keys than :points and :block), then you can use this code:
array = [
{"points": 0, "block": 3},
{"points": 25, "block": 8},
{"points": 65, "block": 4}
]
keys = [:points, :block] # or you can make it generic with array.first.keys
result = keys.map do |key|
[key, array.map{ |item| item.fetch(key, 0) }.inject(:+)]
end.to_h
You can create method as below to get the result
def process_array(array)
points = array.map{|h| h[:points]}
block = array.map{|h| h[:block]}
result = {}
result['points'] = points.inject{ |sum, x| sum + x }
result['block'] = block.inject{ |sum, x| sum + x }
result
end
and calling the method with array input will give you expected result.
[54] pry(main)> process_array(array)
=> {"points"=>90, "block"=>15}
You can also use the Enumerator each_with_object, using a hash as object.
result = array.each_with_object(Hash.new(0)) {|e, h| h[:points] += e[:points]; h[:block] += e[:block] }
# => {:points=>90, :block=>15}
Hash.new(0) means initialise the hash to default value 0 for any keys, for example:
h = Hash.new(0)
h[:whathever_key] # => 0
I was interested in how the reduce method introduced by "Simple Lime" worked and also how it would benchmark against simple iteration over the array and over the keys of each hash.
Here is the code of the "iteration" approach:
Hash.new(0).tap do |result|
array.each do |hash|
hash.each do |key, val|
result[key] = result[key] + val
end
end
end
I was surprised, that the "iteration" code performed 3 times better than the reduce approach.
Here is the benchmark code https://gist.github.com/landovsky/6a1b29cbf13d0cf81bad12b6ba472416

Join an array with a block natively

Is there a native way to join all elements of an array into a unique element like so:
[
{a: "a"},
{b: "b"}
].join do | x, y |
x.merge(y)
end
To output something like:
{
a: "a",
b: "b"
}
The fact that I used hashes into my array is an example, I could say:
[
0,
1,
2,
3
].join do | x, y |
x + y
end
Ends up with 6 as a value.
Enumerable#inject covers both of these cases:
a = [{a: "a"}, {b: "b"}]
a.inject(:merge) #=> {:a=>"a", :b=>"b"}
b = [0, 1, 2, 3]
b.inject(:+) #=> 6
inject "sums" an array using the provided method. In the first case, the "addition" of the sum and the current element is done by merging, and in the second case, through addition.
If the array is empty, inject returns nil. To make it return something else, specify an initial value (thanks #Hellfar):
[].inject(0, :+) #=> 0
[
{a: "a"},
{b: "b"}
].inject({}){|sum, e| sum.merge e}

Ruby sort array of hashes by child-parent relation

So we have and array of hashes
array = [
{id: 1, parent_id: 0},
{id: 2, parent_id: 1},
{id: 3, parent_id: 0},
{id: 4, parent_id: 2}
]
target_array = []
What is the most efficient and ruby way to map/sort that array to the following result:
target_array = [
{id:1,children:
[{id: 2, children: [
{id:4, children:[]}]}]},
{id: 3, children:[]}
]
p.s.The most I am capable of is iterating whole thing for each item and excluding from array hash that is already mapped to target_array.
You can solve this with recursion :
#array = [
{id: 1, parent_id: 0},
{id: 2, parent_id: 1},
{id: 3, parent_id: 0},
{id: 4, parent_id: 2}
]
def build_hierarchy target_array, n
#array.select { |h| h[:parent_id] == n }.each do |h|
target_array << {id: h[:id], children: build_hierarchy([], h[:id])}
end
target_array
end
build_hierarchy [], 0
Output :
=> [{"id"=>1, "children"=>[{"id"=>2, "children"=>[{"id"=>4, "children"=>[]}]}]}, {"id"=>3, "children"=>[]}]
Live example in this ruby fiddle http://rubyfiddle.com/riddles/9b643
I would use recursion, but the following could easily be converted to a non-recursive method.
First construct a hash linking parents to their children (p2c). For this, use the form of Hash#update (aka merge!) that uses a block to determine the values of keys that are present in both hashes being merged:
#p2c = array.each_with_object({}) { |g,h|
h.update(g[:parent_id]=>[g[:id]]) { |_,ov,nv| ov+nv } }
#=> {0=>[1, 3], 1=>[2], 2=>[4]}
There are many other ways to construct this hash. Here's another:
#p2c = Hash[array.group_by { |h| h[:parent_id] }
.map { |k,v| [k, v.map { |g| g[:id] }] }]
Now construct a recursive method whose lone argument is a parent:
def family_tree(p=0)
return [{ id: p, children: [] }] unless #p2c.key?(p)
#p2c[p].each_with_object([]) { |c,a|
a << { id:c, children: family_tree(c) } }
end
We obtain:
family_tree
#=> [ { :id=>1, :children=>
# [
# { :id=>2, :children=>
# [
# { :id=>4, :children=>[] }
# ]
# }
# ]
# },
# { :id=>3, :children=>[] }
# ]
Constructing the hash #p2c initially should make it quite efficient.
This is what I tried my way using Hash
array = [
{id: 1, parent_id: 0},
{id: 2, parent_id: 1},
{id: 3, parent_id: 0},
{id: 4, parent_id: 2}
]
target_hash = Hash.new { |h,k| h[k] = { id: nil, children: [ ] } }
array.each do |n|
id, parent_id = n.values_at(:id, :parent_id)
target_hash[id][:id] = n[:id]
target_hash[parent_id][:children].push(target_hash[id])
end
puts target_hash[0]
Output:
{:id=>nil, :children=>[{:id=>1, :children=>[{:id=>2, :children=>[{:id=>4, :children=>[]}]}]}, {:id=>3, :children=>[]}]}
I think the best one will have O(nlog(n)) time complexity at most. I'm giving my non-hash one :
array = [
{id: 1, parent_id: 0},
{id: 2, parent_id: 1},
{id: 3, parent_id: 0},
{id: 4, parent_id: 2}
]
# This takes O(nlog(n)).
array.sort! do |a, b|
k = (b[:parent_id] <=> b[:parent_id])
k == 0 ? b[:id] <=> a[:id] : k
end
# This takes O(n)
target_array = array.map do |node|
{ id: node[:id], children: [] }
end
# This takes O(nlog(n))
target_array.each_with_index do |node, index|
parent = target_array[index + 1...target_array.size].bsearch do |target_node|
target_node[:id] == array[index][:parent_id]
end
if parent
parent[:children] << node
target_array[index] = nil
end
end
# O(n)
target_array.reverse.compact
# =>
# [{:id => 1, :children =>[{:id=>2,:children=> [ {:id=>4,
# :children=>[]}]}]},
# {:id=>3, :children=>[]} ]
So mine uses O(nlog(n)) in general.
By the way, when I simply tested out the existing solutions I found Gagan Gami's to be most efficient (slightly ahead of mine), I believe it's O(nlog(n)) too, though not obvious. But the currently accepted solution takes O(n^2) time.

Resources