How does Ruby functions return a value even when there is nothing to return? - ruby

Below code converts the provided key's value in an array of hashes from JSON to hash if it is not nil. This is demonstrated in example 1.
In example 2 the provided key is nil therefore no changes are made to the data. This is the behavior I want. However I can't understand why this is happening. In example 2, the code doesn't hit line if !hash[key].nil? which means the function must return nil however it appears to be returning data_2. In ruby I understand that functions return the last evaluated statement. In example 2 what exactly is the last evaluated statement?
require 'json'
def convert(arr_of_hashes, key)
arr_of_hashes.each do |hash|
if !hash[key].nil?
begin
JSON.parse(hash[key])
rescue JSON::ParserError => e
raise "Bad"
else
hash[key] = JSON.parse(hash[key], {:symbolize_names => true})
end
end
end
end
data_1 = [ { :key_1 => "Apple", :key_2 => "{\"one\":1, \"two\":2}", :key_3 => 200 }, { :key_1 => "Orange" } ]
data_2 = [ { :key_1 => "Apple", :key_2 => nil, :key_3 => 200 }, { :key_1 => "Orange" } ]
# Example 1
p convert(data_1, :key_2)
# [{:key_1=>"Apple", :key_2=>{:one=>1, :two=>2}, :key_3=>200}, {:key_1=>"Orange"}]
# Example 2
p convert(data_2, :key_4)
# [{:key_1=>"Apple", :key_2=>nil, :key_3=>200}, {:key_1=>"Orange"}]

Consider an extremely basic example:
irb(main):003:0> a = [1, 2, 3]
=> [1, 2, 3]
irb(main):004:0> a.each { |x| p x }
1
2
3
=> [1, 2, 3]
irb(main):005:0>
The #each method
is returning the Enumerable object.
If I wrap this in a method, the method returns the last expression, which evaluates to the Enumerable object a.
irb(main):006:0> def foo(a)
irb(main):007:1> a.each { |x| puts x }
irb(main):008:1> end
=> :foo
irb(main):009:0> foo([1, 2, 3])
1
2
3
=> [1, 2, 3]
irb(main):010:0>

Related

How to find the largest value of a hash in an array of hashes

In my array, I'm trying to retrieve the key with the largest value of "value_2", so in this case, "B":
myArray = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
myArray.each do |array_hash|
array_hash.each do |key, value|
if value["value_2"] == array_hash.values.max
puts key
end
end
end
I get the error:
"comparison of Hash with Hash failed (ArgumentError)".
What am I missing?
Though equivalent, the array given in the question is generally written:
arr = [{ "A" => { "value_1" => 30, "value_2" => 240 } },
{ "B" => { "value_1" => 40, "value_2" => 250 } },
{ "C" => { "value_1" => 18, "value_2" => 60 } }]
We can find the desired key as follows:
arr.max_by { |h| h.values.first["value_2"] }.keys.first
#=> "B"
See Enumerable#max_by. The steps are:
g = arr.max_by { |h| h.values.first["value_2"] }
#=> {"B"=>{"value_1"=>40, "value_2"=>250}}
a = g.keys
#=> ["B"]
a.first
#=> "B"
In calculating g, for
h = arr[0]
#=> {"A"=>{"value_1"=>30, "value_2"=>240}}
the block calculation is
a = h.values
#=> [{"value_1"=>30, "value_2"=>240}]
b = a.first
#=> {"value_1"=>30, "value_2"=>240}
b["value_2"]
#=> 240
Suppose now arr is as follows:
arr << { "D" => { "value_1" => 23, "value_2" => 250 } }
#=> [{"A"=>{"value_1"=>30, "value_2"=>240}},
# {"B"=>{"value_1"=>40, "value_2"=>250}},
# {"C"=>{"value_1"=>18, "value_2"=>60}},
# {"D"=>{"value_1"=>23, "value_2"=>250}}]
and we wish to return an array of all keys for which the value of "value_2" is maximum (["B", "D"]). We can obtain that as follows.
max_val = arr.map { |h| h.values.first["value_2"] }.max
#=> 250
arr.select { |h| h.values.first["value_2"] == max_val }.flat_map(&:keys)
#=> ["B", "D"]
flat_map(&:keys) is shorthand for:
flat_map { |h| h.keys }
which returns the same array as:
map { |h| h.keys.first }
See Enumerable#flat_map.
Code
p myArray.pop.max_by{|k,v|v["value_2"]}.first
Output
"B"
I'd use:
my_array = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
h = Hash[*my_array]
# => {"A"=>{"value_1"=>30, "value_2"=>240},
# "B"=>{"value_1"=>40, "value_2"=>250},
# "C"=>{"value_1"=>18, "value_2"=>60}}
k = h.max_by { |k, v| v['value_2'] }.first # => "B"
Hash[*my_array] takes the array of hashes and turns it into a single hash. Then max_by will iterate each key/value pair, returning an array containing the key value "B" and the sub-hash, making it easy to grab the key using first:
k = h.max_by { |k, v| v['value_2'] } # => ["B", {"value_1"=>40, "value_2"=>250}]
I guess the idea of your solution is looping through each hash element and compare the found minimum value with hash["value_2"].
But you are getting an error at
if value["value_2"] == array_hash.values.max
Because the array_hash.values is still a hash
{"A"=>{"value_1"=>30, "value_2"=>240}}.values.max
#=> {"value_1"=>30, "value_2"=>240}
It should be like this:
max = nil
max_key = ""
myArray.each do |array_hash|
array_hash.each do |key, value|
if max.nil? || value.values.max > max
max = value.values.max
max_key = key
end
end
end
# max_key #=> "B"
Another solution:
myArray.map{ |h| h.transform_values{ |v| v["value_2"] } }.max_by{ |k| k.values }.keys.first
You asked "What am I missing?".
I think you are missing a proper understanding of the data structures that you are using. I suggest that you try printing the data structures and take a careful look at the results.
The simplest way is p myArray which gives:
[{"A"=>{"value_1"=>30, "value_2"=>240}, "B"=>{"value_1"=>40, "value_2"=>250}, "C"=>{"value_1"=>18, "value_2"=>60}}]
You can get prettier results using pp:
require 'pp'
pp myArray
yields:
[{"A"=>{"value_1"=>30, "value_2"=>240},
"B"=>{"value_1"=>40, "value_2"=>250},
"C"=>{"value_1"=>18, "value_2"=>60}}]
This helps you to see that myArray has only one element, a Hash.
You could also look at the expression array_hash.values.max inside the loop:
myArray.each do |array_hash|
p array_hash.values
end
gives:
[{"value_1"=>30, "value_2"=>240}, {"value_1"=>40, "value_2"=>250}, {"value_1"=>18, "value_2"=>60}]
Not what you expected? :-)
Given this, what would you expect to be returned by array_hash.values.max in the above loop?
Use p and/or pp liberally in your ruby code to help understand what's going on.

Convert array into hash and add a counter value to the new hash

I have the following array of hashes:
[
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
And I would like to translate this array into a hash that includes the counts for each item:
Output:
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
I have tried two approaches to translate the array into a hash.
new_cart = cart.inject(:merge)
hash = Hash[cart.collect { |item| [item, ""] } ]
Both work but then I am stumped at how to capture and pass the count value.
Expected output
{
"BREAD" => {:price => 1.50, :discount => true, :count => 2},
"MARMITE" => {:price => 1.60, :discount => false, :count => 1}
}
We are given the array:
arr = [
{"BREAD" => {:price => 1.50, :discount => true }},
{"BREAD" => {:price => 1.50, :discount => true }},
{"MARMITE" => {:price => 1.60, :discount => false}}
]
and make the assumption that each hash has a single key and if two hashes have the same (single) key, the value of that key is the same in both hashes.
The first step is create an empty hash to which will add key-value pairs:
h = {}
Now we loop through arr to build the hash h. I've added a puts statement to display intermediate values in the calculation.
arr.each do |g|
k, v = g.first
puts "k=#{k}, v=#{v}"
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
displays:
k=BREAD, v={:price=>1.5, :discount=>true}
k=BREAD, v={:price=>1.5, :discount=>true}
k=MARMITE, v={:price=>1.6, :discount=>false}
and returns:
#=> [{"BREAD" =>{:price=>1.5, :discount=>true}},
# {"BREAD" =>{:price=>1.5, :discount=>true}},
# {"MARMITE"=>{:price=>1.6, :discount=>false}}]
each always returns its receiver (here arr), which is not what we want.
h #=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
is the result we need. See Hash#key? (aka, has_key?), Hash#[], Hash#[]= and Hash#merge.
Now let's wrap this in a method.
def hashify(arr)
h = {}
arr.each do |g|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count=>1 })
end
end
h
end
hashify(arr)
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :count=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :count=>1}}
Rubyists would often use the method Enumerable#each_with_object to simplify.
def hashify(arr)
arr.each_with_object({}) do |g,h|
k, v = g.first
if h.key?(k)
h[k][:count] += 1
else
h[k] = v.merge({ :count => 1 })
end
end
end
Compare the two methods to identify their differences. See Enumerable#each_with_object.
When, as here, the keys are symbols, Ruby allows you to use the shorthand { count: 1 } for { :count=>1 }. Moreover, she permits you to write :count = 1 or count: 1 without the braces when the hash is an argument. For example,
{}.merge('cat'=>'meow', dog:'woof', :pig=>'oink')
#=> {"cat"=>"meow", :dog=>"woof", :pig=>"oink"}
It's probably more common to see the form count: 1 when keys are symbols and for the braces to be omitted when a hash is an argument.
Here's a further refinement you might see. First create
h = arr.group_by { |h| h.keys.first }
#=> {"BREAD" =>[{"BREAD"=>{:price=>1.5, :discount=>true}},
# {"BREAD"=>{:price=>1.5, :discount=>true}}],
# "MARMITE"=>[{"MARMITE"=>{:price=>1.6, :discount=>false}}]}
See Enumerable#group_by. Now convert the values (arrays) to their sizes:
counts = h.transform_values { |arr| arr.size }
#=> {"BREAD"=>2, "MARMITE"=>1}
which can be written in abbreviated form:
counts = h.transform_values(&:size)
#=> {"BREAD"=>2, "MARMITE"=>1}
See Hash#transform_values. We can now write:
uniq_arr = arr.uniq
#=> [{"BREAD"=>{:price=>1.5, :discount=>true}},
#= {"MARMITE"=>{:price=>1.6, :discount=>false}}]
uniq_arr.each_with_object({}) do |g,h|
puts "g=#{g}"
k,v = g.first
puts " k=#{k}, v=#{v}"
h[k] = v.merge(counts: counts[k])
puts " h=#{h}"
end
which displays:
g={"BREAD"=>{:price=>1.5, :discount=>true}}
k=BREAD, v={:price=>1.5, :discount=>true}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2}}
g={"MARMITE"=>{:price=>1.6, :discount=>false}}
k=MARMITE, v={:price=>1.6, :discount=>false}
h={"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
"MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
and returns:
#=> {"BREAD"=>{:price=>1.5, :discount=>true, :counts=>2},
# "MARMITE"=>{:price=>1.6, :discount=>false, :counts=>1}}
See Array#uniq.
This did the trick:
arr = [
{ bread: { price: 1.50, discount: true } },
{ bread: { price: 1.50, discount: true } },
{ marmite: { price: 1.60, discount: false } }
]
Get the count for each occurrence of hash, add as key value pair and store:
h = arr.uniq.each { |x| x[x.first.first][:count] = arr.count(x) }
Then convert hashes into arrays, flatten to a single array then construct a hash:
Hash[*h.collect(&:to_a).flatten]
#=> {:bread=>{:price=>1.50, :discount=>true, :count=>2}, :marmite=>{:price=>1.60, :discount=>false, :count=>1}}
Combined a couple of nice ideas from here:
https://raycodingdotnet.wordpress.com/2013/08/05/array-of-hashes-into-single-hash-in-ruby/
and here:
http://carol-nichols.com/2015/08/07/ruby-occurrence-couting/

Most performant way to group/summarise two hashes?

I have two hashes with some data that I need to aggregate. The first one is a mapping of which ids (id_1, id_2, id_3, id_4) belong under what category (a, b, c):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
The second hash holds values of how many events happened per id for a given date (date_1, date_2, date_3):
hash_2 = {
'id_1' => {'date_1' => 5, 'date_2' => 6, 'date_3' => 8},
'id_2' => {'date_1' => 0, 'date_3' => 6},
'id_3' => {'date_1' => 0, 'date_2' => nil, 'date_3' => 1},
'id_4' => {'date_1' => 10, 'date_2' => 1}
}
What I want is to get the total event per category (a,b,c). For the above example, the result would look something like:
hash_3 = {'a' => (5+6+8+0+6), 'b' => (0+0+1), 'c' => (10+1)}
My problem is, that there are about 5000 categories, each pointing to typically 1 to 3 ids, and each ID having event counts for 30 dates or more. So this takes quite a bit of computation. What will be the most performant (time effective) way to do this grouping in Ruby?
update
This is what I tried so far (took like 6-8 seconds!, horribly slow):
def total_clicks_per_category
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
end
def total_event_per_ids(ids)
ids.reduce(0) do |memo, id|
events = hash_2.fetch(id, {})
memo + (events.values.reduce(:+) || 0)
end
end
P.S. I’m using Ruby 2.3.
I'm writing this on a phone so I cannot test right now, but it looks OK.
g = hash_2.each_with_object({}) { |(k,v),g| g[k] = v.values.compact.sum }
hash_3 = hash_1.each_with_object({}) { |(k,v),h| h[k] = g.values_at(*v).sum }
First, create an intermediate hash that holds the sum of hash_2:
hash_4 = hash_2.map{|k, v| [k, v.values.inject(:+)]}.to_h
# => {"id_1"=>19, "id_2"=>6, "id_3"=>1, "id_4"=>11}
Then do the final summation:
hash_3 = hash_1.map{|k, v| [k, v.map{|k| hash_4[k]}.inject(:+)]}.to_h
# => {"a"=>25, "b"=>1, "c"=>11}
Theory
5000*3*30 isn't that many. Ruby probably will need a second at most for this kind of job.
Hash lookup is fast by default, you won't be able to optimize much.
You could pre-calculate hash_2_sum, though :
hash_2_sum = {
'id_1' => 5+6+8,
'id_2' => 0+6,
'id_3' => 0+0+1,
'id_4' => 10+1
}
A loop on hash1 with hash_2_sum lookup, and you're done.
Code
Your example has been updated with some nil values. You need to remove them with compact, and make sure the sum is 0 when no element is found with inject(0, :+):
hash_1 = {'a' => ['id_1','id_2'], 'b' => ['id_3'], 'c' => ['id_4']}
hash_2 = {
'id_1' => { 'date_1' => 5, 'date_2' => 6, 'date_3' => 8 },
'id_2' => { 'date_1' => 0, 'date_3' => 6 },
'id_3' => { 'date_1' => 0, 'date_2' => nil, 'date_3' => 1 },
'id_4' => { 'date_1' => 10, 'date_2' => 1 }
}
hash_2_sum = hash_2.each_with_object({}) do |(key, dates), sum|
sum[key] = dates.values.compact.inject(0, :+)
end
hash_3 = hash_1.each_with_object({}) do |(key, ids), sum|
sum[key] = hash_2_sum.values_at(*ids).inject(0, :+)
end
# {"a"=>25, "b"=>1, "c"=>11}
Note
{}.tap do |res|
hash_1.each do |cat, ids|
res[cat] = total_event_per_ids(ids)
end
end
isn't very readable IMHO.
You can either use each_with_object or Array#to_h :
result = [1, 2, 3].each_with_object({}) do |i, hash|
hash[i] = i * i
end
#=> {1=>1, 2=>4, 3=>9}
result = [1, 2, 3].map { |i| [i, i * i] }.to_h
#=> {1=>1, 2=>4, 3=>9}

JSON with symbols and strings not readable

I have the following JSON:
{ :a => 1, "b" => "test" }
jsonObject[:b] does not give me any data, whereas for a JSON with all keys as strings,
{ "a" => 1, "b" => "test" }
it works fine:
jsonObject[:b] # => "test"
Is there a constraint against using a symbol and key in the same JSON object?
I suggest to parse a JSON to a Hash before using, like
require 'json'
JSON.parse("{...}")
and convert a hash to a JSON string by
hash.to_json
all keys of symbols and strings are converted into strings.
require 'json'
a = {:a => '12', 'b' => '23'}
p aa = a.to_json #=> "{\"a\":\"12\",\"b\":\"23\"}"
p JSON.parse(aa) #=> {"a"=>"12", "b"=>"23"}
It might be possible that you are sometimes dealing with a simple Hash and sometimes with a HashWithIndifferentAccess. The Rails' params for example allow indifferent access by default. This might explain your confusion:
hash = { :a => 1, 'b' => 2 }
hash[:a]
#=> 1
hash['b']
#=> 2
hash[:b]
#=> nil
But with a HashWithIndifferentAccess:
hash = hash.with_indifferent_access
hash[:a]
#=> 1
hash['b']
#=> 2
hash[:b]
#=> 2

A hash-like object that acts like a case statement

What is the best way to construct a hash-like class Case, which is initialized by a hash:
cs = Case.new(:a => 1, /b/ => 2, /c/ => 2, /d/ => 3)
and has a method Case#[] that looks up for the first matching key by === (like a case statement) instead of by == (like the conventional hash) and returns the value:
cs["xxb"] => 2
Here's a possibility.
class Case
def initialize(h)
#h = h
end
def [](key,order=:PRE)
case order
when :PRE
h[#h.keys.find { |k| key === k }]
when :POST
h[#h.keys.find { |k| k === key }]
else
# raise exception
end
end
end
cs = Case.new(:a => 1, /b/ => 2, /c/ => 2, [1,2] => "cat", /d/ => 3)
cs["xxb"] #=> nil
cs["xxb",:POST] #=> 2
cs[Regexp] #=> 2
cs[Regexp,:POST] #=> nil
cs[Array] #=> "cat"
cs[Symbol] #=> 1
This assumes h does not have a key nil.
With the understanding that the key in the hash is to come on the left side of ===, the code would be:
class Case
def initialize(h) #h = h end
def [](key) h[#h.keys.find{|k| k === key}] end
end

Resources