How to simplify hash intersection with array - ruby

I have hash data (originally in json) and an array of selected hash keys:
jsondata ='{"1":
{"name": "Tax Exempt"},
"2":
{"name": "Tax on Purchases"},
"3":
{"name": "Tax on Sales"},
"4":
{"name": "Service Tax"}
}'
parseddata = JSON.parse(jsondata);
selectedtax = ["2","3"]
My code maps the keys and returns the value of the hash that exist in the array. Here is the code:
selectedtaxdetails = Array.new
parseddata.map do |key,value|
if selectedtax.include? key
selectedtaxdetails << value
end
end
Output of selectedtaxdetails is:
[{"name": "Tax on Purchases"},{"name": "Tax on Sales"}]
How can I improve my code?

Solution:
You can do (rails):
parseddata.slice(*selectedtax).values
or even simpler (pure ruby):
parseddata.values_at(*selectedtax)
Explanation:
Both slice and values_at methods expect a list of keys. If you just pass an array it will search for values where this array is a key, whcih obviously is not what you want. Instead you can use a splat operator (*). It will take each element of an array and will pass it into a method as a separate argument, which is exactely what we want here.
Update:
To achieve structure: [{"code":"2", "name": "Tax on Purchase"},{"code":"3", "name": "Tax on Sales"}] you can do (rails):
parseddata.slice(*selectedtax).map {|key, value| value.dup.tap {|h| h['code'] = key}}
or with pure ruby:
parseddata.select{|key,_| selectedtax.include? key}.map {|key, value| value.dup.tap {|h| h['code'] = key}}

The following should do the same in one line
parseddaata.values_at(*selectedtax)

Related

Merge an array of hashes by key-value pair

I have an array of hashes as follows:
[
{'abc_id'=>'1234', 'def_id'=>[]},
{'abc_id'=>'5678', 'def_id'=>['11', '22']},
{'abc_id'=>'1234', 'def_id'=>['33', '44']},
{'abc_id'=>'5678', 'def_id'=>['55', '66']}
]
I'm trying to combine multiple hashes with the same key-value pair into one hash. Thus, we have two pairs with the same value for 'abc_id' key as follows:
{'abc_id'=>'1234', 'def_id'=>[]} and {'abc_id'=>'1234', 'def_id'=>['33', '44']}
{'abc_id'=>'5678', 'def_id'=>['11', '22']} and {'abc_id'=>'5678', 'def_id'=>['55', '66']}
I'm expecting multiple hashes with the same key-value pairs to be merged into one individual hash. For the two pairs above, they should be respectively:
{'abc_id'=>'1234', 'def_id'=>['33', '44']}
{'abc_id'=>'5678', 'def_id'=>['11', '22', '55', '66']}
The more-or-less generic and extendable variant would be:
input.
group_by { |h| h['abc_id'] }.
map do |k, v|
v.reduce do |acc, arr|
# use `+` instead of `|` to save duplicates ⇓⇓⇓
acc.merge(arr) { |_, v1, v2| Array === v1 ? v1 | v2 : v1 }
end
end
#⇒ [{"abc_id"=>"1234", "def_id"=>["33", "44"]},
# {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]
One option more:
array
.map.with_object({}) { |h, hh| hh[h['abc_id']].nil? ? hh[h['abc_id']] = h['def_id'] : hh[h['abc_id']] += h['def_id'] }
.map{ |k, v| {'abc_id' => k, 'def_id' => v} }
The first part returns
# {"1234"=>["33", "44"], "5678"=>["11", "22", "55", "66"]}
The second part rebuilds the original structure, returning:
#=> [{"abc_id"=>"1234", "def_id"=>["33", "44"]}, {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]
One could use the form of Hash#update (aka merge!) and Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. Here this needs to be done at two levels.
Letting arr be the array given in the question, these methods are used as follows.
arr.each_with_object({}) do |g,h|
h.update(g['abc_id']=>g) do |_,o,n|
o.merge(n) { |k,oo,nn| k=='def_id' ? oo+nn : oo }
end
end.values
#=> [{"abc_id"=>"1234", "def_id"=>["33", "44"]},
# {"abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"]}]
See the docs for an explanation of the block variables _, o, n, k, oo and nn. I used an underscore to represent the common key
with update to tell the reader that it is not used in the block calculation.
Note that the receiver of Hash#values is the following.
{ "1234"=>{ "abc_id"=>"1234", "def_id"=>["33", "44"] },
"5678"=>{ "abc_id"=>"5678", "def_id"=>["11", "22", "55", "66"] } }

Merge duplicate values in json using ruby

I have the following item.json file
{
"items": [
{
"brand": "LEGO",
"stock": 55,
"full-price": "22.99",
},
{
"brand": "Nano Blocks",
"stock": 12,
"full-price": "49.99",
},
{
"brand": "LEGO",
"stock": 5,
"full-price": "199.99",
}
]
}
There are two items named LEGO and I want to get output for the total number of stock for the individual brand.
In ruby file item.rb i have code like:
require 'json'
path = File.join(File.dirname(__FILE__), '../data/products.json')
file = File.read(path)
products_hash = JSON.parse(file)
products_hash["items"].each do |brand|
puts "Stock no: #{brand["stock"]}"
end
I got output for stock no individually for each brand wherein I need the stock to be summed for two brand name "LEGO" displayed as one.
Anyone has solution for this?
json = File.open(path,'r:utf-8',&:read) # in case the JSON uses UTF-8
items = JSON.parse(json)['items']
stock_by_brand = items
.group_by{ |h| h['brand'] }
.map do |brand,array|
[ brand,
array
.map{ |item| item['stock'] }
.inject(:+) ]
end
.to_h
#=> {"LEGO"=>60, "Nano Blocks"=>12}
It works like this:
Enumerable#group_by takes the array of items and creates a hash mapping the brand name to an array of all item hashes with that brand
Enumerable#map turns each brand/array pair in that hash into an array of the brand (unchanged) followed by:
Enumerable#map on the array of items picks out just the "stock" counts, and then
Enumerable#inject sums them all together
Array#to_h then turns that array of two-value arrays into a hash, mapping the brand to the sum of stock values.
If you want simpler code that's less functional and possibly easier to understand:
stock_by_brand = {} # an empty hash
items.each do |item|
stock_by_brand[ item['brand'] ] ||= 0 # initialize to zero if unset
stock_by_brand[ item['brand'] ] += item['stock']
end
p stock_by_brand #=> {"LEGO"=>60, "Nano Blocks"=>12}
To see what your JSON string looks like, let's create it from your hash, which I've denoted h:
require 'json'
j = JSON.generate(h)
#=> "{\"items\":[{\"brand\":\"LEGO\",\"stock\":55,\"full-price\":\"22.99\"},{\"brand\":\"Nano Blocks\",\"stock\":12,\"full-price\":\"49.99\"},{\"brand\":\"LEGO\",\"stock\":5,\"full-price\":\"199.99\"}]}"
After reading that from a file, into the variable j, we can now parse it to obtain the value of "items":
arr = JSON.parse(j)["items"]
#=> [{"brand"=>"LEGO", "stock"=>55, "full-price"=>"22.99"},
# {"brand"=>"Nano Blocks", "stock"=>12, "full-price"=>"49.99"},
# {"brand"=>"LEGO", "stock"=>5, "full-price"=>"199.99"}]
One way to obtain the desired tallies is to use a counting hash:
arr.each_with_object(Hash.new(0)) {|g,h| h.update(g["brand"]=>h[g["brand"]]+g["stock"])}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
Hash.new(0) creates an empty hash (represented by the block variable h) with with a default value of zero1. That means that h[k] returns zero if the hash does not have a key k.
For the first element of arr (represented by the block variable g) we have:
g["brand"] #=> "LEGO"
g["stock"] #=> 55
Within the block, therefore, the calculation is:
g["brand"] => h[g["brand"]]+g["stock"]
#=> "LEGO" => h["LEGO"] + 55
Initially h has no keys, so h["LEGO"] returns the default value of zero, resulting in { "LEGO"=>55 } being merged into the hash h. As h now has a key "LEGO", h["LEGO"], will not return the default value in subsequent calculations.
Another approach is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged:
arr.each_with_object({}) {|g,h| h.update(g["brand"]=>g["stock"]) {|_,o,n| o+n}}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
1 k=>v is shorthand for { k=>v } when it appears as a method's argument.

Sorting an array of hashes by a date field

I have an object with many arrays of hashes, one of which I want to sort by a value in the 'date' key.
#array['info'][0] = {"name"=>"personA", "date"=>"23/09/1980"}
#array['info'][1] = {"name"=>"personB", "date"=>"01/04/1970"}
#array['info'][2] = {"name"=>"personC", "date"=>"03/04/1975"}
I have tried various methods using Date.parse and with collect but an unable to find a good solution.
Edit:
To be clear I want to sort the original array in place
#array['info'].sort_by { |i| Date.parse i['date'] }.collect
How might one solve this elegantly the 'Ruby-ist' way. Thanks
Another way, which doesn't require converting the date strings to date objects, is the following.
Code
def sort_by_date(arr)
arr.sort_by { |h| h["date"].split('/').reverse }
end
If arr is to be sorted in place, use Array#sort_by! rather than Enumerable#sort_by.
Example
arr = [{ "name"=>"personA", "date"=>"23/09/1980" },
{ "name"=>"personB", "date"=>"01/04/1970" },
{ "name"=>"personC", "date"=>"03/04/1975" }]
sort_by_date(arr)
#=> [{ "name"=>"personB", "date"=>"01/04/1970" },
# { "name"=>"personC", "date"=>"03/04/1975" },
# { "name"=>"personA", "date"=>"23/09/1980" }]
Explanation
For arr in the example, sort_by passes the first element of arr into its block and assigns it to the block variable:
h = { "name"=>"personA", "date"=>"23/09/1980" }
then computes:
a = h["date"].split('/')
#=> ["23", "09", "1980"]
and then:
b = a.reverse
#=> ["1980", "09", "23"]
Similarly, we obtain b equal to:
["1970", "04", "01"]
and
["1975", "04", "03"]
for each of the other two elements of arr.
If you look at the docs for Array#<=> you will see that these three arrays are ordered as follows:
["1970", "04", "01"] < ["1975", "04", "03"] < ["1980", "09", "23"]
There is no need to convert the string elements to integers.
Looks fine overall. Although you can drop the collect call since it's not needed and use sort_by! to modify the array in-place (instead of reassigning):
#array['info'].sort_by! { |x| Date.parse x['date'] }

add up values from 2 arrays based on duplicate values of the other one

A similar question has been answered here However I'd like to know how I can add up/group the numbers from one array based on the duplicate values of another array.
test_names = ["TEST1", "TEST1", "TEST2", "TEST3", "TEST2", "TEST4", "TEST4", "TEST4"]
numbers = ["5", "4", "3", "2", "9", "7", "6", "1"]
The ideal result I'd like to get is a hash or an array with:
{"TEST1" => 9, "TEST2" => 12, "TEST3" => 2, "TEST4" => 14}
Another way I found you can do:
test_names.zip(numbers).each_with_object(Hash.new(0)) {
|arr, hsh| hsh[arr[0]] += arr[1].to_i }
You can do it like this:
my_hash = Hash.new(0)
test_names.each_with_index {|name, index| my_hash[name] += numbers[index].to_i}
my_hash
#=> {"TEST1"=>9, "TEST2"=>12, "TEST3"=>2, "TEST4"=>14}
I wish to follow #squidguy's example and use Enumerable#zip, but with a different twist:
{}.tap { |h| test_names.zip(numbers.map(&:to_i)) { |a|
h.update([a].to_h) { |_,o,n| o+n } } }
#=> {"TEST1"=>9, "TEST2"=>12, "TEST3"=>2, "TEST4"=>14}
Object#tap is here just a substitute for Enumerable#each_with_object or for having h={} initially and a last line with just h.
I'm using the form of Hash#update (aka merge!) that takes a block for determining the merged value for each key that is present in both the original hash (h) and the hash being merged ([a].to_h). There are three block variables, the shared key (which we don't use here, so I've replaced it with the placeholder _), and the values for that key for the original hash (o) and for the hash being merged (n).

Summing Nested Json

So, we have a json response like:
Link to Formatted Sample Json
{"C":{"1":{"1":{"A":[18],"B":[18],"C":[20],"D":[24],"E":[24],"F":[2],"G":[15],"H":[21],"I":[8]},"2":{"A":[9],"B":[26],"C":[12],"D":[10],"E":[10],"F":[3],"G":[7]},"3":{"A":[6],"B":[4],"C":[5],"D":[3],"E":[4],"F":[13]},"4":{"A":[3],"B":[2],"C":[5],"D":[13],"E":[5],"F":[5],"G":[4],"H":[7]},"5":{"A":[10],"B":[10],"C":[10],"D":[10],"E":[10],"F":[15]},"6":{"A":[10],"B":[7],"C":[5],"D":[4],"E":[7],"F":[10],"G":[4],"H":[18]},"7":{"A":[2],"B":[18],"C":[6],"D":[3],"E":[2],"F":[5],"G":[7],"H":[5],"I":[17]},"8":{"A":[20],"B":[2],"C":[10],"D":[3],"E":[5],"F":[10]},"Review 1":{"A":[30]},"Review 2":{"A":[30]}},"2":{"1":{"A":[2],"B":[3],"C":[10],"D":[10],"E":[10],"F":[15]},"10":{"A":[10],"B":[3],"C":[3],"D":[3],"E":[20]},"11":{"A":[2],"B":[6],"C":[5],"D":[10],"E":[10],"F":[13]},"2":{"A":[5],"B":[5],"C":[5],"D":[6],"E":[6],"F":[12],"G":[6],"H":[8]},"3":{"A":[3],"B":[4],"C":[8],"D":[3],"E":[2],"F":[3],"G":[12]},"4":{"A":[10],"B":[10],"C":[10],"D":[11],"E":[10],"F":[20]},"5":{"A":[8],"B":[4],"C":[8],"D":[5],"E":[14]},"6":{"A":[5],"B":[10],"C":[14],"D":[14]},"7":{"A":[3],"B":[5],"C":[8],"D":[9],"E":[10],"F":[16]},"8":{"A":[2],"B":[2],"C":[4],"D":[2],"E":[3],"F":[6],"G":[8]},"9":{"A":[2],"B":[6],"C":[5],"D":[11]},"_mex":{"1":[9]},"Review 1":{"A":[31]},"Review 2":{"A":[30]},"Review 3":{"A":[30]}},"3":{"1":{"A":[1],"B":[1],"C":[1],"D":[2],"E":[6]},"2":{"A":[2],"B":[4],"C":[7],"D":[8],"E":[8],"F":[9]},"3":{"A":[5],"B":[8],"C":[11]},"4":{"A":[10],"B":[10],"C":[11]},"5":{"A":[2],"B":[4],"C":[5],"D":[1],"E":[3],"F":[8]},"6":{"A":[4],"B":[8],"C":[8],"D":[12],"E":[8],"F":[20]},"7":{"A":[25],"B":[12],"C":[13],"D":[15],"E":[12],"F":[20]},"8":{"A":[5],"B":[3],"C":[3],"D":[7],"E":[1],"F":[1],"G":[1],"H":[1],"I":[1],"J":[3],"K":[17]},"mex2":{"A":[7]},"_mex2":{"A":[7]},"Review 1":{"A":[30]},"Review 2":{"A":[30]}},"4":{"1":{"A":[10],"B":[2],"C":[2],"D":[8],"E":[3],"F":[3]},"2":{"A":[5],"B":[10],"C":[5],"D":[10],"E":[10]},"3":{"A":[6],"B":[4],"C":[3],"D":[11]},"4":{"A":[4],"B":[4],"C":[4],"D":[4],"E":[11],"F":[21]},"5":{"A":[5],"B":[8],"C":[3],"D":[4],"E":[5],"F":[7],"G":[15],"H":[5],"I":[5],"J":[6],"K":[14]},"6":{"A":[2],"B":[4],"C":[3],"D":[2],"E":[2],"F":[3],"G":[4],"H":[4],"I":[4],"J":[4],"K":[7],"L":[34]},"_mex2":{"A":[7]},"Review 1":{"A":[77]}}}}
What I want to do is sum all the numbers contained in the response.
Ive tried iterating through all the nesting but I was only been able to do one section. Using:
#number = 0
json["C"]["1"]["1"].each do |key, val|
val.map do |x|
#number+=x
end
end
#=> 150
Any suggestions how I would do that same for json["C"]["1"]?
Based on the JSON, here's code that'll walk the hash:
hash = JSON.parse(json)
def sum_hash(h)
sum = 0
h.each do |k, v|
sum += v.is_a?(Hash) ? sum_hash(v) : v.first
end
sum
end
sum_hash(hash) # => 1964
The hash has to be walked, and each value inspected since it's irregular. If the value is another hash sum_hash calls itself with that sub-hash, which then begins walking the sub-hash received.
For each hash value that isn't a hash, the integer is retrieved from the array using first and added to sum. When the method exits it returns the current value of sum, so, once the hash has been descended into, successive sum values get added.
Reducing the JSON makes it a LOT easier to make sure the code is doing the right thing:
json = '
{
"C": {
"1": {
"1": {
"A": [1],
"B": [1]
},
"2": {
"A": [1],
"B": [1]
},
"Review 1": {
"A": [1]
},
"Review 2": {
"A": [1]
}
}
}
}
'
Running the above code with that says the sum is 6.

Resources