Transform array of nested Hashes into flat array of non-nested hashes - ruby

I want to transform the given array into result array:
given = [{
"foo_v1_4" => [{
"derivate_version" => 0,
"layers" => {
"tlayer" => {
"baz" => {
"three" => 0.65
},
"bazbar" => {
"three" => 0.65
}
}
}
}]
}]
# the value of key :one is first hash key (foo_v1_4) plus underscore (_) plus derivate_version (0)
result = [{
one: 'foo_v1_4_0',
tlayer: 'baz',
three: '0.6'
},
{
one: 'foo_v1_4_0',
tlayer: 'bazbar',
three: '0.6'
}
]
What I tried:
given.each do |el |
el.each do |derivat |
derivat.each do |d |
d.each do |layer |
layer.each do |l |
derivat = "#{d}_#{l['derivate_version']}"
puts derivat
end
end
end
end
end
I'm struggling at iterating through "layers" hash, the amount of elements in layers is equal to the amount of elements in result array.

It helps to format the objects so we can better see their structures:
given = [
{
"foo_v1_4" => [
{ "derivate_version" => 0,
"layers" => {
"tlayer" => {
"baz" => { "three" => 0.65 },
"bazbar" => { "three" => 0.65 }
}
}
}
]
}
]
result = [
{
one: 'foo_v1_4_0',
tlayer: 'baz',
three: '0.6'
},
{
one: 'foo_v1_4_0',
tlayer: 'bazbar',
three: '0.6'
}
]
We can begin by writing the structure of result:
result = [
{
one:
tlayer:
three:
},
{
one:
tlayer:
three:
}
]
We see that
given = [ { "foo_v1_4" => <array> } ]
The values of the keys :one in the hash result[0] is therefore the first key of the first element of given:
one_val = given[0].keys[0]
#=> "foo_v1_4"
result = [
{
one: one_val
tlayer:
three:
},
{
one: one_val
tlayer:
three:
}
]
All the remaining objects of interest are contained in the hash
h = given[0]["foo_v1_4"][0]["layers"]["layer"]
#=> {
# "baz"=>{ "three"=>0.65 },
# "bazbar"=>{ "three"=>0.65 }
# }
so it is convenient to define it. We see that:
h.keys[0]
#=> "baz"
h.keys[1]
#=> "bazaar"
h["bazbar"]["three"]
#=> 0.65
Note that it generally is not good practice to assume that hash keys are ordered in a particular way.
We may now complete the construction of result,
v = h["bazbar"]["three"].truncate(1)
#=> 0.6
result = [
{
one: one_val,
tlayer: h.keys[0],
three: v
},
{ one: one_val,
tlayer: h.keys[1],
three: v
}
]
#=> [
# { :one=>"foo_v1_4", :tlayer=>"baz", :three=>0.6 },
# { :one=>"foo_v1_4", :tlayer=>"bazbar", :three=>0.6 }
# ]
The creation of the temporary objects one_val, h, and v improves time- and space-efficiency, makes the calculations easier to test and improves the readability of the code.

Try the below:
result = []
given.each do |level1|
level1.each do |key, derivate_versions|
derivate_versions.each do |layers|
# iterate over the elements under tlayer
layers.dig('layers', 'tlayer').each do |tlayer_key, tlayer_value|
sub_result = {}
# key - foo_v1_4, layers['derivate_version'] - 0 => 'foo_v1_4_0'
sub_result[:one] = key + '_' + layers['derivate_version'].to_s
# talyer_key - baz, barbaz
sub_result[:tlayer] = tlayer_key
# talyer_value - { "three" => 0.65 }
sub_result[:three] = tlayer_value['three']
result << sub_result
end
end
end
end
The value of result will be:
2.6.3 :084 > p result
[{:one=>"foo_v1_4_0", :tlayer=>"baz", :three=>0.65}, {:one=>"foo_v1_4_0", :tlayer=>"bazbar", :three=>0.65}]

Related

How to find the largest value of a hash in an array of hashes

In my array, I'm trying to retrieve the key with the largest value of "value_2", so in this case, "B":
myArray = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
myArray.each do |array_hash|
array_hash.each do |key, value|
if value["value_2"] == array_hash.values.max
puts key
end
end
end
I get the error:
"comparison of Hash with Hash failed (ArgumentError)".
What am I missing?
Though equivalent, the array given in the question is generally written:
arr = [{ "A" => { "value_1" => 30, "value_2" => 240 } },
{ "B" => { "value_1" => 40, "value_2" => 250 } },
{ "C" => { "value_1" => 18, "value_2" => 60 } }]
We can find the desired key as follows:
arr.max_by { |h| h.values.first["value_2"] }.keys.first
#=> "B"
See Enumerable#max_by. The steps are:
g = arr.max_by { |h| h.values.first["value_2"] }
#=> {"B"=>{"value_1"=>40, "value_2"=>250}}
a = g.keys
#=> ["B"]
a.first
#=> "B"
In calculating g, for
h = arr[0]
#=> {"A"=>{"value_1"=>30, "value_2"=>240}}
the block calculation is
a = h.values
#=> [{"value_1"=>30, "value_2"=>240}]
b = a.first
#=> {"value_1"=>30, "value_2"=>240}
b["value_2"]
#=> 240
Suppose now arr is as follows:
arr << { "D" => { "value_1" => 23, "value_2" => 250 } }
#=> [{"A"=>{"value_1"=>30, "value_2"=>240}},
# {"B"=>{"value_1"=>40, "value_2"=>250}},
# {"C"=>{"value_1"=>18, "value_2"=>60}},
# {"D"=>{"value_1"=>23, "value_2"=>250}}]
and we wish to return an array of all keys for which the value of "value_2" is maximum (["B", "D"]). We can obtain that as follows.
max_val = arr.map { |h| h.values.first["value_2"] }.max
#=> 250
arr.select { |h| h.values.first["value_2"] == max_val }.flat_map(&:keys)
#=> ["B", "D"]
flat_map(&:keys) is shorthand for:
flat_map { |h| h.keys }
which returns the same array as:
map { |h| h.keys.first }
See Enumerable#flat_map.
Code
p myArray.pop.max_by{|k,v|v["value_2"]}.first
Output
"B"
I'd use:
my_array = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
h = Hash[*my_array]
# => {"A"=>{"value_1"=>30, "value_2"=>240},
# "B"=>{"value_1"=>40, "value_2"=>250},
# "C"=>{"value_1"=>18, "value_2"=>60}}
k = h.max_by { |k, v| v['value_2'] }.first # => "B"
Hash[*my_array] takes the array of hashes and turns it into a single hash. Then max_by will iterate each key/value pair, returning an array containing the key value "B" and the sub-hash, making it easy to grab the key using first:
k = h.max_by { |k, v| v['value_2'] } # => ["B", {"value_1"=>40, "value_2"=>250}]
I guess the idea of your solution is looping through each hash element and compare the found minimum value with hash["value_2"].
But you are getting an error at
if value["value_2"] == array_hash.values.max
Because the array_hash.values is still a hash
{"A"=>{"value_1"=>30, "value_2"=>240}}.values.max
#=> {"value_1"=>30, "value_2"=>240}
It should be like this:
max = nil
max_key = ""
myArray.each do |array_hash|
array_hash.each do |key, value|
if max.nil? || value.values.max > max
max = value.values.max
max_key = key
end
end
end
# max_key #=> "B"
Another solution:
myArray.map{ |h| h.transform_values{ |v| v["value_2"] } }.max_by{ |k| k.values }.keys.first
You asked "What am I missing?".
I think you are missing a proper understanding of the data structures that you are using. I suggest that you try printing the data structures and take a careful look at the results.
The simplest way is p myArray which gives:
[{"A"=>{"value_1"=>30, "value_2"=>240}, "B"=>{"value_1"=>40, "value_2"=>250}, "C"=>{"value_1"=>18, "value_2"=>60}}]
You can get prettier results using pp:
require 'pp'
pp myArray
yields:
[{"A"=>{"value_1"=>30, "value_2"=>240},
"B"=>{"value_1"=>40, "value_2"=>250},
"C"=>{"value_1"=>18, "value_2"=>60}}]
This helps you to see that myArray has only one element, a Hash.
You could also look at the expression array_hash.values.max inside the loop:
myArray.each do |array_hash|
p array_hash.values
end
gives:
[{"value_1"=>30, "value_2"=>240}, {"value_1"=>40, "value_2"=>250}, {"value_1"=>18, "value_2"=>60}]
Not what you expected? :-)
Given this, what would you expect to be returned by array_hash.values.max in the above loop?
Use p and/or pp liberally in your ruby code to help understand what's going on.

Structuring ruby multidimensional hash by first hash

I have the hash below:
mm = {
0 => {
0 => 'p1',
1 => 'p2',
2 => 'p3'
},
1 => {
0 => 'idfp1',
1 => 'idfp2',
2 => 'idfp3'
},
2 => {
0 => 'idfp12',
1 => 'idfp22',
2 => 'idfp32'
}
}
And i'm trying to sort it by the hash with a key of 0. In the first hash (0), there are k-v pairs of number to identifier.
In every subsequent hash (1 and 2), 0 points to the 0 from the first hash, 1 points to the 1 from the first hash, etc.
In each hash after 0 (1 and 2), there are IDs (id for person 1) that belong to p1 (person 1).
I've tried to sort this by creating a new hash with only the first hash in the one above to no avail. This is my attempt. The keys are correct but it's pointing to nil, when it should be pointing to the hash with each person's id.
ids = {}
org = {}
mm[0].each do |id, name|
ids[id] = name
end
mm.drop(1).each do |one|
one.each do |key, id|
org[ids[key]] = id
end
end
How can I achieve this in Ruby?
Edit:
In case the explanation doesn't suffice, here is the desired result:
org = {
'p1' => {
0 => 'idfp1',
1 => 'idfp12'
},
'p2' => {
0 => 'idfp2',
1 => 'idfp22'
},
'p3' => {
0 => 'idfp3',
1 => 'idfp32'
}
}
Two ways:
#1
Code
mm[0].invert.each_with_object({}) { |(k,i),h|
h[k] = (1...mm.size).each_with_object ({}) { |j,g| g[j] = mm[j][i] } }
#=> {"p1"=>{1=>"idfp1", 2=>"idfp12"},
# "p2"=>{1=>"idfp2", 2=>"idfp22"},
# "p3"=>{1=>"idfp3", 2=>"idfp32"}}
Explanation
a = mm[0]
#=> {0=>"p1", 1=>"p2", 2=>"p3"}
b = a.invert
#=> {"p1"=>0, "p2"=>1, "p3"=>2}
b.each_with_object({}) { |(k,i),h|
h[k] = (1...mm.size).each_with_object ({}) { |j,g| g[j] = mm[j][i] } }
#=> {"p1"=>{1=>"idfp1", 2=>"idfp12"},
# "p2"=>{1=>"idfp2", 2=>"idfp22"},
# "p3"=>{1=>"idfp3", 2=>"idfp32"}}
#2
Code
mm.values
.map(&:values)
.transpose
.each_with_object({}) { |a,h| h[a.shift] = Hash[[*(0...a.size)].zip(a) ] }
#=> {"p1"=>{0=>"idfp1", 1=>"idfp12"},
# "p2"=>{0=>"idfp2", 1=>"idfp22"},
# "p3"=>{0=>"idfp3", 1=>"idfp32"}}
Explanation
a = mm.values
#=> [{0=>"p1", 1=>"p2", 2=>"p3" },
# {0=>"idfp1", 1=>"idfp2", 2=>"idfp3" },
# {0=>"idfp12", 1=>"idfp22", 2=>"idfp32"}]
b = a.map(&:values
#=> [[ "p1", "p2", "p3" ],
# [ "idfp1", "idfp2", "idfp3" ],
# [ "idfp12", "idfp22", "idfp32"]]
c = b.transpose
#=> [["p1", "idfp1", "idfp12"],
# ["p2", "idfp2", "idfp22"],
# ["p3", "idfp3", "idfp32"]]
c.each_with_object({}) { |a,h| h[a.shift] = Hash[[*(0...a.size)].zip(a) ] }
#=> {"p1"=>{0=>"idfp1", 1=>"idfp12"},
# "p2"=>{0=>"idfp2", 1=>"idfp22"},
# "p3"=>{0=>"idfp3", 1=>"idfp32"}}

How to limit an array of similar hashes to those that have more than one of the same key:value pair (details inside)

I have an array like this
arr = [ { name: "Josh", grade: 90 }, {name: "Josh", grade: 70 },
{ name: "Kevin", grade: 100 }, { name: "Kevin", grade: 95 },
{ name: "Ben", grade: 90 }, { name: "Rod", grade: 90 },
{ name: "Rod", grade: 70 }, { name: "Jack", grade: 60 } ]
I would like Ben and Jack to be removed since they only have one record in this array. What would be the most elegant way to get this done? I could manually go through it and check, but is there a better way? Like the opposite of
arr.uniq! { |person| person[:name] }
arr.reject! { |x| arr.count { |y| y[:name] == x[:name] } == 1 }
An O(n) solution:
count_hash = {}
arr.each { |x| count_hash[x[:name]] ||= 0; count_hash[x[:name]] += 1 }
arr.reject! { |x| count_hash[x[:name]] == 1 }
Here are three more ways that might be of some interest, though I prefer Robert's solution.
Each of the following returns:
#=> [{:name=>"Josh" , :grade=> 90}, {:name=>"Josh" , :grade=>70},
# {:name=>"Kevin", :grade=>100}, {:name=>"Kevin", :grade=>95},
# {:name=>"Rod" , :grade=> 90}, {:name=>"Rod" , :grade=>70}]
#1
Use the well-worn but dependable Enumerable#group_by to aggregate by name, Hash#values to extract the values then reject those that appear but once:
arr.group_by { |h| h[:name] }.values.reject { |a| a.size == 1 }.flatten
#2
Use the class method Hash#new with a default of zero to identify names with multiple entries, then select for those:
multiples = arr.each_with_object(Hash.new(0)) { |h,g| g[h[:name]] += 1 }
.reject { |_,v| v == 1 } #=> {"Josh"=>2, "Kevin"=>2, "Rod"=>2}
arr.select { |h| multiples.key?(h[:name]) }
#3
Use the form of Hash#update (aka Hash#merge!) that takes a block to determine names that appear only once, then reject for those:
singles = arr.each_with_object({}) { |h,g|
g.update({ h[:name] => 1 }) { |_,o,n| o+n } }
.select { |_,v| v == 1 } #=> {"Ben"=>1, "Jack"=>1}
arr.reject { |h| singles.key?(h[:name]) }

What is an eloquent way to sort an array of hashes based on whether a key is empty in Ruby?

array = [{ name:'Joe', foo:'bar' },
{ name:'Bob', foo:'' },
{ name:'Hal', foo:'baz' }
]
What is an eloquent way to sort so that if foo is empty, then put it at the end, and not change the order of the other elements?
Ruby 1.9.3
array.partition { |h| !h[:foo].empty? }.flatten
array.find_all{|elem| !elem[:foo].empty?} + array.find_all{|elem| elem[:foo].empty?}
returns
[{:name=>"Joe", :foo=>"bar"}, {:name=>"Hal", :foo=>"baz"}, {:name=>"Bob", :foo=>""}]
array = [
{ name:'Joe', foo:'bar' },
{ name:'Bob', foo:'' },
{ name:'Hal', foo:'baz' }
]
arraydup = array.dup
array.delete_if{ |h| h[:foo].empty? }
array += (arraydup - array)
Which results in:
[
[0] {
:name => "Joe",
:foo => "bar"
},
[1] {
:name => "Hal",
:foo => "baz"
},
[2] {
:name => "Bob",
:foo => ""
}
]
With a little refactoring:
array += ((array.dup) - array.delete_if{ |h| h[:foo].empty? })
One can produce keys as tuples, where the first part indicates null/not-null, and the second part is the original index, then sort_by [nulls_last, original_index].
def sort_nulls_last_preserving_original_order array
array.map.with_index.
sort_by { |h,i| [ (h[:foo].empty? ? 1 : 0), i ] }.
map(&:first)
end
Note this avoids all the gross array mutation of some of the other answers and is constructed from pure functional transforms.
array.each_with_index do |item, index|
array << (array.delete_at(index)) if item[:foo].blank?
end
Use whatever you have in place of blank?.

ruby db result set to array in a hash in a hash

I have a db query which returns results like:
db_result.each {|row| puts row}
{"IP"=>"1.2.3.4","Field1"=>"abc","Field2"=>"123"}
{"IP"=>"1.2.3.4","Field1"=>"abc","Field2"=>"234"}
{"IP"=>"1.2.3.4","Field1"=>"bcd","Field2"=>"345"}
{"IP"=>"3.4.5.6","Field1"=>"bcd","Field2"=>"456"}
{"IP"=>"3.4.5.6","Field1"=>"bcd","Field2"=>"567"}
And want to put it into a hash like:
{
"1.2.3.4" => {
"abc" => ["123", "234"],
"bcd" => "345"
},
"3.4.5.6" => {
"bcd" => ["456", "567"]
}
}
What I am currently doing is:
result_hash = Hash.new { |h, k| h[k] = {} }
db_result.each do |row|
result_hash[row["IP"]] = Hash.new { |h, k| h[k] = [] } unless result_hash.has_key? row["IP"]
result_hash[row["IP"]][row["Field1"]] << row["Field2"]
end
Which works, however was wondering if there is a neater way.
Consider this a peer-review. As a recommendation for processing and maintenance...
I'd recommend the data structure you want be a little more consistent.
Instead of:
{
"1.2.3.4" => {
"abc" => ["123", "234"],
"bcd" => "345"
},
"3.4.5.6" => {
"bcd" => ["456", "567"]
}
}
I'd recommend:
{
"1.2.3.4" => {
"abc" => ["123", "234"],
"bcd" => ["345"]
},
"3.4.5.6" => {
"abc" => [],
"bcd" => ["456", "567"]
}
}
Keep the same keys in each sub-hash, and make the values all be arrays. The code for processing that overall hash will be more straightforward and easy to follow.
I agree with Michael, there is nothing wrong with your method. The intent behind the code can be easily seen.
If you want to get fancy, here's one (of many) ways to do it:
x = [
{"IP"=>"1.2.3.4","Field1"=>"abc","Field2"=>"123"},
{"IP"=>"1.2.3.4","Field1"=>"abc","Field2"=>"234"},
{"IP"=>"1.2.3.4","Field1"=>"bcd","Field2"=>"345"},
{"IP"=>"3.4.5.6","Field1"=>"bcd","Field2"=>"456"},
{"IP"=>"3.4.5.6","Field1"=>"bcd","Field2"=>"567"}
]
y = x.inject({}) do |result, row|
new_row = result[row["IP"]] ||= {}
(new_row[row["Field1"]] ||= []) << row["Field2"]
result
end
I think this should yield the same time complexity as your method.

Resources