Split values into separate Arrays based on keys in a Hash? - ruby

I have some thing like this
[
{
"key": "55ffee8b6a617960010e0000",
"doc_count": 1
},
{
"key": "55fff0376a61794e190f0000",
"doc_count": 1
},
{
"key": "55fff0dd6a61794e191f0000",
"doc_count": 1
}
]
i want to separate :key values and :doc_count values into separate arrays like
["55ffee8b6a617960010e0000", "55fff0376a61794e190f0000", "55fff0dd6a61794e191f0000"]
and like [1,1,1]. How to achieve this?

You can use transpose here:
keys, doc_counts = array_of_hashes.map(&:values).transpose
As D-side points out this relies on the ordering of the keys being the same for each hash. If you cannot ensure this (for instance your data is being created via an API) you would have to perform the additional step of sorting the hash's keys. That would look something like:
keys, doc_counts = array_of_hashes.map{|h| Hash[h.sort].values }.transpose
In either case you'll end up with something like:
keys # => ["55ffee8b6a617960010e0000", "55fff0376a61794e190f0000", "55fff0dd6a61794e191f0000"]
doc_counts # => [1, 1, 1]

You can use some of these
a = [
{
"key" => "55ffee8b6a617960010e0000",
"doc_count" => 1
},
{
"key" => "55fff0376a61794e190f0000",
"doc_count" => 1
},
{
"key" => "55fff0dd6a61794e191f0000",
"doc_count" => 1
}
]
1.
hash = Hash[a.map { |h| [h["key"], h["doc_count"]] }]
hash.keys
hash.values
2.
exp = Hash.new { |k, v| k[v] = [] }
a.map { |h| h.each { |k, v| exp[k] << v } }
3.
hash = a.each_with_object({}) { |arr_h, h| h[arr_h["key"]] = arr_h["doc_count"] }
hash.keys
hash.values

You could iterate and assign it to new arrays doc_counts and keys.
array = [{"key"=>"55ffee8b6a617960010e0000", "doc_count"=>1}, {"key"=>"55fff0376a61794e190f0000", "doc_count"=>1}, {"key"=>"55fff0dd6a61794e191f0000", "doc_count"=>1}]
doc_counts, keys = [],[]
array.each do |a|
doc_counts << a["doc_count"]
keys << a["key"]
end
Result
>> doc_counts
=> [1, 1, 1]
>> keys
=> ["55ffee8b6a617960010e0000", "55fff0376a61794e190f0000", "55fff0dd6a61794e191f0000"]
Or
doc_counts = []
keys = array.map do |a|
doc_counts << a["doc_count"]
a["key"]
end

Related

How to find the largest value of a hash in an array of hashes

In my array, I'm trying to retrieve the key with the largest value of "value_2", so in this case, "B":
myArray = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
myArray.each do |array_hash|
array_hash.each do |key, value|
if value["value_2"] == array_hash.values.max
puts key
end
end
end
I get the error:
"comparison of Hash with Hash failed (ArgumentError)".
What am I missing?
Though equivalent, the array given in the question is generally written:
arr = [{ "A" => { "value_1" => 30, "value_2" => 240 } },
{ "B" => { "value_1" => 40, "value_2" => 250 } },
{ "C" => { "value_1" => 18, "value_2" => 60 } }]
We can find the desired key as follows:
arr.max_by { |h| h.values.first["value_2"] }.keys.first
#=> "B"
See Enumerable#max_by. The steps are:
g = arr.max_by { |h| h.values.first["value_2"] }
#=> {"B"=>{"value_1"=>40, "value_2"=>250}}
a = g.keys
#=> ["B"]
a.first
#=> "B"
In calculating g, for
h = arr[0]
#=> {"A"=>{"value_1"=>30, "value_2"=>240}}
the block calculation is
a = h.values
#=> [{"value_1"=>30, "value_2"=>240}]
b = a.first
#=> {"value_1"=>30, "value_2"=>240}
b["value_2"]
#=> 240
Suppose now arr is as follows:
arr << { "D" => { "value_1" => 23, "value_2" => 250 } }
#=> [{"A"=>{"value_1"=>30, "value_2"=>240}},
# {"B"=>{"value_1"=>40, "value_2"=>250}},
# {"C"=>{"value_1"=>18, "value_2"=>60}},
# {"D"=>{"value_1"=>23, "value_2"=>250}}]
and we wish to return an array of all keys for which the value of "value_2" is maximum (["B", "D"]). We can obtain that as follows.
max_val = arr.map { |h| h.values.first["value_2"] }.max
#=> 250
arr.select { |h| h.values.first["value_2"] == max_val }.flat_map(&:keys)
#=> ["B", "D"]
flat_map(&:keys) is shorthand for:
flat_map { |h| h.keys }
which returns the same array as:
map { |h| h.keys.first }
See Enumerable#flat_map.
Code
p myArray.pop.max_by{|k,v|v["value_2"]}.first
Output
"B"
I'd use:
my_array = [
"A" => {
"value_1" => 30,
"value_2" => 240
},
"B" => {
"value_1" => 40,
"value_2" => 250
},
"C" => {
"value_1" => 18,
"value_2" => 60
}
]
h = Hash[*my_array]
# => {"A"=>{"value_1"=>30, "value_2"=>240},
# "B"=>{"value_1"=>40, "value_2"=>250},
# "C"=>{"value_1"=>18, "value_2"=>60}}
k = h.max_by { |k, v| v['value_2'] }.first # => "B"
Hash[*my_array] takes the array of hashes and turns it into a single hash. Then max_by will iterate each key/value pair, returning an array containing the key value "B" and the sub-hash, making it easy to grab the key using first:
k = h.max_by { |k, v| v['value_2'] } # => ["B", {"value_1"=>40, "value_2"=>250}]
I guess the idea of your solution is looping through each hash element and compare the found minimum value with hash["value_2"].
But you are getting an error at
if value["value_2"] == array_hash.values.max
Because the array_hash.values is still a hash
{"A"=>{"value_1"=>30, "value_2"=>240}}.values.max
#=> {"value_1"=>30, "value_2"=>240}
It should be like this:
max = nil
max_key = ""
myArray.each do |array_hash|
array_hash.each do |key, value|
if max.nil? || value.values.max > max
max = value.values.max
max_key = key
end
end
end
# max_key #=> "B"
Another solution:
myArray.map{ |h| h.transform_values{ |v| v["value_2"] } }.max_by{ |k| k.values }.keys.first
You asked "What am I missing?".
I think you are missing a proper understanding of the data structures that you are using. I suggest that you try printing the data structures and take a careful look at the results.
The simplest way is p myArray which gives:
[{"A"=>{"value_1"=>30, "value_2"=>240}, "B"=>{"value_1"=>40, "value_2"=>250}, "C"=>{"value_1"=>18, "value_2"=>60}}]
You can get prettier results using pp:
require 'pp'
pp myArray
yields:
[{"A"=>{"value_1"=>30, "value_2"=>240},
"B"=>{"value_1"=>40, "value_2"=>250},
"C"=>{"value_1"=>18, "value_2"=>60}}]
This helps you to see that myArray has only one element, a Hash.
You could also look at the expression array_hash.values.max inside the loop:
myArray.each do |array_hash|
p array_hash.values
end
gives:
[{"value_1"=>30, "value_2"=>240}, {"value_1"=>40, "value_2"=>250}, {"value_1"=>18, "value_2"=>60}]
Not what you expected? :-)
Given this, what would you expect to be returned by array_hash.values.max in the above loop?
Use p and/or pp liberally in your ruby code to help understand what's going on.

Group hash values by key and concatenate values

I need to group a hash by keys and concatenate the values. For example, given this hash:
[
{"name": "FT002", "data": {"2017-11-01": 1392.0}},
{"name": "FT004", "data": {"2017-11-01": 4091.0}},
{"name": "FT002", "data": {"2017-12-01": 1279.0}},
{"name": "FT004", "data": {"2017-12-01": 3249.0}}
]
I want to produce this hash:
[
{"name": "FT002", "data": {"2017-11-01": 1392.0, "2017-12-01": 1279.0}},
{"name": "FT004", "data": {"2017-11-01": 4091.0, "2017-12-01": 3249.0}}
]
Any help would be appreciated.
I tried various iterations of inject, group_by, and merge, but can't seem to get the right result.
You can accomplish this in three short one-liners, first producing a hash mapping names to data, and then producing your desired structure:
data = [
{"name":"FT002","data":{"2017-11-01":1392.0}},
{"name":"FT004","data":{"2017-11-01":4091.0}},
{"name":"FT002","data":{"2017-12-01":1279.0}},
{"name":"FT004","data":{"2017-12-01":3249.0}}
]
hash = Hash.new { |hash,key| hash[key] = {} }
data.each { |name:, data:| hash[name].merge!(data) }
hash = hash.map { |k,v| { name: k, data: v } }
data.group_by { |h| h[:name] }.map do |k,arr|
{ name: k, data: arr.each_with_object({}) { |g,h| h.update(g[:data]) } }
end
#=> [{:name=>"FT002", :data=>{:"2017-11-01"=>1392.0, :"2017-12-01"=>1279.0}},
# {:name=>"FT004", :data=>{:"2017-11-01"=>4091.0, :"2017-12-01"=>3249.0}}]
The first step is to use Enumerable#group_by to produce the following hash.
data.group_by { |h| h[:name] }
#=> {"FT002"=>[
# {:name=>"FT002", :data=>{:"2017-11-01"=>1392.0}},
# {:name=>"FT002", :data=>{:"2017-12-01"=>1279.0}}
# ],
# "FT004"=>[
# {:name=>"FT004", :data=>{:"2017-11-01"=>4091.0}},
# {:name=>"FT004", :data=>{:"2017-12-01"=>3249.0}}
# ]
# }
The second step is to simply manipulate the keys and values of this hash. See Hash#update (aka merge!).
An alternative to the second step is the following.
data.group_by { |h| h[:name] }.map do |k,arr|
{ name: k, data: arr.map { |g| g[:data].flatten }.to_h }
end
Note that this uses Hash#flatten, not Array#flatten.
This should generate the the results you're looking for:
data = [
{"name":"FT002","data":{"2017-11-01":1392.0}},
{"name":"FT004","data":{"2017-11-01":4091.0}},
{"name":"FT002","data":{"2017-12-01":1279.0}},
{"name":"FT004","data":{"2017-12-01":3249.0}}
]
newData = {}
data.each do |x|
newData[x[:name]] = [] unless newData[x[:name]].present?
newData[x[:name]].push x[:data]
end
combined = []
newData.each do |index,value|
dateData = {}
value.each do |dateStuff|
dateStuff.each do |dateIndex, dateValue|
dateData[dateIndex] = dateValue
end
end
values = {"name": index, "data": dateData}
combined.push values
end
combined

Best way to merge key value pairs in a hash based on number of values for that key in Ruby

I have a hash of arrays in ruby as :
#people = { "a" => ["john", "mark", "tony"], "b"=> ["tom","tim"],
"c" =>["jane"], "others"=>["rob", "ryan"] }
I would like to merge all key value pairs where there are less than 3 items in the array for a particular keys values. They should be merged into the key called "others" to give roughly the result of
#people = { "a" => ["john", "mark", "tony"],
"others"=> ["rob", "ryan", "tom", "tim", "jane"] }
Using the following code is problematic as duplicate key values in a hash cannot exist:
#people = Hash[#people.map{|k,v| v.count<3 ? ["others",v] : [k,v]} ] %>
Whats the best way to elegantly solve this?
You almost have it, the problem is, as you notice, that you can't build the Hash's key/value pairs on the fly because of duplicates. One way around the problem is to start out with the skeleton of what you're trying to build:
#people = #people.each_with_object({ 'others' => [ ] }) do |(k,v), h|
if(v.length >= 3)
h[k] = v
else
h['others'] += v
end
end
Or, if you don't like each_with_object, you could:
h = { 'others' => [ ] }
#people.each do |k, v|
# as above
end
#people = h
Or you could use pretty much the same structure with inject (taking care, as usual, to return the right thing from the block).
There are certainly other ways to do this but these approaches are pretty clear and easy to understand; IMO clarity should be your first goal.
try:
>> #people = { "a" => ["john", "mark", "tony"], "b"=> ["tom","tim"],
"c" =>["jane"], "others"=>["rob", "ryan"] }
>> #new_people = {"others" => []}
>> #people.each_pair {|k,v| (v.size >= 3 && k!="others") ? #new_people.merge!(k=>v) : #new_people['others']+= v}
>> #new_people
=> {"others"=>["rob", "ryan", "jane", "tom", "tim"], "a"=>["john", "mark", "tony"]}
Hash[ #people.group_by { |k,v| v.size < 3 ? 'others' : k }.
map { |k,v| [k, v.flat_map(&:last)] } ]
=> {"a"=>["john", "mark", "tony"],
"others"=>["tom", "tim", "jane", "rob", "ryan"]}
What about this:
> less_than_three, others = #people.partition {|(key, values)| values.size >= 3 }
> Hash[less_than_three]
# => {"a"=>["john", "mark", "tony"]}
> Hash["others" => others.map {|o| o.last}.flatten]
# => {"others"=>["tom", "tim", "jane", "rob", "ryan"]}
#people[:others] = []
#people.each do |k, v|
#people[:others] |= #people.delete(k) if v.size < 3
end
#people.inject({}) do |m, (k, v)|
m[i = v.size >= 3 ? k : 'others'] = m[i].to_a + v
m
end

What is an eloquent way to sort an array of hashes based on whether a key is empty in Ruby?

array = [{ name:'Joe', foo:'bar' },
{ name:'Bob', foo:'' },
{ name:'Hal', foo:'baz' }
]
What is an eloquent way to sort so that if foo is empty, then put it at the end, and not change the order of the other elements?
Ruby 1.9.3
array.partition { |h| !h[:foo].empty? }.flatten
array.find_all{|elem| !elem[:foo].empty?} + array.find_all{|elem| elem[:foo].empty?}
returns
[{:name=>"Joe", :foo=>"bar"}, {:name=>"Hal", :foo=>"baz"}, {:name=>"Bob", :foo=>""}]
array = [
{ name:'Joe', foo:'bar' },
{ name:'Bob', foo:'' },
{ name:'Hal', foo:'baz' }
]
arraydup = array.dup
array.delete_if{ |h| h[:foo].empty? }
array += (arraydup - array)
Which results in:
[
[0] {
:name => "Joe",
:foo => "bar"
},
[1] {
:name => "Hal",
:foo => "baz"
},
[2] {
:name => "Bob",
:foo => ""
}
]
With a little refactoring:
array += ((array.dup) - array.delete_if{ |h| h[:foo].empty? })
One can produce keys as tuples, where the first part indicates null/not-null, and the second part is the original index, then sort_by [nulls_last, original_index].
def sort_nulls_last_preserving_original_order array
array.map.with_index.
sort_by { |h,i| [ (h[:foo].empty? ? 1 : 0), i ] }.
map(&:first)
end
Note this avoids all the gross array mutation of some of the other answers and is constructed from pure functional transforms.
array.each_with_index do |item, index|
array << (array.delete_at(index)) if item[:foo].blank?
end
Use whatever you have in place of blank?.

Convert cartesian product to nested hash in ruby

I have a structure with a cartesian product that looks like this (and could go out to arbitrary depth)...
variables = ["var1","var2",...]
myhash = {
{"var1"=>"a", "var2"=>"a", ...}=>1,
{"var1"=>"a", "var2"=>"b", ...}=>2,
{"var1"=>"b", "var2"=>"a", ...}=>3,
{"var1"=>"b", "var2"=>"b", ...}=>4,
}
... it has a fixed structure but I'd like simple indexing so I'm trying to write a method to convert it to this :
nested = {
"a"=> {
"a"=> 1,
"b"=> 2
},
"b"=> {
"a"=> 3,
"b"=> 4
}
}
Any clever ideas (that allow for arbitrary depth)?
Maybe like this (not the cleanest way):
def cartesian_to_map(myhash)
{}.tap do |hash|
myhash.each do |h|
(hash[h[0]["var1"]] ||= {}).merge!({h[0]["var2"] => h[1]})
end
end
end
Result:
puts cartesian_to_map(myhash).inspect
{"a"=>{"a"=>1, "b"=>2}, "b"=>{"a"=>3, "b"=>4}}
Here is my example.
It uses a method index(hash, fields) that takes the hash, and the fields you want to index by.
It's dirty, and uses a local variable to pass up the current level in the index.
I bet you can make it much nicer.
def index(hash, fields)
# store the last index of the fields
last_field = fields.length - 1
# our indexed version
indexed = {}
hash.each do |key, value|
# our current point in the indexed hash
point = indexed
fields.each_with_index do |field, i|
key_field = key[field]
if i == last_field
point[key_field] = value
else
# ensure the next point is a hash
point[key_field] ||= {}
# move our point up
point = point[key_field]
end
end
end
# return our indexed hash
indexed
end
You can then just call
index(myhash, ["var1", "var2"])
And it should look like what you want
index({
{"var1"=>"a", "var2"=>"a"} => 1,
{"var1"=>"a", "var2"=>"b"} => 2,
{"var1"=>"b", "var2"=>"a"} => 3,
{"var1"=>"b", "var2"=>"b"} => 4,
}, ["var1", "var2"])
==
{
"a"=> {
"a"=> 1,
"b"=> 2
},
"b"=> {
"a"=> 3,
"b"=> 4
}
}
It seems to work.
(see it as a gist
https://gist.github.com/1126580)
Here's an ugly-but-effective solution:
nested = Hash[ myhash.group_by{ |h,n| h["var1"] } ].tap{ |nested|
nested.each do |v1,a|
nested[v1] = a.group_by{ |h,n| h["var2"] }
nested[v1].each{ |v2,a| nested[v1][v2] = a.flatten.last }
end
}
p nested
#=> {"a"=>{"a"=>1, "b"=>2}, "b"=>{"a"=>3, "b"=>4}}
You might consider an alternative representation that is easier to map to and (IMO) just as easy to index:
paired = Hash[ myhash.map{ |h,n| [ [h["var1"],h["var2"]], n ] } ]
p paired
#=> {["a", "a"]=>1, ["a", "b"]=>2, ["b", "a"]=>3, ["b", "b"]=>4}
p paired[["a","b"]]
#=> 2

Resources