Convert Table to Multidimensional Hash in Ruby - ruby

I know that there must be some simple and elegant way to do this, but I'm drawing a blank.
I have a table (or group of key value pairs)
id,val
64664,68
64665,65
64666,53
64667,68
64668,6
64668,27
64668,33
64669,12
In most cases there is one value per id. In some cases there are multiples.
I want to end up with each id with multiple values represented as an array of those values
something like this:
[ 64664 => 68,
64665 => 65,
64666 => 53,
64668 =>[6,27,33],
64669 => 12
]
Any brilliant ideas?

You can use Hash#merge to merge two hashes. Using Enumerable#inject, you can get what you want.
tbl = [
[64664, 68],
[64665, 65],
[64666, 53],
[64667, 68],
[64668, 6],
[64668, 27],
[64668, 33],
[64669, 12],
]
# Convert the table to array of hashes
hashes = tbl.map { |id, val|
{id => val}
}
# Merge the hashes
hashes.inject { |h1, h2|
h1.merge(h2) { |key,old,new|
(old.is_a?(Array) ? old : [old]) << new
}
}
# => {64664=>68, 64665=>65, 64666=>53, 64667=>68, 64668=>[6, 27, 33], 64669=>12}

values = [
[64664, 68],
[64665, 65],
[64666, 53],
[64667, 68],
[64668, 6],
[64668, 27],
[64668, 33],
[64669, 12],
]
# When key not present, create new empty array as default value
h = Hash.new{|h,k,v| h[k]=[]}
values.each{|(k,v)| h[k] << v}
p h #=>{64664=>[68], 64665=>[65], 64666=>[53], 64667=>[68], 64668=>[6, 27, 33], 64669=>[12]}

Related

Merge hash of arrays into array of hashes

So, I have a hash with arrays, like this one:
{"name": ["John","Jane","Chris","Mary"], "surname": ["Doe","Doe","Smith","Martins"]}
I want to merge them into an array of hashes, combining the corresponding elements.
The results should be like that:
[{"name"=>"John", "surname"=>"Doe"}, {"name"=>"Jane", "surname"=>"Doe"}, {"name"=>"Chris", "surname"=>"Smith"}, {"name"=>"Mary", "surname"=>"Martins"}]
Any idea how to do that efficiently?
Please, note that the real-world use scenario could contain a variable number of hash keys.
Try this
h[:name].zip(h[:surname]).map do |name, surname|
{ 'name' => name, 'surname' => surname }
end
I suggest writing the code to permit arbitrary numbers of attributes. It's no more difficult than assuming there are two (:name and :surname), yet it provides greater flexibility, accommodating, for example, future changes to the number or naming of attributes:
def squish(h)
keys = h.keys.map(&:to_s)
h.values.transpose.map { |a| keys.zip(a).to_h }
end
h = { name: ["John", "Jane", "Chris"],
surname: ["Doe", "Doe", "Smith"],
age: [22, 34, 96]
}
squish(h)
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
The steps for the example above are as follows:
b = h.keys
#=> [:name, :surname, :age]
keys = b.map(&:to_s)
#=> ["name", "surname", "age"]
c = h.values
#=> [["John", "Jane", "Chris"], ["Doe", "Doe", "Smith"], [22, 34, 96]]
d = c.transpose
#=> [["John", "Doe", 22], ["Jane", "Doe", 34], ["Chris", "Smith", 96]]
d.map { |a| keys.zip(a).to_h }
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
In the last step the first value of b is passed to map's block and the block variable is assigned its value.
a = d.first
#=> ["John", "Doe", 22]
e = keys.zip(a)
#=> [["name", "John"], ["surname", "Doe"], ["age", 22]]
e.to_h
#=> {"name"=>"John", "surname"=>"Doe", "age"=>22}
The remaining calculations are similar.
If your dataset is really big, you can consider using Enumerator::Lazy.
This way Ruby will not create intermediate arrays during calculations.
This is how #Ursus answer can be improved:
h[:name]
.lazy
.zip(h[:surname])
.map { |name, surname| { 'name' => name, 'surname' => surname } }
.to_a
Other option for the case where:
[..] the real-world use scenario could contain a variable number of hash keys
h = {
'name': ['John','Jane','Chris','Mary'],
'surname': ['Doe','Doe','Smith','Martins'],
'whathever': [1, 2, 3, 4, 5]
}
You could use Object#then with a splat operator in a one liner:
h.values.then { |a, *b| a.zip *b }.map { |e| (h.keys.zip e).to_h }
#=> [{:name=>"John", :surname=>"Doe", :whathever=>1}, {:name=>"Jane", :surname=>"Doe", :whathever=>2}, {:name=>"Chris", :surname=>"Smith", :whathever=>3}, {:name=>"Mary", :surname=>"Martins", :whathever=>4}]
The first part, works this way:
h.values.then { |a, *b| a.zip *b }
#=> [["John", "Doe", 1], ["Jane", "Doe", 2], ["Chris", "Smith", 3], ["Mary", "Martins", 4]]
The last part just maps the elements zipping each with the original keys then calling Array#to_h to convert to hash.
Here I removed the call .to_h to show the intermediate result:
h.values.then { |a, *b| a.zip *b }.map { |e| h.keys.zip e }
#=> [[[:name, "John"], [:surname, "Doe"], [:whathever, 1]], [[:name, "Jane"], [:surname, "Doe"], [:whathever, 2]], [[:name, "Chris"], [:surname, "Smith"], [:whathever, 3]], [[:name, "Mary"], [:surname, "Martins"], [:whathever, 4]]]
[h[:name], h[:surname]].transpose.map do |name, surname|
{ 'name' => name, 'surname' => surname }
end

Translate Ruby hash (key,value) to separate keys

I have a map function in ruby which returns an array of arrays with two values in each, which I want to have in a different format.
What I want to have:
"countries": [
{
"country": "Canada",
"count": 12
},
{and so on... }
]
But map obviously returns my values as array:
"countries": [
[
"Canada",
2
],
[
"Chile",
1
],
[
"China",
1
]
]
When using Array::to_h I am also able to bringt it closer to the format I actually want to have.
"countries": {
"Canada": 2,
"Chile": 1,
"China": 1,
}
I have tried reduce/inject, each_with_object but in both cases I do not understand how to access the incoming parameters. While searching here you find many many similar problems. But haven't found a way to adapt those to my case.
Hope you can help to find a short and elegant solution.
You are given two arrays:
countries= [['Canada', 2], ['Chile', 1], ['China', 1]]
keys = [:country, :count]
You could write
[keys].product(countries).map { |arr| arr.transpose.to_h }
#=> [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]
or simply
countries.map { |country, cnt| { country: country, count: cnt } }
#=> [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]
but the first has the advantage that no code need be changed in the names of the keys change. In fact, there would be no need to change the code if the arrays countries and keys both changed, provided countries[i].size == keys.size for all i = 0..countries.size-1. (See the example at the end.)
The initial step for the first calculation is as follows.
a = [keys].product(countries)
#=> [[[:country, :count], ["Canada", 2]],
# [[:country, :count], ["Chile", 1]],
# [[:country, :count], ["China", 1]]]
See Array#product. We now have
a.map { |arr| arr.transpose.to_h }
map passes the first element of a to the block and sets the block variable arr to that value:
arr = a.first
#=> [[:country, :count], ["Canada", 2]]
The block calculation is then performed:
b = arr.transpose
#=> [[:country, "Canada"], [:count, 2]]
b.to_h
#=> {:country=>"Canada", :count=>2}
So we see that a[0] (arr) is mapped to {:country=>"Canada", :count=>2}. The next two elements of a are then passed to the block and similar calculations are made, after which map returns the desired array of three hashes. See Array#transpose and Array#to_h.
Here is a second example using the same code.
countries= [['Canada', 2, 9.09], ['Chile', 1, 0.74],
['China', 1, 9.33], ['France', 1, 0.55]]
keys = [:country, :count, :area]
[keys].product(countries).map { |arr| arr.transpose.to_h }
#=> [{:country=>"Canada", :count=>2, :area=>9.09},
# {:country=>"Chile", :count=>1, :area=>0.74},
# {:country=>"China", :count=>1, :area=>9.33},
# {:country=>"France", :count=>1, :area=>0.55}]
Just out of curiosity:
countries = [['Canada', 2], ['Chile', 1], ['China', 1]]
countries.map(&%i[country count].method(:zip)).map(&:to_h)
#⇒ [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]

Ruby : How to sort an array of hash in a given order of a particular key

I have an array of hashes, id being one of the keys in the hashes. I want to sort the array elements according to a given order of ID values.
Suppose my array(size=5) is:
[{"id"=>1. ...}, {"id"=>4. ...}, {"id"=>9. ...}, {"id"=>2. ...}, {"id"=>7. ...}]
I want to sort the array elements such that their ids are in the following order:
[1,3,5,7,9,2,4,6,8,10]
So the expected result is:
[{'id' => 1},{'id' => 7},{'id' => 9},{'id' => 2},{'id' => 4}]
Here is a solution for any custom index:
def my_index x
# Custom code can be added here to handle items not in the index.
# Currently an error will be raised if item is not part of the index.
[1,3,5,7,9,2,4,6,8,10].index(x)
end
my_collection = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
p my_collection.sort_by{|x| my_index x['id'] } #=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
Then you can format it in any way you want, maybe this is prettier:
my_index = [1,3,5,7,9,2,4,6,8,10]
my_collection.sort_by{|x| my_index.index x['id'] }
I would map the hash based on the values like so:
a = [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
[1,3,5,7,9,2,4,6,8,10].map{|x| a[a.index({"id" => x})] }.compact
#=> [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
General note on sorting. Use #sort_by method of the ruby's array class:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by {|x|x['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
Or with usage #values method as a callback:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort_by(&:values)
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
or you can use more obvious version with #sort method:
[{'id' => 1},{'id'=>3},{'id'=>2}].sort {|x,y| x['id'] <=> y['id'] }
# => [{"id"=>1}, {"id"=>2}, {"id"=>3}]
For your case, to sort with extended condition use #% to split even and odd indexes:
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
u = y['id'] % 2 <=> x['id'] % 2
u == 0 && y['id'] <=> x['id'] || u
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>9}, {"id"=>2}, {"id"=>4}]
For your case, to sort with extended condition use #% to split according the index, even id value is absent in the index array:
index = [1,3,5,7,4,2,6,8,10] # swapped 2 and 4, 9 is absent
[{'id' => 1},{'id'=>4},{'id'=>9},{'id'=>2},{'id'=>7}].sort do |x,y|
!index.rindex( x[ 'id' ] ) && 1 || index.rindex( x[ 'id' ] ) <=> index.rindex( y[ 'id' ] ) || -1
end
# => [{"id"=>1}, {"id"=>7}, {"id"=>4}, {"id"=>2}, {"id"=>9}]
Why not just sort?
def doit(arr, order)
arr.sort { |h1,h2| order.index(h1['id']) <=> order.index(h2['id']) }
end
order = [1,3,5,7,9,2,4,6,8,10]
arr = [{'id' => 1}, {'id' => 4}, {'id' => 9}, {'id' => 2}, {'id' => 7}]
doit(arr, order)
#=> [{'id' => 1}, {'id' => 7}, {'id' => 9}, {'id' => 2}, {'id' => 4}]
a= [{"id"=>1}, {"id"=>4}, {"id"=>9}, {"id"=>2}, {"id"=>7}]
b=[1,3,5,7,9,2,4,6,8,10]
a.sort_by{|x| b.index (x['id'])}

How to remove outside array from joining two separate arrays with collect

I have an object called Grade with two attributes material and strength.
Grade.all.collect { |g| g.material }
#=> [steel, bronze, aluminium]
Grade.all.collect { |g| g.strength }
#=> [75, 22, 45]
Now I would like to combine both to get the following output:
[steel, 75], [bronze, 22], [aluminium, 45]
I currently do this
Grade.all.collect{|e| e.material}.zip(Grade.all.collect{|g| g.strength})
#=> [[steel, 75], [bronze, 22], [aluminium, 45]]
Note: I do not want the outside array [[steel, 75], [bronze, 22], [aluminium, 45]]
Any thoughts?
Splat the array to a mere list.
*Grade.all.collect{ |g| [g.material, g.strength] }

How to Print to Different Files on the Fly?

How can I print the contents of a dynamically generated and sorted array to different files based on their content?
For example, let's say we have the following multi-dimensional array which is sorted by the second column
[ ['Steph', 'Allen', 29], ['Jon', 'Doe', 30], ['Jane', 'Doe', 30], ['Tom', 'Moore', 28] ]
The goal is to have 3 files:
last_name-Allen.txt <-- Contains Steph Allen 29
last_name-Doe.txt <-- Contains Jon Doe 30 Jane Doe 30
last_name-Moore.txt <-- Contains Tom Moore 28
ar = [ ['Steph', 'Allen', 29], ['Jon', 'Doe', 30], ['Jane', 'Doe', 30], ['Tom', 'Moore', 28] ]
grouped = ar.group_by{|el| el[1] }
# {"Allen"=>[["Steph", "Allen", 29]], "Doe"=>[["Jon", "Doe", 30], ["Jane", "Doe", 30]], "Moore"=>[["Tom", "Moore", 28]]}
grouped.each do |last_name, record|
File.open("last_name-#{last_name}.txt",'w') do |f|
f.puts record.join(' ')
end
end
If you wanted to do this in Groovy, you could use the groupBy method to get a map based on surname like so:
// Start with your list
def list = [ ['Steph', 'Allen', 29], ['Jon', 'Doe', 30], ['Jane', 'Doe', 30], ['Tom', 'Moore', 28] ]
// Group it by the second element...
def grouped = list.groupBy { it[ 1 ] }
println grouped
prints
[Allen:[[Steph, Allen, 29]], Doe:[[Jon, Doe, 30], [Jane, Doe, 30]], Moore:[[Tom, Moore, 28]]]
Then, iterate through this map, opening a new file for each surname and writing the content in (tab separated in this example)
grouped.each { surname, contents ->
new File( "last_name-${surname}.txt" ).withWriter { out ->
contents.each { person ->
out.writeLine( person.join( '\t' ) )
}
}
}
In ruby:
array.each{|first, last, age| open("last_name-#{last}.txt", "a"){|io| io.write([first, last, age, nil].join(" ")}}
It adds an extra space at the end of the file. This is to keep the space when there is another entity to be added.
use a hash with last name as the keys, then iterate over the hash and write each key/value pair to its own file.
In Groovy you can do this:
def a = ​[['Steph', 'Allen', 29], ['Jon', 'Doe', 30], ['Jane', 'Doe', 30], ['Tom', 'Moore', 28]]
a.each {
def name = "last_name-${it[1]}.txt"
new File(name) << it.toString()
}
Probably there is shorter (groovier) way to do this.
You can create a hash of with "second column" as key and value as "file handle". If you get the key in hash, just fetch file handle and write, else create new file handle and insert in hash.
This answer is in Ruby:
# hash which opens appropriate file on first access
files = Hash.new { |surname| File.open("last_name-#{surname}.txt", "w") }
list.each do |first, last, age|
files[last].puts [first, last, age].join(" ")
end
# closes all the file handles
files.values.each(&:close)

Resources