Array of Hashes to CSV with Ruby [closed] - ruby

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I'm working on creating a CSV file of orders from an array of hashes. The name and id in the CSV output need to be the same on each line, but the amount of lines for each order depends on the amount of skus that are in the order.
Is there a simple way to output this orders array?
orders = []
orders << { name:"bob", id:123, sku:[ "123a", "456b", "xyz1" ], qty:[ 2, 4, 1 ] }
orders << { name:"kat", id:987, sku:[ "456b", "aaa0", "xyz1" ], qty:[ 8, 9, 5 ] }
orders << { name:"kat", id:222, sku:[ "123a" ], qty:[ 4 ] }
To a CSV file like this:
name,id,sku,qty
bob,123,123a,2
bob,123,456b,4
bob,123,xyz1,1
kat,987,456b,8
kat,987,aaa0,9
kat,987,xyz1,5
kat,222,123a,4

First of all, let’s prepare the array to dump to CSV:
asarray = orders.map { |e|
[e[:name], e[:id], e[:sku].zip(e[:qty])]
}.map { |e|
e.last.map { |sq| [*e[0..1], *sq] }
}
Now we have raw array ready to be serialized to CSV:
require 'csv'
CSV.open("path/to/file.csv", "wb") do |csv|
csv << ["name", "id", "sku", "qty"]
asarray.each { |order|
order.each { |row|
csv << row
}
}
end

As a variant:
orders = []
orders << { name:"bob", id:123, sku:[ "123a", "456b", "xyz1" ], qty:[ 2, 4, 1 ] }
orders << { name:"kat", id:987, sku:[ "456b", "aaa0", "xyz1" ], qty:[ 8, 9, 5 ] }
orders << { name:"kat", id:222, sku:[ "123a" ], qty:[ 4 ] }
csv = ''
orders.each do |el|
el[:qty].length.times do |idx|
csv += "#{el[:name]},#{el[:id]},#{el[:sku][idx]},#{el[:qty][idx]}\n"
end
end
puts csv
Result:
#> bob,123,123a,2
#> bob,123,456b,4
#> bob,123,xyz1,1
#> kat,987,456b,8
#> kat,987,aaa0,9
#> kat,987,xyz1,5
#> kat,222,123a,4

Try this:
module SKUSeparator
def map_by_skus
inject([]) do |csv, order|
order[:sku].each_with_index do |sku, index|
csv << [order[:name], order[:id], order[:sku][index], order[:qty][index]]
end
csv
end
end
def to_csv
map_by_skus.map { |line| line.join(",") }.join("\n")
end
end
ORDERS = [
{:name=>"bob", :id=>123, :sku=>["123a", "456b", "xyz1"], :qty=>[2, 4, 1]},
{:name=>"kat", :id=>987, :sku=>["456b", "aaa0", "xyz1"], :qty=>[8, 9, 5]},
{:name=>"kat", :id=>222, :sku=>["123a"], :qty=>[4]}
]
ORDERS.extend(SKUSeparator).map_by_skus # =>
# [
# ["bob", 123, "123a", 2],
# ["bob", 123, "456b", 4],
# ["bob", 123, "xyz1", 1],
# ["kat", 987, "456b", 8],
# ["kat", 987, "aaa0", 9],
# ["kat", 987, "xyz1", 5],
# ["kat", 222, "123a", 4]
# ]
ORDERS.extend(SKUSeparator).to_csv # =>
# bob,123,123a,2
# bob,123,456b,4
# bob,123,xyz1,1
# kat,987,456b,8
# kat,987,aaa0,9
# kat,987,xyz1,5
# kat,222,123a,4

Related

Merge hash of arrays into array of hashes

So, I have a hash with arrays, like this one:
{"name": ["John","Jane","Chris","Mary"], "surname": ["Doe","Doe","Smith","Martins"]}
I want to merge them into an array of hashes, combining the corresponding elements.
The results should be like that:
[{"name"=>"John", "surname"=>"Doe"}, {"name"=>"Jane", "surname"=>"Doe"}, {"name"=>"Chris", "surname"=>"Smith"}, {"name"=>"Mary", "surname"=>"Martins"}]
Any idea how to do that efficiently?
Please, note that the real-world use scenario could contain a variable number of hash keys.
Try this
h[:name].zip(h[:surname]).map do |name, surname|
{ 'name' => name, 'surname' => surname }
end
I suggest writing the code to permit arbitrary numbers of attributes. It's no more difficult than assuming there are two (:name and :surname), yet it provides greater flexibility, accommodating, for example, future changes to the number or naming of attributes:
def squish(h)
keys = h.keys.map(&:to_s)
h.values.transpose.map { |a| keys.zip(a).to_h }
end
h = { name: ["John", "Jane", "Chris"],
surname: ["Doe", "Doe", "Smith"],
age: [22, 34, 96]
}
squish(h)
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
The steps for the example above are as follows:
b = h.keys
#=> [:name, :surname, :age]
keys = b.map(&:to_s)
#=> ["name", "surname", "age"]
c = h.values
#=> [["John", "Jane", "Chris"], ["Doe", "Doe", "Smith"], [22, 34, 96]]
d = c.transpose
#=> [["John", "Doe", 22], ["Jane", "Doe", 34], ["Chris", "Smith", 96]]
d.map { |a| keys.zip(a).to_h }
#=> [{"name"=>"John", "surname"=>"Doe", "age"=>22},
# {"name"=>"Jane", "surname"=>"Doe", "age"=>34},
# {"name"=>"Chris", "surname"=>"Smith", "age"=>96}]
In the last step the first value of b is passed to map's block and the block variable is assigned its value.
a = d.first
#=> ["John", "Doe", 22]
e = keys.zip(a)
#=> [["name", "John"], ["surname", "Doe"], ["age", 22]]
e.to_h
#=> {"name"=>"John", "surname"=>"Doe", "age"=>22}
The remaining calculations are similar.
If your dataset is really big, you can consider using Enumerator::Lazy.
This way Ruby will not create intermediate arrays during calculations.
This is how #Ursus answer can be improved:
h[:name]
.lazy
.zip(h[:surname])
.map { |name, surname| { 'name' => name, 'surname' => surname } }
.to_a
Other option for the case where:
[..] the real-world use scenario could contain a variable number of hash keys
h = {
'name': ['John','Jane','Chris','Mary'],
'surname': ['Doe','Doe','Smith','Martins'],
'whathever': [1, 2, 3, 4, 5]
}
You could use Object#then with a splat operator in a one liner:
h.values.then { |a, *b| a.zip *b }.map { |e| (h.keys.zip e).to_h }
#=> [{:name=>"John", :surname=>"Doe", :whathever=>1}, {:name=>"Jane", :surname=>"Doe", :whathever=>2}, {:name=>"Chris", :surname=>"Smith", :whathever=>3}, {:name=>"Mary", :surname=>"Martins", :whathever=>4}]
The first part, works this way:
h.values.then { |a, *b| a.zip *b }
#=> [["John", "Doe", 1], ["Jane", "Doe", 2], ["Chris", "Smith", 3], ["Mary", "Martins", 4]]
The last part just maps the elements zipping each with the original keys then calling Array#to_h to convert to hash.
Here I removed the call .to_h to show the intermediate result:
h.values.then { |a, *b| a.zip *b }.map { |e| h.keys.zip e }
#=> [[[:name, "John"], [:surname, "Doe"], [:whathever, 1]], [[:name, "Jane"], [:surname, "Doe"], [:whathever, 2]], [[:name, "Chris"], [:surname, "Smith"], [:whathever, 3]], [[:name, "Mary"], [:surname, "Martins"], [:whathever, 4]]]
[h[:name], h[:surname]].transpose.map do |name, surname|
{ 'name' => name, 'surname' => surname }
end

Translate Ruby hash (key,value) to separate keys

I have a map function in ruby which returns an array of arrays with two values in each, which I want to have in a different format.
What I want to have:
"countries": [
{
"country": "Canada",
"count": 12
},
{and so on... }
]
But map obviously returns my values as array:
"countries": [
[
"Canada",
2
],
[
"Chile",
1
],
[
"China",
1
]
]
When using Array::to_h I am also able to bringt it closer to the format I actually want to have.
"countries": {
"Canada": 2,
"Chile": 1,
"China": 1,
}
I have tried reduce/inject, each_with_object but in both cases I do not understand how to access the incoming parameters. While searching here you find many many similar problems. But haven't found a way to adapt those to my case.
Hope you can help to find a short and elegant solution.
You are given two arrays:
countries= [['Canada', 2], ['Chile', 1], ['China', 1]]
keys = [:country, :count]
You could write
[keys].product(countries).map { |arr| arr.transpose.to_h }
#=> [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]
or simply
countries.map { |country, cnt| { country: country, count: cnt } }
#=> [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]
but the first has the advantage that no code need be changed in the names of the keys change. In fact, there would be no need to change the code if the arrays countries and keys both changed, provided countries[i].size == keys.size for all i = 0..countries.size-1. (See the example at the end.)
The initial step for the first calculation is as follows.
a = [keys].product(countries)
#=> [[[:country, :count], ["Canada", 2]],
# [[:country, :count], ["Chile", 1]],
# [[:country, :count], ["China", 1]]]
See Array#product. We now have
a.map { |arr| arr.transpose.to_h }
map passes the first element of a to the block and sets the block variable arr to that value:
arr = a.first
#=> [[:country, :count], ["Canada", 2]]
The block calculation is then performed:
b = arr.transpose
#=> [[:country, "Canada"], [:count, 2]]
b.to_h
#=> {:country=>"Canada", :count=>2}
So we see that a[0] (arr) is mapped to {:country=>"Canada", :count=>2}. The next two elements of a are then passed to the block and similar calculations are made, after which map returns the desired array of three hashes. See Array#transpose and Array#to_h.
Here is a second example using the same code.
countries= [['Canada', 2, 9.09], ['Chile', 1, 0.74],
['China', 1, 9.33], ['France', 1, 0.55]]
keys = [:country, :count, :area]
[keys].product(countries).map { |arr| arr.transpose.to_h }
#=> [{:country=>"Canada", :count=>2, :area=>9.09},
# {:country=>"Chile", :count=>1, :area=>0.74},
# {:country=>"China", :count=>1, :area=>9.33},
# {:country=>"France", :count=>1, :area=>0.55}]
Just out of curiosity:
countries = [['Canada', 2], ['Chile', 1], ['China', 1]]
countries.map(&%i[country count].method(:zip)).map(&:to_h)
#⇒ [{:country=>"Canada", :count=>2},
# {:country=>"Chile", :count=>1},
# {:country=>"China", :count=>1}]

Convert JSON to a multidimensional array in Ruby

How can I convert a string of JSON data to a multidimensional array?
# Begin with JSON
json_data = "[
{"id":1,"name":"Don"},
{"id":2,"name":"Bob"},
...
]"
# do something here to convert the JSON data to array of arrays.
# End with multidimensional arrays
array_data = [
["id", "name"],
[1,"Don"],
[2,"Bob"],
...
]
For readability and efficiency, I would do it like this:
require 'json'
json_data = '[{"id":1,"name":"Don"},{"id":2,"name":"Bob"}]'
arr = JSON.parse(json_data)
#=> "[{\"id\":1,\"name\":\"Don\"},{\"id\":2,\"name\":\"Bob\"}]"
keys = arr.first.keys
#=> ["id", "name"]
arr.map! { |h| h.values_at(*keys) }.unshift(keys)
#=> [["id", "name"], [1, "Don"], [2, "Bob"]]
This should do the trick:
require 'json'
json_data = '[{"id":1,"name":"Don"},{"id":2,"name":"Bob"}]'
JSON.parse(json_data).inject([]) { |result, e| result + [e.keys, e.values] }.uniq
First, we read the JSON into an array with JSON.parse. For each element in the JSON, we collect all keys and values using inject which results in the following array:
[
["id", "name"],
[1, "Don"],
["id", "name"],
[2, "Bob"]
]
To get rid of the repeating key-arrays, we call uniq and are done.
[
["id", "name"],
[1, "Don"],
[2, "Bob"]
]
Adding to #tessi's answer, we can avoid using 'uniq' if we combine 'with_index' and 'inject'.
require 'json'
json_data = '[{"id":1,"name":"Don"},{"id":2,"name":"Bob"}]'
array_data = JSON.parse(json_data).each.with_index.inject([]) { |result, (e, i)| result + (i == 0 ? [e.keys, e.values] : [e.values]) }
puts array_data.inspect
The result is:
[["id", "name"], [1, "Don"], [2, "Bob"]]

Ruby - sort_by using dynamic keys

I have an array of hashes:
array = [
{
id: 1,
name: "A",
points: 20,
victories: 4,
goals: 5,
},
{
id: 1,
name: "B",
points: 20,
victories: 4,
goals: 8,
},
{
id: 1,
name: "C",
points: 21,
victories: 5,
goals: 8,
}
]
To sort them using two keys I do:
array = array.group_by do |key|
[key[:points], key[:goals]]
end.sort_by(&:first).map(&:last)
But in my program, the sort criterias are stored in a database and I can get them and store in a array for example: ["goals","victories"] or ["name","goals"].
How can I sort the array using dinamic keys?
I tried many ways with no success like this:
criterias_block = []
criterias.each do |criteria|
criterias_block << "key[:#{criteria}]"
end
array = array.group_by do |key|
criterias_block
end.sort_by(&:first).map(&:last)
Array#sort can do this
criteria = [:points, :goals]
array.sort_by { |entry|
criteria.map { |c| entry[c] }
}
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
This works because if you sort an array [[1,2], [1,1], [2,3]], it sorts by the first elements, using any next elements to break ties
You can use values_at:
criteria = ["goals", "victories"]
criteria = criteria.map(&:to_sym)
array = array.group_by do |key|
key.values_at(*criteria)
end.sort_by(&:first).map(&:last)
# => [[{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]]
values_at returns an array of all the keys requested:
array[0].values_at(*criteria)
# => [4, 5]
I suggest doing it like this.
Code
def sort_it(array,*keys)
array.map { |h| [h.values_at(*keys), h] }.sort_by(&:first).map(&:last)
end
Examples
For array as given by you:
sort_it(array, :goals, :victories)
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
sort_it(array, :name, :goals)
#=> [{:id=>1, :name=>"A", :points=>20, :victories=>4, :goals=>5},
# {:id=>1, :name=>"B", :points=>20, :victories=>4, :goals=>8},
# {:id=>1, :name=>"C", :points=>21, :victories=>5, :goals=>8}]
For the first of these examples, you could of course write:
sort_it(array, *["goals", "victories"].map(&:to_sym))

How do I count items for some time period?

I have records in my database like:
id | item_name | 2013-06-05T17:55:13+03:00
I want to group them by 'items per Day', 'items per Hour', 'items per 20 minutes'.
What is the best way to implement it?
The simple way:
by_day = array.group_by{|a| a.datetime.to_date}
by_hour = array.group_by{|a| [a.datetime.to_date, a.datetime.hour]}
by_20_minutes = array.group_by{|a| [a.datetime.to_date, a.datetime.hour, a.datetime.minute/20]}
require 'time'
def group_by_period(items)
groups = { :day => {}, :hour => {}, :t20min => {} }
items.reduce(groups) do |memo, item|
# Compute the correct buckets for the item's timestamp.
timestamp = Time.parse(item[2]).utc
item_day = timestamp.to_date.to_s
item_hour = timestamp.iso8601[0..12]
item_20min = timestamp.iso8601[0..15]
item_20min[14..18] = (item_20min[14..15].to_i / 20) * 20
# Place the item in each bucket.
[[:day,item_day], [:hour,item_hour], [:t20min,item_20min]].each do |k,v|
memo[k][v] = [] unless memo[k][v]
memo[k][v] << item
end
memo
end
end
sample_db_output = [
[1, 'foo', '2010-01-01T12:34:56Z'],
[2, 'bar', '2010-01-02T12:34:56Z'],
[3, 'gah', '2010-01-02T13:34:56Z'],
[4, 'zip', '2010-01-02T13:54:56Z']
]
group_by_period(sample_db_output)
# {:day=>
# {"2010-01-01"=>[[1, "foo", "2010-01-01T12:34:56Z"]],
# "2010-01-02"=>
# [[2, "bar", "2010-01-02T12:34:56Z"],
# [3, "gah", "2010-01-02T13:34:56Z"],
# [4, "zip", "2010-01-02T13:54:56Z"]]},
# :hour=>
# {"2010-01-01T12"=>[[1, "foo", "2010-01-01T12:34:56Z"]],
# "2010-01-02T12"=>[[2, "bar", "2010-01-02T12:34:56Z"]],
# "2010-01-02T13"=>
# [[3, "gah", "2010-01-02T13:34:56Z"], [4, "zip", "2010-01-02T13:54:56Z"]]},
# :t20min=>
# {"2010-01-01T12:20:00"=>[[1, "foo", "2010-01-01T12:34:56Z"]],
# "2010-01-02T12:20:00"=>[[2, "bar", "2010-01-02T12:34:56Z"]],
# "2010-01-02T13:20:00"=>[[3, "gah", "2010-01-02T13:34:56Z"]],
# "2010-01-02T13:40:00"=>[[4, "zip", "2010-01-02T13:54:56Z"]]}}

Resources