Related
I have result set with some information about a client like
[{name: John, age:20, state:y, city:w, country:x,...},{name:.....}]
Now I want to loop through this list getting only name, state, and city. And create a file with this information in the following format
name | age | city
How can I do this? I thought about adding to a 3d list and then transposing to csv. But I don't like this idea.
CSV is not that hard to do (using the test data from #dawg):
require 'csv'
a = [{name: "John", age:20, state:"ID", city:"Boise", country:"USA"},
{name: "Bob", age:20, state:"CA", city:"LA", country:"USA"}
]
File.open("test.csv","w") do |f|
a.each{|hsh| f << hsh.values_at(:name, :age, :city).to_csv}
end
Given:
a=[{name: "John", age:20, state:"ID", city:"Boise", country:"USA"},
{name: "Bob", age:20, state:"CA", city:"LA", country:"USA"}
]
headers=[:name, :age, :city]
You can do:
> a.map{|hsh| hsh.slice(*headers) }
=>
[{:name=>"John", :age=>20, :city=>"Boise"},
{:name=>"Bob", :age=>20, :city=>"LA"}]
And if you want that format:
puts headers.join(" | ")
a.map{|hsh| hsh.slice(*headers) }.
each{|hsh| puts hsh.values.join(" | ")}
Prints:
name | age | city
John | 20 | Boise
Bob | 20 | LA
I know my question is pretty simple, but I can't manage to find an answer on the internet.
I have a hash called sorted_frequency. I want to output it as a table using the gem hirb. At the time I just have been able to print the hash under the default field names (0, 1). So it looks like that:
0 1
wordA ntimes
wordB mtimes
wordC rtimes
I'd like to rename the field names so it would be something like this:
words number of times
wordA ntimes
wordB mtimes
wordC rtimes
My actual code is this:
#needs to install 'docx' and 'hirb' gems
require 'docx'
require 'hirb'
doc = Docx::Document.open('monografia.docx')
text_listed = doc.to_s.downcase.split(" ")
forbidden_list = ["o", "si", "em", "ha", "no", "és", "amb", "i", "/","el",
"la", "els","les", "l'", "lo", "los", "en", "n'", "na", "es", "ets", "s'",
"sa", "so", "ses", "sos", "un", "una", "unes", "uns", "a", "que", "s'",
"al", "de","del", "per", "ens", "als", "com"]
clean_text= text_listed - forbidden_list
frequency = Hash.new 0
clean_text.each { |word| frequency[word] += 1 }
sorted_frequency = Hash[frequency.sort_by{ | word, times | -times }[0..20]]
puts Hirb::Helpers::AutoTable.render(sorted_frequency)
Again, I'm sorry if this a newbie question
EDIT:
As all my code has been asked, I'll explain it. It opens a docx document with the help of a gem called 'docx'. After that, it splits the doc by spaces and creates an array. After that, I remove some words I don't want to count (those included in the forbidden_list). Then I create a hash where the key is the word, and the value is the number of times that word appears in the docx. After that, I sort that hash and output using the gem 'hirb. The problem is I just don't know how to name the fields of the table created. I hope someone can help me.
According to the hirb docs, AutoTable.render() takes an argument which can be an array_of_arrays or an array_of_hashes. But because your hash argument worked, I looked at your table output, and I decided to try to merely add an option to change the column names that you got:
require 'hirb'
freqs = {
'go' => 3,
'no' => 4,
'to' => 1
}
puts Hirb::Helpers::AutoTable.render(
freqs,
fields: [1, 0], #Specify which fields to include in the table and their order.
#For a row that is an array, the field names are the integers 0, 1, 2, etc.
#For a row that is a hash, the field names are the keys.
headers: {0 => 'Word', 1 => 'Frequency'}, #Convert the field names to something more desirable for the column headers
description: false #Get rid of "3 rows in set" following the table
)
--output:--
+-----------+------+
| Frequency | Word |
+-----------+------+
| 3 | go |
| 4 | no |
| 1 | to |
+-----------+------+
It worked. What must be happening is: AutoTable.render() expects an array--either an array_of_arrays or an array_of_hashes--and if the method doesn't get an array as an argument, it calls to_a() on the argument. Take a look at what happens to your hash:
~/ruby_programs$ irb
2.4.0 :001 > freqs = {'go' => 3, 'no' => 4, 'to' => 1}
=> {"go"=>3, "no"=>4, "to"=>1}
2.4.0 :002 > freqs.to_a
=> [["go", 3], ["no", 4], ["to", 1]]
There's the array_of_arrays that AutoTable.render() needs. Rearranging a little, the array_of_arrays looks like this:
index 0 index 1
| |
[ V V
["go", 3], #row array
["no", 4],
["to", 1]
]
For an array_of_arrays, the column headers in the table are the index positions in each row array. The options for AutoTable.render() let you specify which columns/index positions to include in the table and their order, and the options allow you to convert the column headers to something more desirable.
Here's a more general example:
require 'hirb'
require 'pp'
data = [
['go', 1, '1/12/18'],
['to', 4, '1/24/18'],
['at', 2, '1/28/18']
]
puts Hirb::Helpers::AutoTable.render(
data,
fields: [2, 0], #Specify the index positions in each row array to include in the table and their column order in the table
headers: {0 => 'Word', 2 => 'Date'}, #Convert the column headers to something more desirable
description: false #Get rid of "3 rows in set" following the table
)
--output:--
+---------+------+
| Date | Word |
+---------+------+
| 1/12/18 | go |
| 1/24/18 | to |
| 1/28/18 | at |
+---------+------+
====
require 'hirb'
require 'pp'
freqs = {
'go' => 3,
'no' => 4,
'to' => 1
}
col_names = %w[word count]
new_freqs = freqs.map do |key, val|
{col_names[0] => key, col_names[1] => val}
end
pp new_freqs
puts Hirb::Helpers::AutoTable.render(
new_freqs,
fields: ['word', 'count'], #Specify which keys to include in table and their column order.
headers: {'word' => 'Good Word', 'count' => 'Frequency'}, #Convert keys to more desirable headers.
description: false #Get rid of "3 rows in set" following the table
)
--output:--
[{"word"=>"go", "count"=>3},
{"word"=>"no", "count"=>4},
{"word"=>"to", "count"=>1}]
+-----------+-----------+
| Good Word | Frequency |
+-----------+-----------+
| go | 3 |
| no | 4 |
| to | 1 |
+-----------+-----------+
====
require 'hirb'
require 'pp'
freqs = {
'go' => 3,
'no' => 4,
'to' => 1
}
col_names = %i[word count]
new_freqs = freqs.map do |key, val|
{col_names[0] => key, col_names[1] => val}
end
pp new_freqs
puts Hirb::Helpers::AutoTable.render(new_freqs)
--output:--
[{:word=>"go", :count=>3}, {:word=>"no", :count=>4}, {:word=>"to", :count=>1}]
+-------+------+
| count | word |
+-------+------+
| 3 | go |
| 4 | no |
| 1 | to |
+-------+------+
3 rows in set
===
require 'hirb'
require 'pp'
data = {
'first' => 'second',
'third' => 'fourth',
'fifth' => 'sixth'
}
col_names = %i[field1 field2]
new_data = data.map do |key, val|
{col_names[0] => key, col_names[1] => val}
end
pp new_data
puts Hirb::Helpers::AutoTable.render(new_data)
--output:--
[{:field1=>"first", :field2=>"second"},
{:field1=>"third", :field2=>"fourth"},
{:field1=>"fifth", :field2=>"sixth"}]
+--------+--------+
| field1 | field2 |
+--------+--------+
| first | second |
| third | fourth |
| fifth | sixth |
+--------+--------+
3 rows in set
=====
require 'hirb'
data = [
{field1: 'first', field2: 'second'},
{field1: 'third', field2: 'fourth'},
{field1: 'fifth', field2: 'sixth'}
]
puts Hirb::Helpers::AutoTable.render(data)
--output:--
+--------+--------+
| field1 | field2 |
+--------+--------+
| first | second |
| third | fourth |
| fifth | sixth |
+--------+--------+
3 rows in set
I have an array of hashes like this:
my_array_of_hashes = [
{ :customer=>"Matthew",
:fruit=>"Apples",
:quantity=>2,
:order_month => "January"
},
{ :customer => "Philip",
:fruit => "Oranges",
:quantity => 3,
:order_month => "July"
},
{ :customer => "Matthew",
:fruit => "Oranges",
:quantity => 1,
:order_month => "March"
},
{ :customer => "Charles",
:fruit => "Pears",
:quantity => 3,
:order_month => "January"
},
{ :customer => "Philip",
:fruit => "Apples",
:quantity => 2,
:order_month => "April"
},
{ :customer => "Philip",
:fruit => "Oranges",
:quantity => 1,
:order_month => "July"
}
]
which I would like to summarize in a row-column format. Using my sample data this would mean summing the :quantity values, with one row per unique customer, one column per unique fruit.
-----------------------------------
Customer | Apples | Oranges | Pears
Charles | | | 3
Matthew | 2 | 1 |
Philip | 2 | 4 |
-----------------------------------
This feels like something solvable with Ruby enumerables but I can't see how.
Create arrays needed to construct the table
I will construct three arrays that contain row labels (customers), column labels (fruit) and the values in the table (values).
arr_of_hash = [
{:customer=>"Matthew", :fruit=>"Apples", :quantity=>2, :order_month=>"January"},
{:customer=>"Philip", :fruit=>"Oranges", :quantity=>3, :order_month=>"July" },
{:customer=>"Matthew", :fruit=>"Oranges", :quantity=>1, :order_month=>"March" },
{:customer=>"Charles", :fruit=>"Pears", :quantity=>3, :order_month=>"January"},
{:customer=>"Philip", :fruit=>"Apples", :quantity=>2, :order_month=>"April" },
{:customer=>"Philip", :fruit=>"Oranges", :quantity=>1, :order_month=>"July" }
]
customers = arr_of_hash.flat_map { |g| g[:customer] }.uniq.sort
#=> ["Charles", "Matthew", "Philip"]
fruit = arr_of_hash.flat_map { |g| g[:fruit] }.uniq.sort
#=> ["Apples", "Oranges", "Pears"]
h = customers.each_with_object({}) { |cust,h| h[cust] = fruit.product([0]).to_h }
#=> {"Charles"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>0},
# "Matthew"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>0},
# "Philip" =>{"Apples"=>0, "Oranges"=>0, "Pears"=>0}}
arr_of_hash.each do |g|
customer = g[:customer]
h[customer][g[:fruit]] += g[:quantity]
end
values = h.map { |_,v| v.values }
#=> [[0, 0, 3],
# [2, 1, 0],
# [2, 4, 0]]
Note that immediately before values = h.map { |_,v| v.values }:
h #=> {"Charles"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>3},
# "Matthew"=>{"Apples"=>2, "Oranges"=>1, "Pears"=>0},
# "Philip" =>{"Apples"=>2, "Oranges"=>4, "Pears"=>0}}
Print the table
def print_table(row_labels_title, row_labels, col_labels, values, gap_size=3)
col_width = [values.flatten.max.size, col_labels.max_by(&:size).size].max + gap_size
row_labels_width = [row_labels_title.size, row_labels.max_by(&:size).size].max +
gap_size
horiz_line = '-'*(row_labels_width + col_labels.size * col_width + col_labels.size)
puts horiz_line
print row_labels_title.ljust(row_labels_width)
col_labels.each do |s|
print "|#{s.center(col_width)}"
end
puts
row_labels.each do |row_label|
print row_label.ljust(row_labels_width)
vals = values.shift
col_labels.each do |col_label|
print "|#{vals.shift.to_s.center(col_width)}"
end
puts
end
puts horiz_line
end
print_table("Customers", customers, fruit, values, 2)
--------------------------------------------
Customers | Apples | Oranges | Pears
Charles | 0 | 0 | 3
Matthew | 2 | 1 | 0
Philip | 2 | 4 | 0
--------------------------------------------
You can use a hash with a default value of a hash with a default value of 0 :) :
fruits = Hash.new { |h, k| h[k] = Hash.new(0) }
fruits[:some_name]
# {}
p fruits[:some_name][:some_fruit]
# 0
That way, you don't need any logic inside your loop, you just iterate over the hashes and add the quantities :
my_array_of_hashes = [ {:customer=>"Matthew", :fruit=>"Apples", :quantity=>2, :order_month => "January"}, {:customer => "Philip", :fruit => "Oranges", :quantity => 3, :order_month => "July"}, {:customer => "Matthew", :fruit => "Oranges", :quantity => 1, :order_month => "March"}, {:customer => "Charles", :fruit => "Pears", :quantity => 3, :order_month => "January"}, {:customer => "Philip", :fruit => "Apples", :quantity => 2, :order_month => "April"}, {:customer => "Philip", :fruit => "Oranges", :quantity => 1, :order_month => "July"} ]
fruits = Hash.new { |h, k| h[k] = Hash.new(0) }
my_array_of_hashes.each do |hash|
fruits[hash[:customer]][hash[:fruit]] += hash[:quantity]
end
p fruits
# {
# "Matthew"=>{"Apples"=>2, "Oranges"=>1},
# "Philip"=>{"Oranges"=>4, "Apples"=>2},
# "Charles"=>{"Pears"=>3}
# }
I am creating a hash, whose key is a hash and the value is an array. E.g.,
shop = Hash.new
items.each do |item|
grouping_key = {
'name'=> item['name'],
'value'=> item['value']
}
shop [grouping_key] ||= Array.new
shop [grouping_key] << item
end
Here, I am grouping each item based on grouping key. For the following items:
Item1 = {'name'=>'test', 'value'=>10, 'color'=>'black', 'description'=>'item1'}
Item2 = {'name'=>'test2', 'value'=>10, 'color'=>'blue', 'description'=>'item2'}
Item3 = {'name'=>'test', 'value'=>10, 'color'=>'black', 'description'=>'item3'}
Item4 = {'name'=>'test2', 'value'=>10, 'color'=>'blue', 'description'=>'item4'}
my shop hash will be:
shop = {{'name'=>'test', 'value'=>10}=>[Item1, Item3], {name=>test2, value=>10}=>[Item2, Item4]}
I wanted to add color to hash key, but not as part of grouping key. Is it possible to do so without reiterating over hash and modifying it? e.g.
shop = {{'name'=>'test', 'value'=>10, 'color'=>'black'}=>[Item1, Item3], {'name'=>'test2', 'value'=>10, 'color'=>'blue'}=>[Item2, Item4]}
Any other approach will also be helpful.
Your initial code is equivalent to
shop = items.group_by do | i |
{'name' => i['name'], 'value' => i['value'] }
end
To add the color to the key hash, simply do
shop = items.group_by do | i |
{'name' => i['name'], 'value' => i['value'], 'color' => i['color'] }
end
Now, you are grouping by color too.
If this is not your intention ("but not as part of grouping key"), i.e. if there can be items with the same name and value but different color, and these items shall go into the same group, then you first have to decide which color should be in the group's hash then.
In that case, postprocessing the hash would be simplest:
shop = items.group_by do | i |
{'name' => i['name'], 'value' => i['value'] }
end
shop.keys.each { | h | h['color'] = shop[h].sample['color'] }
Say I have a CSV file with 4 fields,
ID,name,pay,age
and about 32,000 records.
What's the best way to stick this into a hash in Ruby?
In other words, an example record would look like:
{:rec1 => {:id=>"00001", :name => "Bob", :pay => 150, :age => 95 } }
Thanks for the help!
You can use the Excelsior rubygem for this:
csv = ...
result = Hash.new
counter = 1
Excelsior::Reader.rows(csv) do |row|
row_hash = result[("rec#{counter}".intern)] = Hash.new
row.each do |col_name, col_val|
row_hash[col_name.intern] = col_val
end
counter += 1
end
# do something with result...
Typically we'd want to use an :id field for the Hash key, since it'd be the same as a primary key in a database table:
{"00001" => {:name => "Bob", :pay => 150, :age => 95 } }
This will create a hash looking like that:
require 'ap'
# Pretend this is CSV data...
csv = [
%w[ id name pay age ],
%w[ 1 bob 150 95 ],
%w[ 2 fred 151 90 ],
%w[ 3 sam 140 85 ],
%w[ 31999 jane 150 95 ]
]
# pull headers from the first record
headers = csv.shift
# drop the first header, which is the ID. We'll use it as the key so we won't need a name for it.
headers.shift
# loop over the remaining records, adding them to a hash
data = csv.inject({}) { |h, row| h[row.shift.rjust(5, '0')] = Hash[headers.zip(row)]; h }
ap data
# >> {
# >> "00001" => {
# >> "name" => "bob",
# >> "pay" => "150",
# >> "age" => "95"
# >> },
# >> "00002" => {
# >> "name" => "fred",
# >> "pay" => "151",
# >> "age" => "90"
# >> },
# >> "00003" => {
# >> "name" => "sam",
# >> "pay" => "140",
# >> "age" => "85"
# >> },
# >> "31999" => {
# >> "name" => "jane",
# >> "pay" => "150",
# >> "age" => "95"
# >> }
# >> }
Check out the Ruby Gem smarter_csv, which parses CSV-files and returns array(s) of hashes for the rows in the CSV-file. It can also do chunking, to more efficiently deal with large CSV-files, so you can pass the chunks to parallel Resque workers or mass-create records with Mongoid or MongoMapper.
It comes with plenty of useful options - check out the documentation on GitHub
require 'smarter_csv'
filename = '/tmp/input.csv'
array = SmarterCSV.process(filename)
=>
[ {:id=> 1, :name => "Bob", :pay => 150, :age => 95 } ,
...
]
See also:
https://github.com/tilo/smarter_csv
http://www.unixgods.org/~tilo/Ruby/process_csv_as_hashes.html
Hash[*CSV.read(filename, :headers => true).flat_map.with_index{|r,i| ["rec#{i+1}", r.to_hash]}]