Using Ruby CSV header converters - ruby

Say I have the following class:
class Buyer < ActiveRecord::Base
attr_accesible :first_name, :last_name
and the following in a CSV file:
First Name,Last Name
John,Doe
Jane,Doe
I want to save the contents of the CSV into the database. I have the following in a Rake file:
namespace :migration do
desc "Migrate CSV data"
task :import, [:model, :file_path] => :environment do |t, args|
require 'csv'
model = args.model.constantize
path = args.file_path
CSV.foreach(path, :headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') }
) do |row|
model.create!(row.to_hash)
end
end
end
I am getting an undefined method 'downcase' for nil:NilClass. If I exclude the header converters then I get unknown attribute 'First Name'. What's the correct syntax for converting a header from, say, First Name to first_name?

After doing some research here in my desktop, it seems to me the error is for something else.
First I put the data in my "a.txt" file as below :
First Name,Last Name
John,Doe
Jane,Doe
Now I ran the code, which is saved in my so.rb file.
so.rb
require 'csv'
CSV.foreach("C:\\Users\\arup\\a.txt",
:headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') }
) do |row|
p row
end
Now running the :
C:\Users\arup>ruby -v so.rb
ruby 1.9.3p448 (2013-06-27) [i386-mingw32]
#<CSV::Row "first_name":"John" "last_name":"Doe">
#<CSV::Row "first_name":"Jane" "last_name":"Doe">
So everything is working now. Now let me reproduce the error :
I put the data in my "a.txt" file as below ( just added a , after the last column) :
First Name,Last Name,
John,Doe
Jane,Doe
Now I ran the code, which is saved in my so.rb file, again.
C:\Users\arup>ruby -v so.rb
ruby 1.9.3p448 (2013-06-27) [i386-mingw32]
so.rb:5:in `block in <main>': undefined method `downcase' for nil:NilClass (NoMethodError)
It seems, in your header row, there is blank column value which is causing the error. Thus if you have a control to the source CSV file, check there the same. Or do some change in your code, to handle the error as below :
require 'csv'
CSV.foreach("C:\\Users\\arup\\a.txt",
:headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') unless h.nil? }
) do |row|
p row
end

A more general answer, but if you have code that you need to process as text, and sometimes you might get a nil in there, then call to_s on the object. This will turn nil into an empty string. eg
h.to_s.downcase.gsub(' ', '_')
This will never blow up, whatever h is, because every class in ruby has the to_s method, and it always returns a string (unless you've overridden it to do something else, which would be unadvisable).

Passing :symbol to :header_converters will automatically convert to strings to snake case as well.
options = {:headers => true,
:header_converters => :symbol}
CSV.foreach(filepath, options) ...
#<CSV::Row first_name:"John" last_name:"Doe">
#<CSV::Row first_name:"Jane" last_name:"Doe">

Related

Why can't my script access classes that are in different directory?

I am trying to access classes DealState and NotAnEndState that are in another directory, where I have a lib called move-to-go.
move-to-go folder contains modules where the one in my example is named deal_state.rb. When i open deal_state.rb it contains the following the code below.
Path to lib: F:\Ruby25-x64\lib\ruby\gems\2.5.0\gems\move-to-go-5.3.0\lib\move-to-go
module MoveToGo
module DealState
# This is the default, a deal with a status with this state is
# currently being worked on.
NotAnEndState = 0
# The deal has reached a positive end state, eg we have won
# the deal.
PositiveEndState = 1
# The deal has reached a negative end state, eg we have lost
# the deal.
NegativeEndState = -1
end
end
Path to my code: C:Users/Shahin/MigrationFolder/converter.rb
class Converter
def configure(rootmodel)
rootmodel.settings.with_organization do |organization|
organization.set_custom_field( { :integration_id => 'source', :title => 'Källa', :type => :Link } )
end
rootmodel.settings.with_person do |person|
person.set_custom_field( { :integration_id => 'source', :title => 'Källa', :type => :String} )
end
rootmodel.settings.with_deal do |deal|
assessment is default DealState::NotAnEndState
deal.add_status( {:label => '1. Kvalificering' })
deal.add_status( {:label => '2. Deal closed', :assessment => MoveToGo::DealState::PositiveEndState })
deal.add_status( {:label => '4. Deal lost', :assessment => MoveToGo::DealState::NegativeEndState })
end
end
When I execute my script I get this error message:
C:Users/MyUserName/MigrationFolder/converter.rb:63:in `block in configure': uninitialized constant Converter::DealState (NameError)
Did you mean? DEAL_SHEET
New things have however come to light. The error message seems to have an issue with the Converter-class, but i cant really interpret what it is implying.
This line is the error : assessment is default DealState::NotAnEndState.
First you should use MoveToGo::DealState::NotAnEndState and second assessment is default should be in a spec file not here.
If you juste remove this line, there should be no error anymore.

ruby object to_s gives unexpected output

What is the correct way to view the output of the puts statements below? My apologies for such a simple question.... Im a little rusty on ruby. github repo
require 'active_support'
require 'active_support/core_ext'
require 'indicators'
my_data = Indicators::Data.new(Securities::Stock.new(:symbol => 'AAPL', :start_date => '2012-08-25', :end_date => '2012-08-30').output)
puts my_data.to_s #expected to see Open,High,Low,Close for AAPL
temp=my_data.calc(:type => :sma, :params => 3)
puts temp.to_s #expected to see an RSI value for each data point from the data above
Maybe check out the awesome_print gem.
It provides the .ai method which can be called on anything.
An example:
my_obj = { a: "b" }
my_obj_as_string = my_obj.ai
puts my_obj_as_string
# ... this will print
# {
# :a => "b"
# }
# except the result is colored.
You can shorten all this into a single step with ap(my_obj).
There's also a way to return objects as HTML. It's the my_obj.ai(html: true) option.
Just use .inspect method instead of .to_s if you want to see internal properties of objects.

ruby hash.values is not working with built in method

i tried almost everything but I am feeling cornered.
I have a CSV and reading a line from it:
CSV.foreach(file, quote_char: '"', col_sep: ',', row_sep: :auto, headers: true) { |line|
newLine = []
newLine = line.values #undefined method .values
...
}
line is aparently hash, because line['column_name'] is working fine and also line.to_a returns ["col","value","col2","value2",...]
please help, thank you!
You can use #fields on the class CSV::Row
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV/Row.html
It is not a regular hash, it is an instance of CSV::Row, see here for the API
As you can see in the result of the following code the method values isn't there. Your solution of using line['column_name'] is fine.
You can get all the fields with the method fields without parameter.
CSV.parse(DATA, :col_sep => ",", :headers => true).each do |row|
puts row.class
puts row.methods - Object.methods
end
__END__
kId,kName,kURL
1,Google UK,http://google.co.uk
2,Yahoo UK,http://yahoo.co.uk
It is a CSV row which is part array and part hash and doesn't have the .values method available. Use .to_hash first and then you will be able to use .values. (Note that this will remove the field ordering and any duplicate fields)
newLine = line.to_hash.values

import from CSV into Ruby array, with 1st field as hash key, then lookup a field's value given header row

Maybe somebody can help me.
Starting with a CSV file like so:
Ticker,"Price","Market Cap"
ZUMZ,30.00,933.90
XTEX,16.02,811.57
AAC,9.83,80.02
I manage to read them into an array:
require 'csv'
tickers = CSV.read("stocks.csv", {:headers => true, :return_headers => true, :header_converters => :symbol, :converters => :all} )
To verify data, this works:
puts tickers[1][:ticker]
ZUMZ
However this doesn't:
puts tickers[:ticker => "XTEX"][:price]
How would I go about turning this array into a hash using the ticker field as unique key, such that I could easily look up any other field associatively as defined in line 1 of the input? Dealing with many more columns and rows.
Much appreciated!
Like this (it works with other CSVs too, not just the one you specified):
require 'csv'
tickers = {}
CSV.foreach("stocks.csv", :headers => true, :header_converters => :symbol, :converters => :all) do |row|
tickers[row.fields[0]] = Hash[row.headers[1..-1].zip(row.fields[1..-1])]
end
Result:
{"ZUMZ"=>{:price=>30.0, :market_cap=>933.9}, "XTEX"=>{:price=>16.02, :market_cap=>811.57}, "AAC"=>{:price=>9.83, :market_cap=>80.02}}
You can access elements in this data structure like this:
puts tickers["XTEX"][:price] #=> 16.02
Edit (according to comment): For selecting elements, you can do something like
tickers.select { |ticker, vals| vals[:price] > 10.0 }
CSV.read(file_path, headers:true, header_converters: :symbol, converters: :all).collect do |row|
Hash[row.collect { |c,r| [c,r] }]
end
CSV.read(file_path, headers:true, header_converters: :symbol, converters: :all).collect do |row|
row.to_h
end
To add on to Michael Kohl's answer, if you want to access the elements in the following manner
puts tickers[:price]["XTEX"] #=> 16.02
You can try the following code snippet:
CSV.foreach("Workbook1.csv", :headers => true, :header_converters => :symbol, :converters => :all) do |row|
hash_row = row.headers[1..-1].zip( (Array.new(row.fields.length-1, row.fields[0]).zip(row.fields[1..-1])) ).to_h
hash_row.each{|key, value| tickers[key] ? tickers[key].merge!([value].to_h) : tickers[key] = [value].to_h}
end
To get the best of both worlds (very fast reading from a huge file AND the benefits of a native Ruby CSV object) my code had since evolved into this method:
$stock="XTEX"
csv_data = CSV.parse IO.read(%`|sed -n "1p; /^#{$stock},/p" stocks.csv`), {:headers => true, :return_headers => false, :header_converters => :symbol, :converters => :all}
# Now the 1-row CSV object is ready for use, eg:
$company = csv_data[:company][0]
$volatility_month = csv_data[:volatility_month][0].to_f
$sector = csv_data[:sector][0]
$industry = csv_data[:industry][0]
$rsi14d = csv_data[:relative_strength_index_14][0].to_f
which is closer to my original method, but only reads in one record plus line 1 of the input csv file containing the headers. The inline sed instructions take care of that--and the whole thing is noticably instant. This this is better than last because now I can access all the fields from Ruby, and associatively, not caring about column numbers anymore as was the case with awk.
Not as 1-liner-ie but this was more clear to me.
csv_headers = CSV.parse(STDIN.gets)
csv = CSV.new(STDIN)
kick_list = []
csv.each_with_index do |row, i|
row_hash = {}
row.each_with_index do |field, j|
row_hash[csv_headers[0][j]] = field
end
kick_list << row_hash
end
While this isn't a 100% native Ruby solution to the original question, should others stumble here and wonder what awk call I wound up using for now, here it is:
$dividend_yield = IO.readlines("|awk -F, '$1==\"#{$stock}\" {print $9}' datafile.csv")[0].to_f
where $stock is the variable I had previously assigned to a company's ticker symbol (the wannabe key field).
Conveniently survives problems by returning 0.0 if: ticker or file or field #9 not found/empty, or if value cannot be typecasted to a float. So any trailing '%' in my case gets nicely truncated.
Note that at this point one could easily add more filters within awk to have IO.readlines return a 1-dim array of output lines from the smaller resulting CSV, eg.
awk -F, '$9 >= 2.01 && $2 > 99.99 {print $0}' datafile.csv
outputs in bash which lines have a DivYld (col 9) over 2.01 and price (col 2) over 99.99. (Unfortunately I'm not using the header row to to determine field numbers, which is where I was ultimately hoping for some searchable associative Ruby array.)

Parse CSV file with header fields as attributes for each row

I would like to parse a CSV file so that each row is treated like an object with the header-row being the names of the attributes in the object. I could write this, but I'm sure its already out there.
Here is my CSV input:
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
The code would look something like this:
CSV.open('my_file.csv','r') do |csv_obj|
puts csv_obj.foo #prints 1 the 1st time, "blah" 2nd time, etc
puts csv.bar #prints 2 the first time, 7 the 2nd time, etc
end
With Ruby's CSV module I believe I can only access the fields by index. I think the above code would be a bit more readable. Any ideas?
Using Ruby 1.9 and above, you can get a an indexable object:
CSV.foreach('my_file.csv', :headers => true) do |row|
puts row['foo'] # prints 1 the 1st time, "blah" 2nd time, etc
puts row['bar'] # prints 2 the first time, 7 the 2nd time, etc
end
It's not dot syntax but it is much nicer to work with than numeric indexes.
As an aside, for Ruby 1.8.x FasterCSV is what you need to use the above syntax.
Here is an example of the symbolic syntax using Ruby 1.9. In the examples below, the code reads a CSV file named data.csv from Rails db directory.
:headers => true treats the first row as a header instead of a data row. :header_converters => :symbolize parameter then converts each cell in the header row into Ruby symbol.
CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|
puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}"
end
In Ruby 1.8:
require 'fastercsv'
CSV.foreach("#{Rails.root}/db/data.csv", {:headers => true, :header_converters => :symbol}) do |row|
puts "#{row[:foo]},#{row[:bar]},#{row[:baz]}"
end
Based on the CSV provided by the Poul (the StackOverflow asker), the output from the example code above will be:
1,2,3
blah,7,blam
4,5,6
Depending on the characters used in the headers of the CSV file, it may be necessary to output the headers in order to see how CSV (FasterCSV) converted the string headers to symbols. You can output the array of headers from within the CSV.foreach.
row.headers
Easy to get a hash in Ruby 2.3:
CSV.foreach('my_file.csv', headers: true, header_converters: :symbol) do |row|
puts row.to_h[:foo]
puts row.to_h[:bar]
end
Although I am pretty late to the discussion, a few months ago I started a "CSV to object mapper" at https://github.com/vicentereig/virgola.
Given your CSV contents, mapping them to an array of FooBar objects is pretty straightforward:
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
require 'virgola'
class FooBar
include Virgola
attribute :foo
attribute :bar
attribute :baz
end
csv = <<CSV
"foo","bar","baz"
1,2,3
"blah",7,"blam"
4,5,6
CSV
foo_bars = FooBar.parse(csv).all
foo_bars.each { |foo_bar| puts foo_bar.foo, foo_bar.bar, foo_bar.baz }
Since I hit this question with some frequency:
array_of_hashmaps = CSV.read("path/to/file.csv", headers: true)
puts array_of_hashmaps.first["foo"] # 1
This is the non-block version, when you want to slurp the whole file.

Resources