Parsing CSV with headers in Ruby? - ruby

This works:
require 'csv'
file = CSV.open(filename)
puts file.shift
This does not:
require 'csv'
file = CSV.open(filename, :headers=>true)
puts file.shift
I get:
C:/Program Files (x86)/IronRuby 1.1/Lib/ruby/1.9.1/csv.rb:2177:in `convert_field
s': undefined method `with_index' for IronRuby.Builtins.Enumerator:Enumerator (N
oMethodError)
from C:/Program Files (x86)/IronRuby 1.1/Lib/ruby/1.9.1/csv.rb:2218:in `
parse_headers'
from C:/Program Files (x86)/IronRuby 1.1/Lib/ruby/1.9.1/csv.rb:1918:in `
shift'
from C:/Program Files (x86)/IronRuby 1.1/Lib/ruby/1.9.1/csv.rb:1818:in `
loop'
from C:/Program Files (x86)/IronRuby 1.1/Lib/ruby/1.9.1/csv.rb:1818:in `
shift'
from C:/myproject/myproject/myproject/Program.rb:3
I am using Ironruby 1.1.3
I am looking for the correct syntax to get a single line with the headers option.

I tested this in a different engine and this seems to be a bug in Ironruby

You could try this, mind the colsep, adapt it to what you have in your CSV files.
If you want to show the headers:
require 'csv'
path = "G:/documents/musicinfo"
Dir.glob("#{path}/**/*.csv").each {|file|
CSV.foreach(file, :col_sep => ';') do |row|
puts "#{file} => #{row}"
break
end
}
Gives as array:
G:/documents/musicinfo/album.csv => ["album_release_year", "album_id", "artist_id", "album"]
G:/documents/musicinfo/artist.csv => ["artist_id", "artist_name"]
G:/documents/musicinfo/Copy of all2.csv => ["album_release_year,album_id,track_id,artist_id,artist_name,duration,genre,track_popularity,disc_number,track_number,track,album"]
G:/documents/musicinfo/genres.csv => ["genreId", " genreNaam"]
G:/documents/musicinfo/songs.csv => ["album_id", " track_id", " artist_id", " track_popularity", "genre_id", "track"]
And, if you want to show the first datarow or keep it as CSV adapt the above with:
CSV.foreach(file, :col_sep => ';', :headers => true) do |row|
Which gives:
G:/documents/musicinfo/album.csv => 0000,10747312,11522976,Beyond Imagination (Bonus Track Version)
G:/documents/musicinfo/artist.csv => 10000004,Jimi Hendrix
G:/documents/musicinfo/Copy of all2.csv => "2008,12877382,24516577,10000004,Jimi Hendrix,05:39,Pop,9,16,1,10,Free spirit"
G:/documents/musicinfo/genres.csv => 1,Alternative
G:/documents/musicinfo/songs.csv => 12877382,24516577,10000004,9,16,Free spirit

Related

Gem not found when I have just installed it

I am creating a Ruby script which reads data from an Excel sheet and put its data on a MySQL database. I have written it and installed the necessary gems. However, when I try to run it via my cPanel host and I get the following error:
Array ( [0] => /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- ruby-mysql (LoadError) [1] => from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require' [2] => from ../ruby/InsertarFaltantesExcel.rb:2 )
Ruby Code:
require 'rubygems'
require 'ruby-mysql'
require 'spreadsheet'
#load './spreadsheet.rb'
con = Mysql.connect('xx', 'xx', 'xx','xx')
ARGV = "--help" if ARGV.empty?
workbook = Spreadsheet.open(ARGV[0])
sheet = workbook.worksheet(0)
sheet.each do |row|
#faltantes = {
"id_verificador" => "#{row[0]}",
"order_id" => "#{row[1]}",
"id_proveedor" => "#{row[28]}",
"shipping" => "#{row[10]}",
"ean" => "#{row[4]}",
"isbn" => "#{row[5]}",
"description" => "#{row[8]}",
"sku" => "#{row[9]}",
"cost" => "#{row[40]}",
"order_price" => "#{row[14]}",
"master" => "#{row[39]}",
"quantity_purchased" => "#{row[11]}",
"total_price" => "#{row[12]}",
"condition" => "#{row[33]}",
"tracking" => "#{row[29]}"
}
insertar_faltantes(#faltantes, con)
end
def insertar_faltantes(hash, con)
statement - con.prepare("INSERT INTO articulos(art_id_verificador, art_id_orden, art_id_proveedor, art_shipping, art_N13, art_ISBN, art_titulo, art_SKU, art_cost, art_precio, art_master, art_cantidad, art_total, art_condition,
art_tracking) VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?);")
statement.execute "#{hash['id_verificador']}", "#{hash['order_id']}", "#{hash['id_proveedor']}", "#{hash['shipping']}", "#{hash['ean']}", "#{hash['isbn']}", "#{hash['sku']}", "#{hash['cost']}", "#{hash['order_price']}","#{hash['master']}",
"#{hash['quantity_purchased']}", "#{hash['condition']}", "#{hash['tracking']}"
end
The gems in ~/ruby/gems/gems/ are not being recognised by your Ruby executable. Find where all the other gems are being kept and move these ones into there.
Alternatively, try using a different package manager. If you have installed gems successfully in the past, use the manager you used for that.

ruby - get files of certain extension from a directory (windows)

I'm having trouble with the following piece of code where user input of directory name is used to fetch list of files of a particular extension on a windows machine
puts "Enter the name of directory where files exist : "
directory = gets.chomp
csv_files = Dir.glob("#{directory}/*.csv")
Regardless of the directory input ( the directory has .csv files), the last line returns an empty array.
ruby version - ruby 2.0.0p598 (2014-11-13) [i386-mingw32]
Adding additional info asked in comments
PS C:\test> irb
irb(main):001:0> directory = gets.chomp
C:\test
=> "C:\\test"
irb(main):002:0> directory
=> "C:\\test"
irb(main):003:0> Dir.glob("#{directory}/*.csv")
=> []
irb(main):004:0> Dir.glob("#{directory}/*.*")
=> []
irb(main):005:0> Dir.glob("C:/" + directory + "/*.csv")
=> []
irb(main):006:0> Dir.glob("C:/test/*")
=> ["C:/test/test_csv.csv"]
irb(main):007:0> Dir.entries(directory)
=> [".", "..","test_csv.csv"]
irb(main):010:0> Dir.glob('./*.csv')
=> ["./test_csv.csv"]
irb(main):011:0>
Found the solution as below.
Use File.expand_path method to convert the input string into a standard ruby file path.
puts "Enter the name of directory where files exist : "
directory = File.expand_path(gets.chomp)
csv_files = Dir.glob("#{directory}/*.csv")

Working with large CSV files in Ruby

I want to parse two CSV files of the MaxMind GeoIP2 database, do some joining based on a column and merge the result into one output file.
I used standard CSV ruby library, it is very slow. I think it tries to load all the file in memory.
block_file = File.read(block_path)
block_csv = CSV.parse(block_file, :headers => true)
location_file = File.read(location_path)
location_csv = CSV.parse(location_file, :headers => true)
CSV.open(output_path, "wb",
:write_headers=> true,
:headers => ["geoname_id","Y","Z"] ) do |csv|
block_csv.each do |block_row|
puts "#{block_row['geoname_id']}"
location_csv.each do |location_row|
if (block_row['geoname_id'] === location_row['geoname_id'])
puts " match :"
csv << [block_row['geoname_id'],block_row['Y'],block_row['Z']]
break location_row
end
end
end
Is there another ruby library that support processing in chuncks ?
block_csv is 800MB and location_csv is 100MB.
Just use CSV.open(block_path, 'r', :headers => true).each do |line| instead of File.read and CSV.parse. It will parse the file line by line.
In your current version, you explicitly tell it to read all the file with File.read and then to parse the whole file as a string with CSV.parse. So it does exactly what you have told.

Using Ruby CSV header converters

Say I have the following class:
class Buyer < ActiveRecord::Base
attr_accesible :first_name, :last_name
and the following in a CSV file:
First Name,Last Name
John,Doe
Jane,Doe
I want to save the contents of the CSV into the database. I have the following in a Rake file:
namespace :migration do
desc "Migrate CSV data"
task :import, [:model, :file_path] => :environment do |t, args|
require 'csv'
model = args.model.constantize
path = args.file_path
CSV.foreach(path, :headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') }
) do |row|
model.create!(row.to_hash)
end
end
end
I am getting an undefined method 'downcase' for nil:NilClass. If I exclude the header converters then I get unknown attribute 'First Name'. What's the correct syntax for converting a header from, say, First Name to first_name?
After doing some research here in my desktop, it seems to me the error is for something else.
First I put the data in my "a.txt" file as below :
First Name,Last Name
John,Doe
Jane,Doe
Now I ran the code, which is saved in my so.rb file.
so.rb
require 'csv'
CSV.foreach("C:\\Users\\arup\\a.txt",
:headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') }
) do |row|
p row
end
Now running the :
C:\Users\arup>ruby -v so.rb
ruby 1.9.3p448 (2013-06-27) [i386-mingw32]
#<CSV::Row "first_name":"John" "last_name":"Doe">
#<CSV::Row "first_name":"Jane" "last_name":"Doe">
So everything is working now. Now let me reproduce the error :
I put the data in my "a.txt" file as below ( just added a , after the last column) :
First Name,Last Name,
John,Doe
Jane,Doe
Now I ran the code, which is saved in my so.rb file, again.
C:\Users\arup>ruby -v so.rb
ruby 1.9.3p448 (2013-06-27) [i386-mingw32]
so.rb:5:in `block in <main>': undefined method `downcase' for nil:NilClass (NoMethodError)
It seems, in your header row, there is blank column value which is causing the error. Thus if you have a control to the source CSV file, check there the same. Or do some change in your code, to handle the error as below :
require 'csv'
CSV.foreach("C:\\Users\\arup\\a.txt",
:headers => true,
:converters => :all,
:header_converters => lambda { |h| h.downcase.gsub(' ', '_') unless h.nil? }
) do |row|
p row
end
A more general answer, but if you have code that you need to process as text, and sometimes you might get a nil in there, then call to_s on the object. This will turn nil into an empty string. eg
h.to_s.downcase.gsub(' ', '_')
This will never blow up, whatever h is, because every class in ruby has the to_s method, and it always returns a string (unless you've overridden it to do something else, which would be unadvisable).
Passing :symbol to :header_converters will automatically convert to strings to snake case as well.
options = {:headers => true,
:header_converters => :symbol}
CSV.foreach(filepath, options) ...
#<CSV::Row first_name:"John" last_name:"Doe">
#<CSV::Row first_name:"Jane" last_name:"Doe">

Ruby csv read first line in csv file [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Ruby view csv data
In my app i'm reading csv file, but why in view i have only second,...,n records, without first line?
Here is code:
def common_uploader
require 'csv'
#csv = CSV.read("/#{Rails.public_path}/uploads_prices/"+params[:file], {:encoding => "CP1251:UTF-8", :col_sep => ";", :row_sep => :auto, :headers => :false})
end
:headers => :false i write... but why i didn't get first line from csv file? (ruby 1.9.3)
So, how to get also first line?
It should be false, not :false.
You can use the [0,20]-method also on csv:
require 'csv'
csv = CSV.parse(DATA.read, {
:col_sep => ",",
:headers => false
}
)
csv[0,10].each{|line|
p line #-> first 10 lines
}
__END__
00,a,b,c
01,a,b,c
02,a,b,c
03,a,b,c
04,a,b,c
05,a,b,c
06,a,b,c
07,a,b,c
08,a,b,c
09,a,b,c
10,a,b,c
11,a,b,c
12,a,b,c
13,a,b,c
14,a,b,c
15,a,b,c
16,a,b,c
17,a,b,c
18,a,b,c
19,a,b,c
20,a,b,c
But this reads all lines to csv - it is only a restricted output.

Resources