Ruby csv read first line in csv file [duplicate] - ruby

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Ruby view csv data
In my app i'm reading csv file, but why in view i have only second,...,n records, without first line?
Here is code:
def common_uploader
require 'csv'
#csv = CSV.read("/#{Rails.public_path}/uploads_prices/"+params[:file], {:encoding => "CP1251:UTF-8", :col_sep => ";", :row_sep => :auto, :headers => :false})
end
:headers => :false i write... but why i didn't get first line from csv file? (ruby 1.9.3)
So, how to get also first line?

It should be false, not :false.

You can use the [0,20]-method also on csv:
require 'csv'
csv = CSV.parse(DATA.read, {
:col_sep => ",",
:headers => false
}
)
csv[0,10].each{|line|
p line #-> first 10 lines
}
__END__
00,a,b,c
01,a,b,c
02,a,b,c
03,a,b,c
04,a,b,c
05,a,b,c
06,a,b,c
07,a,b,c
08,a,b,c
09,a,b,c
10,a,b,c
11,a,b,c
12,a,b,c
13,a,b,c
14,a,b,c
15,a,b,c
16,a,b,c
17,a,b,c
18,a,b,c
19,a,b,c
20,a,b,c
But this reads all lines to csv - it is only a restricted output.

Related

Working with large CSV files in Ruby

I want to parse two CSV files of the MaxMind GeoIP2 database, do some joining based on a column and merge the result into one output file.
I used standard CSV ruby library, it is very slow. I think it tries to load all the file in memory.
block_file = File.read(block_path)
block_csv = CSV.parse(block_file, :headers => true)
location_file = File.read(location_path)
location_csv = CSV.parse(location_file, :headers => true)
CSV.open(output_path, "wb",
:write_headers=> true,
:headers => ["geoname_id","Y","Z"] ) do |csv|
block_csv.each do |block_row|
puts "#{block_row['geoname_id']}"
location_csv.each do |location_row|
if (block_row['geoname_id'] === location_row['geoname_id'])
puts " match :"
csv << [block_row['geoname_id'],block_row['Y'],block_row['Z']]
break location_row
end
end
end
Is there another ruby library that support processing in chuncks ?
block_csv is 800MB and location_csv is 100MB.
Just use CSV.open(block_path, 'r', :headers => true).each do |line| instead of File.read and CSV.parse. It will parse the file line by line.
In your current version, you explicitly tell it to read all the file with File.read and then to parse the whole file as a string with CSV.parse. So it does exactly what you have told.

Taking json data and converting it to a CSV file

Okay... so new to Ruby here but loving it so far. My problem is I cannot get the data to go into the CSV files.
#!/usr/bin/env ruby
require 'date'
require_relative 'amf'
require 'json'
require 'csv'
amf = Amf.new
#This makes it go out 3 days
apps = amf.post( 'Appointments.getBetweenDates',
{ 'startDate' => Date.today, 'endDate' => Date.today + 4 }
)
apps.each do |app|
cor_md_params = { 'appId' => app['appID'], 'relId' => 7 }
cor_md = amf.post( 'Clinicians.getByAppIdAndRelId', cor_md_params ).first
#this is where it breaks ----->
CSV.open("ile.csv", "wb") do |csv|
csv << ["column1", "column2", "etc.", "etc.."]
csv << ([
# if added puts ([ I can display the info and then make a csv...
app['patFirstName'],
app['patMiddleName'],
app['patLastName'],
app['patBirthdate'],
app['patHin'],
app['patPhone'],
app['patCellPhone'],
app['patBusinessPhone'],
app['appTime'],
app['appID'],
app['patPostalCode'],
app['patProvince'],
app['locName'],
# note that this is not exactly accurate for follow-ups,
# where you have to replace the "1" with the actual value
# in weeks, days, months, etc
#app[ 'bookName' ], => not sure this is needed
cor_md['id'],
cor_md['providerCode'],
cor_md['firstName'],
cor_md['lastName']
].join(', '))
end
end
Now, if I remove the attempt to make the ile.cvs file and just output it with a puts, all the data shows. But I don't want to have to go into the terminal and create a csv file... I would rather just run the .rb program and have it created. Also, hopefully I am making the columns correctly as well...
The thought occurred to me that I could just add another puts above the output.
Or, better, insert a row into the array before I output it...
Really not sure what is best practice here and standards.
This is what I have done and attempted. How can I get it to cleanly output to a CSV file since my attempts are not working
Also, to clarify where it breaks, it does add the column names just not the JSON info that is parsed. I could also be completely doing this the wrong way or a way that isn't possible. I just do not know.
What kind of error do you get? Is it this one:
<<': undefined methodmap' for "something":String (NoMethodError)
I think, you should remove the .join(', ')
The << method of CSV accepts an array, but not a String
http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html#method-i-3C-3C
So instead of:
cor_md['lastName']
].join(', '))
rather:
cor_md['lastName']
])
The problem with the loop (why it writes only 1 row of data)
In the body of your loop, you always reopen the file, and always rewrite what you added before. What you want to do, is probably this:
CSV.open("ile3.csv", "wb") do |csv|
csv << ["column1", "column2", "etc.", "etc.."]
apps.each do |app|
cor_md_params = { 'appId' => app['appID'], 'relId' => 7 }
cor_md = amf.post( 'Clinicians.getByAppIdAndRelId', cor_md_params ).first
#csv << your long array
end
end

Split output data using CSV in Ruby 1.9

I have a csv file that has 7000+ records that I process/manipulate and export to a new csv file. I have no issues doing that and everything works as expected.
I would like to change the process to where it breaks the output into multiple files. So instead of writing all 7000+ rows to the new csv file it would write the first 1000 rows to newexport1.csv and the next 1000 rows to newexport2.csv until it reaches the end of the data.
Is there an easy way to do this with CSV in Ruby 1.9?
My current write method:
CSV.open("#{PATH_TO_EXPORT_FILE}/newexport.csv", "w+", :col_sep => '|', :headers => true) do |f|
export_rows.each do |row|
f << row
The short answer is "no". You'll want to adjust your current code to split up the set and then dump each subset to a different file. This ought to be pretty close:
export_rows.each_slice(1000).with_index do |rows, idx|
CSV.open("#{PATH_TO_EXPORT_FILE}/newexport-#{idx.to_s}.csv", "w+", :col_sep => '|', :headers => true) do |f|
rows.each { |row| f << row }
end
end
Yes, there is.
It's embedded in Ruby 1.9
Check this link
To read:
CSV.foreach("path/to/file.csv") do |row|
# manipulate the content
end
To write:
CSV.open("path/to/file.csv", "wb") do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# something else
end
I think that you'll need to combine one inside the other.
FasterCSV is the standard CSV library since ruby 1.9, you can find a lot of example code in the examples folder:
https://github.com/JEG2/faster_csv/tree/master/examples
For the example code to work, you should change:
require "faster_csv"
for
require "csv"

Removing whitespaces in a CSV file

I have a string with extra whitespace:
First,Last,Email ,Mobile Phone ,Company,Title ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type
I want to parse this line and remove the whitespaces.
My code looks like:
namespace :db do
task :populate_contacts_csv => :environment do
require 'csv'
csv_text = File.read('file_upload_example.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
end
end
#prices = CSV.parse(IO.read('prices.csv'), :headers=>true,
:header_converters=> lambda {|f| f.strip},
:converters=> lambda {|f| f ? f.strip : nil})
The nil test is added to the row but not header converters assuming that the headers are never nil, while the data might be, and nil doesn't have a strip method. I'm really surprised that, AFAIK, :strip is not a pre-defined converter!
You can strip your hash first:
csv.each do |unstriped_row|
row = {}
unstriped_row.each{|k, v| row[k.strip] = v.strip}
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
Edited to strip hash keys too
CSV supports "converters" for the headers and fields, which let you get inside the data before it's passed to your each loop.
Writing a sample CSV file:
csv = "First,Last,Email ,Mobile Phone ,Company,Title ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type
first,last,email ,mobile phone ,company,title ,street,city,state,zip,country, birthday,gender ,contact type
"
File.write('file_upload_example.csv', csv)
Here's how I'd do it:
require 'csv'
csv = CSV.open('file_upload_example.csv', :headers => true)
[:convert, :header_convert].each { |c| csv.send(c) { |f| f.strip } }
csv.each do |row|
puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end
Which outputs:
First Name: 'first'
Last Name: 'last'
Email: 'email'
The converters simply strip leading and trailing whitespace from each header and each field as they're read from the file.
Also, as a programming design choice, don't read your file into memory using:
csv_text = File.read('file_upload_example.csv')
Then parse it:
csv = CSV.parse(csv_text, :headers => true)
Then loop over it:
csv.each do |row|
Ruby's IO system supports "enumerating" over a file, line by line. Once my code does CSV.open the file is readable and the each reads each line. The entire file doesn't need to be in memory at once, which isn't scalable (though on new machines it's becoming a lot more reasonable), and, if you test, you'll find that reading a file using each is extremely fast, probably equally fast as reading it, parsing it then iterating over the parsed file.

Can I delete columns from CSV using Ruby?

Looking at the documentation for the CSV library of Ruby, I'm pretty sure this is possible and easy.
I simply need to delete the first three columns of a CSV file using Ruby but I haven't had any success getting it run.
csv_table = CSV.read(file_path_in, :headers => true)
csv_table.delete("header_name")
csv_table.to_csv # => The new CSV in string format
Check the CSV::Table documentation: http://ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV/Table.html
csv_table = CSV.read("../path/to/file.csv", :headers => true)
keep = ["x", "y", "z"]
new_csv_table = csv_table.by_col!.delete_if do |column_name,column_values|
!keep.include? column_name
end
new_csv_table.to_csv
What about:
require 'csv'
File.open("resfile.csv","w+") do |f|
CSV.foreach("file.csv") do |row|
f.puts(row[3..-1].join(","))
end
end
I have built on a few of the questions (really liked what #fguillen did with CSV::Table) here but just made it a bit simpler to drop it into an existing project, target a file and make a quick change.
Have added byebug cause ... yes. Then also retained the headers from the original file (assuming they exist for anyone wanting to use this snippet).
The file is overwritten each time in case you want to test/tinker.
require 'csv'
require 'byebug'
in_file = './db/data/inbox/change__to_file_name.csv'
out_file = in_file + ".out"
target_col = "change_to_column_name"
csv_table = CSV.read(in_file, headers: true)
csv_table.delete(target_col)
CSV.open(out_file, 'w+', force_quotes: true) do |csv|
csv << csv_table.headers
csv_table.each_with_index do |row|
csv << row
end
end

Resources