Ruby CSV, using square brackets as row separators - ruby

I'm trying to use square brackets '[]' as a row separator in a CSV file. I must use this exact format for this project (output needs to match LEDES98 law invoicing format exactly).
I'm trying to do this:
CSV.open('output.txt', 'w', col_sep: '|', row_sep: '[]') do |csv|
#Do Stuff
end
But Ruby won't take row_sep: '[]' and throws this error:
lib/ruby/1.9.1/csv.rb:2309:in `initialize': empty char-class: /[]\z/ (RegexpError)
I've tried escaping the characters with /'s, using double quotes, etc, but nothing has worked yet. What's the way to do this?

The problem is in CSV#encode_re: the parameter row_sep: "|[]\n" is converted to a Regexp.
What can redefine this method:
class CSV
def encode_re(*chunks)
encode_str(*chunks)
end
end
CSV.open('output.txt', 'w', col_sep: '|', row_sep: "|[]\n"
) do |csv|
csv << [1,2,3]
csv << [4,5,6]
end
The result is:
1|2|3|[]
4|5|6|[]
I found no side effect, but I don't feel comfortble to redefine CSV, so I would recommend to create a new CSV-variant:
#Class to create LEDES98
class LEDES_CSV < CSV
def encode_re(*chunks)
encode_str(*chunks)
end
end
LEDES_CSV.open('output.txt', 'w', col_sep: '|', row_sep: "|[]\n"
) do |csv|
csv << [1,2,3]
csv << [4,5,6]
end
Then you can use the 'original' CSV and for LEDES-files you can use the LEDES_CSV.

Given an input string of the form
s = "[cat][dog][horsey\nhorse]"
you could use something like
s.scan(/\[(.*?)\]/m).flatten
which would return ["cat", "dog", "horsey\nhorse"] and process that with CSV module.

I just tried
require 'csv'
#Create LEDES98
CSV.open('output.txt', 'w', col_sep: '|', row_sep: '[]') do |csv|
csv << [1,2,3]
csv << [4,5,6]
end
and I got
1|2|3[]4|5|6[]
Which csv/ruby-version do you use? My CSV::VERSION is 2.4.7, my ruby version is 1.9.2p290 (2011-07-09) [i386-mingw32].
Another remark:
If I look at the example files in http://www.ledes.org/ then you need additional newlines. I would recommed to use:
require 'csv'
#Create LEDES98
CSV.open('output.txt', 'w', col_sep: '|', row_sep: "[]\n") do |csv|
csv << [1,2,3,nil]
csv << [4,5,6,nil]
end
Result:
1|2|3|[]
4|5|6|[]
The additional nils gives you the last | before the [].
I tested on another computer with ruby 1.9.3p194 (2012-04-20) [i386-mingw32] and get the same error.
I researched a bit and can isolate the problem:
p "[]" #[]
p "\[\]" #[] <--- Problem
p "\\[\\]" #\\[\\]
You can't mask the [. If you mask it once, Ruby produces [ (without the mask sign). If you mask it twice, you mask only the \, not the ].

Related

Is there a way to make Ruby CSV gem generate CSVs with Windows (CR LF) End Of Lines?

For some reason CSV gem is generating CSVs with Unix EOL (see screenshot) here:
https://www.dropbox.com/s/4re7tpp4pj9psov/ice_screenshot_20171230-162304.png?dl=0
Screenshot made in Notepad++ (View all Characters)
Code I use:
require 'csv'
all_the_things = []
all_the_things << ["item1.1","item1.2","item1.3"]
all_the_things << ["item2.1","item2.1","item2.1"]
all_the_things << ["item3.1","item3.1","item3.1"]
CSV.open("test.csv", "wb" ) do |row|
row << ["Column1", "Column2", "Column3"] #just headers
all_the_things.each do |data|
row << data
end
end
Is there a way to make it use Windows EOL (CR LF) instead of UNIX (LF) ones ?
I'm using Windows 10, and if I just output some lines to file using puts everything working just fine (albeight managing proper data structure without CSV gem is nightmare):
....
File.open("test.csv", "w") do |line|
myarray.each do |data|
line.puts data
end
end
Thank you in advance for any ideas and Happy New Year !
As it is clearly stated in the documentation, one might use the row_sep option to specify the row separator:
require 'csv'
# ⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓ here
CSV.open("/tmp/file.csv", "wb", row_sep: "\r\n") do |csv|
csv << %w|1 2 3 4|
csv << %w|a b c d|
end
Also, there is no “CSV gem,” it’s ruby standard library.

ruby hash.values is not working with built in method

i tried almost everything but I am feeling cornered.
I have a CSV and reading a line from it:
CSV.foreach(file, quote_char: '"', col_sep: ',', row_sep: :auto, headers: true) { |line|
newLine = []
newLine = line.values #undefined method .values
...
}
line is aparently hash, because line['column_name'] is working fine and also line.to_a returns ["col","value","col2","value2",...]
please help, thank you!
You can use #fields on the class CSV::Row
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV/Row.html
It is not a regular hash, it is an instance of CSV::Row, see here for the API
As you can see in the result of the following code the method values isn't there. Your solution of using line['column_name'] is fine.
You can get all the fields with the method fields without parameter.
CSV.parse(DATA, :col_sep => ",", :headers => true).each do |row|
puts row.class
puts row.methods - Object.methods
end
__END__
kId,kName,kURL
1,Google UK,http://google.co.uk
2,Yahoo UK,http://yahoo.co.uk
It is a CSV row which is part array and part hash and doesn't have the .values method available. Use .to_hash first and then you will be able to use .values. (Note that this will remove the field ordering and any duplicate fields)
newLine = line.to_hash.values

Ruby CSV - Illegal quoting in line 1. CSV::MalformedCSVError

I have a problem with reading from the csv file. File comes from Windows, so I suppose there are some encoding issues. My code looks like this:
CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv|
CSV.parse(open(doc.file.url), headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n", encoding: 'utf-8').each_with_index do |line, index|
csv << line.headers if index == 0
# do something wiht row
csv << line
end
end
I have to open existing file and complete some columns from it. So I just create new file. The existing file is stored on Dropbox, so I have to use open method.
The problem is that I get an error in this line:
CSV.parse(open(doc.file.url), headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n", encoding: 'utf-8').each_with_index do |line, index|
The error is:
Illegal quoting in line 1. CSV::MalformedCSVError
I check and seems like I don't have BOM characters in the file (not sure if check it right). The problem seems to be in quote character. The exception is thrown for every line in the file.
This is the file that causes me problems: https://dl.dropboxusercontent.com/u/3900955/geo_bez_adresu_10_do_testow_small.csv
I tried different approaches from StackOverflow but nothing helps, for example I changed my code into this:
CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv|
open(doc.file.url) do |f|
f.each_line do |line|
CSV.parse(line, 'r:bom|utf-8') do |row|
csv << row
end
end
end
end
but it doesn't help. I will be grateful for any help with parsing this file.
======= edit =========
When I safe the same file on Windows with encoding ANSI as UTF-8 (in Notepad++) I can parse the file correctly. From this discussion What is "ANSI as UTF-8" and how can I make fputcsv() generate UTF-8 w/BOM?, it seems like I have BOM in the original file. How I can check in Ruby if my file is with BOM and how I can parse the csv file with BOM ?
CSV.parse() requires a string on its first argument, but you're passing a File object instead. What happens is that parse() gets to parse the expanded value of (file object).to_s instead and it cause the error.
Update
To read file with BOM you can have this:
CSV.new(File.open('file.csv', 'r:bom|utf-8'), col_sep: ';').each do |row|
...
end
Reference: https://stackoverflow.com/a/7780559/445221
I didn't find any way to read directly from remote file, if it contains BOM. So I use Tempfile file to create temporary file and then I do CSV.open with 'r:bom|utf-8':
doc = Document.find(doc_id)
path = "#{Rails.root.join('tmp')}/#{doc.name.split('.').first}_#{Time.now.to_i}.csv"
file = Tempfile.new(["#{doc.name.split('.').first}_#{Time.now.to_i}", '.csv'])
file.binmode
file << open(doc.file.url).read
file.close
CSV.open(path, 'w', headers: :first_row, col_sep: ';', row_sep: "\r\n", encoding: 'utf-8') do |csv|
CSV.open(file.path, 'r:bom|utf-8', headers: :first_row, col_sep: ';', quote_char: "\"", row_sep: "\r\n").each_with_index do |line, index|
# do something
end
end
Now, it seems to parse the file.

Split output data using CSV in Ruby 1.9

I have a csv file that has 7000+ records that I process/manipulate and export to a new csv file. I have no issues doing that and everything works as expected.
I would like to change the process to where it breaks the output into multiple files. So instead of writing all 7000+ rows to the new csv file it would write the first 1000 rows to newexport1.csv and the next 1000 rows to newexport2.csv until it reaches the end of the data.
Is there an easy way to do this with CSV in Ruby 1.9?
My current write method:
CSV.open("#{PATH_TO_EXPORT_FILE}/newexport.csv", "w+", :col_sep => '|', :headers => true) do |f|
export_rows.each do |row|
f << row
The short answer is "no". You'll want to adjust your current code to split up the set and then dump each subset to a different file. This ought to be pretty close:
export_rows.each_slice(1000).with_index do |rows, idx|
CSV.open("#{PATH_TO_EXPORT_FILE}/newexport-#{idx.to_s}.csv", "w+", :col_sep => '|', :headers => true) do |f|
rows.each { |row| f << row }
end
end
Yes, there is.
It's embedded in Ruby 1.9
Check this link
To read:
CSV.foreach("path/to/file.csv") do |row|
# manipulate the content
end
To write:
CSV.open("path/to/file.csv", "wb") do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# something else
end
I think that you'll need to combine one inside the other.
FasterCSV is the standard CSV library since ruby 1.9, you can find a lot of example code in the examples folder:
https://github.com/JEG2/faster_csv/tree/master/examples
For the example code to work, you should change:
require "faster_csv"
for
require "csv"

Ruby 1.9.2 export a CSV string without generating a file

I just can't get the 'To a String' example under 'Writing' example in the documentation to work at all.
ruby -v returns:
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin10.8.0]
The example from the documentation I can't working is here:
csv_string = CSV.generate do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
end
The error I get is:
wrong number of arguments (0 for 1)
So it seems like I am missing an argument, in the documentation here it states:
This method wraps a String you provide, or an empty default String
But when I pass in a empty string, it gives me the following error:
No such file or directory -
I am not looking to generate a csv file, I just wanted to create a string of csv that I send as text to the user.
Here is code I know works against Ruby 1.9.2 with Rails 3.0.1
def export_csv(filename, header, rows)
require 'csv'
file = CSV.generate do |csv|
csv << header if not header.blank?
rows.map {|row| csv << row}
end
send_data file, :type => 'text/csv; charset=iso-8859-1; header=present', :disposition => "attachment;filename=#{filename}.csv"
end

Resources