I am trying to write a CSV "fixer".
Unfortunately It seems that the csv.foreach instruction is not calling the lambda I have created. The CPU is used at 100%. Just wondering what ruby is doing in the meantime...
Any ideas why my code is wrong?
1 require "csv"
2
3 ARGV.empty? do
4 print "usage: fixcsv.rb <filename>"
5 exit
6 end
7
8 filename_orig = Dir.pwd + "/" + ARGV[0]
9 filename_dest = filename_orig.sub(/csv$/,"tmp.csv")
10 topic = filename_orig.sub(/_entries.csv$/,"").sub(/.*\//,"")
11
12 puts "topic:" + topic
13
14 writer = CSV.open(filename_dest,"w",:col_sep=>";")
15 #i=0
16 cycler = lambda do |row|
17 #i = i + 1
18 #puts "row number:" + i.to_str
19 #row[17] = topic
20 puts "foo"
21 writer << row
22 end
23
24 begin
25 CSV.foreach(filename_orig,:col_sep=>",",&cycler)
26 rescue
27 puts "exception:" + $!.message
28 exit
29 else
30 writer.close
31 end
Here is the stack trace produced when I Ctrl-C it:
stab#ubuntu:~/wok$ ruby addtopic.rb civilpoliticalrights_entries.csv
topic:civilpoliticalrights
^C/usr/lib/ruby/1.8/csv.rb:914:in `buf_size': Interrupt
from /usr/lib/ruby/1.8/csv.rb:825:in `[]'
from /usr/lib/ruby/1.8/csv.rb:354:in `parse_body'
from /usr/lib/ruby/1.8/csv.rb:227:in `parse_row'
from /usr/lib/ruby/1.8/csv.rb:637:in `get_row'
from /usr/lib/ruby/1.8/csv.rb:556:in `each'
from /usr/lib/ruby/1.8/csv.rb:531:in `parse'
from /usr/lib/ruby/1.8/csv.rb:311:in `open_reader'
from /usr/lib/ruby/1.8/csv.rb:94:in `foreach'
from addtopic.rb:25
EDIT: Ruby version is:
$ ruby --version
ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux]
Your program worked fine for me in Ruby 1.9.
I have a few observations:
If your input pathname does not end in csv, then the input and output file names will be the same. This could easily produce an infinite loop.
You are definitely using the 1.9 flavor of csv. If this program needs to run on 1.8.7 it would need to have patches from the snippet below...
Mods for 1.8.7:
writer = CSV.open(filename_dest, "w", ?;)
#i=0
cycler = lambda do |row|
#i = i + 1
#puts "row number:" + i.to_str
#row[17] = topic
writer << row
end
begin
CSV.open filename_orig, 'r', ?,, &cycler
The main problem with 1.8.7 csv is that the interfaces to CSV.open and CSV.foreach do not take Hash options. Worse, they are expecting numeric code points, a feature of Ruby that apparently didn't work out and was withdrawn in 1.9.
Related
I have written some code out of the Ruby Pickaxe book and I am trying to get it to work.
(around page 62 of "Programming Ruby The Pragmatic Programmer's Guide")
**Edit: More info on the book: (C) 2009, for Ruby 1.9
Given this error message, I am not quite sure how to identify what is going wrong. I appreciate any help in understanding what is going wrong here.
How does one know what to identify and solve?
I am wondering if Ruby's CSV functionality is really just this easy-- no gem/bundle install to run?
I would really like to be able to run my test_code.rb file, but I am unable to figure out this error.
Thank you for your time,
Patrick
Note: all of these files are in the same directory.
IRB command, followed by the error message it generates:
2.1.1 :005 > load "test_code.rb"
LoadError: cannot load such file -- csv-reader
from /Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from /Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from test_code.rb:3:in `<top (required)>'
from (irb):5:in `load'
from (irb):5
from /Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/bin/irb:11:in `<main>'
I don't know how relevant this is, based on the error message, but thought I'd include it.
kernel_require.rb line 55:
if Gem::Specification.unresolved_deps.empty? then
begin
RUBYGEMS_ACTIVATION_MONITOR.exit
return gem_original_require(path)
ensure
RUBYGEMS_ACTIVATION_MONITOR.enter
end
end
line 9-11 of irb:
require "irb"
IRB.start(__FILE__)
First file of program: csv-reader.rb
require 'csv'
require 'book-in-stock'
class CsvReader
def initialize
#books_in_stock = []
end
def read_in_csv_data(csv_file_name)
CSV.foreach(csv_file_name, headers: true) do |row|
#books_in_stock << BookInStock.new(row["ISBN"], row["Amount"])
end
end
def total_value_in_stock
sum = 0.0
#books_in_stock.each {|book| sum += book.price}
end
def number_of_each_isbn
end
end
Second file: book-in-stock.rb
class BookInStock
attr_reader :isbn
attr_accessor :price
def initialize(isbn, price)
#isbn = isbn
#price = Float(price)
end
def price_in_cents
Integer(price*100 + 0.5)
end
def price_in_cents=(cents)
#price = cents / 100.0
end
end
Third file: stock-stats.rb
require 'csv-reader'
reader = CsvReader.new
ARGV.each do |csv_file_name|
STDERR.puts "Processing #{csv_file_name}"
reader.read_in_csv_data(csv_file_name)
end
puts "Total value = #{reader.total_value_in_stock}"
Fourth file: test_code.rb
# this is the test code file
require 'csv-reader'
require 'book-in-stock'
require 'stock-stats'
# code to call
reader = CsvReader.new
reader.read_in_csv_data("file1.csv")
reader.read_in_csv_data("file2.csv")
puts "Total value in stock = #{reader.total_value_in_stock}"
# code to call
book = BookInStock.new("isbn1", 33.80)
puts "Price = #{book.price}"
puts "Price in cents = #{book.price_in_cents}"
book.price_in_cents = 1234
puts "Price = #{book.price}"
puts "Price in cents = #{book.price_in_cents}"
CSV files:
file1.csv
ISBN, Amount
isbn1, 49.00
isbn2, 24.54
isbn3, 33.23
isbn4, 15.55
file2.csv
ISBN, Amount
isbn5-file2, 39.98
isbn6-file2, 14.84
isbn7-file2, 43.63
isbn8-file2, 25.55
Edit
After Frederick Cheung's suggestion to change require to require_relative (for all but the 1st line of csv-reader.rb), the script is running, but a method is not working (see below)
(I did receive an error about this line:
#price = Float(price)
and changed it to #price = price.to_f and it runs just fine. )
3 Questions:
-> I changed the header of my csv files to "ISBN, Amount". Previously Amount was amount (not capitalized). Does this matter (i.e. the capitalizing of the header)?
-> While we're on the subject, what is the "row" keyword doing in the following #read_in_csv_data method?
-> Now that my code runs it appears that the output for "Total value in stock" is not summing up all of the prices in the csv file. Could a Rubyist please help me understand why this is happening?
The method
def read_in_csv_data(csv_file_name)
CSV.foreach(csv_file_name, headers: true) do |row|
#books_in_stock << BookInStock.new(row["ISBN"], row["Amount"])
end
end
and call seem fine to me...
reader = CsvReader.new
reader.read_in_csv_data("file1.csv")
reader.read_in_csv_data("file2.csv")
Here is the current output from terminal:
Total value = []
Price = 33.8
Price in cents = 3380
Price = 12.34
Price in cents = 1234
Total value in stock = [#<BookInStock:0xb8168a60 #isbn="isbn1", #price=0.0>, #<BookInStock:0xb8168740 #isbn="isbn2", #price=0.0>, #<BookInStock:0xb8168358 #isbn="isbn3", #price=0.0>, #<BookInStock:0xb81546f0 #isbn="isbn4", #price=0.0>, #<BookInStock:0xb8156a18 #isbn="isbn5-file2", #price=0.0>, #<BookInStock:0xb8156784 #isbn="isbn6-file2", #price=0.0>, #<BookInStock:0xb81564a0 #isbn="isbn7-file2", #price=0.0>, #<BookInStock:0xb8156248 #isbn="isbn8-file2", #price=0.0>]
Thanks again.
Edit: Big thanks to 7Stud for a very thorough followup answer on every question I had. You have been exceptionally helpful. I have learned several important things thanks to your post.
Edit:
Still not able to get the code to run.
I am not sure how to add to / edit the $LOAD_PATH, so I tried putting all of the files into this directory:
directory: ~MY_RUBY_HOME/lib/ruby/site_ruby/2.1.0/csv-reader
(i.e. /Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/lib/ruby/site_ruby/2.1.0/csv-reader)
However, I still receive the same error message:
✘ ~MY_RUBY_HOME/lib/ruby/site_ruby/2.1.0/csv-reader ruby test_code.rb file1.csv file2.csv
/Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- ./csv_reader (LoadError)
from /Users/patrickmeaney/.rvm/rubies/ruby-2.1.1/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
from test_code.rb:1:in `<main>'
I have written some code out of the Ruby Pickaxe book
Yeah, but there are many Ruby Pickaxe books.
IRB command, followed by the error message it generates:
NEVER run anything in IRB. Never use IRB for ANYTHING. Instead put your code in a file, and then run the file, e.g:
$ ruby my_prog.rb
LoadError: cannot load such file -- csv-reader
If the files you want to require are not located in the directories ruby searches automatically(to see those directories execute the line `p $LOAD_PATH'), then you can specify the absolute or relative path to the file you want to require in the require statement:
require './book_in_stock'
I did receive an error about this line: #price = Float(price) and
changed it to #price = price.to_f and it runs just fine.
x = 'hello'
p x.to_f
p Float(x)
--output:--
0.0
1.rb:3:in `Float': invalid value for Float(): "hello" (ArgumentError)
from 1.rb:3:in `<main>
The difference between Float() and to_f() is that Float will raise an exception when it is unable to convert the String to a Float, while to_f() will return 0 when it cannot convert the String to a Float. Unless you know what you are doing, it's probably best to use Float(), so that you are alerted to the fact that your data has an error in it.
While we're on the subject, what is the "row" keyword doing in the
following #read_in_csv_data method?
When you loop through the rows of your file(e.g. CSV.foreach), csv converts one row of your file into a thing called a "CSV::Row", and then assigns the "CSV::ROW" object to the loop variable, which you have named "row":
CSV.foreach(csv_file_name, headers: true) do |row|
^
|
So "row" is a variable that refers to a "CSV::Row". A "CSV::Row" acts like a hash, enabling you to write things like row['ISBN'] to retrieve the value in that column.
Spaces are significant in csv files. If your header row is ISBN, Amount, then the column names are "ISBN" and " Amount" (see the leading space?). That means there is no value for
row['Amount']
i.e. it will return nil, but there is a value for
row[' Amount']
^
|
Now that my code runs it appears that the output for "Total value in
stock" is not summing up all of the prices in the csv file. Could a
Rubyist please help me understand why this is happening?
1) A def returns the value of the last statement that was executed in the def.
2) Array#each() returns the array.
Here is your def:
def total_value_in_stock
sum = 0.0
#books_in_stock.each {|book| sum += book.price}
end
That def returns the #books_in_stock array. You need to return the sum:
def total_value_in_stock
sum = 0.0
#books_in_stock.each {|book| sum += book.price}
sum
end
If you want to get tricky, you can have csv automatically convert any data in your file that looks like a number to a number:
CSV.foreach(
csv_file_name,
headers: true,
:converters => :numeric
) do |row| ...
...then your BookInStock class would look like this:
class BookInStock
attr_reader :isbn
attr_accessor :price
def initialize(isbn, price)
#isbn = isbn
#price = price #Float(price)
end
Here are all your files amended so they will run correctly:
csv_reader.rb:
require 'csv'
require './book_in_stock'
class CsvReader
def initialize
#books_in_stock = []
end
def read_in_csv_data(csv_file_name)
CSV.foreach(csv_file_name, headers: true) do |row|
#books_in_stock << BookInStock.new(row["ISBN"], row["Amount"])
end
end
def total_value_in_stock
sum = 0.0
#books_in_stock.each {|book| sum += book.price}
sum
end
def number_of_each_isbn
end
end
stock_stats.rb:
require './csv_reader'
reader = CsvReader.new
ARGV.each do |csv_file_name|
STDERR.puts "Processing #{csv_file_name}"
reader.read_in_csv_data(csv_file_name)
end
puts "Total value = #{reader.total_value_in_stock}"
test_code.rb:
require './csv_reader'
require './book_in_stock'
require './stock_stats'
reader = CsvReader.new
reader.read_in_csv_data("file1.csv")
reader.read_in_csv_data("file2.csv")
puts "Total value in stock = #{reader.total_value_in_stock}"
# code to call
book = BookInStock.new("isbn1", 33.80)
puts "Price = #{book.price}"
puts "Price in cents = #{book.price_in_cents}"
book.price_in_cents = 1234
puts "Price = #{book.price}"
puts "Price in cents = #{book.price_in_cents}"
book_in_stock.rb:
class BookInStock
attr_reader :isbn
attr_accessor :price
def initialize(isbn, price)
#isbn = isbn
#price = Float(price)
end
def price_in_cents
Integer(price*100 + 0.5)
end
def price_in_cents=(cents)
#price = cents / 100.0
end
end
file1.csv:
ISBN,Amount
isbn1,49.00
isbn2,24.54
isbn3,33.23
isbn4,15.55
file2.csv:
ISBN,Amount
isbn5-file2,39.98
isbn6-file2,14.84
isbn7-file2,43.63
isbn8-file2,25.55
Now run the program:
~/ruby_programs$ ruby test_code.rb file1.csv file2.csv
Processing file1.csv
Processing file2.csv
Total value = 246.32
Total value in stock = 246.32
Price = 33.8
Price in cents = 3380
Price = 12.34
Price in cents = 1234
require searches for files in Ruby's load path (this is stored in the global variables $: or $LOAD_PATH)
The current directory is not in the load path by default (it used to be in ruby 1.8 and earlier) which is why ruby says that it can't find csv-reader
You can add to the load path either by manipulating the $: variable (it behaves just like an array) or with the the -I option.
For example if you launch irb by doing
irb -I.
Then your code should run without modification (assuming there are no other problems with it)
Lastly you could switch your require statements to use require_relative - this locates files relative to the current file
I am using Ruby 1.8.7 (2010-12-23 patchlevel 330) [x86_64-linux].
When trying to creating a temp file using the tempfile gem, it's getting stuck, and I can't exit from it:
1.8.7 :003 > require 'rubygems'
1.8.7 :004 > require 'tempfile'
1.8.7 :005 > tmp_file = Tempfile.new("new_file")
After the last line, it's stuck and not responsive.
I think it works with newer Ruby versions, but does anyone know what can cause this issue?
It' not gone be easy to find an issue without some deeper debugging. Try running:
class Tempfile
def initialize(basename, tmpdir=Dir::tmpdir)
if $SAFE > 0 and tmpdir.tainted?
tmpdir = '/tmp'
end
lock = nil
n = failure = 0
begin
puts 'Entering critcal stage'
Thread.critical = true
begin
tmpname = File.join(tmpdir, make_tmpname(basename, n))
lock = tmpname + '.lock'
n += 1
puts n
end while ##cleanlist.include?(tmpname) or
File.exist?(lock) or File.exist?(tmpname)
Dir.mkdir(lock)
#rescue
# failure += 1
# retry if failure < MAX_TRY
# raise "cannot generate tempfile `%s'" % tmpname
ensure
Thread.critical = false
puts 'critical stage left'
end
#data = [tmpname]
#clean_proc = Tempfile.callback(#data)
ObjectSpace.define_finalizer(self, #clean_proc)
#tmpfile = File.open(tmpname, File::RDWR|File::CREAT|File::EXCL, 0600)
#tmpname = tmpname
##cleanlist << #tmpname
#data[1] = #tmpfile
#data[2] = ##cleanlist
super(#tmpfile)
Dir.rmdir(lock)
end
end
This should be identical to your Tempfile initialize method, with some debugging bits. Could you run this, and retry creating a new tempfile. I'll update answer when you comment with the console output.
Learning to use Ruby Threads for transportability of code between different OS platforms.
The problem is the console is frozen until non_join1 completes which also prevents non_join2 from being started. non_join1 waits at the join command until the threads complete.
The software requires multiple routines running independently. The primary program is a standalone that runs in realtime collecting data. The data collected is written to files. Different programs, using Threads, process the data in parallel. The start/stop and status is controlled from a main console.
What is the best ruby method to launch the separate programs needed to analyze the data files and get status back from the threads?
thanks,
pb
# This is the console that starts up the multiple threads.
#!/usr/bin/ruby
loop do
puts " input a command"
command = gets.chop!
control = case command
when "1" : "1"
when "2" : "2"
end
if control == "1" then
puts `date`+ "routine 1"
puts `./non_join1.rb`
puts `date`
end
if control == "2" then
puts `date` + "routine 2"
`./non_join2.rb`
end
end
#!/usr/bin/ruby
# Example of worker program 1 used to process data files
#file non_join1.rb
x = Thread.new { sleep 0.1; print "xxxxxxxxx"; print "yyyyyyyyyyy"; print "zzzzzzzzzz" }
a = Thread.new { print "aaaaaaaaa"; print "bbbbbbbbbb"; sleep 0.1; print "cccccccc" }
puts " "
(1..10).each {|i| puts i.to_s+" done #{i}"}
x.join
a.join
sleep(30)
#!/usr/bin/ruby
# Example of worker program 2 used to process data files
#file non_join2.rb
x = Thread.new { sleep 0.1; print "xxxxxxxxx"; print "yyyyyyyyyyy"; print "zzzzzzzzzz" }
a = Thread.new { print "aaaaaaaaa"; print "bbbbbbbbbb"; sleep 0.1; print "cccccccc" }
x.join
a.join
$ ./call_ruby.rb
input a command
1
Sat Feb 18 10:36:43 PST 2012
routine 1
aaaaaaaaabbbbbbbbbb
1 done 1
2 done 2
3 done 3
4 done 4
5 done 5
6 done 6
7 done 7
8 done 8
9 done 9
10 done 10
xxxxxxxxxyyyyyyyyyyyzzzzzzzzzzcccccccc
Sat Feb 18 10:37:13 PST 2012
input a command
Instead of executing with `` try this forking (creating a new process), using for example this function:
class Execute
def self.run(command)
pid = fork do
Kernel.exec(command)
end
return pid
end
end
Your code will look like
loop do
puts " input a command"
command = gets.chop!
control = case command
when "1" : "1"
when "2" : "2"
end
if control == "1" then
puts `date`+ "routine 1"
Execute.run("./non_join1.rb")
puts `date`
end
if control == "2" then
puts `date` + "routine 2"
Execute.run("./non_join2.rb")
end
end
In my local enviroment everything works fine. When I upload to my server, I keep getting an Internal Server Error. I've commented out my code until I found the offending line which is:
dateObj = dateObj.next_month #Problem Child
Here is the complete code:
def makeCal(dateObj)
cal = Hash.new
months = 0
while months < 12
# #pass dateobj to build array
array = buildArray(dateObj)
# #save array to hash with month key
monthName = Date::MONTHNAMES[dateObj.mon]
cal[monthName] = array
# #create new date object using month and set it to the first
date = dateObj.month.to_s + '/' + 1.to_s + '/' + dateObj.year.to_s
dateObj = Date.strptime(date, "%m/%d/%Y")
puts dateObj.kind_of? Date
dateObj = dateObj.next_month #Problem Child
months = months + 1
end
cal
end
And ruby -v locally:
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin10.8.0]
and ruby -v remotely:
ruby 1.9.2p290 (2011-07-09 revision 32553) [i686-linux]
Any ideas on how to solve this?
UPDATE:
173.26.190.206 - - [03/Sep/2011 10:40:17] "POST /calendar " 500 30 0.0020
That's from nginx
and this is the stack trace:
NoMethodError - undefined method `next_month' for #<Date: 4911549/2,0,2299161>:
./main.rb:82:in `makeCal'
./main.rb:120:in `POST /calendar'
I inserted the line: puts dateObj.kind_of? Date
and I get all true. So my dateObj is of kind Date
It seems that you lack
require 'active_support'
BTW, if all you need from it is next_month, you can use
date_obj >>= 1
as Date#>> is part of core library.
Edit:
For getting the first of the month, you can use:
Date.new(date_obj.year, date_obj.month)
I am trying to learn ruby(1.8.7). I have programmed in php for some time and I consider myself proficient in that language. I have a book which I reference, and I have looked at many introductory ruby tutorials, but I cannot figure this out.
myFiles = ['/Users/', '/bin/bash', 'Phantom.file']
myFiles.each do |cFile|
puts "File Name: #{cFile}"
if File.exists?(cFile)
puts "#{cFile} is a file"
cFileStats = File.stat(cFile)
puts cFileStats.inspect
cFileMode = cFileStats.mode.to_s()
puts "cFileMode Class: " + cFileMode.class.to_s()
puts "length of string: " + cFileMode.length.to_s()
printf("Mode: %o\n", cFileMode)
puts "User: " + cFileMode[3,1]
puts "Group: " + cFileMode[4,1]
puts "World: " + cFileMode[5,1]
else
puts "Could not find file: #{cFile}"
end
puts
puts
end
produces the following output:
File Name: /Users/
/Users/ is a file
#<File::Stat dev=0xe000004, ino=48876, mode=040755, nlink=6, uid=0, gid=80, rdev=0x0, size=204, blksize=4096, blocks=0, atime=Sun Sep 04 12:20:09 -0400 2011, mtime=Thu Sep 01 21:29:08 -0400 2011, ctime=Thu Sep 01 21:29:08 -0400 2011>
cFileMode Class: String
length of string: 5
Mode: 40755
User: 7
Group: 7
World:
File Name: /bin/bash
/bin/bash is a file
#<File::Stat dev=0xe000004, ino=8672, mode=0100555, nlink=1, uid=0, gid=0, rdev=0x0, size=1371648, blksize=4096, blocks=1272, atime=Sun Sep 04 16:24:09 -0400 2011, mtime=Mon Jul 11 14:05:45 -0400 2011, ctime=Mon Jul 11 14:05:45 -0400 2011>
cFileMode Class: String
length of string: 5
Mode: 100555
User: 3
Group: 3
World:
File Name: Phantom.file
Could not find file: Phantom.file
Wh is the string length different than expected? (Should be 5 for users, 6 for /bin/bash)? Why are the substrings not puling the correct characters. I understand World not being populated when referencing a 5 character string, but the offsets seem off, and in the case of /bin/bash 3 does not even appear in the string.
Thanks
Scott
This is a nice one.
When a number is preceeded by a 0, it is represented as octal. What you are actually getting for bin/bash:
0100755 to decimal = 33261
"33261".length = 5
And for /Users:
040755 to decimal = 16877
"16877".length = 5
Add the following line:
puts cFileMode
And you will see the error.
to_s takes an argument which is the base. If you call to_s(8) then it should work.
cFileMode = cFileStats.mode.to_s(8)
EDIT
files = ['/home/', '/bin/bash', 'filetest.rb']
files.each do |file|
puts "File Name: #{file}"
if File.exists?(file)
puts "#{file} is a file"
file_stats = File.stat(file)
puts file_stats.inspect
file_mode = file_stats.mode.to_s(8)
puts "cFileMode Class: #{file_mode.class}"
p file_mode
puts "length of string: #{file_mode.length}"
printf("Mode: #{file_mode}")
puts "User: #{file_mode[-3,1]}"
puts "Group: #{file_mode[-2,1]}"
puts "World: #{file_mode[-1,1]}"
else
puts "Could not find file: #{file}"
end
puts
puts
end
Instead of this:
puts "length of string: " + cFileMode.length.to_s()
What you want is this:
puts "length of string: #{cFile.length}"
Please do not use camel case for variables in Ruby, camel case is used on class names only, method and variable names should be written with underscores to separate multiple words.
It's also a good practice to avoid adding parameters to method calls that do not have parameters, so, instead of calling to_s() you should use to_s only.