Ruby script can't find CSV file - ruby

I have a simple crawler written in Ruby that should crawl specific sites and save data into a CSV file, but, when running it from the Windows command line I get this error:
C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:1282:in `initialize': No such file or directory # rb_sysopen - csv/boxers.csv (Errno::ENOENT)
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:1282:in `open'
from C:/Ruby24-x64/lib/ruby/2.4.0/csv.rb:1282:in `open'
from boxers.rb:18:in `<main>'
This is the script:
#!/usr/bin/env ruby
require 'csv'
require 'mechanize'
agent = Mechanize.new{ |agent| agent.history.max_size=0 }
agent.user_agent = 'Mozilla/5.0'
base = "http://baseurl.com/"
division = ARGV[0]
search_url = "http://baseurl.com/ratings.php?sex=M&division=#{division}&pageID="
path='//*[#id="mainContent"]/table/tr[position()>2]'
boxers = CSV.open("csv/file.csv","w")
url = search_url+"1"
begin
page = agent.get(url)
rescue
print " -> error, retrying\n"
retry
end
end
boxers.close

Related

Getting error while opening file , i have created folder with book_name > chapter_name in code and then creating file

In child folder(chapter_number), i am creating file
#!/usr/bin/env ruby
require 'roo'
Dir.glob("**/*.xlsx") do |file|
xlsx = Roo::Spreadsheet.open(file)
bookname = xlsx.column(1)
cahpter_number_array = xlsx.column(2).uniq
cahpter_number_array.each do |chapter|
book_name = bookname[1] if bookname
chapter_number = chapter if (cahpter_number_array && (chapter != "Chapter"))
Dir.mkdir(book_name) unless File.exists?(book_name)
Dir.mkdir("#{book_name}/#{chapter_number}") unless File.exists?("#{book_name}/#{chapter_number}")
xlsx.column(3).each do |md|
output_name = "#{book_name}/#{chapter_number}/#{File.basename(md.partition('-').first, '.*')}.md" if (md != "Verse")
output = File.open("#{output_name}", 'w')
output << "hello"
end
end
end
Error:
`initialize': Is a directory # rb_sysopen - . (Errno::EISDIR)
Below link is my source file:
source file link
Come on now, that isn't really your code. You can't call partition on a number:
file_name = [1,2,3,4,5,6,7]
file_name.each do |md|
... md.partition('-')
so you would have gotten an error for that before getting the error you posted.
In any case, the error message is saying that outputname is set equal to "." and when ruby tries to execute File.open(".", 'w') ruby finds that "." is the name of a directory on your system, and you can't write a directory. You can witness the same error doing this:
~/ruby_programs$ mkdir my_dir
~/ruby_programs$ irb
2.3.0 :001 > File.open('my_dir', 'w')
Errno::EISDIR: Is a directory # rb_sysopen - my_dir
from (irb):1:in `initialize'
from (irb):1:in `open'
from (irb):1
from /Users/7stud/.rvm/rubies/ruby-2.3.0/bin/irb:11:in `<main>'

ChunkyPNG: Is it possible to read an image directly from a URL?

I tried (with some success)
require 'open-uri'
require 'chunky_png'
image_url = "http://res.cloudinary.com/houlihan-lokey/image/upload/c_limit,h_75,w_120/ixl7z4c1czlvrqnbt0mm.png"
# image_url = "http://res.cloudinary.com/houlihan-lokey/image/upload/c_limit,h_75,w_120/zqw2pgczdzbtyj3aib2o.png" # this works
image_file = open(image_url)
image = ChunkyPNG::Image.from_file(image_file)
puts image.width
Some images work, others don't. The error:
TypeError: no implicit conversion of StringIO into String
from /Users/theuser/.rvm/gems/ruby-2.0.0-p247/gems/chunky_png-1.3.3/lib/chunky_png/datastream.rb:66:in `initialize'
from /Users/theuser/.rvm/gems/ruby-2.0.0-p247/gems/chunky_png-1.3.3/lib/chunky_png/datastream.rb:66:in `open'
from /Users/theuser/.rvm/gems/ruby-2.0.0-p247/gems/chunky_png-1.3.3/lib/chunky_png/datastream.rb:66:in `from_file'
from /Users/theuser/.rvm/gems/ruby-2.0.0-p247/gems/chunky_png-1.3.3/lib/chunky_png/canvas/png_decoding.rb:53:in `from_file'
from (irb):5
from /Users/theuser/.rvm/rubies/ruby-2.0.0-p247/bin/irb:16:in `<main>'
I will be running this on Heroku and am wondering -- is there a reliable way to achieve this without creating temporary files?
The issue was with files which were too small for open to create a temp file for.
The solution is to not rely on temp files but to read the image into memory and use ChunkyPNG's Image.from_blob:
require 'open-uri'
require 'chunky_png'
image_url = "http://res.cloudinary.com/houlihan-lokey/image/upload/c_limit,h_75,w_120/ixl7z4c1czlvrqnbt0mm.png"
image_file = open(image_url).read
image = ChunkyPNG::Image.from_blob(image_file)
puts image.width
This may not work with large images, but is OK for my application.

ruby - zlib header error when uncompressing tar.gz

I wanted to write a ruby program to unpackage tar.gz files, and I ran into some issues. After reading the documentation on the ruby-doc site, Zlib::GzipReader and Zlib::Inflate. Then I found this Module someone wrote on GitHub, and that didn't work either. So, using the examples from the Ruby page for Zlib::GzipReader, I these were able to run successfully.
irb(main):027:0* File.open('zlib_inflate.tar.gz') do |f|
irb(main):028:1* gz = Zlib::GzipReader.new(f)
irb(main):029:1> print gz.read
irb(main):030:1> gz.close
irb(main):031:1> end
#
#
#
irb(main):023:0* Zlib::GzipReader.open('zlib_inflate.tar.gz') { |gz|
irb(main):024:1* print gz.read
irb(main):025:1> }
Then, when trying to use the Zlib::Inflate options, I kept running into incorrect header check errors.
irb(main):047:0* zstream = Zlib::Inflate.new
=> #<Zlib::Inflate:0x00000002b15790 #dictionaries={}>
irb(main):048:0> buf = zstream.inflate('zlib_inflate.tar.gz')
Zlib::DataError: incorrect header check
from (irb):48:in `inflate'
from (irb):48
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
#
#
#
irb(main):049:0> cf = File.open("zlib_inflate.tar.gz")
=> #<File:zlib_inflate.tar.gz>
irb(main):050:0> ucf = File.open("ruby_inflate.rb", "w+")
=> #<File:ruby_inflate.rb>
irb(main):051:0> zi = Zlib::Inflate.new(Zlib::MAX_WBITS)
=> #<Zlib::Inflate:0x00000002c097f0 #dictionaries={}>
irb(main):052:0> ucf << zi.inflate(cf.read)
Zlib::DataError: incorrect header check
from (irb):52:in `inflate'
from (irb):52
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
#
#
#
irb(main):057:0* File.open("zlib_inflate.tar.gz") {|cf|
irb(main):058:1* zi = Zlib::Inflate.new
irb(main):059:1> File.open("zlib_inflate.rb", "w+") {|ucf|
irb(main):060:2* ucf << zi.inflate(cf.read)
irb(main):061:2> }
irb(main):062:1> zi.close
irb(main):063:1> }
Zlib::DataError: incorrect header check
from (irb):60:in `inflate'
from (irb):60:in `block (2 levels) in irb_binding'
from (irb):59:in `open'
from (irb):59:in `block in irb_binding'
from (irb):57:in `open'
from (irb):57
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
How would I go about taking a tar.gz file, and extract the contents so the files are not null? I did read this other post about Ruby Zlib from here, but that did not work for me as well. What am I doing wrong?

Ruby Spreadsheet: Bad file descriptor - test.xls (Errno::EBADF)

I have problem with script that makes simple .xls file and writes data to one cell. Here is simple code:
require 'spreadsheet'
class Filter
def filter
#excel = Spreadsheet::Workbook.new
#sheet = #excel.create_worksheet
#sheet[0, 0] = "test"
#excel.write 'test.xls'
end
end
f = Filter.new
f.filter
But it raises error:
C:/Ruby193/lib/ruby/gems/1.9.1/gems/ruby-ole-1.2.11.5/lib/ole/storage/base.rb:62:in
write_nonblock': Bad file descriptor - test.xls (Errno::EBADF)
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/ruby-ole-1.2.11.5/lib/ole/storage/base.rb:62:in
initialize'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/ruby-ole-1.2.11.5/lib/ole/storage/base.rb:78:in
new'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/ruby-ole-1.2.11.5/lib/ole/storage/base.rb:78:in
open'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/excel/writer/workbook.rb:4
53:in write_from_scratch'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/excel/writer/workbook.rb:6
31:inwrite_workbook'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/writer.rb:15:in
block in write'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/writer.rb:14:in
open'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/writer.rb:14:in
write'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/spreadsheet-0.7.4/lib/spreadsheet/workbook.rb:116:in
write'
from filter.rb:10:in `filter'
from filter.rb:15:in `<main>'
because ruby-ole 1.2.11.5 doesn't support windows platform,
more detail: ruby-ole issue
you can use ruby-ole 1.2.11.4 to avoid this problem.
require 'rubygems'
gem 'ruby-ole','1.2.11.4'
require 'spreadsheet'
I've seen these before. First verify that you can write to that file's location.
My guess is either the file is already open in Excel or your antivirus is blocking the 'threat'.

Ruby Net::FTP gettextfile not able to save files locally

I am trying to retrieve files (.csv) from an ftp site and save them all locally in the same folder. My code looks like this:
#! /usr/bin/ruby
require 'logger'
require 'fileutils'
require 'net/ftp'
require 'rubygems'
require 'mysql2'
require 'roo'
require 'date'
# logging setup
log = Logger.new("/path_to_logs/ftp_log.log", 10, 1024000)
log.level = Logger::INFO
export_ftp_path = '/Receive/results/'
export_work_path ='/Users/pierce/results_exports/'
Net::FTP.open('host', 'username', 'password') do |ftp|
log.info("Logged into FTP")
ftp.passive = true
ftp.chdir("#{export_ftp_path}")
ftp.list.each do |file|
log.info("Found file #{file}")
new_file = file[56..115] #take part of the file name and remove spaces and periods
new_file = new_file.gsub(/[.]+/, "")
new_file = new_file.gsub(/\s/, "0")
ftp.gettextfile(file,"#{new_file}")
log.info("Downloaded file #{new_file}")
end
end
And here is the error I receive:
/Users/pierce/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/net/ftp.rb:560:in `initialize': No such file or directory - (Errno::ENOENT)
from /Users/pierce/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/net/ftp.rb:560:in `open'
from /Users/pierce/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/net/ftp.rb:560:in `gettextfile'
from ftp_test.rb:44:in `block (2 levels) in <main>'
from ftp_test.rb:33:in `each'
from ftp_test.rb:33:in `block in <main>'
from /Users/pierce/.rbenv/versions/1.9.2-p290/lib/ruby/1.9.1/net/ftp.rb:116:in `open'
As suggested, here are the values I have for puts file and puts new_file.
file = -rwxr-xr-x 1 1130419 114727 9546 May 17 08:11 results_Wed. 16 May 2012.csv
new_file = results_Wed0230May02012csv
Any suggestions on what to change in gettextfile or within my script to get the files saved correctly?
You should use nlst instead of list when you just need a list of files in a directory. The output of list needs to be properly parsed otherwise.
When you request the file it has to be the original filename, including all spaces. When you save the file it can be anything you want (including spaces or not). The error was because you were requesting the wrong file. Use nlst in your case instead. It will make it much easier (no conversion or parsing needed).

Resources