Can't open a file in docx gem when I call it through a string - ruby

I am using the docx gem to read a docx file. The code works when I write it like this:
require 'docx'
doc = Docx::Document.open('example.docx')
puts doc
It prints the doc perfectly. However, I need to get the path from the user through a gets. I need to do this:
require docx
puts "Provide the path of the document:"
document_path = gets.chomp.tr(" ", "") #it makes sure that any accidental whitespace is removed.
doc = Docx::Document.open(document_path)
puts doc
With this code I expect to get the same result that with the former one. The only difference is that I call the docx document to open through a string, not explicitly. Instead, I get this error:
/var/lib/gems/2.3.0/gems/rubyzip-1.1.7/lib/zip/file.rb:82:in `initialize': File '/root/Documents/Projects/Wordsworth/example.docx' (Zip::Error)
not found
from /var/lib/gems/2.3.0/gems/rubyzip-1.1.7/lib/zip/file.rb:96:in `new'
from /var/lib/gems/2.3.0/gems/rubyzip-1.1.7/lib/zip/file.rb:96:in `open'
from /var/lib/gems/2.3.0/gems/docx-0.2.07/lib/docx/document.rb:25:in `initialize'
from /var/lib/gems/2.3.0/gems/docx-0.2.07/lib/docx/document.rb:50:in `new'
from /var/lib/gems/2.3.0/gems/docx-0.2.07/lib/docx/document.rb:50:in `open'
from test.rb:17:in `<main>'
I visited the docx gem github page but in the examples it gives the docx called is always explicitly written by the coder, never a string. I hope I can get some help. Thanks a lot!

Related

How can I copy files and rename them according to their origin directories using Ruby?

I have many directories with generically named txt files inside. I want to make copies of the txt files, rename them according to the containing directory of each, then move them to the parent directory (that being the directory that holds the directories that hold the original txt files, designated "txts" in the script below). I want to retain the original txt files with their original names in their original directories as well so that nothing within the original directories changes.
I have an old script that I think achieved (some of) my goals once, perhaps moving instead of copying the original txt files, but I'm unable to run it successfully now:
require 'find'
require 'fileutils'
Find.find("txts") do |path|
if FileTest.directory?(path)
next
end
ret = path.scan(/.*txts\/([^\/]+)\/.*/)
name = ret[0].to_s + ".txt"
FileUtils.mv(path, name)
end
Years ago a friend wrote this and ran it from within a unix environment with success. When I run it now, an enormous number of errors are returned. I'm using Ruby 2.2.2 and it's entirely possible there's a placeholder somewhere that I'm too newbish to recognize, or perhaps something changed from the older version of FileUtils... I truly have no idea and am afraid I've been unable to turn up any answers with my neophyte skills.
And so I appeal to you...
Edit: Here's the error message:
C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1328:in `stat': Invalid argument # rb_file
_s_stat - ["may2013"].txt (Errno::EINVAL)
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1328:in `lstat'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1247:in `exist?'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:519:in `block in mv'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1570:in `block in fu_each_src
_dest'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1586:in `fu_each_src_dest0'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:1568:in `fu_each_src_dest'
from C:/Ruby22/lib/ruby/2.2.0/fileutils.rb:516:in `mv'
from extracttxt.rb:12:in `block in <main>'
from C:/Ruby22/lib/ruby/2.2.0/find.rb:48:in `block (2 levels) in find'
from C:/Ruby22/lib/ruby/2.2.0/find.rb:47:in `catch'
from C:/Ruby22/lib/ruby/2.2.0/find.rb:47:in `block in find'
from C:/Ruby22/lib/ruby/2.2.0/find.rb:42:in `each'
from C:/Ruby22/lib/ruby/2.2.0/find.rb:42:in `find'
from extracttxt.rb:6:in `<main>'
The error message shows that ret[0] is the array [ "may13" ], so ret[0].to_s + ".txt" evaluates to the string ["may13"].txt. I'm not sure, but it's possible the behavior of String#scan changed in Ruby 1.9 or 2.0, so it returns an array of arrays when captures are present, whereas before it returned an array of strings.
Something like this ought to solve the problem:
require 'find'
require 'fileutils'
Find.find("txts") do |path|
if FileTest.directory?(path)
next
end
if path =~ %r{txts/([^/]+)/}
FileUtils.cp(path, "#{$1}.txt")
end
end
If you want to match by file extension you could either add it to the Regexp above (e.g. %r{txts/([^/]+)/.+\.txt$}) or you could use Dir[] (a.k.a. Dir.glob) e.g.:
require 'dir'
require 'fileutils'
Dir['txts/**/*.txt'].each do |path|
next if FileTest.directory?(path) ||
next unless path =~ %r{txts/([^/]+)/}
FileUtils.cp(path, "#{$1}.txt")
end
I don't know if there will be any performance difference, but it might be worth trying.

Error trying to image scrape

I'm trying to make a ruby program which will automatically download the most recent Penny-Arcade. Here's the code I have:
require 'mechanize'
agent = Mechanize.new
date_string = Date.today.to_s
page = agent.get('http://www.penny-arcade.com/comic/')
puts page
art_link = page.at('div#comicFrame > a > img')['src']
File.open(date_string, 'wb') do |fo|
fo.write open(art_link).read
end
And the output I get from running the program is:
$ ruby grab_PA.rb
#<Mechanize::Page:0x007f38bc743af0>
grab_PA.rb:12:in `initialize': No such file or directory # rb_sysopen - http://art.penny-arcade.com/photos/i-QpzhbpN/0/1050x10000/i-QpzhbpN-1050x10000.jpg (Errno::ENOENT)
from grab_PA.rb:12:in `open'
from grab_PA.rb:12:in `block in <main>'
from grab_PA.rb:11:in `open'
from grab_PA.rb:11:in `<main>'
But if I copy that exact link and put it into Firefox, it opens up the image. What's happening here? The program does write an image file to the program's directory with today's date, but the file is empty.
open takes an argument that's a filename, not an URL. If you want to access the URL, you would normally have to do a lot more than simply open a file.
Luckily, Ruby provides a nice wrapper for Net::HTTP, called open-uri.
Just drop the following line at the top of your program and it should work fine:
require 'open-uri'
Get the art_link src (something like art_link.attributes['src']). And than agent.get from the source.
After you'll have only the image at agent.page. Just save it by agent.page.save ('image_path_and_name').

Search and Replace within one file

i'm new to Ruby and i'm trying to use RegEx to do multiple search and replaces in an input text file, however my code isn't working, i think i understand why it doesn't work but i don't know the syntax i need to make it work.
Heres my code:
# encoding: utf-8
#!/usr/bin/ruby
# open file to read and write
file = File.open("input.txt", "r+")
# get the contents of the file
contents = file.read
file.close
reassign = contents.gsub(/\w+/, '£££££')
# save it out as a new file
new_file = File.new("output.txt")
new_file.write(reassign)
new_file.close
this is the error messages:
C:/Users/parsonsr/RubymineProjects/Test 3/test3.rb:14:in `write': not opened for writing (IOError)
from C:/Users/parsonsr/RubymineProjects/Test 3/test3.rb:14:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
i tried using an array to pass each line through and change whats relevant but then it only saves whats in the array to the output not the rest of the file.
I either need it to change the text thats already there within one file or change the text then save the file with the new changes made into an output file, whichever is easiest.
Hope this makes sense.
Thanks
Take a look at the documentation. Note that File#open receives mode "r" by default.
So the answer is: change
File.new("output.txt")
to
File.new("output.txt", "w")
Another thing you can do in Ruby is:
File.write("output.txt", reassign)
Or:
File.open("output.txt", "w")
Also, i don't know what's the purpose, but consider a big file, you might want to read batch of lines and write them to the output file each time, not read all at once to the memory.

How to avoid undefined method error for Nilclass

I use the dbf gem to read data out of an df file. I wrote some code:
# encoding: UTF-8
require 'dbf'
widgets = DBF::Table.new("patient.dbf")
widgets.each do |record|
puts record.vorname
end
Basically the code works but after ruby writes about 400 record.vorname to the console i get this error:
...
Gisela
G?nter
mycode.rb:5:in `block in <main>': undefined method `vorname' for nil:NilClass (NoM
ethodError)
from C:/RailsInstaller/Ruby1.9.3/lib/ruby/gems/1.9.1/gems/dbf-2.0.6/lib/
dbf/table.rb:101:in `block in each'
......
My question is how can i avoid this error? Therefore it would be intresting why ( how you can see in the error) the record.vorname's with ä,ö,ü are displayed like ?,?,? for eg:
Günter is transformed to G?nter
Thanks
For some reason, your DBF driver returns nil records. You can pretend that this problem doesn't exist by skipping those.
widgets.each do |record|
puts record.vorname if record
end
About your question about the wrong chars, according to the dfb documentation:
Encodings (Code Pages)
dBase supports encoding non-english characters in different formats.
Unfortunately, the format used is not always set, so you may have to
specify it manually. For example, you have a DBF file from Russia and
you are getting bad data. Try using the 'Russion OEM' encoding:
table = DBF::Table.new('dbf/books.dbf', nil, 'cp866')
See doc/supported_encodings.csv for a full list of supported
encodings.
So make sure you use the right encoding to read from the DB.
To avoid the NoMethodError for nil:Nil Class you can probably try this:
require 'dbf'
widgets = DBF::Table.new("patient.dbf")
widgets.each do |record|
puts record.vorname unless record.blank?
end

Runtime Error using Ruby FileUtils

I've been staring at this for hours, but am not sure what i'm doing wrong. I'm trying to write a simple script to move 100 or so files from various locations in an external list. Should be simple enough, and when I run the command through irb, everything works for that one file, but when running the script I get an error. Here's my script.
#! /opt/local/bin/ruby
require 'fileutils.rb'
list_of_files = File.read "files_to_copy.txt"
source_dir = "/Volumes/data/moved_from_share/"
dest_dir = "/Volumes/data/testeroooo/"
list_of_files.each do |line|
copy_from = source_dir + line
copy_to = dest_dir + line
puts copy_from
puts copy_to
puts
FileUtils.cp_r(copy_from, copy_to)
end
Here is some example input from "files_to_copy.txt":
Accounting HG/Accounts Payable/2011/2011_06/ebi_Inv_218876.pdf
Accounting HG/Accounts Payable/2011/2011_06/expeditors_1050006142.tif
Accounting HG/Accounts Payable/2011/2011_06/expeditors_7050627938.tif
And lastly, here is my output with error:
/Volumes/data/moved_from_share/Accounting PG/Accounts Payable/2011/2011_07/
/Volumes/data/testeroooo/Accounting PG/Accounts Payable/2011/2011_07/
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1255:in `copy': unknown file type: /Volumes/data/moved_from_share/Accounting PG/Accounts Payable/2011/2011_07/ (RuntimeError)
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:451:in `copy_entry'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1324:in `traverse'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:448:in `copy_entry'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:423:in `cp_r'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1395:in `fu_each_src_dest'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1411:in `fu_each_src_dest0'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1393:in `fu_each_src_dest'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:422:in `cp_r'
from copy_it.rb:14
from copy_it.rb:8:in `each'
from copy_it.rb:8
If you have any suggestions, I would love to hear them! Thank you!
Your file list likely contains Accounting PG/Accounts Payable/2011/2011_07/ as an entry, which is a Directory, not a File. This should work perfectly fine, as you're using cp_r.
You could override it to only copy files (assuming your file list includes the subfolder items too):
if File.file?(copy_from)
FileUtils.cp_r(copy_from, copy_to)
end

Resources