Search and Replace within one file - ruby

i'm new to Ruby and i'm trying to use RegEx to do multiple search and replaces in an input text file, however my code isn't working, i think i understand why it doesn't work but i don't know the syntax i need to make it work.
Heres my code:
# encoding: utf-8
#!/usr/bin/ruby
# open file to read and write
file = File.open("input.txt", "r+")
# get the contents of the file
contents = file.read
file.close
reassign = contents.gsub(/\w+/, '£££££')
# save it out as a new file
new_file = File.new("output.txt")
new_file.write(reassign)
new_file.close
this is the error messages:
C:/Users/parsonsr/RubymineProjects/Test 3/test3.rb:14:in `write': not opened for writing (IOError)
from C:/Users/parsonsr/RubymineProjects/Test 3/test3.rb:14:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
i tried using an array to pass each line through and change whats relevant but then it only saves whats in the array to the output not the rest of the file.
I either need it to change the text thats already there within one file or change the text then save the file with the new changes made into an output file, whichever is easiest.
Hope this makes sense.
Thanks

Take a look at the documentation. Note that File#open receives mode "r" by default.
So the answer is: change
File.new("output.txt")
to
File.new("output.txt", "w")
Another thing you can do in Ruby is:
File.write("output.txt", reassign)
Or:
File.open("output.txt", "w")
Also, i don't know what's the purpose, but consider a big file, you might want to read batch of lines and write them to the output file each time, not read all at once to the memory.

Related

How to replace the first few bytes of a file in Ruby without opening the whole file?

I have a 30MB XML file that contains some gibberish in the beginning, and so typically I have to remove that in order for Nokogiri to be able to parse the XML document properly.
Here's what I currently have:
contents = File.open(file_path).read
if contents[0..123].include? 'authenticate_response'
fixed_contents = File.open(file_path).read[123..-1]
File.open(file_path, 'w') { |f| f.write(fixed_contents) }
end
However, this actually causes the ruby script to open up the large XML file twice. Once to read the first 123 characters, and another time to read everything but the first 123 characters.
To solve the first issue, I was able to accomplish this:
contents = File.open(file_path).read(123)
However, now I need to remove these characters from the file without reading the entire file. How can I "trim" the beginning of this file without having to open the entire thing in memory?
You can open the file once, then read and check the "garbage" and finally pass the opened file directly to nokogiri for parsing. That way, you only need read the file once and don't need to write it at all.
File.open(file_path) do |xml_file|
if xml_file.read(123).include? 'authenticate_response'
# header found, nothing to do
else
# no header found. We rewind and let nokogiri parse the whole file
xml_file.rewind
end
xml = Nokogiri::XML.parse(xml_file)
# Now to whatever you want with the parsed XML document
end
Please refer to the documentation of IO#read, IO#rewind and Nokigiri::XML::Document.parse for details about those methods.

No such file or directory # rb_sysopen ruby

Facing below issue eventhough the file is present in the folder.
H:\Ruby_test_works>ruby hurrah.rb
hurrah.rb:7:in `read': No such file or directory # rb_sysopen - H:/Ruby_
test_works/SVNFolders.txt (Errno::ENOENT)
from hurrah.rb:7:in `block in <main>'
from hurrah.rb:4:in `each_line'
from hurrah.rb:4:in `<main>'
Input file (input.txt) Columns are tab separated.
10.3.2.021.asd 10.3.2.041.def SVNFolders.txt
SubversionNotify Subversionweelta post-commit.bat
Commit message still rake customemail.txt
mckechney.com yahoo.in ReadMe.txt
Code :
dir = 'H:/Ruby_test_works'
file = File.open("#{dir}/input.txt", "r")
file.each_line do |line|
initial, final, file_name = line.split("\t")
#puts file_name
old_value = File.read("#{dir}/#{file_name}")
replace = old_value.gsub( /#{Regexp.escape(initial)}, #{Regexp.escape(final)}/)
File.open("#{dir}/#{file_name}", "w") { |fi| fi.puts replace }
end
I have tried using both forward and backward slashes but no luck. What I'm missing, not sure. Thanks.
puts file_name gives the below values
SVNFolders.txt
post-commit.bat
customemail.txt
ReadMe.txt
The file_name contains the newline character \n at the end, which won't get printed but messes up the path. You can fix the issue by stripping the line first:
initial, final, file_name = line.strip.split("\t")
When debugging code, be careful with puts. Quoting its documentation reveals an ugly truth:
Writes the given object(s) to ios. Writes a newline after any that do not already end with a newline sequence.
Another way to put this is to say it ignores (potential) newlines at the end of the object(s). Which is why you never saw that the file name actually was SVNFolders.txt\n.
Instead of using puts, you can use p when troubleshooting issues. The very short comparison between the two is that puts calls to_s and adds a newline, while p calls inspect on the object. Here is a bit more details about the differences: http://www.garethrees.co.uk/2013/05/04/p-vs-puts-vs-print-in-ruby/
Sometimes the issue is not the file, but the path to the file. Consider compare the file path with what you think the file path is with something like:
File.expand_path('my_file.rb')

Ruby: Reading from a file written to by the system process

I'm trying to open a tmpfile in the system $EDITOR, write to it, and then read in the output. I can get it to work, but I am wondering why calling file.read returns an empty string (when the file does have content)
Basically I'd like to know the correct way of reading the file once it has been written to.
require 'tempfile'
file = Tempfile.new("note")
system("$EDITOR #{file.path}")
file.rewind
puts file.read # this puts out an empty string "" .. why?
puts IO.read(file.path) # this puts out the contents of the file
Yes, I will be running this in an ensure block to nuke the file once used ;)
I was running this on ruby 2.2.2 and using vim.
Make sure you are calling open on the file object before attempting to read it in:
require 'tempfile'
file = Tempfile.new("note")
system("$EDITOR #{file.path}")
file.open
puts file.read
file.close
file.unlink
This will also let you avoid calling rewind on the file, since your process hasn't written any bytes to it at the time you open it.
I believe IO.read will always open the file for you, which is why it worked in that case. Whereas calling .read on an IO-like object does not always open the file for you.

How to create a file in Ruby

I'm trying to create a new file and things don't seem to be working as I expect them too. Here's what I've tried:
File.new "out.txt"
File.open "out.txt"
File.new "out.txt","w"
File.open "out.txt","w"
According to everything I've read online all of those should work but every single one of them gives me this:
ERRNO::ENOENT: No such file or directory - out.txt
This happens from IRB as well as a Ruby script. What am I missing?
Use:
File.open("out.txt", [your-option-string]) {|f| f.write("write your stuff here") }
where your options are:
r - Read only. The file must exist.
w - Create an empty file for writing.
a - Append to a file.The file is created if it does not exist.
r+ - Open a file for update both reading and writing. The file must exist.
w+ - Create an empty file for both reading and writing.
a+ - Open a file for reading and appending. The file is created if it does not exist.
In your case, 'w' is preferable.
OR you could have:
out_file = File.new("out.txt", "w")
#...
out_file.puts("write your stuff here")
#...
out_file.close
Try
File.open("out.txt", "w") do |f|
f.write(data_you_want_to_write)
end
without using the
File.new "out.txt"
Try using "w+" as the write mode instead of just "w":
File.open("out.txt", "w+") { |file| file.write("boo!") }
OK, now I feel stupid. The first two definitely do not work but the second two do. Not sure how I convinced my self that I had tried them. Sorry for wasting everyone's time.
In case this helps anyone else, this can occur when you are trying to make a new file in a directory that does not exist.
If the objective is just to create a file, the most direct way I see is:
FileUtils.touch "foobar.txt"
The directory doesn't exist. Make sure it exists as open won't create those dirs for you.
I ran into this myself a while back.
File.new and File.open default to read mode ('r') as a safety mechanism, to avoid possibly overwriting a file. We have to explicitly tell Ruby to use write mode ('w' is the most common way) if we're going to output to the file.
If the text to be output is a string, rather than write:
File.open('foo.txt', 'w') { |fo| fo.puts "bar" }
or worse:
fo = File.open('foo.txt', 'w')
fo.puts "bar"
fo.close
Use the more succinct write:
File.write('foo.txt', 'bar')
write has modes allowed so we can use 'w', 'a', 'r+' if necessary.
open with a block is useful if you have to compute the output in an iterative loop and want to leave the file open as you do so. write is useful if you are going to output the content in one blast then close the file.
See the documentation for more information.
data = 'data you want inside the file'.
You can use File.write('name of file here', data)
You can also use constants instead of strings to specify the mode you want. The benefit is if you make a typo in a constant name, your program will raise an runtime exception.
The constants are File::RDONLY or File::WRONLY or File::CREAT. You can also combine them if you like.
Full description of file open modes on ruby-doc.org

Runtime Error using Ruby FileUtils

I've been staring at this for hours, but am not sure what i'm doing wrong. I'm trying to write a simple script to move 100 or so files from various locations in an external list. Should be simple enough, and when I run the command through irb, everything works for that one file, but when running the script I get an error. Here's my script.
#! /opt/local/bin/ruby
require 'fileutils.rb'
list_of_files = File.read "files_to_copy.txt"
source_dir = "/Volumes/data/moved_from_share/"
dest_dir = "/Volumes/data/testeroooo/"
list_of_files.each do |line|
copy_from = source_dir + line
copy_to = dest_dir + line
puts copy_from
puts copy_to
puts
FileUtils.cp_r(copy_from, copy_to)
end
Here is some example input from "files_to_copy.txt":
Accounting HG/Accounts Payable/2011/2011_06/ebi_Inv_218876.pdf
Accounting HG/Accounts Payable/2011/2011_06/expeditors_1050006142.tif
Accounting HG/Accounts Payable/2011/2011_06/expeditors_7050627938.tif
And lastly, here is my output with error:
/Volumes/data/moved_from_share/Accounting PG/Accounts Payable/2011/2011_07/
/Volumes/data/testeroooo/Accounting PG/Accounts Payable/2011/2011_07/
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1255:in `copy': unknown file type: /Volumes/data/moved_from_share/Accounting PG/Accounts Payable/2011/2011_07/ (RuntimeError)
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:451:in `copy_entry'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1324:in `traverse'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:448:in `copy_entry'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:423:in `cp_r'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1395:in `fu_each_src_dest'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1411:in `fu_each_src_dest0'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:1393:in `fu_each_src_dest'
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/fileutils.rb:422:in `cp_r'
from copy_it.rb:14
from copy_it.rb:8:in `each'
from copy_it.rb:8
If you have any suggestions, I would love to hear them! Thank you!
Your file list likely contains Accounting PG/Accounts Payable/2011/2011_07/ as an entry, which is a Directory, not a File. This should work perfectly fine, as you're using cp_r.
You could override it to only copy files (assuming your file list includes the subfolder items too):
if File.file?(copy_from)
FileUtils.cp_r(copy_from, copy_to)
end

Resources