How do I detect end of file in Ruby? - ruby

I wrote the following script to read a CSV file:
f = File.open("aFile.csv")
text = f.read
text.each_line do |line|
if (f.eof?)
puts "End of file reached"
else
line_num +=1
if(line_num < 6) then
puts "____SKIPPED LINE____"
next
end
end
arr = line.split(",")
puts "line number = #{line_num}"
end
This code runs fine if I take out the line:
if (f.eof?)
puts "End of file reached"
With this line in I get an exception.
I was wondering how I can detect the end of file in the code above.

Try this short example:
f = File.open(__FILE__)
text = f.read
p f.eof? # -> true
p text.class #-> String
With f.read you read the whole file into text and reach EOF.
(Remark: __FILE__ is the script file itself. You may use you csv-file).
In your code you use text.each_line. This executes each_line for the string text. It has no effect on f.
You could use File#each_line without using a variable text. The test for EOF is not necessary. each_line loops on each line and detects EOF on its own.
f = File.open(__FILE__)
line_num = 0
f.each_line do |line|
line_num +=1
if (line_num < 6)
puts "____SKIPPED LINE____"
next
end
arr = line.split(",")
puts "line number = #{line_num}"
end
f.close
You should close the file after reading it. To use blocks for this is more Ruby-like:
line_num = 0
File.open(__FILE__) do | f|
f.each_line do |line|
line_num +=1
if (line_num < 6)
puts "____SKIPPED LINE____"
next
end
arr = line.split(",")
puts "line number = #{line_num}"
end
end
One general remark: There is a CSV library in Ruby. Normally it is better to use that.

https://www.ruby-forum.com/topic/218093#946117 talks about this.
content = File.read("file.txt")
content = File.readlines("file.txt")
The above 'slurps' the entire file into memory.
File.foreach("file.txt") {|line| content << line}
You can also use IO#each_line. These last two options do not read the entire file into memory. The use of the block makes this automatically close your IO object as well. There are other ways as well, IO and File classes are pretty feature rich!
I refer to IO objects, as File is a subclass of IO. I tend to use IO when I don't really need the added methods from File class for the object.
In this way you don't need to deal with EOF, Ruby will for you.
Sometimes the best handling is not to, when you really don't need to.
Of course, Ruby has a method for this.

Without testing this, it seems you should perform a rescue rather than checking.
http://www.ruby-doc.org/core-2.0/EOFError.html
file = File.open("aFile.csv")
begin
loop do
some_line = file.readline
# some stuff
end
rescue EOFError
# You've reached the end. Handle it.
end

Related

find the target string from a large file

I want to write a class, it can find a target string in a txt file and output the line number and the position.
class ReadFile
def find_string(filename, string)
line_num = 0
IO.readlines(filename).each do |line|
line_num += 1
if line.include?(string)
puts line_num
puts line.index(string)
end
end
end
end
a= ReadFile.new
a.find_string('test.txt', "abc")
If the txt file is very large(1 GB, 10GB ...), the performance of this method is very poor.
Is there the better solution?
Use foreach to efficiently read a single line from the file at a time and with_index to track the line number (0-based):
IO.foreach(filename).with_index do |line, index|
if found = line.index(string)
puts "#{index+1}, #{found+1}"
break # skip this if you want to find more than 1 result
end
end
See here for a good explanation of why readlines is giving you performance problems.
This is a variant of #PinnyM's answer. It uses find, which I think is more descriptive than looping and breaking, but does the same thing. This does have a small penalty of having to determine the offset into the line where the string begins after the line is found.
line, index = IO.foreach(filename).with_index.find { |line,index|
line.include?(string) }
if line
puts "'#{string}' found in line #{index}, " +
"beginning in column #{line.index(string)+1}"
else
puts "'#{string}' not found"
end

ruby read and write/change the same file

I am trying to change the content of an existing file. I have this piece of code, which works. But I would like to find a better way to do the manipulation in one time of opening file.
File.open(file_name , 'r') do |f|
content = f.read
end
File.open(file_name , 'w') do |f|
content.insert(0, "something ")
f.write(content)
end
Is there a way we can do it only opening once the file?
I have tried using File.open(file_name , 'r+'), which seems only append to the end of the file (not be able to insert thing at the beginning of the file).
[Edit: I misunderstood your question, but my code below can be fixed by simply inserting the line:
text_to_prepend = ''
after
line_out = text_to_prepend + buf.shift
It could be simplified a little (for your question), but I'll leave it as is to show how the same string could be prepended to each line.]
You can open the file but once, and not read the entire file before writing, but it's messy and a bit tricky. Basically, you need to move the file pointer between reading and writing and maintain a buffer that contains lines from the file that will be wholly or partially overwritten when each modified line is written.
At each step, remove the first line from the buffer and modify it in preparation for writing. Before writing, however, you may need to read one or more additional lines into the buffer, in order that the read pointer remains ahead of the write pointer after the modified line is written. After all lines have been read, each remaining line in the buffer is modified and written.
Code
def prepend_file_lines(file_name, text_to_prepend)
f = File.open(file_name, 'r+')
return if f.eof?
write_pos = 0
line_in = f.readline
read_pos = line_in.size
buf = [line_in]
last_line_read = f.eof?
loop do
break if buf.empty?
line_out = text_to_prepend + buf.shift
while (!last_line_read && read_pos <= write_pos + line_out.size) do
line_in = f.readline
buf << line_in
read_pos += line_in.size
last_line_read = f.eof?
end
f.seek(write_pos, IO::SEEK_SET)
write_pos += f.write(line_out)
f.seek(read_pos, IO::SEEK_SET)
end
end
Example
First, create a test file.
text =<<_
Now is
the time
for all Rubiests
to raise their
glasses to Matz.
_
F_NAME = "sample.txt"
File.write(F_NAME, text)
We can confirm the file was written correctly:
File.readlines(F_NAME).each { |l| puts l }
# Now is
# the time
# for all Rubiests
# to raise their
# glasses to Matz.
Now let's try it:
prepend_file_lines("sample.txt", "Here's to Matz: ")
File.readlines(F_NAME).each { |l| puts l }
# Here's to Matz: Now is
# Here's to Matz: the time
# Here's to Matz: for all Rubiests
# Here's to Matz: to raise their
# Here's to Matz: glasses to Matz.
Note that when testing, it's necessary to write the test file before each call to prepend_file_lines, since the file is being modified.
It looks like you want IO::SEEK_SET with 0 to rewind the file pointer after reading.
file_name = "File.txt";
File.open(file_name , 'r+') do |f|
content = f.read
content.insert(0, "somehting else")
f.seek(0, IO::SEEK_SET)
f.write(content)
end
You can do it in the same file, but you'll likely overwrite the content of the file.
Each file operation sets the file's cursor to a different position, which is the position used for the latter operations. So if you read 8 bytes, you have to back your cursor 8 bytes earlier and write exactly 8 bytes to not overwrite anything, if you write fewer bytes, you'll keep unchanged bytes.
The Ruby File class is IO class, which is documented in http://www.ruby-doc.org/core-1.9.3/IO.html.
To open a file for read/write operations, use "r+" mode.

read file into an array excluding the the commented out lines

I'm almost a Ruby-nOOb (have just the knowledge of Ruby to write some basic .erb template or Puppet custom-facts). Looks like my requirements fairly simple but can't get my head around it.
Trying to write a .erb template, where it reads a file (with space delimited lines) to an array and then handle each array element according to the requirements. This is what I got so far:
fname = "webURI.txt"
def myArray()
#if defined? $fname
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname)
end
end
myArray.each_index do |i|
myLine = myArray[i].split(' ')
puts myLine[0] +"\t=> "+ myLine.last
end
Which works just fine, except (for obvious reason) for the line that is commented out or blank lines. I also want to make sure that when spitted (by space) up, the line shouldn't have more than two fields in it; a file like this:
# This is a COMMENT
#
# Puppet dashboard
puppet controller-all-local.example.co.uk:80
# Nagios monitoring
nagios controller-all-local.example.co.uk::80/nagios
tac talend-tac-local.example.co.uk:8080/org.talend.admin
mng console talend-mca-local.example.co.uk:8080/amc # Line with three fields
So, basically these two things I'd like to achieve:
Read the lines into array, stripping off everything after the first #
Split each element and print a message if the number id more than two
Any help would be greatly appreciated. Cheers!!
Update 25/02
Thanks guy for your help!!
The blankthing doesn't work for at all; throwing in this error; but I kinda failed to understand why:
undefined method `blank?' for "\n":String (NoMethodError)
The array: myArray, which I get is actually something like this (using p instead of puts:
["\n", "puppet controller-all-local.example.co.uk:80\n", "\n", "\n", "nagios controller-all-local.example.co.uk::80/nagios\n", ..... \n"]
Hence, I had to do this to get around this prob:
$fname = "webURI.txt"
def myArray()
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname).map { |arr| arr.gsub(/#.*/,'') }
end
end
# remove blank lines
SSS = myArray.reject { |ln| ln.start_with?("\n") }
SSS.each_index do |i|
myLine = SSS[i].split(' ')
if myLine.length > 2
puts "Too many arguments!!!"
elsif myLine.length == 1
puts "page"+ i.to_s + "\t=> " + myLine[0]
else
puts myLine[0] +"\t=> "+ myLine.last
end
end
You are most welcome to improve the code. cheers!!
goodArray = myArray.reject do |line|
line.start_with?('#') || line.split(' ').length > 2
end
This would reject whatever that either starts with # or the split returns an array of more than two elements returning you an array of only good items.
Edit:
For your inline commenting you can then do
goodArray.map do |line|
line.gsub(/#.*/, '')
end

Skipping the first line when reading in a file in 1.9.3

I'm using ruby's File to open and read in a text file inside of a rake
task. Is there a setting where I can specify that I want the first line of
the file skipped?
Here's my code so far:
desc "Import users."
task :import_users => :environment do
File.open("users.txt", "r", '\r').each do |line|
id, name, age, email = line.strip.split(',')
u = User.new(:id => id, :name => name, :age => age, :email => email)
u.save
end
end
I tried line.lineno and also doing File.open("users.txt", "r", '\r').each do |line, index| and next if index == 0 but have not had any luck.
Change each to each_with_index do |line, index| and next if index == 0 will work.
function drop(n) will remove n lines from the beginning:
File.readlines(filename).drop(1).each do |line|
puts line
end
It will read the whole file into an array and remove first n lines. If you are reading whole file anyway it's probably the most elegant solution.
When reading larger files foreach will be more efficient, as it doesn't read all data into memory:
File.foreach(filename).with_index do |line, line_num|
next if line_num == 0
puts line
end
File.open("users.txt", "r", '\r') do |file|
lines = file.lines # an enumerator
lines.next #skips first line
lines.each do |line|
puts line # do work
end
end
Making use of an enumerator, which 'remembers' where it is.
You probably really want to use csv:
CSV.foreach("users.txt", :headers, :header_converters => :symbol, :col_sep => ',') do |row|
User.new(row).save
end
File.readlines('users.txt')[1..-1].join()
Works good too.
if you want to keep the file as IO the whole time (no array conversions) and you plan on using the data in the first line:
f = File.open('users.txt', 'r')
first_line = f.gets
body = f.readlines
More likely though, what you want is handled by CSV or FasterCSV as others have pointed out. My favorite way to handle files with a header line is to do:
FasterCSV.table('users.txt')
Since a few answers (no longer ?) work in Ruby 1.9.3, here a working sample of the three best methods
# this line must be dropped
puts "using drop"
File.readlines(__FILE__).drop(1).each do |line|
puts line
end
puts ""
puts "using a range"
File.readlines(__FILE__)[1..-1].each do |line|
puts line
end
puts ""
puts "using enumerator"
File.readlines(__FILE__).each do |file, w|
lines = file.lines # an enumerator
lines.next #skips first line
lines.each do |line|
puts line
end
end
The OP said lineno didn't work for them, but I'm guessing it wasn't applied in the correct way. There's lots of ways achieve what the OP is asking for, but using lineno might help you shorten up your code without having to use readlines, which is sometimes too memory intensive.
From the 1.9.3 docs
f = File.new("testfile")
f.each {|line| puts "#{f.lineno}: #{line}" }
produces:
1: This is line one
2: This is line two
3: This is line three
4: And so on...
Note that is a method you can call from the file object, but not the object yielded to the block.
2: require 'pry'; binding.pry
=> 3: f.each {|line| puts line.lineno }
[1] pry(#<SomeFile>)> line.lineno
NoMethodError: undefined method `lineno' for #<String:0x00007fa7d682b920>
You can find the same example with identical code in docs for the latest stable version of Ruby today (2.5.1).
So going off the example, the code might look like
f = File.new("testfile")
o = File.open("output.txt", w)
f.each do |line|
next if f.lineno == 1
o << line
end

Determine last line in Ruby

I'm wondering how I can determine when I am on the last line of a file that I reading in. My code looks like
File.open(file_name).each do |line|
if(someway_to_determine_last_line)
end
I noticed that there is a file.eof? method, but how would I call the method as the file is being read? Thanks!
If you're iterating the file with each, then the last line will be passed to the block after the end-of-file is reached, because the last line is, by definition, the line ending with EOF.
So just call file.eof? in the block.
If you'd like to determine if it's the last non-empty line in the file, you'd have to implement some kind of readahead.
Depending on what you need to do with this "last non-empty line", you might be able to do something like this:
last_line = nil
File.open(file_name).each do |line|
last_line = line if(!line.chomp.empty?)
# Do all sorts of other things
end
if(last_line)
# Do things with the last non-empty line.
end
Secret sauce is .to_a
lines = File.open(filename).to_a
Get the first line:
puts lines.first
Get the last line:
puts lines.last
Get the n line of a file:
puts lines.at(5)
Get the count of lines:
puts lines.count
fd.eof? works, but just for fun, here's a generic solution that works with any kind of enumerators (Ruby 1.9):
class Enumerator
def +(other)
Enumerator.new do |yielder|
each { |e| yielder << e }
other.each { |e| yielder << e }
end
end
def with_last
Enumerator.new do |yielder|
(self + [:some_flag_here]).each_cons(2) do |a, b|
yielder << [a, b == :some_flag_here]
end
end
end
end
# a.txt is a file containing "1\n2\n3\n"
open("a.txt").lines.with_last.each do |line, is_last|
p [line, is_last]
end
Which outputs:
["1\n", false]
["2\n", false]
["3\n", true]
Open your file and use the readline method:
To simply manipulate last line of file do the following:
f = File.open('example.txt').readlines
f.each do |readline|
if readline[f.last]
puts "LAST LINE, do something to it"
else
puts "#{readline} "
end
end
Line 1 reads the file in as an array of lines
Line 2 uses that object and iterates over each of them
Line 3 tests if the current line matches the last line
Line 4 acts if it's a match
Line 5 & 6 handle behavior for non-matching circumstance

Resources