Can anyone tell me why this program is not producing an output? The output it should be producing is: Line read: 0
Line read: 1 Line read: 2 Line read: 3 and so on.
So far, I am not getting an output even though I have fixed a number of bugs. Any help or suggestions would be much appreciated.
# takes a number and writes that number to a file then on each line
# increments from zero to the number passed
def write(aFile, number)
# You might need to fix this next line:
aFile.puts(number)
index = 0
while (index < number)
aFile.puts(index.to_s)
index += 1
end
end
# Read the data from the file and print out each line
def read(aFile)
# Defensive programming:
count = aFile.gets
if (is_numeric?(count))
count = count.to_i
index = 0
while (index < count)
line = aFile.gets
puts "line read: " + line
index+=1
end
end
end
# Write data to a file then read it in and print it out
def main
aFile = File.new("mydata.txt", "w") # open for writing
write(aFile, 10)
aFile.close
aFile = File.new("mydata.txt", "r")
read(aFile)
aFile.close
end
# returns true if a string contains only digits
def is_numeric?(obj)
if /[^0-9]/.match(obj) == nil
true
end
false
end
main
Your code isn't written in the Ruby way.
This is how I'd write it if I wanted to closely mimic your code's logic:
# takes a number and writes that number to a file then on each line
# increments from zero to the number passed
def write_data(fname, counter)
File.open(fname, 'w') do |fo|
fo.puts(counter)
counter.times do |n|
fo.puts n
end
end
end
# returns true if a string contains only digits
def is_numeric?(obj)
obj[/^\d+$/]
end
# Read the data from the file and print out each line
def read_data(fname)
File.open(fname) do |fi|
counter = fi.gets.chomp
if is_numeric?(counter)
counter.to_i.times do |n|
line_in = fi.gets
puts 'Line read: %s' % line_in
end
end
end
end
# Write data to a file then read it in and print it out
DATA_FILE = 'mydata.txt'
write_data(DATA_FILE, 10)
read_data(DATA_FILE)
Which outputs:
Line read: 0
Line read: 1
Line read: 2
Line read: 3
Line read: 4
Line read: 5
Line read: 6
Line read: 7
Line read: 8
Line read: 9
Notice these things:
Method (or variable) names are not in camelCase in Ruby, they're snake_case. ItsAReadabiltyThing.
Ruby encourages us to use a block when opening files for reading or writing, to automatically close the file when we're finished with it. Leaving danging file handles opened then not closed, in a loop, in a long-running program, is a great way for your program to crash in a way that's hard to figure out. SO has many questions that resulted from doing that. This is from the IO#open documentation:
With no associated block, ::open is a synonym for ::new. If the optional code block is given, it will be passed io as an argument, and the IO object will automatically be closed when the block terminates. In this instance, ::open returns the value of the block.
Usually you'll see code use File.open instead of IO.open, mostly out of habit in Ruby coders. File inherits from IO and adds some additional file-oriented methods to the class, so it's a little more full-featured.
Ruby has many methods that help us avoid using while loops. Getting the counters wrong or missing a condition that should terminate the loop, is all too common in programming, so Ruby makes it easy to loop "n times" or to iterate over all the elements in an array. The times method accomplishes that nicely.
String's [] method is really powerful and makes it easy to look at the contents of a string and apply a pattern or a slice. Using /^\d+$/ checks the entire string to make sure all characters are digits, so some_string[/^\d+$/] is a shorter version than what you're doing and accomplishes the same thing, returns a "truthy" value.
We don't use a main method. That's old-school Pascal, C or Java and is artificially structured. Ruby's a little more friendly than that.
Instead of using
3.times do |n|
puts n
end
# >> 0
# >> 1
# >> 2
I'd probably use
puts (0..(3 - 1)).to_a * "\n"
# >> 0
# >> 1
# >> 2
just because I tend to think in Perl terms. It's another old habit.
I found 2 errors. Fixing those errors gives you desired output.
Error #1.
Your method is_numeric? always returns false. Even if your condition is true. The last line of the method is false and therefore the whole method ALWAYS returns false.
You can fix it in 2 steps.
Step #1:
if /[^0-9]/.match(obj) == nil
true
else
false
end
It's not a good practice to return booleans within conditional. You can simplify it this way:
def is_numeric?(obj)
/[^0-9]/.match(obj) == nil
end
or even better
def is_numeric?(obj)
/[^0-9]/.match(obj).nil?
end
Error #2 is inside your read method. If you try to output the value of count after you read it from the file it gives you "10\n". That \n at the end messes you up.
To get rid of \n when you read from the file you could possibly use chomp. So then your reading line would be:
count = aFile.gets.chomp
and the rest works like magic
Related
I want to write a class, it can find a target string in a txt file and output the line number and the position.
class ReadFile
def find_string(filename, string)
line_num = 0
IO.readlines(filename).each do |line|
line_num += 1
if line.include?(string)
puts line_num
puts line.index(string)
end
end
end
end
a= ReadFile.new
a.find_string('test.txt', "abc")
If the txt file is very large(1 GB, 10GB ...), the performance of this method is very poor.
Is there the better solution?
Use foreach to efficiently read a single line from the file at a time and with_index to track the line number (0-based):
IO.foreach(filename).with_index do |line, index|
if found = line.index(string)
puts "#{index+1}, #{found+1}"
break # skip this if you want to find more than 1 result
end
end
See here for a good explanation of why readlines is giving you performance problems.
This is a variant of #PinnyM's answer. It uses find, which I think is more descriptive than looping and breaking, but does the same thing. This does have a small penalty of having to determine the offset into the line where the string begins after the line is found.
line, index = IO.foreach(filename).with_index.find { |line,index|
line.include?(string) }
if line
puts "'#{string}' found in line #{index}, " +
"beginning in column #{line.index(string)+1}"
else
puts "'#{string}' not found"
end
I wrote the following script to read a CSV file:
f = File.open("aFile.csv")
text = f.read
text.each_line do |line|
if (f.eof?)
puts "End of file reached"
else
line_num +=1
if(line_num < 6) then
puts "____SKIPPED LINE____"
next
end
end
arr = line.split(",")
puts "line number = #{line_num}"
end
This code runs fine if I take out the line:
if (f.eof?)
puts "End of file reached"
With this line in I get an exception.
I was wondering how I can detect the end of file in the code above.
Try this short example:
f = File.open(__FILE__)
text = f.read
p f.eof? # -> true
p text.class #-> String
With f.read you read the whole file into text and reach EOF.
(Remark: __FILE__ is the script file itself. You may use you csv-file).
In your code you use text.each_line. This executes each_line for the string text. It has no effect on f.
You could use File#each_line without using a variable text. The test for EOF is not necessary. each_line loops on each line and detects EOF on its own.
f = File.open(__FILE__)
line_num = 0
f.each_line do |line|
line_num +=1
if (line_num < 6)
puts "____SKIPPED LINE____"
next
end
arr = line.split(",")
puts "line number = #{line_num}"
end
f.close
You should close the file after reading it. To use blocks for this is more Ruby-like:
line_num = 0
File.open(__FILE__) do | f|
f.each_line do |line|
line_num +=1
if (line_num < 6)
puts "____SKIPPED LINE____"
next
end
arr = line.split(",")
puts "line number = #{line_num}"
end
end
One general remark: There is a CSV library in Ruby. Normally it is better to use that.
https://www.ruby-forum.com/topic/218093#946117 talks about this.
content = File.read("file.txt")
content = File.readlines("file.txt")
The above 'slurps' the entire file into memory.
File.foreach("file.txt") {|line| content << line}
You can also use IO#each_line. These last two options do not read the entire file into memory. The use of the block makes this automatically close your IO object as well. There are other ways as well, IO and File classes are pretty feature rich!
I refer to IO objects, as File is a subclass of IO. I tend to use IO when I don't really need the added methods from File class for the object.
In this way you don't need to deal with EOF, Ruby will for you.
Sometimes the best handling is not to, when you really don't need to.
Of course, Ruby has a method for this.
Without testing this, it seems you should perform a rescue rather than checking.
http://www.ruby-doc.org/core-2.0/EOFError.html
file = File.open("aFile.csv")
begin
loop do
some_line = file.readline
# some stuff
end
rescue EOFError
# You've reached the end. Handle it.
end
Just to analyze my iis log (BONUS: happened to know that iislog is encoded in ASCII, errrr..)
Here's my ruby code
1.readlines
Dir.glob("*.log").each do |filename|
File.readlines(filename,:encoding => "ASCII").each do |line|
#comment line
if line[0] == '#'
next
else
line_content = line.downcase
#just care about first one
matched_keyword = keywords.select { |e| line_content.include? e }[0]
total_count += 1 if extensions.any? { |e| line_content.include? e }
hit_count[matched_keyword] += 1 unless matched_keyword.nil?
end
end
end
2.open
Dir.glob("*.log").each do |filename|
File.open(filename,:encoding => "ASCII").each_line do |line|
#comment line
if line[0] == '#'
next
else
line_content = line.downcase
#just care about first one
matched_keyword = keywords.select { |e| line_content.include? e }[0]
total_count += 1 if extensions.any? { |e| line_content.include? e }
hit_count[matched_keyword] += 1 unless matched_keyword.nil?
end
end
end
"readlines" read the whole file in mem, why "open" always a bit faster on the contrary??
I tested it a couple of times on Win7 Ruby1.9.3
Both readlines and open.each_line read the file only once. And Ruby will do buffering on IO objects, so it will read a block (e.g. 64KB) data from disk every time to minimize the cost on disk read. There should be little time consuming difference in the disk read step.
When you call readlines, Ruby constructs an empty array [] and repeatedly reads a line of file contents and pushes it to the array. And at last it will return the array containing all lines of the file.
When you call each_line, Ruby reads a line of file contents and yield it to your logic. When you finished processing this line, ruby reads another line. It repeatedly reads lines until there is no more contents in the file.
The difference between the two method is that readlines have to append the lines to an array. When the file is large, Ruby might have to duplicate the underlying array (C level) to enlarge its size one or more times.
Digging into the source, readlines is implemented by io_s_readlines which calls rb_io_readlines. rb_io_readlines calls rb_io_getline_1 to fetch line and rb_ary_push to push result into the returning array.
each_line is implemented by rb_io_each_line which calls rb_io_getline_1 to fetch line just like readlines and yield the line to your logic with rb_yield.
So, there is no need to store line results in a growing array for each_line, no array resizing, copying issue.
I'm almost a Ruby-nOOb (have just the knowledge of Ruby to write some basic .erb template or Puppet custom-facts). Looks like my requirements fairly simple but can't get my head around it.
Trying to write a .erb template, where it reads a file (with space delimited lines) to an array and then handle each array element according to the requirements. This is what I got so far:
fname = "webURI.txt"
def myArray()
#if defined? $fname
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname)
end
end
myArray.each_index do |i|
myLine = myArray[i].split(' ')
puts myLine[0] +"\t=> "+ myLine.last
end
Which works just fine, except (for obvious reason) for the line that is commented out or blank lines. I also want to make sure that when spitted (by space) up, the line shouldn't have more than two fields in it; a file like this:
# This is a COMMENT
#
# Puppet dashboard
puppet controller-all-local.example.co.uk:80
# Nagios monitoring
nagios controller-all-local.example.co.uk::80/nagios
tac talend-tac-local.example.co.uk:8080/org.talend.admin
mng console talend-mca-local.example.co.uk:8080/amc # Line with three fields
So, basically these two things I'd like to achieve:
Read the lines into array, stripping off everything after the first #
Split each element and print a message if the number id more than two
Any help would be greatly appreciated. Cheers!!
Update 25/02
Thanks guy for your help!!
The blankthing doesn't work for at all; throwing in this error; but I kinda failed to understand why:
undefined method `blank?' for "\n":String (NoMethodError)
The array: myArray, which I get is actually something like this (using p instead of puts:
["\n", "puppet controller-all-local.example.co.uk:80\n", "\n", "\n", "nagios controller-all-local.example.co.uk::80/nagios\n", ..... \n"]
Hence, I had to do this to get around this prob:
$fname = "webURI.txt"
def myArray()
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname).map { |arr| arr.gsub(/#.*/,'') }
end
end
# remove blank lines
SSS = myArray.reject { |ln| ln.start_with?("\n") }
SSS.each_index do |i|
myLine = SSS[i].split(' ')
if myLine.length > 2
puts "Too many arguments!!!"
elsif myLine.length == 1
puts "page"+ i.to_s + "\t=> " + myLine[0]
else
puts myLine[0] +"\t=> "+ myLine.last
end
end
You are most welcome to improve the code. cheers!!
goodArray = myArray.reject do |line|
line.start_with?('#') || line.split(' ').length > 2
end
This would reject whatever that either starts with # or the split returns an array of more than two elements returning you an array of only good items.
Edit:
For your inline commenting you can then do
goodArray.map do |line|
line.gsub(/#.*/, '')
end
I've seen some really beautiful examples of Ruby and I'm trying to shift my thinking to be able to produce them instead of just admire them. Here's the best I could come up with for picking a random line out of a file:
def pick_random_line
random_line = nil
File.open("data.txt") do |file|
file_lines = file.readlines()
random_line = file_lines[Random.rand(0...file_lines.size())]
end
random_line
end
I feel like it's gotta be possible to do this in a shorter, more elegant way without storing the entire file's contents in memory. Is there?
There is already a random entry selector built into the Ruby Array class: sample().
def pick_random_line
File.readlines("data.txt").sample
end
You can do it without storing anything except the most recently-read line and the current candidate for the returned random line.
def pick_random_line
chosen_line = nil
File.foreach("data.txt").each_with_index do |line, number|
chosen_line = line if rand < 1.0/(number+1)
end
return chosen_line
end
So the first line is chosen with probability 1/1 = 1; the second line is chosen with probability 1/2, so half the time it keeps the first one and half the time it switches to the second.
Then the third line is chosen with probability 1/3 - so 1/3 of the time it picks it, and the other 2/3 of the time it keeps whichever one of the first two it picked. Since each of them had a 50% chance of being chosen as of line 2, they each wind up with a 1/3 chance of being chosen as of line 3.
And so on. At line N, every line from 1-N has an even 1/N chance of being chosen, and that holds all the way through the file (as long as the file isn't so huge that 1/(number of lines in file) is less than epsilon :)). And you only make one pass through the file and never store more than two lines at once.
EDIT You're not going to get a real concise solution with this algorithm, but you can turn it into a one-liner if you want to:
def pick_random_line
File.foreach("data.txt").each_with_index.reduce(nil) { |picked,pair|
rand < 1.0/(1+pair[1]) ? pair[0] : picked }
end
This function does exactly what you need.
It's not a one-liner. But it works with textfiles of any size (except zero size, maybe :).
def random_line(filename)
blocksize, line = 1024, ""
File.open(filename) do |file|
initial_position = rand(File.size(filename)-1)+1 # random pointer position. Not a line number!
pos = Array.new(2).fill( initial_position ) # array [prev_position, current_position]
# Find beginning of current line
begin
pos.push([pos[1]-blocksize, 0].max).shift # calc new position
file.pos = pos[1] # move pointer backward within file
offset = (n = file.read(pos[0] - pos[1]).rindex(/\n/) ) ? n+1 : nil
end until pos[1] == 0 || offset
file.pos = pos[1] + offset.to_i
# Collect line text till the end
begin
data = file.read(blocksize)
line.concat((p = data.index(/\n/)) ? data[0,p.to_i] : data)
end until file.eof? or p
end
line
end
Try it:
filename = "huge_text_file.txt"
100.times { puts random_line(filename).force_encoding("UTF-8") }
Negligible (imho) drawbacks:
the longer the line, the higher the chance it'll be picked.
doesn't take into account the "\r" line separator ( windows-specific ). Use files with Unix-style line endings!
This is not much better than what you came up with, but at least it's shorter:
def pick_random_line
lines = File.readlines("data.txt")
lines[rand(lines.length)]
end
One thing you can do to make your code more Rubyish is omitting braces. Use readlines and size instead of readlines() and size().
A one liner:
def pick_random_line(file)
`head -$((${RANDOM} % `wc -l < #{file}` + 1)) #{file} | tail -1`
end
If you protest that it's not Ruby, go find a talk in this year's Euruko titled Ruby is unlike a Banana.
PS: Ignore SO's incorrect syntax highlighting.
Here a shorter version of Mark's exellent answer, not as short as Dave's though
def pick_random_line number=1, chosen_line=""
File.foreach("data.txt") {|line| chosen_line = line if rand < 1.0/number+=1}
chosen_line
end
Stat the file, pick a random number between zero and the size of the file, seek to that byte in the file. Scan until the next newline, then read and return the next line (assuming you're not at the end of the file).