File.open and blocks in Ruby 1.8.7 - ruby

I'm pretty new to ruby and I'm currently reading the Pickaxe book to get familiar with everything. I came across the File.open section where it discusses taking a block as a parameter to a File.open call then guaranteeing that the file is closed. Now this sounds like an absolutely brilliant way to avoid shooting yourself in the foot and as I'm dangerously low on toes, I figure I'll give it a go. Here is what I wrote (in irb if that matters):
File.open('somefile.txt', 'r').each { |line| puts line }``
My expectation was that the file somefile.txt would get opened, read, printed and closed, right? As far as I can tell wrong. If I use lsof to look at open file handles, it's still open. However, if I do
f = File.open('somefile.txt', 'r').each { |line| puts line }
f.close()
Am I using blocks wrong in this example or have I failed to understand the meaning of File.open when used with a block. I've read section on ruby-doc.org related to File.open but that just seems to confirm that what I'm doing ought to be working as expected.
Can anyone explain what I'm doing wrong?

In order to close file after block, you should pass block to File.open() directly, not to each:
File.open('somefile.txt', 'r') do |f|
f.each_line { |l| puts l }
end
File.open(…).each {…} is just iterating over opened file without closing it.

Related

In ruby, how to rename a text file, keep same handle and delete the old file

I am using a constant (FLOG) as a handle to write to my log. At a given point, I have to use a temporary log, and later append that content to the regular log, all that with the same handle, which is used through a bunch of methods.
My test program is below. After closing the handle 'FLOG' associated with the temp log, when I re-assign FLOG to the new log, this somehow re-opens the temp log, and I can't delete it.
Is there a way to make sure that the old temp file stays close (so I can delete it)
# Pre-existing log:
final_log = "final_#{Time.now.strftime("%Y%m%d")}.txt"
#Writing something in it
File.open(final_log, "w+") { |file| file.write("This is the final log: #{final_log}\n") }
# temp log:
temp_log = "temp_#{Time.now.strftime("%Y%m%d")}.txt"
FLOG = File.new(temp_log, "w+")
# write some stuff in temp_log
FLOG.puts "Writing in temp_log named #{temp_log}"
# closing handle for temp_log
FLOG.close
# avoid constant reuse warning:
Object.send(:remove_const,'FLOG') if Object.const_defined?('FLOG')
# need to append temp_log content to final_log with handle FLOG
FLOG = File.open(final_log, "a+")
# appending old temp log to new log
File.open(temp_log, "r").readlines.each do |line|
puts "appending... #{line}"
FLOG.puts "appending... #{line}"
end
# closing handle
FLOG.close
# this tells me that 'temp_log' is somehow re-opened:
ObjectSpace.each_object(File) { |f| puts("3: #{temp_log} is open") if f.path == temp_log && !f.closed? }
File.delete(temp_log) # Cant do that:
# test_file2.rb:35:in `delete': Permission denied - temp_20150324.txt (Errno::EACCES)
If you're going to use a temp file, use tempfile
require 'tempfile'
# Pre-existing log:
final_log = "final_#{Time.now.strftime("%Y%m%d")}.txt"
#Writing something in it
File.open(final_log, "w+") { |file| file.write("This is the final log: #{final_log}\n") }
# give the tempfile a meaningful prefix
temp_log = Tempfile.new('foobar')
begin
$flog = temp_log
# write some stuff in temp_log
$flog.puts "Writing in temp_log named #{temp_log.path}"
# need to append temp_log content to final_log with handle $flog
$flog = File.open(final_log, "a+")
# reopen temp_log for reading, append to new log
temp_log.open.readlines.each do |line|
puts "appending... #{line}"
$flog.puts "appending... #{line}"
end
# closing handle
$flog.close
ensure
# delete temp_log
temp_log.unlink
end
And while globals are generally bad, hacking a constant so that you can use it like a global is worse.
temp_log is still open because you didn't close it. If you did something like:
temp_log_lines = File.open(temp_log, 'r') { |f| f.readlines }
then the I/O stream for to temp_log would be closed at the end of the block. However, doing File.open(temp_log, "r").readlines takes the IO object returned by File.open and calls readlines on it, which you then call each on with an accompanying block. Since the block is part of your call to each and not File.open, the stream is not closed at the end of it, and stays open for the rest of the program.
As to why you can't delete temp_log at the end of the program, it's hard to say without knowing what's going on in the underlying file system. Neither Ruby nor the underlying (POSIX) OS will complain if you delete a file that you've opened a stream for and not closed; the file will be unlinked but the stream will persist and still have the contents of the file and so on. The error you're getting is saying that the owner of the Ruby process for this program doesn't have the rights to delete the file the program created. That's strange, but hard to diagnose just from this code. Consider the directory the program is working in, what the permissions on it are, etc.
Speaking more generally, there are some things you could make use of in Ruby here that would make your life easier.
If you want a temporary file, there is a Tempfile class you could make use of that does a lot of legwork for you.
The idiomatic way of doing I/O with files in Ruby is to pass a block into File.open. The block is handed an I/O stream for the file that is automatically closed at the end of the block, so you don't need to do it manually. Here's an example:
flog = File.new(temp_log, 'w+') do |f|
f.puts "Writing in temp_log named #{temp_log}"
end
FLOG is not a true constant in your code. A constant is only a constant if its value never changes throughout the life of the program it's declared in. Ruby is a very permissive language, so it allows you to reassign them, but warns you if you do. Stricter languages would throw an error if you did that. FLOG is just a normal variable and should be written flog. A good use for a constant is a value external to the action of your program that your program needs be able to reference—for instance, instead of writing 3.141592653589793 every time you need to refer to an approximation of pi, you could declare PI = 3.141592653589793 and use PI afterwards. In Ruby, this has been done for you in the Math module—Math::PI returns this. User settings are another place constants often show up—they're determined before the program gets going, help to determine what it does, and should be unmodified during its execution, so storing them in constants sometimes makes sense.
You describe the program you supplied as a test program. Ruby has really great testing libraries you could make use of that will be nicer than writing scripts like this. Minitest is part of the Ruby standard library and is my favorite testing framework in Ruby, but a lot of people like RSpec too. (I'd like to link to the documentation for those frameworks, but I don't have enough reputation—sorry. You'll have to Google.)
It'll be hard to make use of those frameworks if you write your code imperatively like this, though. Ruby is a very object-oriented language and you'll get a lot out of structuring your code in an object-oriented style when working in it. If you're not familiar with OO design, some books that have been really good for me are Practical Object-Oriented Design in Ruby by Sandi Metz, Refactoring: Improving the Design of Existing Code by Martin Fowler et al., and Growing Object-Oriented Software, Guided by Tests, by Steve Freeman and Nat Pryce. (Same thing here with the lack of links.)

Ruby - how to read first n lines from file into array

For some reason, I can't find any tutorial mentioning how to do this...
So, how do I read the first n lines from a file?
I've come up with:
while File.open('file.txt') and count <= 3 do |f|
...
count += 1
end
end
but it is not working and it also doesn't look very nice to me.
Just out of curiosity, I've tried things like:
File.open('file.txt').10.times do |f|
but that didn't really work either.
So, is there a simple way to read just the first n lines without having to load the whole file?
Thank you very much!
Here is a one-line solution:
lines = File.foreach('file.txt').first(10)
I was worried that it might not close the file in a prompt manner (it might only close the file after the garbage collector deletes the Enumerator returned by File.foreach). However, I used strace and I found out that if you call File.foreach without a block, it returns an enumerator, and each time you call the first method on that enumerator it will open up the file, read as much as it needs, and then close the file. That's nice, because it means you can use the line of code above and Ruby will not keep the file open any longer than it needs to.
There are many ways you can approach this problem in Ruby. Here's one way:
File.open('Gemfile') do |f|
lines = 10.times.map { f.readline }
end
File.foreach('file.txt').with_index do |line, i|
break if i >= 10
puts line
end
File inherits from IO and IO mixes in Enumerable methods which include #first
Passing an integer to first(n) will return the first n items in the enumerable collection. For a File object, each item is a line in the file.
File.open('filename.txt', 'r').first(10)
This returns an array of the lines including the \n line breaks.
You may want to #join them to create a single whole string.
File.open('filename.txt', 'r').first(10).join
You could try the following:
`head -n 10 file`.split
It's not really "pure ruby" but that's rarely a requirement these days.

Learn Ruby the Hard Way ex17 extra credit 3 - consolidating to one line

For exercise 17, through searching other responses I was able to condense the following into one line (as asked in the extra credit #3)
from_file, to_file = ARGV
script = $0
input = File.open(from_file)
indata = input.read()
output = File.open(to_file, 'w')
output.write(indata)
output.close()
input.close()
I was able to condense it into:
from_file, to_file = ARGV
script = $0
File.open(to_file, 'w') {|f| f.write IO.read(from_file)}
Is there a better/different way to condense this into 1 line?
Can someone help explain the line I created? I created this from various questions/answers unrelated to this question. I have tried looking up exactly what I did but I am still a little lost and want a full understanding of it.
Similar to using IO::read to simplify "just read the whole file into a string", you can use IO::write to "just write the string to the file":
from_file, to_file = ARGV
IO.write(to_file, IO.read(from_file))
Since you don't use script, it can be removed. If you really want to get things down to one line, you can do:
IO.write(ARGV[1], IO.read(ARGV[0]))
I personally find this just as comprehensible, and the lack of error checking is equivalent.
You're using File#open with a block to open to_file in write-only mode ('w'). Inside the block you have access to the open file as f, and the file will be closed for you when the block terminates. IO::read reads the entire contents of from_file, which you then pass to IO#write on f (File is a subclass of IO), writing those contents to f (which is the open, write-only File for to_file).
There are always different ways of doing things:
Using File.open with a block is a good approach here. I like that to_file and from_file are declared in variables. So I think this is a good and readable solution that is not overly verbose.
The basic approach here is swapping out open/close operations with the more-clean File.open method with a block. File.open with a block will open a file, run the block, and then close the file, which is exactly what is needed here. Because the method automatically opens and closes the file, we are able to remove the boilerplate code that appears in the initial example. IO.read is another shortcut method that allows us to open/read/close the file without all of the open/close boilerplate. This is an exercise to learn more about Ruby's standard File/IO library, and in this case swapping out the more verbose methods is sufficient to reduce things to a single line.
I'm just a complete beginner, but this works for me:
open(ARGV[1], 'w').write(open(ARGV[0]).read)
It doesn't look elegant for me, but it works.
Edit: This is my attempt to put the entire script into one line if it's not clear.

How to close temporary(anonymous) File object passed in as parameter?

Sometimes I found myself need to open a file, read it content and do some functional manipulation and store the data to an variable. This would end up with the following line of code:
#some_vars = File.open("items.txt").read.chomp!.split(',')
I have two questions here:
Does the File instance File.open() closed after this line?
How to close such a File instance without sacrificing the readability?
No, File.open leaves the file handle open. You should use IO.read instead, which returns the entire contents of the file and closes it when it's done:
IO.read("items.txt").chomp!.split(',')
This is bit shorter for one-liners than passing a block to File.open.
The example you posted will not close the file descriptor automatically. You would have to manually call File#close on the descriptor, or let Ruby close the file automatically when the interpreter exits.
If you want to automatically close a file, you need the File#open block syntax:
File.open('items.txt') { |f| f.read.chomp!.split(',') }
Ruby will then close the file whenever the block terminates.
Even using block in File.open doesn't make sure that file will be always closed, for example, when the application is suspended.
There's a trick to protect from this kind of situation.
f = File.open('items.txt', 'w')
at_exit { f.flush; f.close }
Then at_exit block will be executed at the end of application or when the program exits.

What does 'a' mean in Ruby `open()`, and what does |f| mean?

What does 'a' and |f| mean below ?
open('myfile.out', 'a') { |f|
f.puts "Hello, world."
}
From the ruby IO doc:
"a" | Write-only, starts at end of file if file exists,
| otherwise creates a new file for writing.
The |f| is a variable that holds the IO object in the block (everything in the {}). So when you f.puts "Hello World" you're calling puts on the IO object which then writes to the file.
The 'a' is just a file open mode, like you'd see in C / C++. It means append, and is relatively uncommon - you're more likely to be familiar with 'r' (read), 'w' (write), etc.
The {|f| ... } bit is the exciting part. It's called a a block - they're everywhere, and they're probably my favourite part of Ruby - I've gone back to C++ recently, and I find myself cursing the language for not supporting them all the time.
Think of code like foo(bar) {|baz| ... } as creating a nameless function, and passing that function as another (hidden) argument to foo (kinda like this is a hidden argument to member functions in C++) - it's just not as hidden, 'cause you specify it right there.
Now, when you pass the block to foo, it will eventually call your block (using the yield statement), and it will supply the argument baz. If my foo behaved like your File.open function, its definition would look something like this:
def foo(filename, &block)
file = File.open(filename)
yield(file)
file.close
end
You can see how it opens the file, passes it to your block with yield, and then closes the file once your block returns. Very convenient - blocks are your friends!
Another good place to start wrapping your head around them is the each function - one of the simplest and most common block functions in Ruby:
[holt#Michaela ~]$ irb
irb(main):001:0> ['Welcome', 'to', 'Ruby!'].each {|word| puts word}
Welcome
to
Ruby!
=> ["Welcome", "to", "Ruby!"]
irb(main):002:0>
This time, your block gets called three times, and each time a different array element gets yielded to your block as word - it's a super-simple way to call a function for every element of an array.
Hope this helps, and welcome to Ruby!
'a' -> Mode in which to open the file ('append' mode)
f is a parameter to the block. A block is a piece of code that can be executed (it is a Proc object underneath).
Here, f will be the file descriptor, I think.
1) You call the open method, passing in the two arguments:
myfile.out <-- This is your file that you want to access
a <-- you are stating that you want to write to a file, starting at the end of the file(aka append)
2) The method open that exists in Kernel, yields an IO stream object aka |f|, in which you can access throughout your block.
3) You are appending "hello world" to myfile.out
4) Once the block ends, the IO stream closes.
The 'a', which stands for append, opens the file in write-only mode and starts writing at the end of the file. If no file exists, a new file is created. Please see the Ruby Docs for more information.
The |f| is a block parameter, which is being passed within the {}. For more information on blocks, please see The Pragmatic Programmer's Guide.
I would highly suggest reading through the help file for the File class for starters.
You can see there the documentation for the open method.
The method signature is File.open(filename, mode)
So, in your example, a, is the mode which in this case is append. Here's a list of valid values for the mode argument:
'r' - Open a file for reading. The file must exist.
'w' - Create an empty file for writing. If a file with the same name already exists its content is erased and the file is treated as a new empty file.
'a' - Append to a file. Writing operations append data at the end of the file. The file is created if it does not exist.
'r+' - Open a file for update both reading and writing. The file must exist.
'w+' - Create an empty file for both reading and writing. If a file with the same name already exists its content is erased and the file is treated as a new empty file.
'a+' - Open a file for reading and appending. All writing operations are performed at the end of the file, protecting the previous content to be overwritten. You can reposition (fseek, rewind) the internal pointer to anywhere in the file for reading, but writing operations will move it back to the end of file. The file is created if it does not exist.
If File.open is used in a block, such as in your example, then f becomes the block variable that points to the newly-opened file, which allows you to both read and write to the file just using f as the reference, while within the block. Using this form of File.open is nice because it handles closing the file automatically when the block ends.
open('myfile.out', 'a') -> Here 'a' means Write only access. Pointer is positioned at end of file.
|f| is the file descriptor, it does puts of "Hello, World."
Instead of |f|, you can write anything, say |abc| or |line|, it doesn't matter.

Resources