What does 'a' and |f| mean below ?
open('myfile.out', 'a') { |f|
f.puts "Hello, world."
}
From the ruby IO doc:
"a" | Write-only, starts at end of file if file exists,
| otherwise creates a new file for writing.
The |f| is a variable that holds the IO object in the block (everything in the {}). So when you f.puts "Hello World" you're calling puts on the IO object which then writes to the file.
The 'a' is just a file open mode, like you'd see in C / C++. It means append, and is relatively uncommon - you're more likely to be familiar with 'r' (read), 'w' (write), etc.
The {|f| ... } bit is the exciting part. It's called a a block - they're everywhere, and they're probably my favourite part of Ruby - I've gone back to C++ recently, and I find myself cursing the language for not supporting them all the time.
Think of code like foo(bar) {|baz| ... } as creating a nameless function, and passing that function as another (hidden) argument to foo (kinda like this is a hidden argument to member functions in C++) - it's just not as hidden, 'cause you specify it right there.
Now, when you pass the block to foo, it will eventually call your block (using the yield statement), and it will supply the argument baz. If my foo behaved like your File.open function, its definition would look something like this:
def foo(filename, &block)
file = File.open(filename)
yield(file)
file.close
end
You can see how it opens the file, passes it to your block with yield, and then closes the file once your block returns. Very convenient - blocks are your friends!
Another good place to start wrapping your head around them is the each function - one of the simplest and most common block functions in Ruby:
[holt#Michaela ~]$ irb
irb(main):001:0> ['Welcome', 'to', 'Ruby!'].each {|word| puts word}
Welcome
to
Ruby!
=> ["Welcome", "to", "Ruby!"]
irb(main):002:0>
This time, your block gets called three times, and each time a different array element gets yielded to your block as word - it's a super-simple way to call a function for every element of an array.
Hope this helps, and welcome to Ruby!
'a' -> Mode in which to open the file ('append' mode)
f is a parameter to the block. A block is a piece of code that can be executed (it is a Proc object underneath).
Here, f will be the file descriptor, I think.
1) You call the open method, passing in the two arguments:
myfile.out <-- This is your file that you want to access
a <-- you are stating that you want to write to a file, starting at the end of the file(aka append)
2) The method open that exists in Kernel, yields an IO stream object aka |f|, in which you can access throughout your block.
3) You are appending "hello world" to myfile.out
4) Once the block ends, the IO stream closes.
The 'a', which stands for append, opens the file in write-only mode and starts writing at the end of the file. If no file exists, a new file is created. Please see the Ruby Docs for more information.
The |f| is a block parameter, which is being passed within the {}. For more information on blocks, please see The Pragmatic Programmer's Guide.
I would highly suggest reading through the help file for the File class for starters.
You can see there the documentation for the open method.
The method signature is File.open(filename, mode)
So, in your example, a, is the mode which in this case is append. Here's a list of valid values for the mode argument:
'r' - Open a file for reading. The file must exist.
'w' - Create an empty file for writing. If a file with the same name already exists its content is erased and the file is treated as a new empty file.
'a' - Append to a file. Writing operations append data at the end of the file. The file is created if it does not exist.
'r+' - Open a file for update both reading and writing. The file must exist.
'w+' - Create an empty file for both reading and writing. If a file with the same name already exists its content is erased and the file is treated as a new empty file.
'a+' - Open a file for reading and appending. All writing operations are performed at the end of the file, protecting the previous content to be overwritten. You can reposition (fseek, rewind) the internal pointer to anywhere in the file for reading, but writing operations will move it back to the end of file. The file is created if it does not exist.
If File.open is used in a block, such as in your example, then f becomes the block variable that points to the newly-opened file, which allows you to both read and write to the file just using f as the reference, while within the block. Using this form of File.open is nice because it handles closing the file automatically when the block ends.
open('myfile.out', 'a') -> Here 'a' means Write only access. Pointer is positioned at end of file.
|f| is the file descriptor, it does puts of "Hello, World."
Instead of |f|, you can write anything, say |abc| or |line|, it doesn't matter.
Related
What is the difference between doing:
file = open('myurl')
# Do stuff with file
And doing:
open('myurl') do |file|
# Do things with file
end
Do I need to close and remove the file when I am not using the block approach? If so, how do I close and remove it? I don't see any close/remove method in the docs
The documentation for OpenURI is a little opaque to beginners, but the docs for #open can be found here.
Those docs say:
#open returns an IO-like object if block is not given. Otherwise it yields the IO object and return the value of the block.
The key words here are "IO-like object." We can infer from that that the object (in your examples, file), will respond to the #close method.
While the documentation doesn't say so, by looking at the source we can see that #open will return either a StringIO or a Tempfile object, depending on the size of the data returned. OpenURI's internal Buffer class first initializes a StringIO object, but if the size of the output exceeds 10,240 bytes it creates a Tempfile and writes the data to it (to avoid storing large amounts of data in memory). Both StringIO and Tempfile have behavior consistent with IO, so it's good practice (when not passing a block to #open), to call #close on the object in an ensure:
begin
file = open(url)
# ...do some work...
ensure
file.close
end
Code in the ensure section always runs, even if code between begin and ensure raises an exception, so this will, well, ensure that file.close gets called even if an error occurs.
For exercise 17, through searching other responses I was able to condense the following into one line (as asked in the extra credit #3)
from_file, to_file = ARGV
script = $0
input = File.open(from_file)
indata = input.read()
output = File.open(to_file, 'w')
output.write(indata)
output.close()
input.close()
I was able to condense it into:
from_file, to_file = ARGV
script = $0
File.open(to_file, 'w') {|f| f.write IO.read(from_file)}
Is there a better/different way to condense this into 1 line?
Can someone help explain the line I created? I created this from various questions/answers unrelated to this question. I have tried looking up exactly what I did but I am still a little lost and want a full understanding of it.
Similar to using IO::read to simplify "just read the whole file into a string", you can use IO::write to "just write the string to the file":
from_file, to_file = ARGV
IO.write(to_file, IO.read(from_file))
Since you don't use script, it can be removed. If you really want to get things down to one line, you can do:
IO.write(ARGV[1], IO.read(ARGV[0]))
I personally find this just as comprehensible, and the lack of error checking is equivalent.
You're using File#open with a block to open to_file in write-only mode ('w'). Inside the block you have access to the open file as f, and the file will be closed for you when the block terminates. IO::read reads the entire contents of from_file, which you then pass to IO#write on f (File is a subclass of IO), writing those contents to f (which is the open, write-only File for to_file).
There are always different ways of doing things:
Using File.open with a block is a good approach here. I like that to_file and from_file are declared in variables. So I think this is a good and readable solution that is not overly verbose.
The basic approach here is swapping out open/close operations with the more-clean File.open method with a block. File.open with a block will open a file, run the block, and then close the file, which is exactly what is needed here. Because the method automatically opens and closes the file, we are able to remove the boilerplate code that appears in the initial example. IO.read is another shortcut method that allows us to open/read/close the file without all of the open/close boilerplate. This is an exercise to learn more about Ruby's standard File/IO library, and in this case swapping out the more verbose methods is sufficient to reduce things to a single line.
I'm just a complete beginner, but this works for me:
open(ARGV[1], 'w').write(open(ARGV[0]).read)
It doesn't look elegant for me, but it works.
Edit: This is my attempt to put the entire script into one line if it's not clear.
Sometimes I found myself need to open a file, read it content and do some functional manipulation and store the data to an variable. This would end up with the following line of code:
#some_vars = File.open("items.txt").read.chomp!.split(',')
I have two questions here:
Does the File instance File.open() closed after this line?
How to close such a File instance without sacrificing the readability?
No, File.open leaves the file handle open. You should use IO.read instead, which returns the entire contents of the file and closes it when it's done:
IO.read("items.txt").chomp!.split(',')
This is bit shorter for one-liners than passing a block to File.open.
The example you posted will not close the file descriptor automatically. You would have to manually call File#close on the descriptor, or let Ruby close the file automatically when the interpreter exits.
If you want to automatically close a file, you need the File#open block syntax:
File.open('items.txt') { |f| f.read.chomp!.split(',') }
Ruby will then close the file whenever the block terminates.
Even using block in File.open doesn't make sure that file will be always closed, for example, when the application is suspended.
There's a trick to protect from this kind of situation.
f = File.open('items.txt', 'w')
at_exit { f.flush; f.close }
Then at_exit block will be executed at the end of application or when the program exits.
What This Question Is Not About
This question is not about how to auto-close a file with File#close or the File#open block syntax. It's a question about where Ruby stores its list of open file descriptors at runtime.
The Actual Question
If you have a program with open descriptors, but you don't have access to the related File or IO object, how can you find a reference to the currently-open file descriptors? Take this example:
filename='/tmp/foo'
%x( touch "#{filename}" )
File.open(filename)
filehandle = File.open(filename)
The first File instance is opened, but the reference to the object is not stored in a variable. The second instance is stored in filehandle, where I can easily access it with #inspect or #close.
However, the discarded File object isn't gone; it's just not accessible in any obvious way. Until the object is finalized, Ruby must be keeping track of it somewhere...but where?
TL; DR
All File and IO objects are stored in ObjectSpace.
Answer
The ObjectSpace class says:
The ObjectSpace module contains a number of routines that interact with the garbage collection facility and allow you to traverse all living objects with an iterator.
How I Tested This
I tested this at the console on Ruby 1.9.3p194.
The test fixture is really simple. The idea is to have two File objects with different object identities, but only one is directly accessible through a variable. The other is "out there somewhere."
# Don't save a reference to the first object.
filename='/tmp/foo'
File.open(filename)
filehandle = File.open(filename)
I then explored different ways I could interact with the File objects even if I didn't use an explicit object reference. This was surprisingly easy once I knew about ObjectSpace.
# List all open File objects.
ObjectSpace.each_object(File) do |f|
puts "%s: %d" % [f.path, f.fileno] unless f.closed?
end
# List the "dangling" File object which we didn't store in a variable.
ObjectSpace.each_object(File) do |f|
unless f.closed?
printf "%s: %d\n", f.path, f.fileno unless f === filehandle
end
end
# Close any dangling File objects. Ignore already-closed files, and leave
# the "accessible" object stored in *filehandle* alone.
ObjectSpace.each_object(File) {|f| f.close unless f === filehandle rescue nil}
Conclusion
There may be other ways to do this, but this is the answer I came up with to scratch my own itch. If you know a better way, please post another answer. The world will be a better place for it.
like the following line of code
sites = YAML::load(File.open(SITESPATH))
is it necessary to change to
File.open(SITESPATH) do |file|
sites = YAML::load(file)
end
just in order to make it sure that file got closed?
Yes, you should close the file, so your second example is the correct one.
Just as a side-note, remember that the sites variable will not be visible outside the block, unless you already created it before the block.
Because IO.open, when called with block, returns the value of the block, you may use:
sites = File.open(SITESPATH) {|file| YAML::load(file) }
You could use YAML.load_file(filename) instead.
This isn't really about YAML::load so much as it is about File/IO streams generally.
Called without a block, File.open is exactly the same as File.new. This doesn't close a file on its own, so you would need to close it yourself.
From the documentation:
With no associated block, [File.]open is a synonym for IO.new. If the
optional code block is given, it will be passed io as an argument,
and the IO object will automatically be closed when the block
terminates. In this instance, IO.open returns the value of the block.