Ruby has two ways of referring to the standard input: The STDIN constant , and the $stdin global variable.
Aside from the fact that I can assign a different IO object to $stdin because it's not a constant (e.g. before forking to redirect IO in my children), what's the difference between STDIN and $stdin? When should I use each in my code?
If I reassign $stdin, does it affect STDIN?
And does this also apply to STDOUT/$stdout and STDER/$stderr?
If $stdin is reassigned, STDIN is not affected. Likewise $stdin is not affected when STDIN is reassigned (which is perfectly possible (though pointless), but will produce a warning). However if neither variable has been reassigned, they both point to the same IO object, so calling reopen¹ on one will affect the other.
All the built-in ruby methods use $< (a.k.a. ARGF) to read input. If ARGV is empty, ARGF reads from $stdin, so if you reassign $stdin, that will affect all built-in methods. If you reassign STDIN it will have no effect unless some 3rd party method uses STDIN.
In your own code you should use $stdin to be consistent with the built-in methods².
¹ reopen is a method which can redirect an IO object to another stream or file. However you can't use it to redirect an IO to a StringIO, so it does not eliminate all uses cases of reassigning $stdin.
² You may of course also use $</ARGF to be even more consistent with the built-in methods, but most of the time you don't want the ARGF behavior if you're explicitly using the stdin stream.
STDERR and $stderr are pointing to the same thing initially; you can reassign the global variable but you shouldn't mess with the constant. $stdin and STDIN, $stdout and STDOUT pairs are likewise.
I had to change STDERR a couple of times as an alternative to monkey-patching some gems outputting error messages with STDERR.puts. If you reassign with STDERR = $stdout you get a warning while STDERR.reopen('nul', 'w') goes without saying.
Related
Here is a simple ruby script, that takes input from a user and provides an output(yes, it will be refactored). I wanted this script to provide output into a text file, rather than console window. That was accomplished, by simply adding $stdout = File.new('out.txt', 'w'), but I thought that this line will just describe a variable which I will use later to tell script to use it to write output into created file.
I cant find much documentation about this method and wondering how does this program knows how to write generated output into that file?
$stdout is a global variable. By default it stores an object of type IO associated with the standard output of the program (which is, by default, the console).
puts is a method of the Kernel module that actually calls $stdout.send() and pass it the list of arguments it receives. As the documentation explains, puts(obj, ...) is equivalent to $stdout.puts(obj, ...).
Your code replaces $stdout with an object of type File that extends class IO. When it is created, your object opens the file out.txt for writing and together with its inheritance from IO it is fully compatible with the default behaviour of $stdout.
Since by default, all the output goes to $stdout, your new definition of $stdout ensures the output is written to the file out.txt without other changes in the code.
$stdout is a global variable (as indicated by $) and according to the documentation, puts is:
Equivalent to
$stdout.puts(obj, ...)
If you assign another object to $stdout, then Kernel#puts will simply send puts to that object. Likewise, print will send write:
class Foo < BasicObject
def puts(*args)
::STDOUT.puts "Foo#puts called with #{args.inspect}"
end
def write(*args)
::STDOUT.puts "Foo#write called with #{args.inspect}"
end
end
$stdout = Foo.new
puts 'hello', 'world'
# Foo#puts called with ["hello", "world"]
print "\n"
# Foo#write called with ["\n"]
Note that if you assign to $stdout, Ruby checks whether the object responds to write. If not, a TypeError will be raised.
It's probably worth highlighting the fact, mentioned by the other posts, that this is a global variable, and modifying it is not thread safe. For instance, if you point $stdout to a new IO stream, anything across the app (on that process thread) that would be logged to stdout will now be logged to your new IO stream. This can lead to multiple streams of unexpected and possibly sensitive input landing in your new IO stream.
I wonder if there is any real difference between STDIN and $stdin. I do in irb:
STDIN == $stdin
and get back true. Are they just two names for the same thing? Or is there some difference?
From Ruby globals:
STDIN
The standard input. The default value for $stdin.
They are the same object by default.
[1] pry(main)> $stdin.object_id
=> 13338048
[2] pry(main)> STDIN.object_id
=> 13338048
[3] pry(main)> $stdin.object_id == STDIN.object_id
=> true
As #shivam commented, $stdin is a global variable and it may be assigned to something different, while STDIN is a constant.
STDIN is a constant and therefore you'll get a ruby warning if you try to replace it. Otherwise the two are just normal ruby variables in that they can point to the same object (and are by default) and if they do, doing something with one will affect the other variable, but if you assign something else to one of the variables, they will be different.
Standard ruby methods like get will read from $stdin (not STDIN) by default. That means you can override $stdin ($stdout, $stderr) for standard methods and use the constant versions to see what $stdin, $stdout or $stderr originaly were.
Note that overriding $stdin, $stdout, or $stderr won't affect the standard streams of newly spawned programs (the actual filedescriptors 0, 1, and 2 respectively). To do that you'd need to call IO#reopen on the stream you'd want to change, e.g. (assuming the constant version hasn't been forcibly replaced),
STDOUT.reopen("newfile") #Write all output to "newfile" including the output of newly spawned processes (`%x{}`,`system`, `spawn`, `IO.popen`, etc.)
Now with reopen, you can replace the streams only to actual OS-level files/file descriptors (e.g., no StringIO), but if you're on UNIX, there's not much you can't do with OS-level files (you can change them to pipes which you can read elsewhere in your program, for example).
I've been trying to understand the differences between the two, but as of now I have a very rudimentary understanding. ARGV is a subset of IO I believe. I know that ARGV returns an array when called into a Ruby script, but IO does the same thing as well. Can anyone explain this topic to me or direct me to a good explanation? I've searched multiple blogs but to no avail.
Thanks!
ARGV is an array:
> ARGV.class
=> Array
This array contains the command line arguments for your script. For example:
$ cat pancakes.rb
puts ARGV.inspect
$ ruby pancakes.rb -where is house
["-where", "is", "house"]
IO is quite a bit different. IO is:
[...] the basis for all input and output in Ruby. [...]
Many of the examples in this section use the File class, the only standard subclass of IO. The two classes are closely associated. Like the File class, the Socket library subclasses from IO (such as TCPSocket or UDPSocket).
IO is the base class for file-like things in Ruby.
Perhaps you're thinking of ARGF rather than ARGV:
ARGF is a stream designed for use in scripts that process files given as command-line arguments or passed in via STDIN.
The arguments passed to your script are stored in the ARGV Array, one argument per element. ARGF assumes that any arguments that aren't filenames have been removed from ARGV.
[...]
If ARGV is empty, ARGF acts as if it contained STDIN, i.e. the data piped to your script.
So you can use ARGF like an IO that lets you say:
$ your_script some_file
and
$ some_command | your_script
without your_script really having to care about which way it is called.
The IO class is the basis for all input and output in Ruby. An I/O stream may be duplexed (that is, bidirectional), and so may use more than one native operating system stream.
Many of the examples in this section use the File class, the only
standard subclass of IO. The two classes are closely associated. Like
the File class, the Socket library subclasses from IO (such as
TCPSocket or UDPSocket).
The Kernel#open method can create an IO (or File) object for these
types of arguments:
A plain string represents a filename suitable for the underlying
operating system. A string starting with "|" indicates a subprocess.
The remainder of the string following the "|" is invoked as a process
with appropriate input/output channels connected to it. A string equal
to "|-" will create another Ruby instance as a subprocess. The IO may
be opened with different file modes (read-only, write-only) and
encodings for proper conversion. See ::new for these options. See
Kernel#open for details of the various command formats described
above.
Source: http://www.ruby-doc.org/core-2.1.2/IO.html
ARGF is a stream designed for use in scripts that process files given as command-line arguments or
passed in via STDIN.
The arguments passed to your script are stored in the ARGV Array, one
argument per element. ARGF assumes that any arguments that aren't
filenames have been removed from ARGV.
argv → ARGV Returns the ARGV array, which contains the arguments
passed to your script, one per element.
Source: http://www.ruby-doc.org/core-2.1.2/ARGF.html
I've often read that Ruby is a pure object oriented language since commands are typically given as messages passed to the object.
For example:
In Ruby one writes: "A".ord to get the ascii code for A and 0x41.chr to emit the character given its ascii code.
This is in contrast to Python's: ord("A") and chr(0x41)
So far so good --- Ruby's syntax is message passing.
But the apparent inconsistency appears when considering the string output command:
Now one has: puts str or puts(str) instead of str.puts
Given the pure object orientation expectation for Ruby's syntax, I would have expected the output command to be a message passed to the string object, i.e. calling a method from the string class, hence str.puts
Any explanations? Am I missing something?
Thanks
I would have expected the output command to be a message passed to the string object, i.e. calling a method from the string class, hence str.puts
This is incorrect expectation, let's start with that. Why would you tell a string to puts itself? What would it print itself to? It knows nothing (and should know nothing) of files, I/O streams, sockets and other places you can print things to.
When you say puts str, it's actually seen as self.puts str (implicit receiver). That is, the message is sent to the current object.
Now, all objects include Kernel module. Therefore, all objects have Kernel#puts in their lists of methods. Any object can puts (including current object, self).
As the doc says,
puts str
is translated to
$stdout.puts str
That is, by default, the implementation is delegated to standard output (print to console). If you want to print to a file or a socket, you have to invoke puts on an instance of file or socket classes. This is totally OO.
Ruby isn't entirely OO (for example, methods are not objects), but in this case, it is. puts is Kernel#puts, which is shorthand for $stdout.puts. That is, you're calling the puts method of the $stdout stream and passing a string as the parameter to be output to the stream. So, when you call
puts "foo"
You're really calling:
$stdout.puts("foo")
Which is entirely consistent with OO.
puts is a method on an output streams e.g.
$stdout.puts("this", "is", "a", "test")
Printing something to somewhere at least involves two things: what is written and where it is written to. Depending on what you focus on, there can be different implementations, even in OOP. Besides that, Ruby has a way to make a method look more like a function (i.e., not being particularly tied to a receiver as in OOP) for methods that are used all over the place. So there are at least three logical options that could be thought of for such methods like printing.
An OOP method defined on the object to be printed
An OOP method defined on the object where it should be printed
A function-style method
For the second option, IO#write is one example; The receiver is the destination of writing.
The puts without an explicit receiver is actually Kernel#puts, and takes neither of the two as the arguments; it is an example of the third option; you are correct to point out that this is not so OOP, but Matz especially provided the Kernel module to be able to do things like this: a function-style method.
The first option is what you are expecting; it is nothing wrong. It happens that there is no well known method of this type, but it was proposed in the Ruby core by one of the developers, but unfortunately, it did not make it. Actually, I felt the same thing as you, and have something similar in my personal library called Object#intercept. A simplified version is this:
class Object
def intercept
tap{|x| p x}
end
end
:foo.intercept # => :foo
You can replace p with puts if you want.
I'm working on a ruby script that ultimately starts up a system process that takes quite a while. I need to read from the stderr of this process and react to it depending on what is output.
I'm currently doing it as such:
Open3.popen3(cmd_to_run) do |stdin, stdout, stderr, waitthread|
stderr.each_line do |line|
# look out for specific lines and react to them accordingly
end
end
But I've also seen implementations to achieve something similar but doing it with kernel#select:
Open3.popen3(cmd_to_run) do |stdin, stdout, stderr, waitthread|
io = select([stderr], nil, nil, 30)
if io.nil?
log("Command timed out during Kernel#select")
return
end
io[0][0].each_line do |line|
# look out for specific lines and react to them accordingly
end
end
I've read the pickaxe description of what select does, but I'm confused as to why I should (of if I should) use it? The first method works just the same.
Probably two reasons:
You can use timeout, which you can't with each_line
You can wait for more than one IO object, e. g. io = select([stdout, stderr]) and more than one event (e.g. write event or exception too)