Ruby: What's the difference between IO and ARGV? - ruby

I've been trying to understand the differences between the two, but as of now I have a very rudimentary understanding. ARGV is a subset of IO I believe. I know that ARGV returns an array when called into a Ruby script, but IO does the same thing as well. Can anyone explain this topic to me or direct me to a good explanation? I've searched multiple blogs but to no avail.
Thanks!

ARGV is an array:
> ARGV.class
=> Array
This array contains the command line arguments for your script. For example:
$ cat pancakes.rb
puts ARGV.inspect
$ ruby pancakes.rb -where is house
["-where", "is", "house"]
IO is quite a bit different. IO is:
[...] the basis for all input and output in Ruby. [...]
Many of the examples in this section use the File class, the only standard subclass of IO. The two classes are closely associated. Like the File class, the Socket library subclasses from IO (such as TCPSocket or UDPSocket).
IO is the base class for file-like things in Ruby.
Perhaps you're thinking of ARGF rather than ARGV:
ARGF is a stream designed for use in scripts that process files given as command-line arguments or passed in via STDIN.
The arguments passed to your script are stored in the ARGV Array, one argument per element. ARGF assumes that any arguments that aren't filenames have been removed from ARGV.
[...]
If ARGV is empty, ARGF acts as if it contained STDIN, i.e. the data piped to your script.
So you can use ARGF like an IO that lets you say:
$ your_script some_file
and
$ some_command | your_script
without your_script really having to care about which way it is called.

The IO class is the basis for all input and output in Ruby. An I/O stream may be duplexed (that is, bidirectional), and so may use more than one native operating system stream.
Many of the examples in this section use the File class, the only
standard subclass of IO. The two classes are closely associated. Like
the File class, the Socket library subclasses from IO (such as
TCPSocket or UDPSocket).
The Kernel#open method can create an IO (or File) object for these
types of arguments:
A plain string represents a filename suitable for the underlying
operating system. A string starting with "|" indicates a subprocess.
The remainder of the string following the "|" is invoked as a process
with appropriate input/output channels connected to it. A string equal
to "|-" will create another Ruby instance as a subprocess. The IO may
be opened with different file modes (read-only, write-only) and
encodings for proper conversion. See ::new for these options. See
Kernel#open for details of the various command formats described
above.
Source: http://www.ruby-doc.org/core-2.1.2/IO.html
ARGF is a stream designed for use in scripts that process files given as command-line arguments or
passed in via STDIN.
The arguments passed to your script are stored in the ARGV Array, one
argument per element. ARGF assumes that any arguments that aren't
filenames have been removed from ARGV.
argv → ARGV Returns the ARGV array, which contains the arguments
passed to your script, one per element.
Source: http://www.ruby-doc.org/core-2.1.2/ARGF.html

Related

GO cobra: space separated values in StringArray flags

In GO's Cobra, lib for making CLIs, there are two input flags accepting multiple values being passed. One of the options is StringArray, when used as follows:
--flag=value1 --flag=value2
it yields an array ["value1", "value2"].
I am working on a drop-in replacement for a tool that expects somewhat more complex input:
--flag=valueA1 valueB1 --flag=valueA2 valueB2
the array it should yield would be ["valueA1 valueB1", "valueA2 valueB2"]
is there a way in cobra to parse the entire string until the next flag and include it in StringArray value like above?
There isn't a built-in way in cobra to do this, as there will be ambiguity. For example, in the case when there is also a sub-command named valueB1 or valueB2, it's not clear whether those should be executed as subcommands or interpreted as additional argument to --flag.
The standard way to support an input like this is to expect the the input values are quoted, and cobra supports that. E.g.:
--flag="valueA1 valueB1" --flag="valueA2 valueB2"

Where is Stat_t defined for plan9?

In the plan9 specific Go code for syscall, there is no Stat_t like with other GOOS. Where is Stat_t, or its equivalent defined?
TL;DR: It's the *syscall.Dir type. Read on for details.
The source for os.Stat on Plan9 is here. It calls dirstat, which is defined here. It feeds the return value of dirstat into fileInfoFromStat, which is defined in the same file here.
In the case of paths (as opposed to *File objects), dirstat just calls syscall.Stat, which is basically just a thin wrapper around stat. syscall.Stat expects a byte buffer to be able to write into. This buffer is processed a bit (see dirstat for details), and then fed into syscall.UnmarshalDir, which is where the magic happens. The documentation states that it "decodes a single 9P stat message" from a buffer and returns a *syscall.Dir.
dirstat then passes this *syscall.Dir to fileInfoFromStat, which is what processes it into a FileInfo. It's this *syscall.Dir value that is obtained through the Sys() method on the FileInfo object.

Difference between `STDIN` and `$stdin`

I wonder if there is any real difference between STDIN and $stdin. I do in irb:
STDIN == $stdin
and get back true. Are they just two names for the same thing? Or is there some difference?
From Ruby globals:
STDIN
The standard input. The default value for $stdin.
They are the same object by default.
[1] pry(main)> $stdin.object_id
=> 13338048
[2] pry(main)> STDIN.object_id
=> 13338048
[3] pry(main)> $stdin.object_id == STDIN.object_id
=> true
As #shivam commented, $stdin is a global variable and it may be assigned to something different, while STDIN is a constant.
STDIN is a constant and therefore you'll get a ruby warning if you try to replace it. Otherwise the two are just normal ruby variables in that they can point to the same object (and are by default) and if they do, doing something with one will affect the other variable, but if you assign something else to one of the variables, they will be different.
Standard ruby methods like get will read from $stdin (not STDIN) by default. That means you can override $stdin ($stdout, $stderr) for standard methods and use the constant versions to see what $stdin, $stdout or $stderr originaly were.
Note that overriding $stdin, $stdout, or $stderr won't affect the standard streams of newly spawned programs (the actual filedescriptors 0, 1, and 2 respectively). To do that you'd need to call IO#reopen on the stream you'd want to change, e.g. (assuming the constant version hasn't been forcibly replaced),
STDOUT.reopen("newfile") #Write all output to "newfile" including the output of newly spawned processes (`%x{}`,`system`, `spawn`, `IO.popen`, etc.)
Now with reopen, you can replace the streams only to actual OS-level files/file descriptors (e.g., no StringIO), but if you're on UNIX, there's not much you can't do with OS-level files (you can change them to pipes which you can read elsewhere in your program, for example).

Ruby as a "pure" object oriented language --- inconsistency with Ruby puts?

I've often read that Ruby is a pure object oriented language since commands are typically given as messages passed to the object.
For example:
In Ruby one writes: "A".ord to get the ascii code for A and 0x41.chr to emit the character given its ascii code.
This is in contrast to Python's: ord("A") and chr(0x41)
So far so good --- Ruby's syntax is message passing.
But the apparent inconsistency appears when considering the string output command:
Now one has: puts str or puts(str) instead of str.puts
Given the pure object orientation expectation for Ruby's syntax, I would have expected the output command to be a message passed to the string object, i.e. calling a method from the string class, hence str.puts
Any explanations? Am I missing something?
Thanks
I would have expected the output command to be a message passed to the string object, i.e. calling a method from the string class, hence str.puts
This is incorrect expectation, let's start with that. Why would you tell a string to puts itself? What would it print itself to? It knows nothing (and should know nothing) of files, I/O streams, sockets and other places you can print things to.
When you say puts str, it's actually seen as self.puts str (implicit receiver). That is, the message is sent to the current object.
Now, all objects include Kernel module. Therefore, all objects have Kernel#puts in their lists of methods. Any object can puts (including current object, self).
As the doc says,
puts str
is translated to
$stdout.puts str
That is, by default, the implementation is delegated to standard output (print to console). If you want to print to a file or a socket, you have to invoke puts on an instance of file or socket classes. This is totally OO.
Ruby isn't entirely OO (for example, methods are not objects), but in this case, it is. puts is Kernel#puts, which is shorthand for $stdout.puts. That is, you're calling the puts method of the $stdout stream and passing a string as the parameter to be output to the stream. So, when you call
puts "foo"
You're really calling:
$stdout.puts("foo")
Which is entirely consistent with OO.
puts is a method on an output streams e.g.
$stdout.puts("this", "is", "a", "test")
Printing something to somewhere at least involves two things: what is written and where it is written to. Depending on what you focus on, there can be different implementations, even in OOP. Besides that, Ruby has a way to make a method look more like a function (i.e., not being particularly tied to a receiver as in OOP) for methods that are used all over the place. So there are at least three logical options that could be thought of for such methods like printing.
An OOP method defined on the object to be printed
An OOP method defined on the object where it should be printed
A function-style method
For the second option, IO#write is one example; The receiver is the destination of writing.
The puts without an explicit receiver is actually Kernel#puts, and takes neither of the two as the arguments; it is an example of the third option; you are correct to point out that this is not so OOP, but Matz especially provided the Kernel module to be able to do things like this: a function-style method.
The first option is what you are expecting; it is nothing wrong. It happens that there is no well known method of this type, but it was proposed in the Ruby core by one of the developers, but unfortunately, it did not make it. Actually, I felt the same thing as you, and have something similar in my personal library called Object#intercept. A simplified version is this:
class Object
def intercept
tap{|x| p x}
end
end
:foo.intercept # => :foo
You can replace p with puts if you want.

What is the difference between STDIN and $stdin in Ruby?

Ruby has two ways of referring to the standard input: The STDIN constant , and the $stdin global variable.
Aside from the fact that I can assign a different IO object to $stdin because it's not a constant (e.g. before forking to redirect IO in my children), what's the difference between STDIN and $stdin? When should I use each in my code?
If I reassign $stdin, does it affect STDIN?
And does this also apply to STDOUT/$stdout and STDER/$stderr?
If $stdin is reassigned, STDIN is not affected. Likewise $stdin is not affected when STDIN is reassigned (which is perfectly possible (though pointless), but will produce a warning). However if neither variable has been reassigned, they both point to the same IO object, so calling reopen¹ on one will affect the other.
All the built-in ruby methods use $< (a.k.a. ARGF) to read input. If ARGV is empty, ARGF reads from $stdin, so if you reassign $stdin, that will affect all built-in methods. If you reassign STDIN it will have no effect unless some 3rd party method uses STDIN.
In your own code you should use $stdin to be consistent with the built-in methods².
¹ reopen is a method which can redirect an IO object to another stream or file. However you can't use it to redirect an IO to a StringIO, so it does not eliminate all uses cases of reassigning $stdin.
² You may of course also use $</ARGF to be even more consistent with the built-in methods, but most of the time you don't want the ARGF behavior if you're explicitly using the stdin stream.
STDERR and $stderr are pointing to the same thing initially; you can reassign the global variable but you shouldn't mess with the constant. $stdin and STDIN, $stdout and STDOUT pairs are likewise.
I had to change STDERR a couple of times as an alternative to monkey-patching some gems outputting error messages with STDERR.puts. If you reassign with STDERR = $stdout you get a warning while STDERR.reopen('nul', 'w') goes without saying.

Resources