Difference between ways to use gets method - ruby

I saw two ways to use gets, a simple form:
print 'Insert your name: '
name = gets()
puts "Your name is #{name}"
and a form that drew my attention:
print 'Insert your name: '
STDOUT.flush
name = gets.chomp
puts "Your name is #{name}"
The second sample looks like perl in using the flush method of the default output stream. Perl makes explicit default output stream manipulating; the method flush is a mystery to me. It can behave different from what I'm inferring, and it uses chomp to remove the new line character.
What happens behind the scenes in the second form? What situation is it useful or necessary to use the second form?

"Flushing" the output ensures that it shows the printed message before it waits for your input; this may be just someone being certain unnecessarily, or it may be that on certain operating systems you need it. Alternatively you can use STDOUT.sync = true to force a flush after every output. (You may wonder, "Why wouldn't I always use this?" Well, if your code is outputting a lot of content, repeatedly flushing it may slow it down.)
chomp removes the newline from the end of the input. If you want the newline (the result of the user pressing "Enter" after typing their name) then don't chomp it.

Looking at some Github code I can see that STDOUT.flush is used mostly for server-side/multi-threaded jobs, and not in everyday use.
Generally speaking, when you want to accept input from the user, you'd want to use gets.chomp. Just remember, no matter what the user enters, Ruby will ALWAYS interprete that as a string.
To convert it to an integer, you need to call to_i, or to_f for a float. You don't need chomp in these cases, since to_i or to_f removes the "\n" automatically. There are a lot of subtle things going on implicitly as you'll see, and figuring them out is simply a matter of practice.

I've rarely seen someone use STDOUT.flush except in mutli-threading. Also it makes things confusing, defeating the whole purpose of writing elegant code.

Related

What is the difference between "hello".length and "hello" .length?

I am surprised when I run the following examples in ruby console. They both produce the same output.
"hello".length
and
"hello" .length
How does the ruby console remove the space and provide the right output?
You can put spaces wherever you want, the interpreter looks for the end of the line. For example:
Valid
"hello".
length
Invalid
"hello"
.length
The interpreter sees the dot at the end of the line and knows something has to follow it up. While in the second case it thinks the line is finished. The same goes for the amount of spaces in one line. Does it matter how the interpreter removes the spaces? What matters is that you know the behavior.
If you want you can even
"hello" . length
and it will still work.
I know this is not an answer to you question, but does the "how" matter?
EDIT: I was corrected in the comments below. The examples with multiple lines given above are both valid when run in a script instead of IRB. I was mixed them up with the operators. Where the following also applies when running a script:
Valid
result = true || false
Valid
result = true ||
false
Invalid
result = true
|| false
This doesn't have as much to do with the console as it has to do with how the language itself is parsed by the compiler.
Most languages are parsed in such a way that items to be parsed are first grouped into TOKENS. Then the compiler is defined to expect a certain SEQUENCE of tokens in order to interpret each programming statement.
Because the compiler is only looking for a TOKEN SEQUENCE, it doesn't matter if there is space in between or not.
In this case the compiler is looking for:
STRING DOT METHOD_NAME
So it won't matter if you write "hello".length, or even "hello" . length. The same sequence of tokens are present in both, and that is all that matters to the compiler.
If you are curious how these token sequences are defined in the Ruby source code, you can look at parse.y starting around line 1042:
https://github.com/ruby/ruby/blob/trunk/parse.y#L1042
This is a file that is written using the YACC language, which is a language used to define parsers with.
Even without knowing anything about YACC, you should already be able to get some clues on how it works by just looking around the file a bit.

Difference between 2 ways working with pipes using ARGF?

Using ARGF I can create Ruby programs that respect pipelines. Suppose, I to constantly read new entries:
$ tail -f log/test.log | my_prog
I can do this using:
ARGF.each_line do |line|
...
end
Also, I found another way:
while input = ARGF.gets
input.each_line do |line|
...
end
end
Looks like, both variants do the same thing or there is a difference between them? If so, what is it?
Thanks in advance.
As Stefan mentioned, you did a little mistake in second case. Proper way of using "ARGF.gets" approach in your case will look like:
while input = ARGF.gets
# input here represents a line
end
If you rewrite the second example as above, you will not have difference in behavior.
Actual difference you may notice between ARGF#gets and ARGF#each_line is in semantics: each_line accepts block or returns enumerator and gets returns a next line if it is available.
Another option is to use Kernel#gets. Beware it's behavior may differ from ARGF#gets in some cases, especially if you change a separator:
A separator of nil reads the entire contents, and a zero-length separator reads the input one paragraph at a time, where paragraphs are divided by two consecutive newlines.
But for reading (and then printing) constantly from stdin you may use it as follows:
print while gets

.downcase! syntax shorthand

Can someone explain what the difference is between the two pieces of code below? Both feature an ! at the end. Is the first version just the shorthand?
print "Who are you?"
user_input = gets.chomp.downcase!
print "Who are you?"
user_input = gets.chomp
user_input.downcase!
Edit: Having an exclamation point (aka "bang") at the end of a method name in ruby means "handle with care". From Matz himself:
The bang (!) does not mean "destructive" nor lack of it mean non
destructive either. The bang sign means "the bang version is more
dangerous than its non bang counterpart; handle with care". Since
Ruby has a lot of "destructive" methods, if bang signs follow your
opinion, every Ruby program would be full of bangs, thus ugly.
(For the full thread, see #sawa's link in the comments.)
For the method in question, downcase is making a copy of the given string, modifying that, and returning that copy as a result. Whereas downcase! modifies the string itself.
In the first case, you're modifying the variable stored in gets.chomp, in the second you're modifying user_input.
Note that if you call user_input.downcase on the last line (instead of user_input.downcase!) it won't actually change user_input, it just returns a copy of the string and makes the copy lowercase.

Why does Ruby's 'gets' includes the closing newline?

I never need the ending newline I get from gets. Half of the time I forget to chomp it and it is a pain in the....
Why is it there?
Like puts (which sounds similar), it is designed to work with lines, using the \n character.
gets takes an optional argument that is used for "splitting" the input (or "just reading till it arrives). It defaults to the special global variable $/, which contains a \n by default.
gets is a pretty generic method for readings streams and includes this separator. If it would not do it, parts of the stream content would be lost.
var = gets.chomp
This puts it all on one line for you.
If you look at the documentation of IO#gets, you'll notice that the method takes an optional parameter sep which defaults to $/ (the input record separator). You can decide to split input on other things than newlines, e.g. paragraphs ("a zero-length separator reads the input a paragraph at a time (two successive newlines in the input separate paragraphs)"):
>> gets('')
dsfasdf
fasfds
dsafadsf #=> "dsfasdf\nfasfds\n\n"
From a performance perspective, the better question would be "why should I get rid of it?". It's not a big cost, but under the hood you have to pay to chomp the string being returned. While you may never have had a case where you need it, you've surely had plenty of cases where you don't care -- gets s; puts stuff() if s =~ /y/i, etc. In those cases, you'll see a (tiny, tiny) performance improvement by not chomping.
How I auto-detect line endings:
# file open in binary mode
line_ending = 13.chr + 10.chr
check = file.read(1000)
case check
when /\r\n/
# already set
when /\n/
line_ending = 10.chr
when /\r/
line_ending = 13.chr
end
file.rewind
while !file.eof?
line = file.gets(line_ending).chomp
...
end

Ruby Array#puts not using overridden implementation?

I am using Ruby 1.8.6 for the following code:
# Create an array and override the #to_s on that object
thing = [1,2,3]
def thing.to_s
'one'
end
print "Using print: "
print thing
puts
puts "Using puts: "
puts thing
Output:
Using print: one
Using puts:
1
2
3
So thing is an Array and I have overridden thing#to_s. print seems to use my overriden implementation while puts does not. Why?
I have followed the source code of Kernel#puts and Kernel#print (which are C-implementations) and see that they are very different implementations. I want to know what might be the design-decision (if any) behind this?
By the way, if I create thing as an instance of another class I wrote (or as a Hash/String/other-classes I tried), both print and puts use the overridden implementation of to_s.
Oh boy ... This has already been the topic of a countless number of endless threads on the ruby-talk mailinglist, the ruby-core mailinglist and a gazillion of blogs.
The gist of it is that puts special cases Arrays. Why it special cases those, why it special cases only those (as opposed to, say, all Enumerables), why it special cases those (and not, say, print), nobody really knows. It is the way it is.
BTW, since you mentioned the POLS: the Ruby community has always made it very clear that the POLS only applies to matz. So, Ruby is about not surprising matz. If you or me or anybody else is surprised, that doesn't count.
From the Ruby Programming Language:
alt text http://ecx.images-amazon.com/images/I/41n-JSlBHkL._SL75_.jpg
Output streams are appendable, like strings and arrays are, and you can write values to them with the << operator. puts is one of the most common output methods. It converts each of its arguments to a string, and writes each one to the stream. If the string does not already end with a newline character, it adds one. If any of the arguments to puts is an array, the array is recursively expanded, and each element is printed on its own line as if it were passed directly as an argument to puts. The print method converts its arguments to strings, and outputs them to the stream. If the global field separator $, has been changed from its default value of nil, then that value is output between each of the arguments to print. If the output record separator $/ has been changed from its default value of nil, then that value is output after all arguments are printed.
As for design decisions, that I do not know.

Resources