Ruby Array#puts not using overridden implementation? - ruby

I am using Ruby 1.8.6 for the following code:
# Create an array and override the #to_s on that object
thing = [1,2,3]
def thing.to_s
'one'
end
print "Using print: "
print thing
puts
puts "Using puts: "
puts thing
Output:
Using print: one
Using puts:
1
2
3
So thing is an Array and I have overridden thing#to_s. print seems to use my overriden implementation while puts does not. Why?
I have followed the source code of Kernel#puts and Kernel#print (which are C-implementations) and see that they are very different implementations. I want to know what might be the design-decision (if any) behind this?
By the way, if I create thing as an instance of another class I wrote (or as a Hash/String/other-classes I tried), both print and puts use the overridden implementation of to_s.

Oh boy ... This has already been the topic of a countless number of endless threads on the ruby-talk mailinglist, the ruby-core mailinglist and a gazillion of blogs.
The gist of it is that puts special cases Arrays. Why it special cases those, why it special cases only those (as opposed to, say, all Enumerables), why it special cases those (and not, say, print), nobody really knows. It is the way it is.
BTW, since you mentioned the POLS: the Ruby community has always made it very clear that the POLS only applies to matz. So, Ruby is about not surprising matz. If you or me or anybody else is surprised, that doesn't count.

From the Ruby Programming Language:
alt text http://ecx.images-amazon.com/images/I/41n-JSlBHkL._SL75_.jpg
Output streams are appendable, like strings and arrays are, and you can write values to them with the << operator. puts is one of the most common output methods. It converts each of its arguments to a string, and writes each one to the stream. If the string does not already end with a newline character, it adds one. If any of the arguments to puts is an array, the array is recursively expanded, and each element is printed on its own line as if it were passed directly as an argument to puts. The print method converts its arguments to strings, and outputs them to the stream. If the global field separator $, has been changed from its default value of nil, then that value is output between each of the arguments to print. If the output record separator $/ has been changed from its default value of nil, then that value is output after all arguments are printed.
As for design decisions, that I do not know.

Related

Difference between 2 ways working with pipes using ARGF?

Using ARGF I can create Ruby programs that respect pipelines. Suppose, I to constantly read new entries:
$ tail -f log/test.log | my_prog
I can do this using:
ARGF.each_line do |line|
...
end
Also, I found another way:
while input = ARGF.gets
input.each_line do |line|
...
end
end
Looks like, both variants do the same thing or there is a difference between them? If so, what is it?
Thanks in advance.
As Stefan mentioned, you did a little mistake in second case. Proper way of using "ARGF.gets" approach in your case will look like:
while input = ARGF.gets
# input here represents a line
end
If you rewrite the second example as above, you will not have difference in behavior.
Actual difference you may notice between ARGF#gets and ARGF#each_line is in semantics: each_line accepts block or returns enumerator and gets returns a next line if it is available.
Another option is to use Kernel#gets. Beware it's behavior may differ from ARGF#gets in some cases, especially if you change a separator:
A separator of nil reads the entire contents, and a zero-length separator reads the input one paragraph at a time, where paragraphs are divided by two consecutive newlines.
But for reading (and then printing) constantly from stdin you may use it as follows:
print while gets

Difference between ways to use gets method

I saw two ways to use gets, a simple form:
print 'Insert your name: '
name = gets()
puts "Your name is #{name}"
and a form that drew my attention:
print 'Insert your name: '
STDOUT.flush
name = gets.chomp
puts "Your name is #{name}"
The second sample looks like perl in using the flush method of the default output stream. Perl makes explicit default output stream manipulating; the method flush is a mystery to me. It can behave different from what I'm inferring, and it uses chomp to remove the new line character.
What happens behind the scenes in the second form? What situation is it useful or necessary to use the second form?
"Flushing" the output ensures that it shows the printed message before it waits for your input; this may be just someone being certain unnecessarily, or it may be that on certain operating systems you need it. Alternatively you can use STDOUT.sync = true to force a flush after every output. (You may wonder, "Why wouldn't I always use this?" Well, if your code is outputting a lot of content, repeatedly flushing it may slow it down.)
chomp removes the newline from the end of the input. If you want the newline (the result of the user pressing "Enter" after typing their name) then don't chomp it.
Looking at some Github code I can see that STDOUT.flush is used mostly for server-side/multi-threaded jobs, and not in everyday use.
Generally speaking, when you want to accept input from the user, you'd want to use gets.chomp. Just remember, no matter what the user enters, Ruby will ALWAYS interprete that as a string.
To convert it to an integer, you need to call to_i, or to_f for a float. You don't need chomp in these cases, since to_i or to_f removes the "\n" automatically. There are a lot of subtle things going on implicitly as you'll see, and figuring them out is simply a matter of practice.
I've rarely seen someone use STDOUT.flush except in mutli-threading. Also it makes things confusing, defeating the whole purpose of writing elegant code.

.downcase! syntax shorthand

Can someone explain what the difference is between the two pieces of code below? Both feature an ! at the end. Is the first version just the shorthand?
print "Who are you?"
user_input = gets.chomp.downcase!
print "Who are you?"
user_input = gets.chomp
user_input.downcase!
Edit: Having an exclamation point (aka "bang") at the end of a method name in ruby means "handle with care". From Matz himself:
The bang (!) does not mean "destructive" nor lack of it mean non
destructive either. The bang sign means "the bang version is more
dangerous than its non bang counterpart; handle with care". Since
Ruby has a lot of "destructive" methods, if bang signs follow your
opinion, every Ruby program would be full of bangs, thus ugly.
(For the full thread, see #sawa's link in the comments.)
For the method in question, downcase is making a copy of the given string, modifying that, and returning that copy as a result. Whereas downcase! modifies the string itself.
In the first case, you're modifying the variable stored in gets.chomp, in the second you're modifying user_input.
Note that if you call user_input.downcase on the last line (instead of user_input.downcase!) it won't actually change user_input, it just returns a copy of the string and makes the copy lowercase.

What does "$," mean in ruby?

I stumbled upon this piece of code in the rails source:
# File actionpack/lib/action_view/helpers/output_safety_helper.rb, line 30
def safe_join(array, sep=$,)
sep ||= "".html_safe
sep = ERB::Util.html_escape(sep)
array.map { |i| ERB::Util.html_escape(i) }.join(sep).html_safe
end
What does $, do? I read the Regexp-documentation but I couldn't find anything about it.
The official documentation for the system variables is in:
http://www.ruby-doc.org/stdlib-2.0/libdoc/English/rdoc/English.html
A lot of Ruby's special variables are accessible via methods in various modules and classes, which hides the fact that the variable is what contains the value. For instance, lineno, available in IO and inherited by File, is the line number of the last line read by an IO stream. It's relying on $/ and $.
The "English" module provides long versions of the cryptic variables, making it more readable. Use of the cryptic variables isn't as idiomatic in Ruby as they are in Perl, which is why they're more curious when you run into them.
They come from a variety of sources: most, if not all, are immediately from Perl, but Perl inherited them from sed, awk, and the rest of its kitchen-sink collection of code. (It's a great language, really.)
There are other variables set by classes like Regexp, which defines variables for pre and post match, plus captures. This is from the documentation:
$~ is equivalent to ::last_match;
$& contains the complete matched text;
$` contains string before match;
$' contains string after match;
$1, $2 and so on contain text matching first, second, etc capture group;
$+ contains last capture group.
Though Ruby defines the short, cryptic, versions of the variables, it's recommended that we use require "English" to provide the long names. It's a readability thing, which translates to a long-term ease-of-maintenance thing.
I finally found the answer myself here:
The output field separator for the print. Also, it is the default separator for Array#join. (Mnemonic: what is printed when there is a , in your print statement.)
The following code snippet shows the effect:
a = [1,2,3]
puts a.join # => 123
$, = ','
puts a.join # => 1,2,3

How to use ':' to break out words in %w'dog:cat:bird' w/o split

I am trying to do %w'dog:cat:bird' but I want the character that breaks apart the words to be a : rather than whitespace as %w currently does.
I do not want to use .split as in the actual code I am using a few different % idioms for different needs and I would like to use just one syntax.
I just checked in "The Ruby Programming Language" by Matz and David Flanagan, and it appears that array literals created with %w must use spaces to delimit the elements. If you really want to have arrays of strings, delimited by ":", and you don't want to use "split" in the code, I suggest you define a method of your own which will allow you to simulate the desired behavior, maybe something like:
class Object
def w(str)
str.split(":")
end
end
Then you can write something like:
w'a:b:c'

Resources