I was goofing around in a pry REPL and found some very interesting behavior: the tilde method.
It appears Ruby syntax has a built-in literal unary operator, ~, just sitting around.
This means ~Object.new sends the message ~ to an instance of Object:
class Object
def ~
puts 'what are you doing, ruby?'
end
end
~Object.new #=> what are you doing, ruby?
This seems really cool, but mysterious. Is Matz essentially trying to give us our own customizable unary operator?
The only reference I can find to this in the rubydocs is in the operator precedence notes, where it's ranked as the number one highest precedence operator, alongside ! and unary + This makes sense for unary operators. (For interesting errata about the next two levels of precedence, ** then unary -, check out this question.) Aside from that, no mention of this utility.
The two notable references to this operator I can find by searching, amidst the ~=,!~, and~>` questions, are this and this. They both note its usefulness, oddity, and obscurity without going into its history.
After I was about to write off ~ as a cool way to provide custom unary operator behavior for your objects, I found a place where its actually used in ruby--fixnum (integers).
~2 returns -3. ~-1 returns 1. So it negates an integer and subtracts one... for some reason?
Can anyone enlighten me as purpose of the tilde operator's unique and unexpected behavior in ruby at large?
Using pry to inspect the method:
show-method 1.~
From: numeric.c (C Method):
Owner: Fixnum
Visibility: public
Number of lines: 5
static VALUE
fix_rev(VALUE num)
{
return ~num | FIXNUM_FLAG;
}
While this is impenetrable to me, it prompted me to look for a C unary ~ operator. One exists: it's the bitwise NOT operator, which flips the bits of a binary integer (~1010 => 0101). For some reason this translates to one less than the negation of a decimal integer in Ruby.
More importantly, since ruby is an object oriented language, the proper way to encode the behavior of ~0b1010 is to define a method (let's call it ~) that performs bitwise negation on a binary integer object. To realize this, the ruby parser (this is all conjecture here) has to interpret ~obj for any object as obj.~, so you get a unary operator for all objects.
This is just a hunch, anyone with a more authoritative or elucidating answer, please enlighten me!
--EDIT--
As #7stud points out, the Regexp class makes use of it as well, essentially matching the regex against $_, the last string received by gets in the current scope.
As #Daiku points out, the bitwise negation of Fixnums is also documented.
I think my parser explanation solves the bigger question of why ruby allows ~ as global unary operator that calls Object#~.
For fixnum, it's the one's complement, which in binary, flips all the ones and zeros to the opposite value. Here's the doc: http://www.ruby-doc.org/core-2.0/Fixnum.html#method-i-7E. To understand why it gives the values it does in your examples, you need to understand how negative numbers are represented in binary. Why ruby provides this, I don't know. Two's complement is generally the one used in modern computers. It has the advantage that the same rules for basic mathematical operations work for both positive and negative numbers.
The ~ is the binary one's complement operator in Ruby. One's complement is just flipping the bits of a number, to the effect that the number is now arithmetically negative.
For example, 2 in 32-bit (the size of a Fixnum) binary is 0000 0000 0000 0010, thus ~2 would be equal to 1111 1111 1111 1101 in one's complement.
However, as you have noticed and this article discusses in further detail, Ruby's version of one's complement seems to be implemented differently, in that it not only makes the integer negative but also subtracts 1 from it. I have no idea why this is, but it does seem to be the case.
It's mentioned in several places in pickaxe 1.8, e.g. the String class. However, in ruby 1.8.7 it doesn't work on the String class as advertised. It does work for the Regexp class:
print "Enter something: "
input = gets
pattern = 'hello'
puts ~ /#{pattern}/
--output:--
Enter something: 01hello
2
It is supposed to work similarly for the String class.
~ (Bignum)
~ (Complex)
~ (Fixnum)
~ (Regexp)
~ (IPAddr)
~ (Integer)
Each of these are documented in the documentation.
This list is from the documentation for Ruby 2.6
The behavior of this method "at large" is basically anything you want it to be, as you described yourself with your definition of a method called ~ on Object class. The behaviors on the core classes that have it defined by the implementations maintainers, seems to be pretty well documented, so that it should not have unexpected behavior for those objects.
Related
Apart from making a nice symmetry with unary minus, why is unary plus operator defined on Numeric class? Is there some practical value in it, except for causing confusion allowing writing things like ++i (which, unlike most non-Rubyists would think, doesn't increment i).
I can think of scenario where defining unary plus on a custom class could be useful (say if you're creating some sexy DSL), so being able to define it is ok, but why is it already defined on Ruby numbers?
Perhaps it's just a matter of consistency, both with other programming languages, and to mirror the unary minus.
Found support for this in The Ruby Programming Language (written by Yukihiro Matsumoto, who designed Ruby):
The unary plus is allowed, but it has no effect on numeric operands—it simply returns the value of its operand. It is provided for symmetry with unary minus, and can, of course, be redefined.
As mentioned in the docs, if a string is frozen the unary plus operator will return a mutable string.
One possible reason I see is to explicitly state that a number is positive(even though it by default is positive).
ruby-1.9.2-p136 :051 > +3
=> 3
ruby-1.9.2-p136 :052 > 3
=> 3
DrRacket running R5RS says that 1### is a perfectly valid Scheme number and prints a value of 1000.0. This leads me to believe that the pound signs (#) specify inexactness in a number, but I'm not certain. The spec also says that it is valid syntax for a number literal, but it does not say what those signs mean.
Any ideas as to what the # signs in Scheme number literals signifiy?
The hash syntax was introduced in 1989. There were a discussion on inexact numbers on the Scheme authors mailing list, which contains several nice ideas. Some caught on and some didn't.
http://groups.csail.mit.edu/mac/ftpdir/scheme-mail/HTML/rrrs-1989/msg00178.html
One idea that stuck was introducing the # to stand for an unknown digit.
If you have measurement with two significant digits you can indicate that with 23## that the digits 2 and 3 are known, but that the last digits are unknown. If you write 2300, then you can't see that the two zero aren't to ne trusted. When I saw the syntax I expected 23## to evaluate to 2350, but (I believe) the interpretation is implementation dependent. Many implementation interpret 23## as 2300.
The syntax was formally introduced here:
http://groups.csail.mit.edu/mac/ftpdir/scheme-mail/HTML/rrrs-1989/msg00324.html
EDIT
From http://groups.csail.mit.edu/mac/ftpdir/scheme-reports/r3rs-html/r3rs_8.html#SEC52
An attempt to produce more digits than are available in the internal
machine representation of a number will be marked with a "#" filling
the extra digits. This is not a statement that the implementation
knows or keeps track of the significance of a number, just that the
machine will flag attempts to produce 20 digits of a number that has
only 15 digits of machine representation:
3.14158265358979##### ; (flo 20 (exactness s))
EDIT2
Gerald Jay Sussman writes why the introduced the syntax here:
http://groups.csail.mit.edu/mac/ftpdir/scheme-mail/HTML/rrrs-1994/msg00096.html
Here's the R4RS and R5RS docs regarding numerical constants:
R4RS 6.5.4 Syntax of numerical constants
R5RS 6.2.4 Syntax of numerical constants.
To wit:
If the written representation of a number has no exactness prefix, the constant may be either inexact or exact. It is inexact if it contains a decimal point, an exponent, or a "#" character in the place of a digit, otherwise it is exact.
Not sure they mean anything beyond that, other than 0.
I know how each of them can be converted to one another but never really understood what their applications are. The usual infix operation is quite readable, but where does it fail which led to inception of prefix and postfix notation
Infix notation is easy to read for humans, whereas pre-/postfix notation is easier to parse for a machine. The big advantage in pre-/postfix notation is that there never arise any questions like operator precedence.
For example, consider the infix expression 1 # 2 $ 3. Now, we don't know what those operators mean, so there are two possible corresponding postfix expressions: 1 2 # 3 $ and 1 2 3 $ #. Without knowing the rules governing the use of these operators, the infix expression is essentially worthless.
Or, to put it in more general terms: it is possible to restore the original (parse) tree from a pre-/postfix expression without any additional knowledge, but the same isn't true for infix expressions.
Postfix notation, also known as RPN, is very easy to process left-to-right. An operand is pushed onto a stack; an operator pops its operand(s) from the stack and pushes the result. Little or no parsing is necessary. It's used by Forth and by some calculators (HP calculators are noted for using RPN).
Prefix notation is nearly as easy to process; it's used in Lisp.
At least for the case of the prefix notation: The advantage of using a prefix operator is that syntactically, it reads as if the operator is a function call
Another aspect of prefix/postfix vs. infix is that the arity of the operator (how many arguments it is applied to) no longer has to be limited to exactly 2. It can be more, or sometimes less (0 or 1 when defaults are implied naturally, like zero for addition/subtraction, one for multiplication/division).
Apart from making a nice symmetry with unary minus, why is unary plus operator defined on Numeric class? Is there some practical value in it, except for causing confusion allowing writing things like ++i (which, unlike most non-Rubyists would think, doesn't increment i).
I can think of scenario where defining unary plus on a custom class could be useful (say if you're creating some sexy DSL), so being able to define it is ok, but why is it already defined on Ruby numbers?
Perhaps it's just a matter of consistency, both with other programming languages, and to mirror the unary minus.
Found support for this in The Ruby Programming Language (written by Yukihiro Matsumoto, who designed Ruby):
The unary plus is allowed, but it has no effect on numeric operands—it simply returns the value of its operand. It is provided for symmetry with unary minus, and can, of course, be redefined.
As mentioned in the docs, if a string is frozen the unary plus operator will return a mutable string.
One possible reason I see is to explicitly state that a number is positive(even though it by default is positive).
ruby-1.9.2-p136 :051 > +3
=> 3
ruby-1.9.2-p136 :052 > 3
=> 3
What is the difference between
10.6.to_i
and
10.6.to_int
?
This is excelently explained here:
First of all, neither to_i or to_int is meant to do something fancier than the other. Generally, those 2 methods don’t differ in their implementation that much, they only differ in what they announce to the outside world. The to_int method is generally used with conditional expressions and should be understood the following way : “Is this object can be considered like a real integer at all time?”
The to_i method is generally used to do the actual conversion and should be understood that way : “Give me the most accurate integer representation of this object please”
String, for example, is a class that implements to_i but does not implements to_int. It makes sense because a string cannot be seen as an integer in it’s own right. However, in some occasions, it can have a representation in the integer form. If we write x = “123″, we can very well do x.to_i and continue working with the resulting Fixnum instance. But it only worked because the characters in x could be translated into a numerical value. What if we had written : x = “the horse outside is funny” ? That’s right, a string just cannot be considered like an Integer all the time.
There is no difference. They are synonymous.