Ruby and :symbols - ruby

I have just started using Ruby and I am reading "Programming Ruby 1.9 - The Pragmatic Programmer's Guide". I came across something called symbols, but as a PHP developer I don't understand what they do and what they are good for.
Can anyone help me out with this?

It's useful to think of symbols in terms of "the thing called." In other words, :banana is referring to "the thing called banana." They're used extensively in Ruby, mostly as Hash (associative array) keys.
They really are similar to strings, but behind the scenes, very different. One key difference is that only one of a particular symbol exists in memory. So if you refer to :banana 10 times in your code, only one instance of :banana is created and they all refer to that one. This also implies they're immutable.

Symbols are similar to string literals in the sense that share the same memory space, but it is important to remark they are not string equivalents.
In Ruby, when you type "this" and "this" you're using two different memory locations; by using symbols you'll use only one name during the program execution. So if you type :this in several places in your program, you'll be using only one.
From Symbol doc:
Symbol objects represent names and some strings inside the Ruby interpreter. They are generated using the :name and :"string" literals syntax, and by the various to_sym methods. The same Symbol object will be created for a given name or string for the duration of a program‘s execution, regardless of the context or meaning of that name. Thus if Fred is a constant in one context, a method in another, and a class in a third, the Symbol :Fred will be the same object in all three contexts.
So, you basically use it where you want to treat a string as a constant.
For instance, it is very common to use it with the attr_accessor method, to define getter/setter for an attribute.
class Person
attr_accessor :name
end
p = Person.new
p.name= "Oscar"
But this would do the same:
class DontDoThis
attr_accessor( "name" )
end
ddt = DontDoThis.new
ddt.name= "Dont do it"

Think of a Symbol as a either:
A method name that you plan to use later
A constant / enumeration that you want to store and compare against
For example:
s = "FooBar"
length = s.send(:length)
>>> 6

#AboutRuby has a good answer, using the terms "the thing called".
:banana is referring to "the thing
called banana."
He notes that you can refer to :banana many times in the code and its the same object-- even in different scopes or off in some weird library. :banana is the thing called banana, whatever that might mean when you use it.
They are used as
keys to arrays, so you look up :banana you only have one entry. In most languages if these are Strings you run the risk of having multiple Strings in memory with the text "banana" and not having the code detect they are the same
method/proc names. Most people are familiar with how C distinguishes a method from its call with parentheses: my_method vs. my_method(). In Ruby, since parentheses are optional, these both indicate a call to that method. The symbol, however, is convenient to use as a standin for methods (even though there really is no relationship between a symbol and a method).
enums (and other constants). Since they don't change they exhibit many of the properties of these features from other languages.

Related

Single symbol to multiple identifiers in Ruby

In the "Well grounded Rubyist 2nd Edition", David Black states that (p.239):
The symbol table is just that: a symbol table. It’s not an object table. If you use an identifier for more than one purpose—say, as a local variable and also as a method name— the corresponding symbol will still only appear once in the symbol table
Then the author goes ahead and gives the following example:
>> Symbol.all_symbols.size
=> 3118
>> abc = 1
=> 1
>> Symbol.all_symbols.size
=> 3119
>> def abc; end
=> :abc
>> Symbol.all_symbols.size
=> 3119
My question is two-fold:
How is it possible to have the same identifier for more than one purpose!? - I understand that Ruby knows which one is which based on context but is this enough?
The symbol that was created in the example above, which identifier does it refer to? The local variable or the method name?
Great question.
Let's untangle this one by one.
"How is it possible to have the same identifier for more than one purpose!?"
"The symbol that was created, which identifier does it refer to?"
Let me first start with code that might look more familiar.
Obviously we can use the same string str to store two objects in two hashes
str = "max"
people[str] = Person.new
statistics[str] = 42
Now your example code does exactly the same
# pseudo-code
sym = :abc
locals[sym] = 1
methods[sym] = Method.new(...)
Internally Ruby represents everything using hashes
there is a hash with all classes
for each class there is a hash with all methods
for each instance there is a hash with all instance variables
for each method activation there is a hash with all local variables
et cetera
…
Symbols are used as keys into those hashes, and as such the same symbol can be used many times to map to many things in many hashes. Just the same way the code in your Rails app may use the same string as key in many different hash objects.
Now symbols are somewhat special. There is one and only one instance for :abc and Ruby uses a hash, yet another hash — the so-called symbol table — to map all symbols to an internal magic number. And then these magic numbers are used internally to refer to the symbol. I guess that is why the author of the book wrote "the symbols table is not an object table."
Mapping a string "abc" to these internal numbers is called "interning" and hence symbols are sometimes referred to as interned strings.
Fun fact, you can lookup these magic numbers yourself with :symbol.object_id and even infer from the order of numbers which symbols have been created first.
Hope that answers your question :)
I have not read the book, but the first two sentences seem to refer to the same questions you ask:
The symbol table is just that: a symbol table. It’s not an object table.
In other words - symbols can be names for identifiers, but they are not identifiers, nor do they have 1:1 mapping with identifiers directly (without context).
foo is an identifier. :foo is just the name of that identifier.
How is it possible to have two people named John? It's just a name, not an id.
Which person does the name John refer to? Depends on the context. In this case - the variable.
I can go into further details on how the actual resolution happens in the language if this question wasn't conceptual in nature.

Determining type of an object in ruby

I'll use python as an example of what I'm looking for (you can think of it as pseudocode if you don't know Python):
>>> a = 1
>>> type(a)
<type 'int'>
I know in ruby I can do :
1.9.3p194 :002 > 1.class
=> Fixnum
But is this the proper way to determine the type of the object?
The proper way to determine the "type" of an object, which is a wobbly term in the Ruby world, is to call object.class.
Since classes can inherit from other classes, if you want to determine if an object is "of a particular type" you might call object.is_a?(ClassName) to see if object is of type ClassName or derived from it.
Normally type checking is not done in Ruby, but instead objects are assessed based on their ability to respond to particular methods, commonly called "Duck typing". In other words, if it responds to the methods you want, there's no reason to be particular about the type.
For example, object.is_a?(String) is too rigid since another class might implement methods that convert it into a string, or make it behave identically to how String behaves. object.respond_to?(:to_s) would be a better way to test that the object in question does what you want.
you could also try: instance_of?
p 1.instance_of? Fixnum #=> True
p "1".instance_of? String #=> True
p [1,2].instance_of? Array #=> True
Oftentimes in Ruby, you don't actually care what the object's class is, per se, you just care that it responds to a certain method. This is known as Duck Typing and you'll see it in all sorts of Ruby codebases.
So in many (if not most) cases, its best to use Duck Typing using #respond_to?(method):
object.respond_to?(:to_i)
I would say "yes".
Matz had said something like this in one of his talks,
"Ruby objects have no types."
Not all of it but the part that he is trying to get across to us.
Why would anyone have said
"Everything is an Object" then?
To add he said "Data has Types not objects".
RubyConf 2016 - Opening Keynote by Yukihiro 'Matz' Matsumoto
But Ruby doesn't care as much about the type of object as the class.
We use classes, not types. All data, then, has a class.
12345.class
'my string'.class
Classes may also have ancestors
Object.ancestors
They also have meta classes but I'll save you the details on that.
Once you know the class then you'll be able to lookup what methods you may use for it. That's where the "data type" is needed.
If you really want to get into details the look up...
"The Ruby Object Model"
This is the term used for how Ruby handles objects. It's all internal so you don't really see much of this but it's nice to know. But that's another topic.
Yes! The class is the data type. Objects have classes and data has types. So if you know about data bases then you know there are only a finite set of types.
text blocks
numbers
variable_name.class
Here variable name is "a"
a.class
every variable have a prop with name class. if you print it, it will tell you what type it is. so do like this:
puts a.class

Why is :key.hash != 'key'.hash in Ruby?

I'm learning Ruby right now for the Rhodes mobile application framework and came across this problem: Rhodes' HTTP client parses JSON responses into Ruby data structures, e.g.
puts #params # prints {"body"=>{"results"=>[]}}
Since the key "body" is a string here, my first attempt #params[:body] failed (is nil) and instead it must be #params['body']. I find this most unfortunate.
Can somebody explain the rationale why strings and symbols have different hashes, i.e. :body.hash != 'body'.hash in this case?
Symbols and strings serve two different purposes.
Strings are your good old familiar friends: mutable and garbage-collectable. Every time you use a string literal or #to_s method, a new string is created. You use strings to build HTML markup, output text to screen and whatnot.
Symbols, on the other hand, are different. Each symbol exists only in one instance and it exists always (i.e, it is not garbage-collected). Because of that you should make new symbols very carefully (String#to_sym and :'' literal). These properties make them a good candidate for naming things. For example, it's idiomatic to use symbols in macros like attr_reader :foo.
If you got your hash from an external source (you deserialized a JSON response, for example) and you want to use symbols to access its elements, then you can either use HashWithIndifferentAccess (as others pointed out), or call helper methods from ActiveSupport:
require 'active_support/core_ext'
h = {"body"=>{"results"=>[]}}
h.symbolize_keys # => {:body=>{"results"=>[]}}
h.stringify_keys # => {"body"=>{"results"=>[]}}
Note that it'll only touch top level and will not go into child hashes.
Symbols and Strings are never ==:
:foo == 'foo' # => false
That's a (very reasonable) design decision. After all, they have different classes, methods, one is mutable the other isn't, etc...
Because of that, it is mandatory that they are never eql?:
:foo.eql? 'foo' # => false
Two objects that are not eql? typically don't have the same hash, but even if they did, the Hash lookup uses hash and then eql?. So your question really was "why are symbols and strings not eql?".
Rails uses HashWithIndifferentAccess that accesses indifferently with strings or symbols.
In Rails, the params hash is actually a HashWithIndifferentAccess rather than a standard ruby Hash object. This allows you to use either strings like 'action' or symbols like :action to access the contents.
You will get the same results regardless of what you use, but keep in mind this only works on HashWithIndifferentAccess objects.
Copied from : Params hash keys as symbols vs strings

What's the rationale/history for the # convention of identifying methods in Ruby?

For example, I've always seen methods referred to as String#split, but never String.split, which seems slightly more logical. Or maybe even String::split, because you could consider #split to be in the namespace of String. I've even seen the method alone, when the class is assumed/implied (#split).
I understand that this is the way methods are identified in ri. Which came first?
Is this to differentiate, for example, methods from fields?
I've also heard that this helps differentiates instance methods from class methods. But where did this start?
The difference indicates how you access the methods.
Class methods use the :: separator to indicate that message can be sent to the class/module object, while instance methods use the # separator to indicate that the message can be sent to an instance object.
I'm going to pick the Complex class (in Ruby 1.9) to demonstrate the difference. You have both Complex::rect and Complex#rect. These methods have different arity and they serve entirely different purposes. Complex::rect takes a real and an imaginary argument, returning a new instance of Complex, while Complex#rect returns an array of the real and imaginary components of the instance.
ruby-1.9.1-p378 > x = Complex.rect(1,5)
=> (1+5i)
ruby-1.9.1-p378 > x.rect
=> [1, 5]
ruby-1.9.1-p378 > x.rect(2, 4) # what would this even do?
ArgumentError: wrong number of arguments(2 for 0)
from (irb):4:in `rect'
from (irb):4
from /Users/mr/.rvm/rubies/ruby-1.9.1-p378/bin/irb:17:in `<main>'
I think the reason that they don't use . as the separator for everything is that it would be ambiguous whether the method belongs to a class or an instance. Now that I'm used to Ruby doing this, I actually see it as a drawback to other languages' conventions, to be honest.
Also, this is somewhat of a completely unrelated topic from fields because all messages you can send are messages, properly speaking, even if it looks like a publicly accessible field. The closest thing you have to fields are attributes or instance variables, of course, which are always prefixed with # and are not directly accessible from outside the instance unless you are using inheritance or Object#instance_variable_get/_set.
As to specifically why they chose :: and #? :: makes sense to me because it conventionally separated namespaces, but # was probably just a symbol that wasn't used in other nomenclature and could unambiguously be recognized as an instance-method separator.
I understand that this is the way methods are identified in ri. Which came first?
Yes, this is where it came from. When you use #, it automatically hyperlinks your methods, so references to other methods in documentation began being prefixed by the # sign. See here:
Names of classes, source files, and any method names containing an underscore or preceded by a hash character are automatically hyperlinked from comment text to their description.
You can't actually invoke a method this way, however. But that shouldn't be surprising; after all, <cref ...> is an invalid statement in C# despite being a valid documentation tag.

What does :this means in Ruby on Rails?

I'm new to the Ruby and Ruby on Rails world. I've read some guides, but i've some trouble with the following syntax.
I think that the usage of :condition syntax is used in Ruby to define a class attribute with some kind of accessor, like:
class Sample
attr_accessor :condition
end
that implicitly declares the getter and setter for the "condition" property.
While i was looking at some Rails sample code, i found the following examples that i don't fully understand.
For example:
#post = Post.find(params[:id])
Why it's accessing the id attribute with this syntax, instead of:
#post = Post.find(params[id])
Or, for example:
#posts = Post.find(:all)
Is :all a constant here? If not, what does this code really means? If yes, why the following is not used:
#posts = Post.find(ALL)
Thanks
A colon before text indicates a symbol in Ruby. A symbol is kind of like a constant, but it's almost as though a symbol receives a unique value (that you don't care about) as its constant value.
When used as a hash index, symbols are almost (but not exactly) the same as using strings.
Also, you can read "all" from :all by calling to_s on the symbol. If you had a constant variable ALL, there would be no way to determine that it meant "all" other than looking up its value. This is also why you can use symbols as arguments to meta-methods like attr_accessor, attr_reader, and the like.
You might want to read up on Ruby symbols.
:all is a symbol. Symbols are Ruby's version of interned strings. You can think of it like this: There is an invisible global table called symbols which has String keys and Fixnum values. Any string can be converted into a symbol by calling .to_sym, which looks for the string in the table. If the string is already in the table, it returns the the Fixnum, otherwise, it enters it into the table and returns the next Fixnum. Because of this, symbols are treated at run-time like Fixnums: comparison time is constant (in C parlance, comparisons of symbols can be done with == instead of strcmp)
You can verify this by looking at the object_id of objects; when two thing's object_ids are the same, they're both pointing at the same object.
You can see that you can convert two strings to symbols, and they'll both have the same object id:
"all".to_sym.object_id == "all".to_sym.object_id #=> true
"all".to_sym.object_id == :all.object_id #=> true
But the converse is not true: (each call to Symbol#to_s will produce a brand new string)
:all.to_s.object_id == :all.to_s.object_id #=> false
Don't look at symbols as a way of saving memory. Look at them as indicating that the string ought to be immutable. 13 Ways of Looking at a Ruby Symbol gives a variety of ways of looking at a symbol.
To use a metaphor: symbols are for multiple-choice tests, strings are for essay questions.
This has nothing to do with Rails, it's just Ruby's Symbols. :all is a symbol which is effectively just a basic string.

Resources