We caught some code in Ruby that seems odd, and I was wondering if someone could explain it:
$ irb
irb(main):001:0> APPLE = 'aaa'
=> "aaa"
irb(main):002:0> banana = APPLE
=> "aaa"
irb(main):003:0> banana << 'bbb'
=> "aaabbb"
irb(main):004:0> banana
=> "aaabbb"
irb(main):005:0> APPLE
=> "aaabbb"
Catch that? The constant was appended to at the same time the local variable was.
Known behavior? Expected?
Known behaviour. Constants don't mean that you can't modify the object it refers to, merely that it'll give you a warning (and only a warning) if you assign it to a different object.
In short, ruby constants aren't.
Note: This behaviour is listed in an answer to "What are the Ruby Gotchas a newbie should be warned about?" It's a worthwhile read.
Catch that? The constant was appended to at the same time the local variable was.
No, it wasn't appended to, and neither was the local variable.
The single object that both the constant and the local variable are referring to was appended to, but neither the constant nor the local variable was changed. You cannot modify or change a variable or constant in Ruby (at least not in the way that your question implies), the only thing you can change is objects.
The only two things you can do with variables or constants is dereferencing them and assigning to them.
The constant is a red herring here, it is completely irrelevant to the example given. The only thing that is relevant is that there is only one single object in the entire example. That single object is accessible under two different names. If the object changes, then the object changes. Period. It does not mysteriously split itself in two. Which name you use to look at the changed object doesn't matter. There is only one object anyway.
This works exactly the same as in any other programming language: if you have multiple references to a mutable object in, say, Python, Java, C#, C++, C, Lisp, Smalltalk, JavaScript, PHP, Perl or whatever, then any change to that object will be visible no matter what reference is used, even if some of those references are final or const or whatever that particular language calls it.
This is simply how shared mutable state works and is just one of the many reasons why shared mutable state is bad.
In Ruby, you can generally call the freeze method on any object to make it immutable. However, again, you are modifying the object here, so anybody else who has a reference to that object will all the sudden find that the object has become immutable. Therefore, just to be safe, you need to copy the object first, by calling dup. But of course, that's not enough either, if you think of an array, for example: if you dup the array, you get a different array, but the objects inside the array are the still the same ones in the original array. And if you freeze the array, then you can no longer modify the array, but the objects in the array may very well still be mutable:
ORIG = ['Hello']
CLONE = ORIG.dup.freeze
CLONE[0] << ', World!'
CLONE # => ['Hello, World!']
That's shared mutable state for you. The only way to escape this madness is either to give up shared state (e.g. Actor Programming: if nobody else can see it, then it doesn't matter how often or when it changes) or mutable state (i.e. Functional Programming: if it never changes, then it doesn't matter how many others see it).
The fact that one of the two variables in the original example is actually a constant is completely irrelevant to the problem. There two main differences between a variable and a constant in Ruby: they have different lookup rules, and constants generate a warning if they are assigned to more than once. But in this example, the lookup rules are irrelevant and the constant is assigned to only once, so there really is no difference between a variable and a constant in this case.
You can freeze constants if you want them to be unchangable:
>> APPLE = 'aaa'
=> "aaa"
>> banana = APPLE
=> "aaa"
>> APPLE.freeze
=> "aaa"
>> banana.frozen?
=> true
>> banana << 'bbb'
TypeError: can't modify frozen string
from (irb):5:in `<<'
from (irb):5
Constants in Ruby aren't "constants". You might as well use any other name; putting them in all caps doesn't change anything, interpreter-wise, about the object, unless you try to change the pointer's address.
If you look at it that way, the behavior is obvious and necessary; Apple is a pointer to a string object, and so is banana. You then edit the object that banana is pointing to. Apple is pointing to that same object, so the change is reflected there too.
Related
I'm reading Eloquent Ruby, and am on Chapter 6 on Symbols. Some excerpts:
"There can only ever be one instance of any given symbol. If I mention :all twice in my code, it is always the same :all."
a = :all
b = :all
puts a.object_id, b.object_id # same objects
"Another aspect of symbols that makes them so well suited to their chosen career is that symbols are immutable - once you create that :all symbol, it will be :all until the end of time (or at least until your Ruby interpreter exits)"
What is the difference between being immutable and the fact that there can only be one instance of you?
By the way, I would like to write the previous sentence more accurately: "What is the difference between a class being immutable and the fact that there can only be one instance of the class?" Is class the right word to insert there?
How would you even go about trying to mutate a symbol, they don't seem to hold values like other variables?
Immutable means that an object cannot be changed. In Ruby, symbols are immutable. To make a symbol mutable, you have to perform type conversion to a string, which is mutable.
a = :mystring
a = a.to_s
=> "mystring"
For proof that a symbol is immutable, you can call the frozen? property on it.
a.frozen?
=> true
Note that symbols cannot be unfrozen unlike strings which have an unfreeze method.
For object ids
In Ruby, the object_id of an object is the same as the VALUE that represents the object on the C level. For most objects, this points to a location in memory where the object data is stored. This varies over time because it depends on where the system decided to allocate its memory.
Symbols have the same object id because they are meant to represent a SINGLE value.
To check this out, let's type to the console the same symbol multiple times.
:z.object_id
=> 636328
:z.object_id
=> 636328
:z.object_id
=> 636328
Now, let's try the same thing only with strings
"z".object_id
=> 21237740
"z".object_id
=> 24355380
As you can see, here we have two references to the string z, both of which are different objects. Thus, they have different object_ids.
This also means that symbols can save quite a bit of memory, especially if we are dealing with big data. Because symbols are the same object, it's faster to compare them then it is strings. Strings require comparing the values instead of the object ids.
Your sentence is fine; you're not sure of the common phrase used to describe a class with only one instance. I'll explain that as I go along.
An object that is immutable cannot change through any operations done on it. This means that any operation that would change a symbol would generate a new one instead.
:foo.object_id # 1520028
:foo.upcase.object_id # 70209716662240
:foo.capitalize.object_id # 70209719120060
You can certainly write objects that are immutable, or make them immutable (with some caveats) via freeze, but you can always create a new instance of them.
f = "foo"
f.freeze
f1 = "foo"
puts f.object_id == f1.object_id # false
An object that only ever has one instance of itself is considered to be a singleton.
If there's only one instance of it, then you only store it in memory once.
If you attempt to create it, you only get the previously existing object back.
I want to be able to write number.incr, like so:
num = 1; num.incr; num
#=> 2
The error I'm seeing states:
Can't change the value of self
If that's true, how do bang! methods work?
You cannot change the value of self
An object is a class pointer and a set of instance methods (note that this link is an old version of Ruby, because its dramatically simpler, and thus better for explanatory purposes).
"Pointing" at an object means you have a variable which stores the object's location in memory. Then to do anything with the object, you first go to the location in memory (we might say "follow the pointer") to get the object, and then do the thing (e.g. invoke a method, set an ivar).
All Ruby code everywhere is executing in the context of some object. This is where your instance variables get saved, it's where Ruby looks for methods that don't have a receiver (e.g. $stdout is the receiver in $stdout.puts "hi", and the current object is the receiver in puts "hi"). Sometimes you need to do something with the current object. The way to work with objects is through variables, but what variable points at the current object? There isn't one. To fill this need, the keyword self is provided.
self acts like a variable in that it points at the location of the current object. But it is not like a variable, because you can't assign it new value. If you could, the code after that point would suddenly be operating on a different object, which is confusing and has no benefits over just using a variable.
Also remember that the object is tracked by variables which store memory addresses. What is self = 2 supposed to mean? Does it only mean that the current code operates as if it were invoked 2? Or does it mean that all variables pointing at the old object now have their values updated to point at the new one? It isn't really clear, but the former unnecessarily introduces an identity crisis, and the latter is prohibitively expensive and introduce situations where it's unclear what is correct (I'll go into that a bit more below).
You cannot mutate Fixnums
Some objects are special at the C level in Ruby (false, true, nil, fixnums, and symbols).
Variables pointing at them don't actually store a memory location. Instead, the address itself stores the type and identity of the object. Wherever it matters, Ruby checks to see if it's a special object (e.g. when looking up an instance variable), and then extracts the value from it.
So there isn't a spot in memory where the object 123 is stored. Which means self contains the idea of Fixnum 123 rather than a memory address like usual. As with variables, it will get checked for and handled specially when necessary.
Because of this, you cannot mutate the object itself (though it appears they keep a special global variable to allow you to set instance variables on things like Symbols).
Why are they doing all of this? To improve performance, I assume. A number stored in a register is just a series of bits (typically 32 or 64), which means there are hardware instructions for things like addition and multiplication. That is to say the ALU, is wired to perform these operations in a single clock cycle, rather than writing the algorithms with software, which would take many orders of magnitude longer. By storing them like this, they avoid the cost of storing and looking the object in memory, and they gain the advantage that they can directly add the two pointers using hardware. Note, however, that there are still some additional costs in Ruby, that you don't have in C (e.g. checking for overflow and converting result to Bignum).
Bang methods
You can put a bang at the end of any method. It doesn't require the object to change, it's just that people usually try to warn you when you're doing something that could have unexpected side-effects.
class C
def initialize(val)
#val = val # => 12
end # => :initialize
def bang_method!
"My val is: #{#val}" # => "My val is: 12"
end # => :bang_method!
end # => :bang_method!
c = C.new 12 # => #<C:0x007fdac48a7428 #val=12>
c.bang_method! # => "My val is: 12"
c # => #<C:0x007fdac48a7428 #val=12>
Also, there are no bang methods on integers, It wouldn't fit with the paradigm
Fixnum.instance_methods.grep(/!$/) # => [:!]
# Okay, there's one, but it's actually a boolean negation
1.! # => false
# And it's not a Fixnum method, it's an inherited boolean operator
1.method(:!).owner # => BasicObject
# In really, you call it this way, the interpreter translates it
!1 # => false
Alternatives
Make a wrapper object: I'm not going to advocate this one, but it's the closest to what you're trying to do. Basically create your own class, which is mutable, and then make it look like an integer. There's a great blog post walking through this at http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html it will get you 95% of the way there
Don't depend directly on the value of a Fixnum: I can't give better advice than this without knowing what you're trying to do / why you feel this is a need.
Also, you should show your code when you ask questions like this. I misunderstood how you were approaching it for a long time.
It's simply impossible to change self to another object. self is the receiver of the message send. There can be only one.
If that's true, how do bang! methods work?
The bang (!) is simply part of the method name. It has absolutely no special meaning whatsoever. It is a convention among Ruby programmers to name surprising variants of less surprising methods with a bang, but that's just that: a convention.
I heard that everything in ruby is object. I replied in an interview that a variable is an object, and the interviewer said NO. Anybody know the truth?
"In ruby, everything is an object" is basically true.
But more accurately, I would say that any value that can be assigned to a variable or returned from a method is an object. Is a variable an object? Not really. A variable is simply a name of an object (also known as a "pointer") that allows you locate it in memory and do stuff with it.
shajin = Person.new()
In this snippet, we have a variable shajin, which points to an object (an instance of the person class). The variable is simply the identifier for an object, but is not the object itself.
I think it was a trick question. Ultimately object orientation is feature for humans to understand complex programs, but computers are not object oriented themselves. Drill down enough layers and objects cease to exist in any language.
So perhaps it's more fair to say: "In ruby, everything important is an object".
Why not go directly to the source? The Ruby Language Specification couldn't be more clear and obvious (emphasis added by me):
6.2 Variables
6.2.1 General description
A variable is denoted by a name, and refers to an object, which is called the value of the variable.
A variable itself is not an object.
http://www.techotopia.com/index.php/Understanding_Ruby_Variables
"A variable in Ruby is just a label for a container.
A variable could contain almost anything - a string, an array, a hash.
A variable name may only contain lowercase letters, numbers, and underscores.
A variable name should ideally make sense in the context of your program."
"We'll begin with the fact that Ruby is a completelyobject-orientated language. Every value is an object (...)."(The Ruby Programming Language, Flanagan & Matsumoto, page 2).
Note this book, co-authored by the language creator, does not state "everything is an object".
a = 1
1 is an object, 'a' is a reference to the 1 object. If 'a' was an object on it's own, it would have an object_id of it's own. But:
1.object_id #=> 3
a.object_id #=> 3
Also, methods are not really objects (but you can turn them into objects if needed).
#Alex Wayne and #Jörg W Mittag answears are correct, but I would like to add that "not everything" important is an object. Like method and block are not objects, but can be converted to objects, with method method and proc respectively.
Is this a ruby bug?
target_url_to_edit = target_url
if target_url_to_edit.include?("http://")
target_url_to_edit["http://"] = ""
end
logger.debug "target url is now #{target_url}"
This returns target_url without http://
You need to duplicate the in-memory object because variable names are just references to in-memory objects:
target_url_to_edit = target_url.dup
Now target_url_to_edit gets assigned a new copy of the original object.
For your case this code probably does the same in just one line (no dup, no if):
target_url_to_edit = target_url.sub(%r{^http://}, "")
No, this is not a bug in Ruby, this is just how shared mutable state works, not just in Ruby but in any programming language.
Think about it this way: my mom calls me "son", my friends call me "Jörg". If I cut my hair, then it doesn't matter which name you use to refer to me: I am the same person, regardless of whether you call me "son" or "Jörg" or "Mr. Mittag" or "hey, douchebag", therefore my hair will always be short. It doesn't magically grow back if you call me by a different name.
The same thing happens in your code: you refer to the string by two different names, but it doesn't matter which name you use; if the string changes, then it changes.
The solution is, of course, to not share mutable state and to not mutate shared state, like in #hurikhan77's answer.
That is not a bug. It is the intended behavior because target_url_to_edit points to the same object in memory as target_url since Ruby uses references for object assignment. If you know C, it is similar to pointers.
Here is how to change its behaviour to force passing by value (note the star sign):
target_url_to_edit = *target_url.to_s
if target_url_to_edit.include?("http://")
target_url_to_edit["http://"] = ""
end
logger.debug "target url is now #{target_url}"
And just like many things in ruby, hard to find where it's documented...
In Ruby some methods have a question mark (?) that ask a question like include? that ask if the object in question is included, this then returns a true/false.
But why do some methods have exclamation marks (!) where others don't?
What does it mean?
In general, methods that end in ! indicate that the method will modify the object it's called on. Ruby calls these as "dangerous methods" because they change state that someone else might have a reference to. Here's a simple example for strings:
foo = "A STRING" # a string called foo
foo.downcase! # modifies foo itself
puts foo # prints modified foo
This will output:
a string
In the standard libraries, there are a lot of places you'll see pairs of similarly named methods, one with the ! and one without. The ones without are called "safe methods", and they return a copy of the original with changes applied to the copy, with the callee unchanged. Here's the same example without the !:
foo = "A STRING" # a string called foo
bar = foo.downcase # doesn't modify foo; returns a modified string
puts foo # prints unchanged foo
puts bar # prints newly created bar
This outputs:
A STRING
a string
Keep in mind this is just a convention, but a lot of Ruby classes follow it. It also helps you keep track of what's getting modified in your code.
The exclamation point means many things, and sometimes you can't tell a lot from it other than "this is dangerous, be careful".
As others have said, in standard methods it's often used to indicate a method that causes an object to mutate itself, but not always. Note that many standard methods change their receiver and don't have an exclamation point (pop, shift, clear), and some methods with exclamation points don't change their receiver (exit!). See this article for example.
Other libraries may use it differently. In Rails an exclamation point often means that the method will throw an exception on failure rather than failing silently.
It's a naming convention but many people use it in subtly different ways. In your own code a good rule of thumbs is to use it whenever a method is doing something "dangerous", especially when two methods with the same name exist and one of them is more "dangerous" than the other. "Dangerous" can mean nearly anything though.
This naming convention is lifted from Scheme.
1.3.5 Naming conventions
By convention, the names of procedures
that always return a boolean value
usually end in ``?''. Such procedures
are called predicates.
By convention, the names of procedures
that store values into previously
allocated locations (see section 3.4)
usually end in ``!''. Such procedures
are called mutation procedures. By
convention, the value returned by a
mutation procedure is unspecified.
! typically means that the method acts upon the object instead of returning a result. From the book Programming Ruby:
Methods that are "dangerous," or modify the receiver, might be named with a trailing "!".
It is most accurate to say that methods with a Bang! are the more dangerous or surprising version. There are many methods that mutate without a Bang such as .destroy and in general methods only have bangs where a safer alternative exists in the core lib.
For instance, on Array we have .compact and .compact!, both methods mutate the array, but .compact! returns nil instead of self if there are no nil's in the array, which is more surprising than just returning self.
The only non-mutating method I've found with a bang is Kernel's .exit! which is more surprising than .exit because you cannot catch SystemExit while the process is closing.
Rails and ActiveRecord continues this trend in that it uses bang for more 'surprising' effects like .create! which raises errors on failure.
From themomorohoax.com:
A bang can used in the below ways, in order of my personal preference.
An active record method raises an error if the method does not do
what it says it will.
An active record method saves the record or a method saves an
object (e.g. strip!)
A method does something “extra”, like posts to someplace, or does
some action.
The point is: only use a bang when you’ve really thought about whether
it’s necessary, to save other developers the annoyance of having to
check why you are using a bang.
The bang provides two cues to other developers.
that it’s not necessary to save the object after calling the
method.
when you call the method, the db is going to be changed.
Simple explanation:
foo = "BEST DAY EVER" #assign a string to variable foo.
=> foo.downcase #call method downcase, this is without any exclamation.
"best day ever" #returns the result in downcase, but no change in value of foo.
=> foo #call the variable foo now.
"BEST DAY EVER" #variable is unchanged.
=> foo.downcase! #call destructive version.
=> foo #call the variable foo now.
"best day ever" #variable has been mutated in place.
But if you ever called a method downcase! in the explanation above, foo would change to downcase permanently. downcase! would not return a new string object but replace the string in place, totally changing the foo to downcase.
I suggest you don't use downcase! unless it is totally necessary.
!
I like to think of this as an explosive change that destroys all that has gone before it. Bang or exclamation mark means that you are making a permanent saved change in your code.
If you use for example Ruby's method for global substitutiongsub!the substitution you make is permanent.
Another way you can imagine it, is opening a text file and doing find and replace, followed by saving. ! does the same in your code.
Another useful reminder if you come from the bash world is sed -i has this similar effect of making permanent saved change.
Bottom line: ! methods just change the value of the object they are called upon, whereas a method without ! returns a manipulated value without writing over the object the method was called upon.
Only use ! if you do not plan on needing the original value stored at the variable you called the method on.
I prefer to do something like:
foo = "word"
bar = foo.capitalize
puts bar
OR
foo = "word"
puts foo.capitalize
Instead of
foo = "word"
foo.capitalize!
puts foo
Just in case I would like to access the original value again.
Called "Destructive Methods" They tend to change the original copy of the object you are referring to.
numbers=[1,0,10,5,8]
numbers.collect{|n| puts n*2} # would multiply each number by two
numbers #returns the same original copy
numbers.collect!{|n| puts n*2} # would multiply each number by two and destructs the original copy from the array
numbers # returns [nil,nil,nil,nil,nil]
My answer explains the significance of Ruby methods with exclamation marks/shebangs in the context of Ruby on Rails (RoR) model validations.
Essentially, whenever developers define Model validations (explained here), their ultimate goal is to decline a database record change & raise/throw the relevant exception(s) in case invalid data has been submitted to update the record in question.
RoR ActiveRecord gem defines various model manipulation methods (Ruby on Rails guides.). Among the methods, the valid? method is the only one that triggers validation without database action/modification. The rest of the methods attempt to change the database.
These methods trigger callbacks whenever they run. Some of the methods in the list feature a sister method with a shebang. What is the difference between the two? It has to do with the form of callback returned whenever a record validation fails.
Methods without the exclamation/shebang merely return a boolean false in the event of record validation failure while the methods with a shebang raise/throw an exception which can then be handled appropriately in code.
Just as a heads-up, since I experienced this myself.
In Ruby, ! mutates the object and returns it. Otherwise it will return nil.
So, if you are doing some kind of operations on an array for example, and call the method .compact! and there is nothig to compact, it will return nil.
Example:
arr = [1, 2, 3, nil]
arr.compact!
=> [1, 2, 3]
Run again arr.compact!
=> nil
It is better to explicitly return again the array arr if you need to use it down the line, otherwise you will get the nil value.
Example:
arr = [1, 2, 3]
arr.compact! => nil
arr # to get the value of the array