Ruby array changes by changing a 'copy' of one of its elements - ruby

I'm trying to confirm whether my understanding is correct of these six lines of code:
string="this is a sentence"
words=string.split
first_word=words[0]
first_word[0]=first_word[0].upcase
out=words.join(" ")
puts(out)
which prints "This is a sentence" (with the first letter capitalized).
It would appear that changing the "first_word" string, which is defined as the first element of the "words" array, also changes the original "words" array. Is this indeed Ruby's default behavior? Does it not make it more difficult to track where in the code changes to the array take place?

You just need need to distinguish between a variable and an object. Your string is an object. first_word is a variable.
Look for example
a = "hello"
b = a
c = b
now all variables contain the same object, a string with the value "hello". We say they reference the object. No copy is made.
a[0] = 'H'
This changes the first character of the object, a string which now has the value "Hello". Both b and c contain the same, now changed object.
a = "different"
This assigns a new object to the variable a. b and c still hold the original object.
Is this Rubys default behaviour? yes. And it also works like this in many other programming languages.
Does it make it difficult to track changes? Sometimes.
If you takes an element from an array (like your first_word), you need to know:
If you change the object itself, no matter how you access it,
all variables will still hold your object, which just happened to be changed.
But if you replace the object in the array, like words[0] = "That", then all your other variables will still hold the original object.

This behavior is caused by how ruby does pass-by-value and pass-by-reference.
This is probably one of the more confusing parts of Ruby. It is well accepted that Ruby is a pass-by-value, high level programming language. Unfortunately, this is slightly incorrect, and you have found yourself a perfect example. Ruby does pass-by-value, however, most values in ruby are references. When Ruby does an assignment of a simple datatypes, integers, floats, strings, it will create a new object. However, when assigning objects such as arrays and hashes, you are creating references.
original_hash = {name: "schylar"}
reference_hash = original_hash
reference_hash[:name] = "!schylar"
original_hash #=> "!schylar"
original_array = [1,2]
reference_array = original_array
reference_array[0] = 3
reference_array #=> [3,2]
original_fixnum = 1
new_object_fixnum = original_fixnum
new_object_fixnum = 2
original_fixnum #=> 1
original_string = "Schylar"
new_object_string = original_string
new_object_string = "!Schylar"
original_string #=> "Schylar'
If you find yourself needing to copy by value, you may re-think the design. A common way to pass-by-value complex datatypes is using the Marshal methods.
a = {name: "Schylar"}
b = Marshal.load(Marshal.dump(a))
b[:name] = "!!!Schylar"
a #=> {:name => "Schylar"}

Related

Ruby map gives weird results

Consider the following code
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map{|x,y|
a[x],a[y]=a[y],a[x]
#p a
a
}
I know ruby map method collects the last expression of the given block but when using the above code to swap the chars in a using the indexes in t won't succeeds.My intention was to collect the state of a after each swap in the index of t.But map always gives the array of a which is in the last state ie)["135264789", "135264789", "135264789", "135264789"].
The results shows that the map method have collected the final result of a after completing each indexes in t.But when printing the a after each swap prints correct value of a at each state.
Is this the correct behavior or am i missing something?
This is because the String#[]= method mutates the string.
Quick fix would be something like this:
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map{|x,y]
b = "#{a}" # IMPORTANT - this builds a new string
b[x],b[y]=b[y],b[x] # this mutates the new string
#p b
b
}
An alternative to "#{a}" would be to say a.clone, it does the same thing in this case.
The reason this works, is because instead of directly modifying a with a[x],a[y]=a[y],a[x], you're making a temporary copy of a and modifying that instead
edit - I misread the question - if you want to show the result of chaining each operation on the previous result, use dup/clone after the modification as Stefan said in his answer
Is my understanding correct?
Yes, I believe it is. I second what Max says, and I'll also elaborate a bit in case it helps.
Each b is a newly created object because it gets created inside the block, so it gets recreated with every new iteration. The a is created outside the block, so the same object (a) keeps getting referenced inside the block for each iteration.
You can better understand how this works by experimenting with #object_id. Try running this code:
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map { |x,y|
b = "#{a}" # IMPORTANT - this builds a new string
b[x],b[y]=b[y],b[x]
p "a.object_id = #{a.object_id}"
p "b.object_id = #{b.object_id}"
b
}
You will notice that a is the same object for each iteration of the #map method, while b is a new one.
This is an example of the concept of a closure. A closure is some sort of enclosed code structure that retains access to whatever state is available in the context in which it was created, while that context doesn't have access to its, the enclosed code's, state. Sort of like a "one way mirror": the enclosed code can see outside, but the outside can't see into the enclosed code.
In Ruby, closures are implemented as blocks: blocks are closures. So, everything that is visible to whatever context a block is created in (in this case, main) is also visible to that block, although the reverse is not true — for example, you can't reference b from outside the block. (Methods are not closures: if your block were a method, it wouldn't be able to see a unless you passed it in as an argument to your method.)
So, as Max says, when you make changes to a inside your block, you are actually changing (mutating) the same a that you defined up top each time.
Side topic
Now, if you are referencing individual characters in strings it's important to understand that the underlying structure of strings differs from that of arrays. Also, arrays behave differently when you mutate their elements from strings when you mutate their characters.
I'm mentioning this because I have this vague feeling that you are thinking of string character references as pretty much analogous to array element references. This is pretty much only true with respect to syntax.
You may find the results of running this code interesting:
a = '123456789'
p a.object_id
p a[0].object_id
p a[1].object_id
a[0] = '7'
p a.object_id
p a[0].object_id
p a[1].object_id
puts
a = '123456789'.chars
p a.object_id
p a[0].object_id
p a[1].object_id
a[0] = '7'
p a.object_id
p a[0].object_id
p a[1].object_id
In particular, a comparison of the four outputs of a[1].object_id should be instructive, because it shows where strings and arrays differ. If you reassign an element in an array, that element and only that element gets a new object id. If you reassign a character in a string, the string object itself remains the same, but every character in the string gets recreated.
Since you are returning a from map, the result will contain a four times. Those a's all refer to the same object.
You probably want to return a copy of a to preserve its current state:
a = '123456789'
t = [[1, 4], [3, 4], [4, 5], [1, 2]]
r = t.map { |x, y|
a[x], a[y] = a[y], a[x]
a.dup
}
r #=> ["153426789", "153246789", "153264789", "135264789"]

Testing order using a hashmap

I'm a ruby beginner and reading that a Hash does not have order. I tried to play with that concept but found I could still order things like so:
Travel_Plans = Hash.new
Travel_Plans[4] = "Colorado Springs"
Travel_Plans[1] = "Santa Fe"
Travel_Plans[2] = "Raton"
Travel_Plans[5] = "Denver"
Travel_Plans[3] = "Pueblo"
puts Travel_Plans.sort
Could someone explain what is meant by "Hash does not have order"?
If you could provide a simple example that would be great.
Ruby's Hash class represents a "hash map" or "key-value dictionary" in conventional terms. These are intended to be structures that allow quick random-access to individual elements, but the elements themselves have no intrinsic ordering.
Internally Ruby's Hash organizes elements into their various locations in memory using the hash method each object must provide to be used as a key. Ruby's Hash is unusually, if not ludicrously flexible, in that an object, any object, can be used as a key, and it's preserved exactly as-is. Contrast with JavaScript where keys must be strings and strings only.
That means you can do this:
{ 1 => 'Number One', '1' => 'String One', :one => 'Symbol One', 1.0 => 'Float One }
Where that has four completely different keys.
This is in contrast to Array where the ordering is an important part of how an array works. You don't want to have a queue where things go in one order and come out in another.
Now Ruby's Hash class used to have no intrinsic order, but due to popular demand now it stores order in terms of insertion. That is, the first items inserted are "first". Normally you don't depend on this behaviour explicitly, but it does show up if you're paying attention:
a = { x: '1', y: '2' }
# => {:x=>"1, :y=>"2"}
b = { }
b[:y] = '2'
b[:x] = '1'
b
# => {:y=>"2", :x=>"1"}
Note that the order of the keys in b are reversed due to inserting them in reverse order. They're still equivalent though:
a == b
# => true
When you call sort on a Hash you actually end up converting it to an array of key/value pairs, then sorting each of those:
b.sort
# => [[:x, "1"], [:y, "2"]]
Which you could convert back into a Hash if you want:
b.sort.to_h
# => {:x=>"1", :y=>"2"}
So now it's "ordered" properly. In practice this rarely matters, though, as you'll be accessing the keys individually as necessary. b[:x] doesn't care where the :x key is, it always returns the right value regardless.
Some things to note about Ruby:
Don't use Hash.new, instead just use { } to represent an empty Hash structure.
Don't use capital letters for variables, they have significant meaning in Ruby. Travel_Plans is a constant, not a variable, because it starts with a capital letter. Those are reserved for ClassName and CONSTANT_NAME type usage. This should be travel_plans.
First, the statement "[h]ash does not have order" is wrong as of today. It used to be true for really old and outdated versions of Ruby. You seem to have picked outdated information, which would be unreliable as of today.
Second, the code you provided:
puts Travel_Plans.sort
is irrelevant to showing your point that the hash Travel_Plans has, i.e. preserves, the order. What you should have done to check whether the order is preserved, is to simply do:
p Travel_Plans
which would always result in showing the keys in the order 4, 1, 2, 5, 3, which matches the order in which you assigned the key-values to the hash, thus indeed shows that hash preserves the order.

Dynamically Modify Ruby Instance Variable in Array

I have an array ["moniker", #moniker] where the moniker can be any one of around 100 instance variables and its string representation. I want to change what the instance variable located at index 1 is referencing (not that data itself, which may very well be immutable). Just doing array[1] = newData doesn't work because it just changes whats in the array. I know this would be simple in C, but I'm struggling to find a way to do this in Ruby.
Your struggle is because you are thinking like a C programmer, where you have access to the underlying pointers, and where everything is mutable. In C, The array would store a pointer to a mutable integer, and you could change the integer whenever you want. In Ruby, every variable is a reference to an object, and numbers are immutable objects. So, #moniker is a reference to an object, the integer 4. When you create the array, you copy that reference into the array, so now the integer 4 has two references: One from #moniker, and one from the array. As you have found, changing the reference in the array does not change the reference named #moniker--it still refers to the object 4.
"Box" a reference in an array
This is not really a Ruby way of doing things. I'm showing it because it might help to illustrate how Ruby works with references.
You can box a reference in an array:
#moniker = [4]
a = ["moniker", #moniker]
This requires you to deference the array when you want access to the underlying object:
#moniker.first
a[1].first
But now you can change the underlying integer in #moniker and the array will see the change:
#moniker[0] = 42
p a[1].first # => 42
Encapsulate the number in a mutable object.
Being an object oriented language, you might encapsulate that number in a mutable object.
class Moniker
attr_accessor :value
def initialize(value)
#value = value
end
end
(attr_accessor :value builds reader and writer methods for the instance variable #value).
#moniker = Moniker.new(4)
a = ["monikier", #moniker]
#moniker.value = 42
p a[1].value # => 42
You would obviously chose a better name than "value." I couldn't because I don't know what the value represents.
Why these two solutions work
This was a comment by Jörg W Mittag, but it deserves to be part of the answer:
It may seem obvious, but I wanted to mention it explicitly: the two solutions are the same solution. The first uses an already existing class with generic semantics, the the second defines a new class with precise semantics for the specific encapsulated value. But in both cases, it's about wrapping the immutable value in a mutable value and mutating the "outer" value.
#moniker never got into the array but its value did.
In IRB:
#moniker = 4
a = ["moniker", #moniker]
=> ["moniker", 4]
You're just working with the value in the array anyway so just change it and you're good to go:
a[1] = 5
a
=> ["moniker", 5]
You might want to consider a hash:
h = {:moniker => #moniker}
=> {:moniker=>4}
h[:moniker] = 5
h
=> {:moniker=>5}

Creating Copies In Ruby

I have the following code as an example.
a = [2]
b = a
puts a == b
a.each do |num|
a[0] = num-1
end
puts a == b
I want b to refer to a's value, and the value of b not to change when a is changed.(The second puts should return false).
Thank you in advance.
Edited-
The answer posted by user2864740 seems to work for the example I gave. However, I'm working on a sudoku solving program, and it doesn't seem to work there.
#gridbylines = [[1,0,0,9,2,0,0,0,0],
[5,2,4,0,1,0,0,0,0],
[0,0,0,0,0,0,0,7,0],
[0,5,0,0,0,8,1,0,2],
[0,0,0,0,0,0,0,0,0],
[4,0,2,7,0,0,0,9,0],
[0,6,0,0,0,0,0,0,0],
[0,0,0,0,3,0,9,4,5],
[0,0,0,0,7,1,0,0,6]]
save = Array.new(#gridbylines) #or #gridbylines.dup
puts save == #gridbylines #returns true
puts save.equal?(#gridbylines) #returns false
#gridbylines[0][0] = 'foo'
puts save.equal?(#gridbylines) #returns false
puts save == #gridbylines #returns true, but I want "save" not to change when I change "#gridbylines"
Does this have something to do with the fact that I'm using a global variable, or the version of Ruby I'm using, or even because it's a multidimentional array unlike the previous example?
Variables name or "refer to" objects1. In the code above the same object (which has two names, a and b) is being changed.
A simple solution in this case is to make a (shallow) copy of the original Array object, such as b = a.dup or b = Array.new(a). (With a shallow copy, elements in the array are also shared and will exhibit the similar phoneme as the original question unless they to are [recursively] duplicated, etc.2)
a = [2]
b = Array.new(a) # create NEW array object, a shallow-copy of `a`
puts a == b # true (same content)
puts a.equal?(b) # false (different objects)
a.each do |num|
a[0] = num-1 # now changing the object named by `a` does not
# affect the object named by `b` as they are different
end
puts a == b # false (different content)
And an isolated example of this "naming" phenomena (see the different equality forms):
a = []
b = a # assignment does NOT make a copy of the object
a.equals?(b) # true (same object)
c = a.dup # like Array.new, create a new shallow-copy object
a.equals?(c) # false (different object)
1 I find it most uniform to talk about variables being names, as such a concept can be applied across many languages - the key here is that any object can have one or more names, just as a person can have many nicknames. If an object has zero names then it is no longer strongly reachable and, just like a person, is forgotten.
However, another way to view variables (naming objects) is that they hold reference values, where the reference value identifies an object. This leads to phrasing such as
"variable a contains a reference [value] to object x" - or,
"variable a refers to / references object x" - or, as I prefer,
"variable a is a name for object x".
For the case or immutable "primitive" or "immediate" values the underlying mechanics are slightly different but, being immutable the object values cannot be changed and such a lack-of-shared object nature will not manifest itself.
See also:
String assignment by reference/copy? (examines object relationship after assignment)
Strange Feature? of Ruby Arrays (examines same-object mutation)
Is Ruby pass by reference or by value? (assignment works in the same way as argument passing)
2 As per the updated question with nested arrays, this is explained by the previous rules - the variables (well, really expressions) still name shared objects. In any case, one way to "clone an array of arrays" (to two levels, although not recursively) is to use:
b = a.map {|r| r.dup}
This is because Array#map returns a new array with the mapped values which are, in this case, duplicates (shallow clones) of the corresponding nested arrays.
See How to create a deep copy of an object in Ruby? for other "deep[er] copy" approaches - especially if the arrays (or affect mutable objects) were nested to N-levels.

Strange Feature? of Ruby Arrays

I came around this strange feature(?) of arrays in Ruby and it would be very helpful if someone could explain to me why they work the way they do.
First lets give an example of how things usually work.
a = "Hello" #=> "Hello"
b = a #=> "Hello"
b += " Goodbye" #=> "Hello Goodbye"
b #=> "Hello Goodbye"
a #=> "Hello"
Ok cool, when you use = it creats a copy of the object (this time a string).
But when you use arrays this happens:
a = [1,2,3] #=> [1,2,3]
b = a #=> [1,2,3]
b[1] = 5 #=> [1,5,3]
b #=> [1,5,3]
a #=> [1,5,3]
Now thats just strange. Its the only object I've found that doesn't get copied when using = but instead just creates a refrance to the original object.
Can someone also explain (there must be a method) for copying an array without having it point back to the original object?
Actually, you should re-examine your premise.
The string assignment is really b = b + " Goodbye". The b + " Goodbye" operation returns an entirely new string, so the variable b is pointing to a new object after the assignment.
But when you assign to an individual array element, you are not creating an entirely new array, so a and b continue to point to the same object, which you just changed.
If you are looking for a rationale for the mutating vs functional behavior of arrays, it's simple. There is nothing to be gained by modifying the string. It is most likely necessary to allocate new memory anyway, so an entirely new string is created.
But an array can be arbitrarily large. Creating a new array in order to change just one element could be hugely expensive. And in any case, an array is like any other composite object. Changing an individual attribute does not necessarily affect any other attributes.
And to answer your question, you can always do:
b = a.dup
What happends there is that ruby is treating the Array object by reference and not by value.
So you can see it as this:
b= [1,2,3]
a= b
--'b' Points to---> [1,2,3] <--'a' points t---
So as you can see both point to the same reference, that means that if you change anything in a it will be reflected on b.
As for your question on the copying the object you could use the Object#clone method to do so.
Try your Array case with a String:
a = "Hello" #=> "Hello"
b = a #=> "Hello"
b[1] = "x" #=> "x"
b #=> "Hxllo"
a #=> "Hxllo"
Strings and Arrays work the same way in this regard.
The key difference in the two cases, as you wrote them, is this:
b += " Goodbye"
This is syntactic shorthand for
b = b + " Goodbye"
which is creating a new string from b + " Goodbye" and then assigning that to b. The way to modify an existing string, rather than creating a new one, is
b << " Goodbye"
And if you plug that into your sequence, you'll see that it modifies both a and b, since both variables refer to the same string object.
As for deep copying, there's a decent piece about it here:
http://ruby.about.com/od/advancedruby/a/deepcopy.htm

Resources