change contents of shallow copied array of strings in ruby - ruby

Suppose I create the following arrays in ruby:
a = ["apple", "cherry"]
b = a.dup
Here, b is a shallow copy of a. So if I do:
a.each{|fruit| fruit << " pie"}
I get both a and b equal to ["apple pie", "cherry pie"]. No problem there. But suppose I change one element of b:
b[1] = "blueberry"
and issue the same "each" command. Now b is ["apple pie", "blueberry"], because a[0] and b[0] are the same strings, but a[1] and b[1] are different strings. I could run the command on both a and b, but then b is ["apple pie pie", "blueberry pie"], because I have run the append operation on the same string twice.
Is there a way to modify in place all the strings of a and b, without duplicates. In this simple example, I could test for the substring " pie", but this wouldn't work for other types of changes (such as deleting the first character).
I tried creating a set containing all the strings, so that each would be unique; but it seems the set creation copies the strings, so they cannot be modified in place. Is there a way to test if two strings are the same in memory? I have googled for that, but found nothing.
The application of this is that I have big arrays of strings, which I "dup" to create a history of them. Now I want to apply a change to the entire history, without double applying (or triple, etc) the changes.

I don't understand your use case; I suspect you're making things more complicated than they need to be.
Is there a way to test if two strings are the same in memory?
Object#object_id is what you're looking for.
Is there a way to modify in place all the strings of a and b, without duplicates?
You could keep a set of not all the object_ids, similar to what you were already trying. You can retrieve the string with ObjectSpace#_id2ref. Something like this:
require 'set'
set = Set.new
a = ["apple", "cherry"]
b = a.dup
b[1] = "blueberry"
# Collect unique string objects
a.each{|s| set << s.object_id}
b.each{|s| set << s.object_id}
# Make pie with each unique string object
set.each{|id| ObjectSpace._id2ref(id) << " pie"}
a
# => ["apple pie", "cherry pie"]
b
# => ["apple pie", "blueberry pie"]
That seems a bit crazy to me, though. Again, I think there's probably a better way to do what you're trying to do, but it's hard to tell based on the information provided.

Related

Ruby map gives weird results

Consider the following code
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map{|x,y|
a[x],a[y]=a[y],a[x]
#p a
a
}
I know ruby map method collects the last expression of the given block but when using the above code to swap the chars in a using the indexes in t won't succeeds.My intention was to collect the state of a after each swap in the index of t.But map always gives the array of a which is in the last state ie)["135264789", "135264789", "135264789", "135264789"].
The results shows that the map method have collected the final result of a after completing each indexes in t.But when printing the a after each swap prints correct value of a at each state.
Is this the correct behavior or am i missing something?
This is because the String#[]= method mutates the string.
Quick fix would be something like this:
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map{|x,y]
b = "#{a}" # IMPORTANT - this builds a new string
b[x],b[y]=b[y],b[x] # this mutates the new string
#p b
b
}
An alternative to "#{a}" would be to say a.clone, it does the same thing in this case.
The reason this works, is because instead of directly modifying a with a[x],a[y]=a[y],a[x], you're making a temporary copy of a and modifying that instead
edit - I misread the question - if you want to show the result of chaining each operation on the previous result, use dup/clone after the modification as Stefan said in his answer
Is my understanding correct?
Yes, I believe it is. I second what Max says, and I'll also elaborate a bit in case it helps.
Each b is a newly created object because it gets created inside the block, so it gets recreated with every new iteration. The a is created outside the block, so the same object (a) keeps getting referenced inside the block for each iteration.
You can better understand how this works by experimenting with #object_id. Try running this code:
a="123456789"
t=[[1,4],[3,4],[4,5],[1,2]]
p t.map { |x,y|
b = "#{a}" # IMPORTANT - this builds a new string
b[x],b[y]=b[y],b[x]
p "a.object_id = #{a.object_id}"
p "b.object_id = #{b.object_id}"
b
}
You will notice that a is the same object for each iteration of the #map method, while b is a new one.
This is an example of the concept of a closure. A closure is some sort of enclosed code structure that retains access to whatever state is available in the context in which it was created, while that context doesn't have access to its, the enclosed code's, state. Sort of like a "one way mirror": the enclosed code can see outside, but the outside can't see into the enclosed code.
In Ruby, closures are implemented as blocks: blocks are closures. So, everything that is visible to whatever context a block is created in (in this case, main) is also visible to that block, although the reverse is not true — for example, you can't reference b from outside the block. (Methods are not closures: if your block were a method, it wouldn't be able to see a unless you passed it in as an argument to your method.)
So, as Max says, when you make changes to a inside your block, you are actually changing (mutating) the same a that you defined up top each time.
Side topic
Now, if you are referencing individual characters in strings it's important to understand that the underlying structure of strings differs from that of arrays. Also, arrays behave differently when you mutate their elements from strings when you mutate their characters.
I'm mentioning this because I have this vague feeling that you are thinking of string character references as pretty much analogous to array element references. This is pretty much only true with respect to syntax.
You may find the results of running this code interesting:
a = '123456789'
p a.object_id
p a[0].object_id
p a[1].object_id
a[0] = '7'
p a.object_id
p a[0].object_id
p a[1].object_id
puts
a = '123456789'.chars
p a.object_id
p a[0].object_id
p a[1].object_id
a[0] = '7'
p a.object_id
p a[0].object_id
p a[1].object_id
In particular, a comparison of the four outputs of a[1].object_id should be instructive, because it shows where strings and arrays differ. If you reassign an element in an array, that element and only that element gets a new object id. If you reassign a character in a string, the string object itself remains the same, but every character in the string gets recreated.
Since you are returning a from map, the result will contain a four times. Those a's all refer to the same object.
You probably want to return a copy of a to preserve its current state:
a = '123456789'
t = [[1, 4], [3, 4], [4, 5], [1, 2]]
r = t.map { |x, y|
a[x], a[y] = a[y], a[x]
a.dup
}
r #=> ["153426789", "153246789", "153264789", "135264789"]

Unexpected FrozenError when appending elements via <<

I have a hash containing names and categories:
hash = {
'Dog' => 'Fauna',
'Rose' => 'Flora',
'Cat' => 'Fauna'
}
and I want to reorganize it so that the names are grouped by their corresponding category:
{
'Fauna' => ['Dog', 'Cat'],
'Flora' => ['Rose']
}
I am adding each names via <<:
new_hash = Hash.new
hash.each do |name , category|
if new_hash.key?(category)
new_file[category] << name
else
new_hash[category] = name
end
end
But I am being told that this operation is being performed on a frozen element:
`<<' : Can’t modify frozen string (FrozenError)
I suppose this is because each yields frozen objects. How can I restructure this code so the '.each' doesn't provide frozen variables?
I needed to add the first name to an array and then that array to the hash.
new_hash = Hash.new
hash.each do |name , category|
if new_hash.key?(category)
new_file[category] << name
else
new_hash[category] = [name] # <- must be an array
end
end
How can I restructure this code so the '.each' doesn't provide frozen variables?
Short answer: you can't.
Hash#each doesn't "provide frozen variables".
First off, there is no such thing as a "frozen variable". Variables aren't frozen. Objects are. The distinction between variables and objects is fundamental, not just in Ruby but in any programming language (and in fact pretty much everywhere else, too). If I have a sticker with the name "Seamus" on it, then this sticker is not you. It is simply a label that refers to you.
Secondly, Hash#each doesn't provide "variables". In fact, it doesn't provide anything that is not in the hash already. It simply yields the objects that are already in the hash.
Note that, in order to avoid confusion and bugs, strings are automatically frozen when used as keys. So, you can't modify string keys. You can either make sure they are correct from the beginning, or you can create a new hash with new string keys. (You can also add the new keys to the existing hash and delete the old keys, but that is a lot of complexity for little gain.)

Testing order using a hashmap

I'm a ruby beginner and reading that a Hash does not have order. I tried to play with that concept but found I could still order things like so:
Travel_Plans = Hash.new
Travel_Plans[4] = "Colorado Springs"
Travel_Plans[1] = "Santa Fe"
Travel_Plans[2] = "Raton"
Travel_Plans[5] = "Denver"
Travel_Plans[3] = "Pueblo"
puts Travel_Plans.sort
Could someone explain what is meant by "Hash does not have order"?
If you could provide a simple example that would be great.
Ruby's Hash class represents a "hash map" or "key-value dictionary" in conventional terms. These are intended to be structures that allow quick random-access to individual elements, but the elements themselves have no intrinsic ordering.
Internally Ruby's Hash organizes elements into their various locations in memory using the hash method each object must provide to be used as a key. Ruby's Hash is unusually, if not ludicrously flexible, in that an object, any object, can be used as a key, and it's preserved exactly as-is. Contrast with JavaScript where keys must be strings and strings only.
That means you can do this:
{ 1 => 'Number One', '1' => 'String One', :one => 'Symbol One', 1.0 => 'Float One }
Where that has four completely different keys.
This is in contrast to Array where the ordering is an important part of how an array works. You don't want to have a queue where things go in one order and come out in another.
Now Ruby's Hash class used to have no intrinsic order, but due to popular demand now it stores order in terms of insertion. That is, the first items inserted are "first". Normally you don't depend on this behaviour explicitly, but it does show up if you're paying attention:
a = { x: '1', y: '2' }
# => {:x=>"1, :y=>"2"}
b = { }
b[:y] = '2'
b[:x] = '1'
b
# => {:y=>"2", :x=>"1"}
Note that the order of the keys in b are reversed due to inserting them in reverse order. They're still equivalent though:
a == b
# => true
When you call sort on a Hash you actually end up converting it to an array of key/value pairs, then sorting each of those:
b.sort
# => [[:x, "1"], [:y, "2"]]
Which you could convert back into a Hash if you want:
b.sort.to_h
# => {:x=>"1", :y=>"2"}
So now it's "ordered" properly. In practice this rarely matters, though, as you'll be accessing the keys individually as necessary. b[:x] doesn't care where the :x key is, it always returns the right value regardless.
Some things to note about Ruby:
Don't use Hash.new, instead just use { } to represent an empty Hash structure.
Don't use capital letters for variables, they have significant meaning in Ruby. Travel_Plans is a constant, not a variable, because it starts with a capital letter. Those are reserved for ClassName and CONSTANT_NAME type usage. This should be travel_plans.
First, the statement "[h]ash does not have order" is wrong as of today. It used to be true for really old and outdated versions of Ruby. You seem to have picked outdated information, which would be unreliable as of today.
Second, the code you provided:
puts Travel_Plans.sort
is irrelevant to showing your point that the hash Travel_Plans has, i.e. preserves, the order. What you should have done to check whether the order is preserved, is to simply do:
p Travel_Plans
which would always result in showing the keys in the order 4, 1, 2, 5, 3, which matches the order in which you assigned the key-values to the hash, thus indeed shows that hash preserves the order.

Ruby array changes by changing a 'copy' of one of its elements

I'm trying to confirm whether my understanding is correct of these six lines of code:
string="this is a sentence"
words=string.split
first_word=words[0]
first_word[0]=first_word[0].upcase
out=words.join(" ")
puts(out)
which prints "This is a sentence" (with the first letter capitalized).
It would appear that changing the "first_word" string, which is defined as the first element of the "words" array, also changes the original "words" array. Is this indeed Ruby's default behavior? Does it not make it more difficult to track where in the code changes to the array take place?
You just need need to distinguish between a variable and an object. Your string is an object. first_word is a variable.
Look for example
a = "hello"
b = a
c = b
now all variables contain the same object, a string with the value "hello". We say they reference the object. No copy is made.
a[0] = 'H'
This changes the first character of the object, a string which now has the value "Hello". Both b and c contain the same, now changed object.
a = "different"
This assigns a new object to the variable a. b and c still hold the original object.
Is this Rubys default behaviour? yes. And it also works like this in many other programming languages.
Does it make it difficult to track changes? Sometimes.
If you takes an element from an array (like your first_word), you need to know:
If you change the object itself, no matter how you access it,
all variables will still hold your object, which just happened to be changed.
But if you replace the object in the array, like words[0] = "That", then all your other variables will still hold the original object.
This behavior is caused by how ruby does pass-by-value and pass-by-reference.
This is probably one of the more confusing parts of Ruby. It is well accepted that Ruby is a pass-by-value, high level programming language. Unfortunately, this is slightly incorrect, and you have found yourself a perfect example. Ruby does pass-by-value, however, most values in ruby are references. When Ruby does an assignment of a simple datatypes, integers, floats, strings, it will create a new object. However, when assigning objects such as arrays and hashes, you are creating references.
original_hash = {name: "schylar"}
reference_hash = original_hash
reference_hash[:name] = "!schylar"
original_hash #=> "!schylar"
original_array = [1,2]
reference_array = original_array
reference_array[0] = 3
reference_array #=> [3,2]
original_fixnum = 1
new_object_fixnum = original_fixnum
new_object_fixnum = 2
original_fixnum #=> 1
original_string = "Schylar"
new_object_string = original_string
new_object_string = "!Schylar"
original_string #=> "Schylar'
If you find yourself needing to copy by value, you may re-think the design. A common way to pass-by-value complex datatypes is using the Marshal methods.
a = {name: "Schylar"}
b = Marshal.load(Marshal.dump(a))
b[:name] = "!!!Schylar"
a #=> {:name => "Schylar"}

Strange Feature? of Ruby Arrays

I came around this strange feature(?) of arrays in Ruby and it would be very helpful if someone could explain to me why they work the way they do.
First lets give an example of how things usually work.
a = "Hello" #=> "Hello"
b = a #=> "Hello"
b += " Goodbye" #=> "Hello Goodbye"
b #=> "Hello Goodbye"
a #=> "Hello"
Ok cool, when you use = it creats a copy of the object (this time a string).
But when you use arrays this happens:
a = [1,2,3] #=> [1,2,3]
b = a #=> [1,2,3]
b[1] = 5 #=> [1,5,3]
b #=> [1,5,3]
a #=> [1,5,3]
Now thats just strange. Its the only object I've found that doesn't get copied when using = but instead just creates a refrance to the original object.
Can someone also explain (there must be a method) for copying an array without having it point back to the original object?
Actually, you should re-examine your premise.
The string assignment is really b = b + " Goodbye". The b + " Goodbye" operation returns an entirely new string, so the variable b is pointing to a new object after the assignment.
But when you assign to an individual array element, you are not creating an entirely new array, so a and b continue to point to the same object, which you just changed.
If you are looking for a rationale for the mutating vs functional behavior of arrays, it's simple. There is nothing to be gained by modifying the string. It is most likely necessary to allocate new memory anyway, so an entirely new string is created.
But an array can be arbitrarily large. Creating a new array in order to change just one element could be hugely expensive. And in any case, an array is like any other composite object. Changing an individual attribute does not necessarily affect any other attributes.
And to answer your question, you can always do:
b = a.dup
What happends there is that ruby is treating the Array object by reference and not by value.
So you can see it as this:
b= [1,2,3]
a= b
--'b' Points to---> [1,2,3] <--'a' points t---
So as you can see both point to the same reference, that means that if you change anything in a it will be reflected on b.
As for your question on the copying the object you could use the Object#clone method to do so.
Try your Array case with a String:
a = "Hello" #=> "Hello"
b = a #=> "Hello"
b[1] = "x" #=> "x"
b #=> "Hxllo"
a #=> "Hxllo"
Strings and Arrays work the same way in this regard.
The key difference in the two cases, as you wrote them, is this:
b += " Goodbye"
This is syntactic shorthand for
b = b + " Goodbye"
which is creating a new string from b + " Goodbye" and then assigning that to b. The way to modify an existing string, rather than creating a new one, is
b << " Goodbye"
And if you plug that into your sequence, you'll see that it modifies both a and b, since both variables refer to the same string object.
As for deep copying, there's a decent piece about it here:
http://ruby.about.com/od/advancedruby/a/deepcopy.htm

Resources