Ruby object references vs collection references - ruby

I was going through The Well Grounded Rubyist and got confused by the following example.
Suppose we have an array of strings:
numbers = ["one", "two", "three"]
If I freeze this array, I can't do the following:
numbers[2] = "four"
That statement is a Runtime error, but this:
numbers[2].replace("four")
is not.
The book explains that in the first of the last two statements, we are trying to access the array. That's what I found confusing because I thought we are trying to access the third element of the array, which is a string object. And how is that different from the last statement?

It's different because in the statement that works you are calling String#replace. As you might expect, a call to Array#replace will fail.
numbers.replace [1,2,3]
TypeError: can't modify frozen array
The object reference at any given array index might be arbitrarily complicated and it's not the job of the frozen array to keep those objects from changing ... it just wants to keep the array from changing. You can see this:
ree-1.8.7> numbers[2].object_id
=> 2149301040
ree-1.8.7> numbers[2].replace "four"
=> "four"
ree-1.8.7> numbers[2].object_id
=> 2149301040
numbers[2] has the same object_id after String#replace runs; the Array did not actually change.

An array is a list of object_id's. String#replace is special - it changes the string but it keeps the object_id. So the list of object_id's does not change and the Array does not detect any change.
You can freeze every string of the array. String#replace would then result in an error.

Related

Iterate through hashes to find values predefined in an array

I have an array with hashes:
test = [
{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
{"type"=>1338, "age"=>18, "name"=>"John Doe"},
{"type"=>1339, "age"=>22, "name"=>"Carl Adley"},
{"type"=>1340, "age"=>25, "name"=>"Anna Brent"}
]
I am interested in getting all the hashes where the name key equals to a value that can be found in an array:
get_hash_by_name = ["John Doe","Anna Brent"]
Which would end up in the following:
# test_sorted = would be:
# {"type"=>1338, "age"=>18, "name"=>"John Doe"}
# {"type"=>1340, "age"=>25, "name"=>"Anna Brent"}
I probably have to iterate with test.each somehow, but I still trying to get a grasp of Ruby. Happy for all help!
Here's something to meditate on:
Iterating over an array to find something is slow, even if it's a sorted array. Computer languages have various structures we can use to improve the speed of lookups, and in Ruby Hash is usually a good starting point. Where an Array is like reading from a sequential file, a Hash is like reading from a random-access file, we can jump right to the record we need.
Starting with your test array-of-hashes:
test = [
{'type'=>1337, 'age'=>12, 'name'=>'Eric Johnson'},
{'type'=>1338, 'age'=>18, 'name'=>'John Doe'},
{'type'=>1339, 'age'=>22, 'name'=>'Carl Adley'},
{'type'=>1340, 'age'=>25, 'name'=>'Anna Brent'},
{'type'=>1341, 'age'=>13, 'name'=>'Eric Johnson'},
]
Notice that I added an additional "Eric Johnson" record. I'll get to that later.
I'd create a hash that mapped the array of hashes to a regular hash where the key of each pair is a unique value. The 'type' key/value pair appears to fit that need well:
test_by_types = test.map { |h| [
h['type'], h]
}.to_h
# => {1337=>{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
# 1338=>{"type"=>1338, "age"=>18, "name"=>"John Doe"},
# 1339=>{"type"=>1339, "age"=>22, "name"=>"Carl Adley"},
# 1340=>{"type"=>1340, "age"=>25, "name"=>"Anna Brent"},
# 1341=>{"type"=>1341, "age"=>13, "name"=>"Eric Johnson"}}
Now test_by_types is a hash using the type value to point to the original hash.
If I create a similar hash based on names, where each name, unique or not, points to the type values, I can do fast lookups:
test_by_names = test.each_with_object(
Hash.new { |h, k| h[k] = [] }
) { |e, h|
h[e['name']] << e['type']
}.to_h
# => {"Eric Johnson"=>[1337, 1341],
# "John Doe"=>[1338],
# "Carl Adley"=>[1339],
# "Anna Brent"=>[1340]}
Notice that "Eric Johnson" points to two records.
Now, here's how we look up things:
get_hash_by_name = ['John Doe', 'Anna Brent']
test_by_names.values_at(*get_hash_by_name).flatten
# => [1338, 1340]
In one quick lookup Ruby returned the matching types by looking up the names.
We can take that output and grab the original hashes:
test_by_types.values_at(*test_by_names.values_at(*get_hash_by_name).flatten)
# => [{"type"=>1338, "age"=>18, "name"=>"John Doe"},
# {"type"=>1340, "age"=>25, "name"=>"Anna Brent"}]
Because this is running against hashes, it's fast. The hashes can be BIG and it'll still run very fast.
Back to "Eric Johnson"...
When dealing with the names of people it's likely to get collisions of the names, which is why test_by_names allows multiple type values, so with one lookup all the matching records can be retrieved:
test_by_names.values_at('Eric Johnson').flatten
# => [1337, 1341]
test_by_types.values_at(*test_by_names.values_at('Eric Johnson').flatten)
# => [{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
# {"type"=>1341, "age"=>13, "name"=>"Eric Johnson"}]
This will be a lot to chew on if you're new to Ruby, but the Ruby documentation covers it all, so dig through the Hash, Array and Enumerable class documentation.
Also, *, AKA "splat", explodes the array elements from the enclosing array into separate parameters suitable for passing into a method. I can't remember where that's documented.
If you're familiar with database design this will look very familiar, because it's similar to how we do database lookups.
The point of all of this is that it's really important to consider how you're going to store your data when you first ingest it into your program. Do it wrong and you'll jump through major hoops trying to do useful things with it. Do it right and the code and data will flow through very easily, and you'll be able to massage/extract/combine the data easily.
Said differently, Arrays are containers useful for holding things you want to access sequentially, such as jobs you want to print, sites you need to access in order, files you want to delete in a specific order, but they're lousy when you want to lookup and work with a record randomly.
Knowing which container is appropriate is important, and for this particular task, it appears that an array of hashes isn't appropriate, since there's no fast way of accessing specific ones.
And that's why I made my comment above asking what you were trying to accomplish in the first place. See "What is the XY problem?" and "XyProblem" for more about that particular question.
You can use select and include? so
test.select {|object| get_hash_by_name.include? object['name'] }
…should do the job.

How to check if nested hash attributes are empty

I have a Hash
person_params = {"firstname"=>"",
"lastname"=>"tom123",
"addresses_attributes"=>
{"0"=>
{"address_type"=>"main",
"catalog_delivery"=>"0",
"street"=>"tomstr",
"city"=>"tomcity"
}
}
}
With person_params[:addresses_attributes], I get:
# => {"0"=>{"address_type"=>"main", "catalog_delivery"=>"0", "street"=>"tomstr", "zip"=>"", "lockbox"=>"", "city"=>"tomcity", "country"=>""}}
1) How can I get a new hash without the leading 0?
desired_hash = {"address_type"=>"main", "catalog_delivery"=>"0", "street"=>"tomstr", "zip"=>"", "lockbox"=>"", "city"=>"tomcity", "country"=>""}
2) How can I check whether the attributes in the new hash are empty?
Answer 1:
person_params[:addresses_attributes]['0']
Answer 2:
hash = person_params[:addresses_attributes]['0']
hash.empty?
This looks just like a params hash from Rails =D. Anyway, it seems that your addresses_attributes contains some nested attributes. This means that what you have in practice is more of an array of hashes than a single hash, and that's what you see right? Instead of it being an actually Ruby Array, it is a hash with the index as a string.
So how do you get the address attributes? Well if you only want to get the first address, here are some ways to do that:
person_params[:addresses_attributes].values.first
# OR
person_params[:addresses_attributes]["0"]
In the first case, we will just take the values from the addreses_attributes hash, which gives us an Array from which we can take the first item. If there are no values in addresses_attributes, then we will get nil.
In the second case, we will just ask for the hash value with the key "0". If there are no values in addresses_attributes, we will get nil with this method also. (You might want to avoid using the second case, if you are not confident that the addresses_attributes hash will always be indexed from "0" and incremented by "1")

Understanding hashes

An exercise says:
Create three hashes called person1, person2, and person3, with first
and last names under the keys :first and :last. Then create a params
hash so that params[:father] is person1, params[:mother] is person2,
and params[:child] is person3. Verify that, for example,
params[:father][:first] has the right value.
I did
person1 = {first: "Thom", last: "Bekker"}
person2 = {first: "Kathy", last: "Bekker"}
person2 = {first: "Anessa", last: "Bekker"}
then the params hash in Rails
params = {}
params[:father] = person1
params[:mother] = person2
params[:child] = person3
Now I can ask for father, mother or child's first or last name like so
params[:father][:first] gives me "Thom".
What makes params[:father][:first][:last] return an error? Is there a way to make that return "Thom Bekker"?
I have no way to check if the way I came up with is correct, is there a better way to do the exercise?
Is there a reason why symbol: is better than symbol =>?
Your Return Value is a Single Hash Object
You're misunderstanding the type of object you're getting back. params[:father] returns a single Hash object, not an Array or an Array of hashes. For example:
params[:father]
#=> {:first=>"Thom", :last=>"Bekker"}
params[:father].class
#=> Hash
So, you can't access the missing third element (e.g. :last) because there's no such element within the value of params[:father][:first].
Instead, you could deconstruct the Hash:
first, last = params[:father].values
#=> ["Thom", "Bekker"]
or do something more esoteric like:
p params[:father].values.join " "
#=> "Thom Bekker"
The point is that you have to access the values of the Hash, or convert it to an Array first, rather than treating it directly like an Array and trying to index into multiple values at once.
In Ruby, using square brackets on the Hash or other classes is actually using a method available for an object of that class (this one). When you call these methods in your example, each of these methods will be called and will return its result before the next method is called. So, as you've defined it:
Calling [:father] on params returns the hash represented by person1
[:first] is then called on {first: "Thom", last: "Bekker"}, returning the corresponding value in the hash, "Thom"
[:last] is called on "Thom", which results in an error. Calling square brackets on a string with an integer between them can access the character in a string at that index (person1[:first][0] returns "T"), but "Thom" doesn't have a way of handling the :last symbol inside the square brackets.
There are a number of ways you could get the names printed as you wanted, one of the simplest being combining the string values in person1:
params[:father][:first] + " " + params[:father][:last]
returns "Thom Bekker"
So here's my question, what exactly makes params[:father][:first][:last] return an error? Is there a way to make that return "Thom Bekker"?
Both of these would work
params[:father][:first] + params[:father][:last]
params[:father].values.join(' ')
But maybe it would be better to think about them like the nested structures that they are:
father = params[:father]
name = father[:first] + father[:last]
puts name
To answer your last question, pretend that there's no difference between a hashrocket => and symbol:. This is one where you don't need to care for a long time. Maybe around year 2 start asking this again, but for learning treat them as if they were equivalent.
(Full disclosure, there are differences, but this is a holy war that you really don't want to see played out)

How to replace an object in Ruby?

Say I have some deeply nested array structure and a reference to an object inside:
strings = ["1", "2", " 3"]
nested = [[strings] * 10] * 10
reference = nested[0][0][0]
How do I replace the object reference points to with eg. "4"? I need somthing generic that works with arbitrary objects, not String#gsub! and friends. Something like Object#replace(other_obj).
You can't, we don't have (explicit) pointers in Ruby, we have (implicit) references but you can't dereference them to mess with what they contain. Instead, you need to do something like:
inner nested[0][0]
inner[0] = '4'
so that you can work with a reference to the element you want to replace rather than the element itself.
Of course, with the structure in your question, that inner[0] = '4' will replace the first element of strings (and thus every element of nested since it is just a pile of references to the same array that strings references.
Sorry about how overloaded the term reference is here. It is a horrible abuse of English but English itself is an abuse of English :)

Changing one array in an array of arrays changes them all; why?

a = Array.new(3,[])
a[1][0] = 5
a => [[5], [5], [5]]
I thought this doesn't make sense!
isn't it should a => [[], [5], []]
or this's sort of Ruby's feature ?
Use this instead:
a = Array.new(3){ [] }
With your code the same object is used for the value of each entry; once you mutate one of the references you see all others change. With the above you instead invoke the block each time a new value is needed, which returns a new array each time.
This is similar in nature to the new user question about why the following does not work as expected:
str.gsub /(<([a-z]+)>/, "-->#{$1}<--"
In the above, string interpolation occurs before the gsub method is ever called, so it cannot use the then-current value of $1 in your string. Similarly, in your question you create an object and pass it to Array.new before Ruby starts creating array slots. Yes, the runtime could call dup on the item by default…but that would be potentially disastrous and slow. Hence you get the block form to determine on your own how to create the initial values.

Resources