Which value for a duplicate key is ignored in a Ruby hash? - ruby

If a hash has more than one occurrences of identical keys pointing to different values, then how does Ruby determine which value is assigned to that key?
In other words,
hash = {keyone: 'value1', keytwo: 'value2', keyone: 'value3'}
results in
warning: duplicated key at line 1 ignored: :keyone
but how do I know which value is assigned to :keyone?

The last one overwrites the previous values. In this case, "value3" becomes the value for :keyone. This works just as the same with merge. When you merge two hashes that have the same keys, the value in the latter hash (not the receiver but the argument) overwrites the other value.

Line numbers on duplicate key warnings can be misleading. As the other answers here confirm, every value of a duplicated key is ignored except for the last value defined for that key.
Using the example in the question across multiple lines:
1 hash1 = {key1: 'value1',
2 key2: 'value2',
3 key1: 'value3'}
4 puts hash1.to_s
keydup.rb:1: warning: duplicated key at line 3 ignored: :key1
{:key1=>"value3", :key2=>"value2"}
The message says "line 3 ignored" but, in fact it was the value of the key defined at line 1 that is ignored, and the value at line 3 is used, because that is the last value passed into that key.

IRB is your friend. Try the following in the command line:
irb
hash = {keyone: 'value1', keytwo: 'value2', keyone: 'value3'}
hash[:keyone]
What did you get? Should be "value3".
Best way to check these things is simply to try it out. It's one of the great things about Ruby.

This is spelled out clearly in section 11.5.5.2 Hash constructor of the ISO Ruby Language Specification:
11.5.5.2 Hash constructor
Semantics
[...]
b) 2) For each association Ai, in the order it appears in the program text, take the following steps:
i) Evaluate the operator-expression of the association-key of Ai. Let Ki be the resulting value.
ii) Evaluate the operator-expression of the association-value. Let Vi be the resulting value.
iii) Store a pair of Ki and Vi in H by invoking the method []= on H with Ki and Vi as the arguments.

Related

In TI-BASIC, how do I display the Variable Name given only the variable?

I'm creating a function that displays a lot of variables with the format Variable + Variable Name.
Define LibPub out(list)=
Func
Local X
for x,1,dim(list)
list[x]->name // How can I get the variable name here?
Disp name+list[x]
EndFor
Return 1
EndFunc
Given a list value, there is no way to find its name.
Consider this example:
a:={1,2,3,4}
b:=a ; this stores {1,2,3,4} in b
out(b)
Line 1: First the value {1,2,3,4} is created. Then an variable with name a is created and its value is set to {1,2,3,4}.
Line 2: The expression a is evaluated; the result is {1,2,3,4}. A new variable with the name b is created and its value is set to `{1,2,3,4}.
Line 3: The expression b is evaluated. The variable reference looks up what value is stored in b. The result is {1,2,3,4}. This value is then passed to the function out.
The function out receives the value {1,2,3,4}. Given the value, there is no way of knowing whether the value happened to be stored in a variable. Here the value is stored in both a and b.
However we can also look at out({1,1,1,1}+{0,2,3,4}).
The system will evaluate {1,1,1,1}+{0,2,3,4} and get {1,2,3,4}. Then out is called. The value out received the result of an expression, but an equivalent value happens to be stored in a and b. This means that the values doesn't have a name.
In general: Variables have a name and a value. Values don't have names.
If you need to print a name, then look into strings.
This will be memory intensive, but you could keep a string of variable names, and separate each name by some number of characters and get a substring based on the index of the variable in the list that you want to get. For instance, say you want to access index zero, then you take a substring starting at (index of variable * length of variable name, indexofvariable *length + length+1).
The string will be like this: say you had the variables foo, bas, random, impetus
the string will be stored like so: "foo bas random impetus "

Ruby: Optimizing storage for holding a huge number of strings, some of them duplicates

I have a text file with two columns. The values in the first column ("key") are all different, the values in the second column - these strings have a length between 10 and approximately 200 - have some duplicates. The number of duplicates varies. Some strings - especially the longer ones - don't have any duplicate, while others might have 20 duplicate occurancies.
key1 valueX
key2 valueY
key3 valueX
key4 valueZ
I would like to represent this data as a hash. Because of the large number of keys and the existence of duplicate values, I am wondering, whether some method of sharing common strings would be helpful.
The data in the file is kind of "constant", i.e. I can put effort (in time of space) to preprocess it in a suitable way, as long as it is accessed efficiently, once it is entered my application.
I will now outline an algorithm, where I believe this would solve the problem. My question is, whether the algorithm is sound, respectively whether it could be improved. Also, I would like to know whether using freeze on the strings would provide an additional optimization:
In a separated preprocessing process, I find out which strings values are indeed duplicate, and I annotate the data accordingly (i.e. create a third column in the file), in that all occurances of a repeated string except the first occurance, have a pointer to the first occurance:
key1 valueX
key2 valueY
key3 valueX key1
key4 valueZ
When I read in my application the data into memory (line by line), I use this annotation, to create a pointer to the original string, instead of allocating a new one:
if columns.size == 3
myHash[columns[0]] = columns[1] # First occurance of the string
else
myHash[columns[0]] = myHash[columns[2]].dup # Subsequent occurances
end
Will this achieve my goal? Can it be done any better?
One way you could do this is using symbols.
["a", "b", "c", "a", "d", "c"].each do |c|
puts c.intern.object_id
end
417768 #a
313128 #b
312328 #c
417768 #a
433128 #d
312328 #c
Note how c got the same value.
You can turn a string into a symbol with the intern method. If you intern an equal string you should get the same symbol out, like a flyweight pattern.
If you save the symbol in your hash you'll just have each string a single time. When it's time to use the symbol just call .to_s on the symbol and you'll get the string back. (Not sure how the to_s works, it may do creation work on each call.) Another idea would be to cache strings your self, ie have an integer to string cache hash and just put the integer key in your data structures. When you need the string you can look it up.

Syntax error with specific argument order

With this code:
Sponsorship.includes(patient: :vaccinations, :backer)
I get syntax error, unexpected ')', expecting =>. But when I change the order of the arguments like so:
Sponsorship.includes(:backer, patient: :vaccinations)
The errors go away. Why is the syntax error dependent on the order of the arguments?
Because a hash parameter needs to be the last parameter if you're relying on Ruby syntax sugar to avoid writing the {} yourself.
You have two valid alternatives here:
#sponsorship = Sponsorship.includes({ patient: :vaccinations }, :backer)
.find_by_slug(params[:id])
#sponsorship = Sponsorship.includes(:backer, patient: :vaccinations)
.find_by_slug(params[:id])
The first time you have 2 arguments, the first is an hash, and the second is a value,
In the second example you still have 2 arguments, the first is the value, and the second is an implicit hash. In ruby you can omit the brackets when the hash is the last argument passed to a method
What you did here:
#sponsorship = Sponsorship.includes(patient: :vaccinations, :backer)
.find_by_slug(params[:id])
Is interpreted as:
#sponsorship = Sponsorship.includes({ patient: :vaccinations, :backer })
.find_by_slug(params[:id])
Which is clearly wrong as hashes needs a { key0: value0, keyN: valueN } syntax.
That's not valid ruby syntax. You probably meant:
#sponsorship = Sponsorship.includes(patient: [:vaccinations, :backer]).find_by_slug(params[:id])
Note that patient: is the same as :patient =>, which is the key protion of a hash. So ruby is expecting the next thing to be the value half of the hash, not a list of things. I changed it to an array (not sure if that's what you meant).
#sponsorship = Sponsorship.includes(:backer, patient: [:vaccinations]).find_by_slug(params[:id])
The includes method expects to find the hash as the last argument. You must pass the hash as such. Otherwise you must put {} around the hash.

What is the difference between %{} and %s?

In exercise 9 in the Learn Ruby the Hard Way book, I am asked to write the following:
formatter = "%{first} %{second} %{third} %{fourth}"
puts formatter % {first: 1, second: 2, third: 3, fourth: 4}
which just evaluates to
1 2 3 4
When searching around, I noticed that many people have written this instead:
formatter = "%s %s %s %s"
puts formatter % [1, 2, 3, 4]
I believe the latter example is from an older version of the book. Can someone explain to me what the differences are between the two examples?
The quick answer to that is that the %{} syntax allows you to pass in named arguments to be substituted into your string whereas the %s syntax only substitutes items in the order they are given.
You can do more things with the %{} version, for example if you have the same string you need to substitute in multiple times you could write it out like this:
string = "Hello %{name}, nice to meet you %{name}. Now I have said %{name} three times, I remember your name."
string % { :name => "Magnus" }
With the %s syntax, you would have had to write:
string = "Hello %s, nice to meet you %s. Now I have said %s three times, I remember your name."
string % ["Magnus", "Magnus", "Magnus"]
There are many other formats and ways to write substitutions for strings in Ruby. The full explanation can be found here in the Ruby documentation.
formatter = "%{first} %{second} %{third} %{fourth}"
and
formatter = "%s %s %s %s"
are essentially the same in that the formatting method will take values and substitute them into a string, however the first format string uses named placeholders vs. unnamed placeholders in the second.
This affects how you pass the values being substituted. The first accepts a hash of symbols, and uses the keys of that hash to identify which of the fields get that associated value. In the second, you pass an array, and the values are picked positionally from the array when being substituted into the format string.
The second is more common, and has been around for years, so you'll see it more often. The first, because it's newer isn't going to run on old Rubies, but has the advantage of resulting in a bit more readable format strings which is good for maintenance. Also, as #griffin says in his comment, the first also allows you to more easily repeat a value if necessary. You can do it when passing an array into the old-style format, but a hash would be more efficient memory-wise, especially for cases when you've got a lot of variables you're reusing.
Note: You'll see %{foo}f and %<foo>f. There is a difference in how the variable passed in is formatted, so be careful which you use:
'%<foo>f' % {:foo => 1} # => "1.000000"
'%{foo}f' % {:foo => 1} # => "1f"
I think the difference is too subtle and will cause problems for unaware developers. For more information, see the Kernel#sprintf documentation.

What is the difference between : and "" in Ruby hashes?

I see some people using hash(es) like this:
end_points = { "dev" => "http://example.com"}
and in other places using this:
end_points = { :dev => "http://example.com"}
What is the difference between these two approaches?
"" declares a String. : declares a Symbol. If you're using a hash, and you don't need to alter the key's value or keep it around for anything, use a symbol.
Check this out for a more elaborate explanation.
:dev is a Symbol, 'dev' is a String.
Most of the time, symbols are used but both are correct. Some read on the subject :
What are symbols and how do we use them?
Why use symbols as hash keys in Ruby?
In first case you use string in second you use symbol. Symbols are specific type in Ruby. In whole program there is only one instance of symbol, but string can have it many. I.e.
> :sym.__id__
=> 321608
> :sym.__id__
=> 321608
> "sym".__id__
=> 17029680
> "sym".__id__
=> 17130280
As you see symbol always has the same ID what mean that it is always the same object, but string is every time new string in new place of memory. That's the case why symbols are more common as hash keys, it's simply faster.

Resources