Ruby: How come the same strings have different hashcodes? - ruby

test = 'a'
test2 = '#a'.slice(0)
test3 = '#a'[1]
puts test.hash
puts test2.hash
puts test3.hash
Output:
100
64
97
Is this a bug or am I misunderstanding how the hash method works? Is there a way to fix this?

The results of these expressions are not all the same data. Ruby 1.8 integers contain character numbers for single character indexing. This has been changed in Ruby 1.9, but slice(0) returns the first character of the string '#', not 'a'.
In Ruby 1.8 (using irb):
irb(main):001:0> test = 'a'
=> "a"
irb(main):002:0> test2 = '#a'.slice(0)
=> 64
irb(main):003:0> test3 = '#a'[1]
=> 97
irb(main):004:0> test.hash
=> 100
irb(main):005:0> test2.hash
=> 129
irb(main):006:0> test3.hash
=> 195
In Ruby 1.9.1:
irb(main):001:0> test = 'a'
=> "a"
irb(main):002:0> test2 = '#a'.slice(0)
=> "#"
irb(main):003:0> test3 = '#a'[1]
=> "a"
irb(main):004:0> test.hash
=> 1365935838
irb(main):005:0> test2.hash
=> 347394336
irb(main):006:0> test3.hash
=> 1365935838

The reason is that each variable refers to different a object with its own unique hash code! The variable test is the string "a", test2 is the integer 64 (the character number of '#'), and test3 is the integer 97 ('a'). The surprise is probably that in Ruby, the elements of strings are integers, not strings or characters.

As maerics points out, unless you've defined your own hash method for the class you're using, the hash might simply be on the object itself, not its contents. That said, you can (and should) define your own hash method for any class where you define an equals method.
In Ruby, the String class already does this for you:
irb(main):001:0> test="a"
=> "a"
irb(main):002:0> test2="a"
=> "a"
irb(main):003:0> test.hash
=> 100
irb(main):004:0> test2.hash
=> 100
irb(main):005:0> test2[0]=test.slice(0)
=> 97
irb(main):006:0> test2
=> "a"
irb(main):007:0> test2.hash
=> 100
I haven't found an equivalent text for Ruby, but this page on Java gives an excellent algorithm for generating your own hash code that's not hard to copy for Ruby: http://www.javapractices.com/topic/TopicAction.do?Id=28

Related

IRB (apparently) not inspecting hashes correctly

I'm seeing some odd behavior in IRB 1.8.7 with printing hashes. If I initialize my hash with a Hash.new, it appears that my hash is "evaluating" to an empty hash:
irb(main):024:0> h = Hash.new([])
=> {}
irb(main):025:0> h["test"]
=> []
irb(main):026:0> h["test"] << "blah"
=> ["blah"]
irb(main):027:0> h
=> {}
irb(main):028:0> puts h.inspect
{}
=> nil
irb(main):031:0> require 'pp'
=> true
irb(main):032:0> pp h
{}
=> nil
irb(main):033:0> h["test"]
=> ["blah"]
As you can see, the data is actually present in the hash, but trying to print or display it seems to fail. Initialization with a hash literal seems to fix this problem:
irb(main):050:0> hash = { 'test' => ['testval'] }
=> {"test"=>["testval"]}
irb(main):051:0> hash
=> {"test"=>["testval"]}
irb(main):053:0> hash['othertest'] = ['secondval']
=> ["secondval"]
irb(main):054:0> hash
=> {"othertest"=>["secondval"], "test"=>["testval"]}
The issue here is that invoking h["test"] doesn't actually insert a new key into the hash - it just returns the default value, which is the array that you passed to Hash.new.
1.8.7 :010 > a = []
=> []
1.8.7 :011 > a.object_id
=> 70338238506580
1.8.7 :012 > h = Hash.new(a)
=> {}
1.8.7 :013 > h["test"].object_id
=> 70338238506580
1.8.7 :014 > h["test"] << "blah"
=> ["blah"]
1.8.7 :015 > h.keys
=> []
1.8.7 :016 > h["bogus"]
=> ["blah"]
1.8.7 :017 > h["bogus"].object_id
=> 70338238506580
1.8.7 :019 > a
=> ["blah"]
The hash itself is still empty - you haven't assigned anything to it. The data isn't present in the hash - it's present in the array that is returned for missing keys in the hash.
It looks like you're trying to create a hash of arrays. To do so, I recommend you initialize like so:
h = Hash.new { |h,k| h[k] = [] }
Your version isn't working correctly for me, either. The reason why is a little complicated to understand. From the docs:
If obj is specified, this single object will be used for all default values.
I've added the bolding. The rest of the emphasis is as-is.
You're specifying that obj is [], and it's only a default value. It doesn't actually set the contents of the hash to that default value. So when you do h["blah"] << "test", you're really just asking it to return a copy of the default value and then adding "test" to that copy. It never goes into the hash at all. (I need to give Chris Heald credit for explaining this below.)
If instead you give it a block, it calls that block EVERY TIME you do a lookup on a non-existent entry of the hash. So you're not just creating one Array anymore. You're creating one for each entry of the hash.

The confusing Ruby method returns value

I have Ruby code:
def test_111(hash)
n = nil
3.times do |c|
if n
n[c] = c
else
n = hash
end
end
end
a = {}
test_111(a)
p a
Why it print {1=>1, 2=>2}, not the {} ??
In the test_111 method, the hash and the a use the same memory?
How can the a value be changed in the test_111 method?
I can't understand
Hashes are passed by reference. So, when you change a method parameter (which is a Hash), you change the original hash.
To avoid this, you should clone the hash.
test_111(a.dup)
This will create a shallow copy (that is, it will not clone child hashes that you may have).
A little illustration of what shallow copy is:
def mutate hash
hash[:new] = 1
hash[:existing][:value] = 2
hash
end
h = {existing: {value: 1}}
mutate h # => {:existing=>{:value=>2}, :new=>1}
# new member added, existing member changed
h # => {:existing=>{:value=>2}, :new=>1}
h = {existing: {value: 1}}
mutate h.dup # => {:existing=>{:value=>2}, :new=>1}
# existing member changed, no new members
h # => {:existing=>{:value=>2}}
In ruby, just about every object is passed by reference. This means when you do something as simple as
a = b
unless a was one of the simple types, after this assignment a and b will point to the same thing.
This means if you alter the second variable, the first is affected the same way:
irb(main):001:0> x = "a string"
=> "a string"
irb(main):002:0> y = x
=> "a string"
irb(main):003:0> x[1,0] = "nother"
=> "nother"
irb(main):004:0> x
=> "another string"
irb(main):005:0> y
=> "another string"
irb(main):006:0>
and of course the same applies for hashes:
irb(main):006:0> a = { :a => 1 }
=> {:a=>1}
irb(main):007:0> b = a
=> {:a=>1}
irb(main):008:0> a[:b] = 2
=> 2
irb(main):009:0> a
=> {:a=>1, :b=>2}
irb(main):010:0> b
=> {:a=>1, :b=>2}
irb(main):011:0>
If you don't want this to happen, use .dup or .clone:
irb(main):001:0> a = "a string"
=> "a string"
irb(main):002:0> b = a.dup
=> "a string"
irb(main):003:0> a[1,0] = "nother"
=> "nother"
irb(main):004:0> a
=> "another string"
irb(main):005:0> b
=> "a string"
irb(main):006:0>
For most people dup and clone have the same effect.
So if you write a function that modifies one of its parameters, unless you specifically want those changes to be seen by the code that calls the function, you should first dup the parameter being modified:
def test_111(hash)
hash = hash.dup
# etc
end
The behavior of your code is called a side effect - a change to the program's state that isn't a core part of the function. Side effects are generally to be avoided.

Finding out variable class and difference between Hash and Array

Is it possible to clearly identify a class of variable?
something like:
#users.who_r_u? #=>Class (some information)
#packs.who_r_u? #=> Array (some information)
etc.
Can someone provide clear short explanation of difference between Class, Hash, Array, Associated Array, etc. ?
You can use:
#users.class
Test it in irb:
1.9.3p0 :001 > 1.class
=> Fixnum
1.9.3p0 :002 > "1".class
=> String
1.9.3p0 :003 > [1].class
=> Array
1.9.3p0 :004 > {:a => 1}.class
=> Hash
1.9.3p0 :005 > (1..10).class
=> Range
Or:
1.9.3p0 :010 > class User
1.9.3p0 :011?> end
=> nil
1.9.3p0 :012 > #user = User.new
=> #<User:0x0000010111bfc8>
1.9.3p0 :013 > #user.class
=> User
These were only quick irb examples, hope it's enough to see the use of .class in ruby.
You could also use kind_of? to test wheter its receiver is a class, an array or anything else.
#users.kind_of?(Array) # => true
You can find these methods in Ruby document http://ruby-doc.org/core-1.9.3/Object.html
#user.class => User
#user.is_a?(User) => true
#user.kind_of?(User) => true
found helpful: <%= debug #users %>
A difference between Class and Hash? They are too different to even provide normal answer. Hash is basically an array with unique keys, where each key has its associated value. That's why it's also called associative array.
Here is some explanation:
array = [1,2,3,4]
array[0] # => 1
array[-1] # => 4
array[0..2] # => [1,2,3]
array.size # => 4
Check out more Array methods here: http://ruby-doc.org/core-1.9.3/Array.html
hash = {:foo => 1, :bar => 34, :baz => 22}
hash[:foo] # => 1
hash[:bar] # => 34
hash.keys # => [:baz,:foo,:bar]
hash.values # => [34,22,1]
hash.merge :foo => 3921
hash # => {:bar => 34,:foo => 3921,:baz => 22 }
Hash never keeps order of the elments you added to it, it just preserves uniqueness of keys, so you can easily retreive values.
However, if you do this:
hash.merge "foo" => 12
you will get
hash # => {:bar => 34, baz => 22, "foo" => 12, :foo => 2}
It created new key-value pair since :foo.eql? "foo" returns false.
For more Hash methods check out this: http://www.ruby-doc.org/core-1.9.3/Hash.html
Class object is a bit too complex to explain in short, but if you want to learn more about it, reffer to some online tutorials.
And remember, API is your friend.

What are values mapped to?

I don't understand how Ruby hashes work.
I expect these:
a = 'a'
{a => 1}[a] # => 1
{a: 1}[:a] # => 1
{2 => 1}[2] # => 1
How does this work?
{'a' => 1}['a'] # => 1
The first string 'a' is not the same object as the second string 'a'.
Ruby doesn't use object equality (equal?) for comparing hash keys. It wouldn't be very useful if it did after all.
Instead it uses eql?, which for strings is the same as ==
As a footnote to other answers, you can let a hash behave like you expected:
h = {'a'=> 1}
p h['a'] #=> 1
h.compare_by_identity
p h['a'] #=> nil ; not the same object
some_hash[k] = v
Basically, when you do this, what is stored is not a direct association k => v. Instead of that, k is asked for a hash code, which is then used to map to v.
Equal values yield equal hash codes. That's why your last example works the way it does.
A couple of examples:
1.9.3p0 :001 > s = 'string'
=> "string"
1.9.3p0 :002 > 'string'.hash
=> -895223107629439507
1.9.3p0 :003 > 'string'.hash == s.hash
=> true
1.9.3p0 :004 > 2.hash
=> 2271355725836199018
1.9.3p0 :005 > nil.hash
=> 2199521878082658865

Ruby equivalent of Perl Data::Dumper

I am learning Ruby & Perl has this very convenient module called Data::Dumper, which allows you to recursively analyze a data structure (like hash) & allow you to print it. This is very useful while debugging. Is there some thing similar for Ruby?
Look into pp
example:
require 'pp'
x = { :a => [1,2,3, {:foo => bar}]}
pp x
there is also the inspect method which also works quite nicely
x = { :a => [1,2,3, {:foo => bar}]}
puts x.inspect
I normally use a YAML dump if I need to quickly check something.
In irb the syntax is simply y obj_to_inspect. In a normal Ruby app, you may need to add a require 'YAML' to the file, not sure.
Here is an example in irb:
>> my_hash = {:array => [0,2,5,6], :sub_hash => {:a => 1, :b => 2}, :visible => true}
=> {:sub_hash=>{:b=>2, :a=>1}, :visible=>true, :array=>[0, 2, 5, 6]}
>> y my_hash # <----- THE IMPORTANT LINE
---
:sub_hash:
:b: 2
:a: 1
:visible: true
:array:
- 0
- 2
- 5
- 6
=> nil
>>
The final => nil just means the method didn't return anything. It has nothing to do with your data structure.
you can use Marshal, amarshal, YAML

Resources