Ruby object vs. hash - ruby

Code snippet below returns an object.
class Person
def initialize(name, gender)
#name = name
#gender = gender
end
end
x = Person.new("Dan", "M")
=> #<Person:0x007f6f96600560 #name="Dan", #gender="M">
What is the difference between an object < ... > and a hash { ... }? Why wouldn't a Ruby class just return hashes?
What is the 0x007f6f96600560 in the object? I am pretty sure it's not object_id.

Object → Hash
From the excellent book "Ruby under the microscope" by Pat Shaughnessy :
Every Ruby object is the combination of a class pointer and an array
of instance variables.
Here's a somewhat longer description :
A user-defined Ruby object is represented by a structure called an
RObject, and is referred to by a pointer called VALUE.
Inside RObject, there is another structure called RBasic, which all
Ruby values will have.
Aside from the RBasic structure, RObject also contains numiv, a count
of how many instance variables the object has, ivptr, a pointer to an
array of values of the instance variables, and iv_index_tbl, which is
a pointer to a hash table stored in the object’s associated RClass
structure that maps the name/identity of each instance variable to its
position in the ivtpr array.
From any Ruby object, it's possible to extract a hash of instance variables :
class Object
def instance_variables_hash
Hash[instance_variables.map { |name| [name, instance_variable_get(name)] } ]
end
end
With your example :
x.instance_variables_hash
=> {:#name=>"Dan", :#gender=>"M"}
Hash → Object ?
But you couldn't possibly create x back from this hash, because you're missing a crucial piece of information : what class is x an instance of?
So for example, you wouldn't know the methods that you can send to x :
class Dog
def initialize(name, gender)
#name = name
#gender = gender
end
def bark
puts "WOOF"
end
end
person = Person.new("Dan", "M")
dog = Dog.new("Dan", "M")
p person.instance_variables_hash
# {:#name=>"Dan", :#gender=>"M"}
p dog.instance_variables_hash == person.instance_variables_hash
# true
person.bark
# undefined method `bark' for #<Person:0x007fb3b20ed658 #name="Dan", #gender="M">
object_id
To get the object_id out of the inspect string :
"0x007f6f96600560".sub('0x','').to_i(16)/2
#=> 70058620486320
And back :
"0x" + (70058620486320 * 2).to_s(16).rjust(14,'0')
#=> "0x007f6f96600560"

Of course, some times you can use objects and hashes for the same thing. Storing key value pair ob objects like this:
[3] pry(main)> class Person
def initialize(name, gender)
#name = name
#gender = gender
end
end
[3] pry(main)* => :initialize
[4] pry(main)> x = Person.new("Dan", "M")
=> #<Person:0x00000003708098 #gender="M", #name="Dan">
[13] pry(main)> y = Person.new("Peter", "M")
=> #<Person:0x0000000391fca0 #gender="M", #name="Peter">
[22] pry(main)> z = {name: "Maria", gender: "F"}
=> {:name=>"Maria", :gender=>"F"}
But this objects really doesn't get all the power of an object oriente programming language from the definitions of an class/object and hash:
Ruby is a perfect Object Oriented Programming Language. The features
of the object-oriented programming language include:
Data Encapsulation:
Data Abstraction:
Polymorphism:
Inheritance:
These features have been discussed in Object Oriented Ruby.
An object-oriented program involves classes and objects. A class is
the blueprint from which individual objects are created. In
object-oriented terms, we say that your bicycle is an instance of the
class of objects known as bicycles.
Take the example of any vehicle. It comprises wheels, horsepower, and
fuel or gas tank capacity. These characteristics form the data members
of the class Vehicle. You can differentiate one vehicle from the other
with the help of these characteristics.
A vehicle can also have certain functions, such as halting, driving,
and speeding. Even these functions form the data members of the class
Vehicle. You can, therefore, define a class as a combination of
characteristics and functions.
and a hash:
A Hash is a collection of key-value pairs like this: "employee" =>
"salary". It is similar to an Array, except that indexing is done via
arbitrary keys of any object type, not an integer index.
So for store data I recommend you a Hash.
On the other hand, as showed in a comment the number that appers in the object representation is the object id, but with few operations added:
1) bitwise left shift:
5 << 1 # gives 10
2) passed to hexadeimal
(10).to_s(16)
"a"
pry(main)> x = Person.new("Dan", "M")
=> #<Person:0x00000003708098 #gender="M", #name="Dan">
[5] pry(main)> x.object_id
=> 28852300
[8] pry(main)> (x.object_id << 1 ).to_s(16)
=> "3708098"
finally in ruby you can get the hash representation of an object like this:
x.instance_variables.each {|var| hash[var.to_s.delete("#")] = x.instance_variable_get(var) }

Related

Selfschizofrenia in Ruby

I'm looking at a piece of code that suffers from self schizophrenia. One object is wrapping another object and to the programmer this is hidden and the code will expect the identity of the wrapper and the wrapped to be the same. This is only related to object_id and not to any method calls including comparions. I know that the VM would have problems if the wrapper would give of the same object_id as the wrapped but are there any Kernel, Class, Module methods (or other commonly used methods) that relies on the object_id to behave correctly?
In example
I might have code like
class HashSet
def add(x)
if #objects.has_key? x.object_id
false
else
#objects[x.object_id] = x
end
end
end
If I expect the call to add to return false I will be surprised that I can actuallly add the same object twice (I'm unaware of the wrapper).
To restate the question:
are there any Kernel, Class, Module methods (or other commonly used methods) that relies on the object_id to behave correctly?
Hash instances have a compare_by_identity mode:
a1 = "a"
a2 = "a"
p a1.object_id == a2.object_id #=>false
h = {}
h.compare_by_identity
h[a1] = 0
h[a2] = 1
p h # => {"a"=>0, "a"=>1}
p h["a"] # => nil
p h[a2] # => 1

How do I access the elements in a hash which is itself a value in a hash?

I have this hash $chicken_parts, which consists of symbol/hash pairs (many more than shown here):
$chicken_parts = { :beak = > {"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}, :claws => {"name"=>"Claws", "color"=>"Dirty", function"=>"Scratching"} }
Then I have a class Embryo which has two class-specific hashes:
class Embryo
#parts_grown = Hash.new
#currently_developing = Hash.new
Over time, new pairs from $chicken_parts will be .merge!ed into #parts_grown. At various times, #currently developing will be declared equal to one of the symbol/hash pairs from #parts_grown.
I'm creating Embryo class functions and I want to be able to access the "name", "color", and "function" values in #currently_developing, but I don't seem to be able to do it.
def grow_part(part)
#parts_grown.merge!($chicken_parts[part])
end
def develop_part(part)
#currently_developing = #parts_grown[part]
seems to populate the hashes as expected, but
puts #currently_developing["name"]
does not work. Is this whole scheme a bad idea? Should I just make the Embryo hashes into arrays of symbols from $chicken_parts, and refer to it whenever needed? That seemed like cheating to me for some reason...
There's a little bit of confusion here. When you merge! in grow_part, you aren't adding a :beak => {etc...} pair to #parts_grown. Rather, you are merging the hash that is pointed too by the part name, and adding all of the fields of that hash directly to #parts_grown. So after one grow_part, #parts_grown might look like this:
{"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}
I don't think that's what you want. Instead, try this for grow_part:
def grow_part(part)
#parts_grown[part] = $chicken_parts[part]
end
class Embryo
#parts_grown = {a: 1, b: 2}
def show
p #parts_grown
end
def self.show
p #parts_grown
end
end
embryo = Embryo.new
embryo.show
Embryo.show
--output:--
nil
{:a=>1, :b=>2}

Is there any way to convert human readable representation of Ruby's object back to this object

Just imagine a situation when the only information we know about Ruby's object is it's human readable format:
#<User::Class::Element:0x2fef43 #field1 = 1, #field2 = two, #field3 = [1,2,3]>
The task is to write a method which could convert this representation to the object of the class pointed by this representation (of course with having an access to all appropriate namespaces, modules, classes and methods). For example:
obj = humanReadableFormat2Obj("#<User::Class::Element:0x2fef43 #field1 = 1, #field2 = \"two\", #field3 = [1,2,3]>")
puts obj.field1 #=> "1"
puts obj.field2 #=> "two"
p obj.field3 #=> [1, 2, 3]
puts obj.class.to_s #=> User::Class::Element
P.S. This task originates from the problem of synchronization of several large data bases. Instead of transfering objects from one data base to another in the binary format(hundreds of MB) you can transfer only a script (several KB) and execute it on another data base to create appropriate object.
The Ox and Oj gems (XML and JSON respectively) can serialize into relatively human readable Ruby objects. This would probably be a better solution, since the inspect method doesn't always return all of the attributes of a Ruby object, as Sigurd mentioned in the comments.
Example from the Ox docs:
require 'ox'
class Sample
attr_accessor :a, :b, :c
def initialize(a, b, c)
#a = a
#b = b
#c = c
end
end
# Create Object
obj = Sample.new(1, "bee", ['x', :y, 7.0])
# Now dump the Object to an XML String.
xml = Ox.dump(obj)
# Convert the object back into a Sample Object.
obj2 = Ox.parse_obj(xml)

Uniq of ruby array fails to work

I have a array of my object Country which has the attributes "code" and "name"
The array could have a country in it more than once so I want to distinct the array.
This is my countries class
class Country
include Mongoid::Fields::Serializable
attr_accessor :name, :code
FILTERS = ["Afghanistan","Brunei","Iran", "Kuwait", "Libya", "Saudi Arabia", "Sudan", "Yemen", "Britain (UK)", "Antarctica", "Bonaire Sint Eustatius & Saba", "British Indian Ocean Territory", "Cocos (Keeling) Islands", "St Barthelemy", "St Martin (French part)", "Svalbard & Jan Mayen","Vatican City"]
EXTRAS = {
'eng' => 'England',
'wal' => 'Wales',
'sco' => 'Scotland',
'nlr' => 'Northern Ireland'
}
def initialize(name, code)
#name = name
#code = code
end
def deserialize(object)
return nil unless object
Country.new(object['name'], object['code'])
end
def serialize(country)
{:name => country.name, :code => country.code}
end
def self.all
add_extras(filter(TZInfo::Country.all.map{|country| to_country country})).sort! {|c1, c2| c1.name <=> c2.name}
end
def self.get(code)
begin
to_country TZInfo::Country.get(code)
rescue TZInfo::InvalidCountryCode => e
'InvalidCountryCode' unless EXTRAS.has_key? code
Country.new EXTRAS[code], code
end
end
def self.get_by_name(name)
all.select {|country| country.name.downcase == name.downcase}.first
end
def self.filter(countries)
countries.reject {|country| FILTERS.include?(country.name)}
end
def self.add_extras(countries)
countries + EXTRAS.map{|k,v| Country.new v, k}
end
private
def self.to_country(country)
Country.new country.name, country.code
end
end
and my request for the array which is called from another class
def countries_ive_drunk
(had_drinks.map {|drink| drink.beer.country }).uniq
end
If I throw the array I can see the structure is:
[
#<Country:0x5e3b4c8 #name="Belarus", #code="BY">,
#<Country:0x5e396e0 #name="Britain (UK)", #code="GB">,
#<Country:0x5e3f350 #name="Czech Republic", #code="CZ">,
#<Country:0x5e3d730 #name="Germany", #code="DE">,
#<Country:0x5e43778 #name="United States", #code="US">,
#<Country:0x5e42398 #name="England", #code="eng">,
#<Country:0x5e40f70 #name="Aaland Islands", #code="AX">,
#<Country:0x5e47978 #name="England", #code="eng">,
#<Country:0x5e46358 #name="Portugal", #code="PT">,
#<Country:0x5e44d38 #name="Georgia", #code="GE">,
#<Country:0x5e4b668 #name="Germany", #code="DE">,
#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">,
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
]
This is the same, whether or not I do .uniq and you can see there is two "Anguilla"
As pointed out by others, the problem is that uniq uses hash to distinguish between countries and that by default, Object#hash is different for all objects. It will also use eql? in case two objects return the same hash value, to be sure if they are eql or not.
The best solution is to make your class correct in the first place!
class Country
# ... your previous code, plus:
include Comparable
def <=>(other)
return nil unless other.is_a?(Country)
(code <=> other.code).nonzero? || (name <=> other.name)
# or less fancy:
# [code, name] <=> [other.code, other.name]
end
def hash
[name, code].hash
end
alias eql? ==
end
Country.new("Canada", "CA").eql?(Country.new("Canada", "CA")) # => true
Now you can sort arrays of Countries, use countries as key for hashes, compare them, etc...
I've included the above code to show how it's done in general, but in your case, you get all this for free if you subclass Struct(:code, :name)...
class Country < Stuct(:name, :code)
# ... the rest of your code, without the `attr_accessible` nor the `initialize`
# as Struct provides these and `hash`, `eql?`, `==`, ...
end
Objects in array are considered duplicate by Array#uniq if their #hash values are duplicate, which is not the case in this code. You need to use different approach to do what intended, like this:
def countries_ive_drunk
had_drinks.map {|drink| drink.beer.country.code }
.uniq
.map { |code| Country.get code}
end
This boils down to what does equality mean? When is an object a duplicate of another? The default implementations of ==, eql? just compare the ruby object_id which is why you don't get the results you want.
You could implement ==, eql? and hash in a way that makes sense for your class, for example by comparing the countries' codes.
An alternative is to use uniq_by. This is an active support addition to Array, but mongoid depends on active support anyway, so you wouldn't be adding a dependency.
some_list_of_countries.uniq_by {|c| c.code}
Would use countries' codes to uniq them. You can shorten that to
some_list_of_countries.uniq_by(&:code)
Each element in the array is separate class instance.
#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
The ids are unique.
#<Country:0x5e4a2a0 #name="Anguilla", #code="AI">,
#<Country:0x5e48c98 #name="Anguilla", #code="AI">
Array#uniq thinks these are different objects (different instances of Country class), because the objects' ids are different.
Obviously you need to change your strategy.
At least as early as 1.9.3, Array#uniq would take a block just like uniq_by. uniq_by is now deprecated.

When to use Struct instead of Hash in Ruby?

I don't have much programming experience. But, to me, Struct seems somewhat similar to Hash.
What can Struct do well?
Is there anything Struct can do, that Hash cannot do?
After googling, the concept of Struct is important in C, but I don't know much about C.
Structs differ from using hashmaps in the following ways (in addition to how the code looks):
A struct has a fixed set of attributes, while you add new keys to a hash.
Calling an attribute that does not exist on an instance of a struct will cause a NoMethodError, while getting the value for a non-existing key from a hash will just return nil.
Two instances of different structs will never be equal even if the structs have the same attributes and the instances have the same values (i.e. Struct.new(:x).new(42) == Struct.new(:x).new(42) is false, whereas Foo = Struct.new(:x); Foo.new(42)==Foo.new(42) is true).
The to_a method for structs returns an array of values, while to_a on a hash gets you an array of key-value-pairs (where "pair" means "two-element array")
If Foo = Struct.new(:x, :y, :z) you can do Foo.new(1,2,3) to create an instance of Foo without having to spell out the attribute names.
So to answer the question: When you want to model objects with a known set of attributes, use structs. When you want to model arbitrary use hashmaps (e.g. counting how often each word occurs in a string or mapping nicknames to full names etc. are definitely not jobs for a struct, while modeling a person with a name, an age and an address would be a perfect fit for Person = Struct.new(name, age, address)).
As a sidenote: C structs have little to nothing to do with ruby structs, so don't let yourself get confused by that.
I know this question was almost well-answered, but surprisingly nobody has talked about one of the biggest differences and the real benefits of Struct. And I guess that's why somebody is still asking.
I understand the differences, but what's the real advantage to using a Struct over a Hash, when a Hash can do the same thing, and is simpler to deal with? Seems like Structs are kind of superfluous.
Struct is faster.
require 'benchmark'
Benchmark.bm 10 do |bench|
bench.report "Hash: " do
50_000_000.times do { name: "John Smith", age: 45 } end
end
bench.report "Struct: " do
klass = Struct.new(:name, :age)
50_000_000.times do klass.new("John Smith", 45) end
end
end
# ruby 2.2.2p95 (2015-04-13 revision 50295) [x64-mingw32].
# user system total real
# Hash: 22.340000 0.016000 22.356000 ( 24.260674)
# Struct: 12.979000 0.000000 12.979000 ( 14.095455)
# ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin11.0]
#
# user system total real
# Hash: 31.980000 0.060000 32.040000 ( 32.039914)
# Struct: 16.880000 0.010000 16.890000 ( 16.886061)
One more main difference is you can add behavior methods to a Struct.
Customer = Struct.new(:name, :address) do
def greeting; "Hello #{name}!" ; end
end
Customer.new("Dave", "123 Main").greeting # => "Hello Dave!"
From the Struct documentation:
A Struct is a convenient way to bundle a number of attributes together, using accessor methods, without having to write an explicit class.
On the other hand, a Hash:
A Hash is a collection of key-value pairs. It is similar to an Array, except that indexing is done via arbitrary keys of any object type, not an integer index. The order in which you traverse a hash by either key or value may seem arbitrary, and will generally not be in the insertion order.
The main difference is how you access your data.
ruby-1.9.1-p378 > Point = Struct.new(:x, :y)
=> Point
ruby-1.9.1-p378 > p = Point.new(4,5)
=> #<struct Point x=4, y=5>
ruby-1.9.1-p378 > p.x
=> 4
ruby-1.9.1-p378 > p.y
=> 5
ruby-1.9.1-p378 > p = {:x => 4, :y => 5}
=> {:x=>4, :y=>5}
ruby-1.9.1-p378 > p.x
NoMethodError: undefined method `x' for {:x=>4, :y=>5}:Hash
from (irb):7
from /Users/mr/.rvm/rubies/ruby-1.9.1-p378/bin/irb:17:in `<main>'
ruby-1.9.1-p378 > p[:x]
=> 4
ruby-1.9.1-p378 > p[:y]
=> 5
In short, you would make a new Struct when you want a class that's a "plain old data" structure (optionally with the intent of extending it with more methods), and you would use a Hash when you don't need a formal type at all.
If you're just going to encapsulate the data, then a Hash (or an Array of Hashes) are fine. If you're planning to have the data manipulate or interact with other data, then a Struct can open some interesting possibilities:
Point = Struct.new(:x, :y)
point_a = Point.new(0,0)
point_b = Point.new(2,3)
class Point
def distance_to another_point
Math.sqrt((self.x - another_point.x)**2 + (self.y - another_point.y)**2)
end
end
puts point_a.distance_to point_b

Resources