When to use Struct instead of Hash in Ruby? - ruby

I don't have much programming experience. But, to me, Struct seems somewhat similar to Hash.
What can Struct do well?
Is there anything Struct can do, that Hash cannot do?
After googling, the concept of Struct is important in C, but I don't know much about C.

Structs differ from using hashmaps in the following ways (in addition to how the code looks):
A struct has a fixed set of attributes, while you add new keys to a hash.
Calling an attribute that does not exist on an instance of a struct will cause a NoMethodError, while getting the value for a non-existing key from a hash will just return nil.
Two instances of different structs will never be equal even if the structs have the same attributes and the instances have the same values (i.e. Struct.new(:x).new(42) == Struct.new(:x).new(42) is false, whereas Foo = Struct.new(:x); Foo.new(42)==Foo.new(42) is true).
The to_a method for structs returns an array of values, while to_a on a hash gets you an array of key-value-pairs (where "pair" means "two-element array")
If Foo = Struct.new(:x, :y, :z) you can do Foo.new(1,2,3) to create an instance of Foo without having to spell out the attribute names.
So to answer the question: When you want to model objects with a known set of attributes, use structs. When you want to model arbitrary use hashmaps (e.g. counting how often each word occurs in a string or mapping nicknames to full names etc. are definitely not jobs for a struct, while modeling a person with a name, an age and an address would be a perfect fit for Person = Struct.new(name, age, address)).
As a sidenote: C structs have little to nothing to do with ruby structs, so don't let yourself get confused by that.

I know this question was almost well-answered, but surprisingly nobody has talked about one of the biggest differences and the real benefits of Struct. And I guess that's why somebody is still asking.
I understand the differences, but what's the real advantage to using a Struct over a Hash, when a Hash can do the same thing, and is simpler to deal with? Seems like Structs are kind of superfluous.
Struct is faster.
require 'benchmark'
Benchmark.bm 10 do |bench|
bench.report "Hash: " do
50_000_000.times do { name: "John Smith", age: 45 } end
end
bench.report "Struct: " do
klass = Struct.new(:name, :age)
50_000_000.times do klass.new("John Smith", 45) end
end
end
# ruby 2.2.2p95 (2015-04-13 revision 50295) [x64-mingw32].
# user system total real
# Hash: 22.340000 0.016000 22.356000 ( 24.260674)
# Struct: 12.979000 0.000000 12.979000 ( 14.095455)
# ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin11.0]
#
# user system total real
# Hash: 31.980000 0.060000 32.040000 ( 32.039914)
# Struct: 16.880000 0.010000 16.890000 ( 16.886061)

One more main difference is you can add behavior methods to a Struct.
Customer = Struct.new(:name, :address) do
def greeting; "Hello #{name}!" ; end
end
Customer.new("Dave", "123 Main").greeting # => "Hello Dave!"

From the Struct documentation:
A Struct is a convenient way to bundle a number of attributes together, using accessor methods, without having to write an explicit class.
On the other hand, a Hash:
A Hash is a collection of key-value pairs. It is similar to an Array, except that indexing is done via arbitrary keys of any object type, not an integer index. The order in which you traverse a hash by either key or value may seem arbitrary, and will generally not be in the insertion order.
The main difference is how you access your data.
ruby-1.9.1-p378 > Point = Struct.new(:x, :y)
=> Point
ruby-1.9.1-p378 > p = Point.new(4,5)
=> #<struct Point x=4, y=5>
ruby-1.9.1-p378 > p.x
=> 4
ruby-1.9.1-p378 > p.y
=> 5
ruby-1.9.1-p378 > p = {:x => 4, :y => 5}
=> {:x=>4, :y=>5}
ruby-1.9.1-p378 > p.x
NoMethodError: undefined method `x' for {:x=>4, :y=>5}:Hash
from (irb):7
from /Users/mr/.rvm/rubies/ruby-1.9.1-p378/bin/irb:17:in `<main>'
ruby-1.9.1-p378 > p[:x]
=> 4
ruby-1.9.1-p378 > p[:y]
=> 5
In short, you would make a new Struct when you want a class that's a "plain old data" structure (optionally with the intent of extending it with more methods), and you would use a Hash when you don't need a formal type at all.

If you're just going to encapsulate the data, then a Hash (or an Array of Hashes) are fine. If you're planning to have the data manipulate or interact with other data, then a Struct can open some interesting possibilities:
Point = Struct.new(:x, :y)
point_a = Point.new(0,0)
point_b = Point.new(2,3)
class Point
def distance_to another_point
Math.sqrt((self.x - another_point.x)**2 + (self.y - another_point.y)**2)
end
end
puts point_a.distance_to point_b

Related

Ruby object vs. hash

Code snippet below returns an object.
class Person
def initialize(name, gender)
#name = name
#gender = gender
end
end
x = Person.new("Dan", "M")
=> #<Person:0x007f6f96600560 #name="Dan", #gender="M">
What is the difference between an object < ... > and a hash { ... }? Why wouldn't a Ruby class just return hashes?
What is the 0x007f6f96600560 in the object? I am pretty sure it's not object_id.
Object → Hash
From the excellent book "Ruby under the microscope" by Pat Shaughnessy :
Every Ruby object is the combination of a class pointer and an array
of instance variables.
Here's a somewhat longer description :
A user-defined Ruby object is represented by a structure called an
RObject, and is referred to by a pointer called VALUE.
Inside RObject, there is another structure called RBasic, which all
Ruby values will have.
Aside from the RBasic structure, RObject also contains numiv, a count
of how many instance variables the object has, ivptr, a pointer to an
array of values of the instance variables, and iv_index_tbl, which is
a pointer to a hash table stored in the object’s associated RClass
structure that maps the name/identity of each instance variable to its
position in the ivtpr array.
From any Ruby object, it's possible to extract a hash of instance variables :
class Object
def instance_variables_hash
Hash[instance_variables.map { |name| [name, instance_variable_get(name)] } ]
end
end
With your example :
x.instance_variables_hash
=> {:#name=>"Dan", :#gender=>"M"}
Hash → Object ?
But you couldn't possibly create x back from this hash, because you're missing a crucial piece of information : what class is x an instance of?
So for example, you wouldn't know the methods that you can send to x :
class Dog
def initialize(name, gender)
#name = name
#gender = gender
end
def bark
puts "WOOF"
end
end
person = Person.new("Dan", "M")
dog = Dog.new("Dan", "M")
p person.instance_variables_hash
# {:#name=>"Dan", :#gender=>"M"}
p dog.instance_variables_hash == person.instance_variables_hash
# true
person.bark
# undefined method `bark' for #<Person:0x007fb3b20ed658 #name="Dan", #gender="M">
object_id
To get the object_id out of the inspect string :
"0x007f6f96600560".sub('0x','').to_i(16)/2
#=> 70058620486320
And back :
"0x" + (70058620486320 * 2).to_s(16).rjust(14,'0')
#=> "0x007f6f96600560"
Of course, some times you can use objects and hashes for the same thing. Storing key value pair ob objects like this:
[3] pry(main)> class Person
def initialize(name, gender)
#name = name
#gender = gender
end
end
[3] pry(main)* => :initialize
[4] pry(main)> x = Person.new("Dan", "M")
=> #<Person:0x00000003708098 #gender="M", #name="Dan">
[13] pry(main)> y = Person.new("Peter", "M")
=> #<Person:0x0000000391fca0 #gender="M", #name="Peter">
[22] pry(main)> z = {name: "Maria", gender: "F"}
=> {:name=>"Maria", :gender=>"F"}
But this objects really doesn't get all the power of an object oriente programming language from the definitions of an class/object and hash:
Ruby is a perfect Object Oriented Programming Language. The features
of the object-oriented programming language include:
Data Encapsulation:
Data Abstraction:
Polymorphism:
Inheritance:
These features have been discussed in Object Oriented Ruby.
An object-oriented program involves classes and objects. A class is
the blueprint from which individual objects are created. In
object-oriented terms, we say that your bicycle is an instance of the
class of objects known as bicycles.
Take the example of any vehicle. It comprises wheels, horsepower, and
fuel or gas tank capacity. These characteristics form the data members
of the class Vehicle. You can differentiate one vehicle from the other
with the help of these characteristics.
A vehicle can also have certain functions, such as halting, driving,
and speeding. Even these functions form the data members of the class
Vehicle. You can, therefore, define a class as a combination of
characteristics and functions.
and a hash:
A Hash is a collection of key-value pairs like this: "employee" =>
"salary". It is similar to an Array, except that indexing is done via
arbitrary keys of any object type, not an integer index.
So for store data I recommend you a Hash.
On the other hand, as showed in a comment the number that appers in the object representation is the object id, but with few operations added:
1) bitwise left shift:
5 << 1 # gives 10
2) passed to hexadeimal
(10).to_s(16)
"a"
pry(main)> x = Person.new("Dan", "M")
=> #<Person:0x00000003708098 #gender="M", #name="Dan">
[5] pry(main)> x.object_id
=> 28852300
[8] pry(main)> (x.object_id << 1 ).to_s(16)
=> "3708098"
finally in ruby you can get the hash representation of an object like this:
x.instance_variables.each {|var| hash[var.to_s.delete("#")] = x.instance_variable_get(var) }

Ruby - defining class level variable dynamically

I am new to ruby and wanted to know if this is possible. Suppose I have a file with different blocks like this
fruits[tomato=1,orange=2]
greens[peas=2,potato=3]
I have parsed this file and stored it into a hash like this
{"fruits"=>{"tomato"=>"1", "orange"=>"2"}, "greens"=>{"potato"=>"3", "peas"=>"2"}}
And I also know how to access the different parts of the hash. But suppose if want to make it something like this
fruits.tomato # 1
fruits.orange # 2
(Like an object with tomato and orange being its variables)
The catch here is suppose I don't know if the file is going to contain fruits and greens, it could contain a different group called meat. I know this dynamic problem can be solved if I insert everything into a hash with the key as group name and value will be another hash. But can this be done with the example of fruit.tomato or fruits.orange I provided above(Probably by declaring it in a class but I am not sure how to dynamically add class vars in ruby or if that is even possible as I am new to the language).
I spent quite a bit of time making a program just like this in order to help speed up development with API's. I ended up writing a gem to objectify raw JSON (shameless plug: ClassyJSON).
That said, I think your use case is a good one for OpenStruct. I limited my code to just your example and your desired result but here's what it might look like:
require 'ostruct'
hash = {"fruits"=>{"tomato"=>"1", "orange"=>"2"}, "greens"=>{"potato"=>"3", "peas"=>"2"}}
structs = []
hash.each do |k, v|
if v.is_a? Hash
obj = OpenStruct.new({k => OpenStruct.new(v)})
end
structs << obj
end
Here we built up a number of OpenStruct objects and can access their values as you outlined:
[1] pry(main)> structs
=> [#<OpenStruct fruits=#<OpenStruct tomato="1", orange="2">>, #<OpenStruct greens=#<OpenStruct potato="3", peas="2">>]
[2] pry(main)> structs.first
=> #<OpenStruct fruits=#<OpenStruct tomato="1", orange="2">>
[3] pry(main)> structs.first.fruits
=> #<OpenStruct tomato="1", orange="2">
[4] pry(main)> structs.first.fruits.tomato
=> "1"
def add_accessor_method(name, ref)
define_singleton_method name do
return ref
end
end
I found this solution which will make an accessor method for me during parsing the file itself. So I will not have to use OpenStruct later on to convert my hash to an object with different accessor methods. ( I am sure OpenStruct under the hood is doing that)

Why don't numbers support .dup?

>> a = 5
=> 5
>> b = "hello, world!"
=> "hello, world!"
>> b.dup
=> "hello, world!"
>> a.dup
TypeError: can't dup Fixnum
from (irb):4:in `dup'
from (irb):4
I understand that Ruby will make a copy every time you assign an integer to a new variable, but why does Numeric#dup raise an error?
Wouldn't this break abstraction, since all objects should be expected to respond to .dup properly?
Rewriting the dup method will fix the problem, as far as I can tell:
>> class Numeric
>> def dup()
>> self
>> end
>> end
Does this have a downside I'm not seeing? Why isn't this built into Ruby?
Most objects in Ruby are passed by reference and can be dupped. Eg:
s = "Hello"
t = s # s & t reference to the same string
t.upcase! # modifying either one will affect the other
s # ==> "HELLO"
A few objects in Ruby are immediate, though. They are passed by value, there can only be one of this value and it therefore cannot be duped. These are any (small) integers, true, false, symbols and nil. Many floats are also immediates in Ruby 2.0 on 64 bit systems.
In this (preposterous) example, any "42" will hold the same instance variable.
class Fixnum
attr_accessor :name
alias_method :original_to_s, :to_s
def to_s
name || original_to_s
end
end
42.name = "The Answer"
puts *41..43 # => 41, The Answer, 43
Since you would normally expect something.dup.name = "new name" to not affect any other object than the copy obtained with dup, Ruby chooses not to define dup on immediates.
Your question is more complex than it appears. There was some discussion on ruby-core as to how this can be made easier. Also, other types of Numeric objects (floats, bignums, rationals and complex numbers) can not be duped although they are not immediates either.
Note that ActiveSupport (part of rails) provide the method duplicable? on all objects
The problem with the dup() function that you defined is that it doesn't return a copy of the object, but rather returns the object itself. This is not what a duplicate procedure is supposed to do.
I don't know Ruby, but a possible reason I can think of for dup not being defined for numbers is that a number is a basic type and thus, doing something like:
>> a = 5
>> b = a
would automatically assign the value 5 into the variable b, as opposed to making b and a point to the same value in memory.

hash['key'] to hash.key in Ruby

I have a a hash
foo = {'bar'=>'baz'}
I would like to call foo.bar #=> 'baz'
My motivation is rewriting an activerecord query into a raw sql query (using Model#find_by_sql). This returns a hash with the SELECT clause values as keys. However, my existing code relies on object.method dot notation. I'd like to do minimal code rewrite. Thanks.
Edit: it appears Lua has this feature:
point = { x = 10, y = 20 } -- Create new table
print(point["x"]) -- Prints 10
print(point.x) -- Has exactly the same meaning as line above
>> require 'ostruct'
=> []
>> foo = {'bar'=>'baz'}
=> {"bar"=>"baz"}
>> foo_obj = OpenStruct.new foo
=> #<OpenStruct bar="baz">
>> foo_obj.bar
=> "baz"
>>
What you're looking for is called OpenStruct. It's part of the standard library.
A good solution:
class Hash
def method_missing(method, *opts)
m = method.to_s
if self.has_key?(m)
return self[m]
elsif self.has_key?(m.to_sym)
return self[m.to_sym]
end
super
end
end
Note: this implementation has only one known bug:
x = { 'test' => 'aValue', :test => 'bar'}
x.test # => 'aValue'
If you prefer symbol lookups rather than string lookups, then swap the two 'if' condition
Rather than copy all the stuff out of the hash, you can just add some behaviour to Hash to do lookups.
If you add this defintion, you extend Hash to handle all unknown methods as hash lookups:
class Hash
def method_missing(n)
self[n.to_s]
end
end
Bear in mind that this means that you won't ever see errors if you call the wrong method on hash - you'll just get whatever the corresponding hash lookup would return.
You can vastly reduce the debugging problems this can cause by only putting the method onto a specific hash - or as many hashes as you need:
a={'foo'=>5, 'goo'=>6}
def a.method_missing(n)
self[n.to_s]
end
The other observation is that when method_missing gets called by the system, it gives you a Symbol argument. My code converted it into a String. If your hash keys aren't strings this code will never return those values - if you key by symbols instead of strings, simply substitute n for n.to_s above.
There are a few gems for this. There's my recent gem, hash_dot, and a few other gems with similar names I discovered as I released mine on RubyGems, including dot_hash.
HashDot allows dot notation syntax, while still addressing concerns about NoMethodErrors addressed by #avdi. It is faster, and more traversable than an object created with OpenStruct.
require 'hash_dot'
a = {b: {c: {d: 1}}}.to_dot
a.b.c.d => 1
require 'open_struct'
os = OpenStruct.new(a)
os.b => {c: {d: 1}}
os.b.c.d => NoMethodError
It also maintains expected behavior when non-methods are called.
a.non_method => NoMethodError
Please feel free to submit improvements or bugs to HashDot.

Access variables programmatically by name in Ruby

I'm not entirely sure if this is possible in Ruby, but hopefully there's an easy way to do this. I want to declare a variable and later find out the name of the variable. That is, for this simple snippet:
foo = ["goo", "baz"]
How can I get the name of the array (here, "foo") back? If it is indeed possible, does this work on any variable (e.g., scalars, hashes, etc.)?
Edit: Here's what I'm basically trying to do. I'm writing a SOAP server that wraps around a class with three important variables, and the validation code is essentially this:
[foo, goo, bar].each { |param|
if param.class != Array
puts "param_name wasn't an Array. It was a/an #{param.class}"
return "Error: param_name wasn't an Array"
end
}
My question is then: Can I replace the instances of 'param_name' with foo, goo, or bar? These objects are all Arrays, so the answers I've received so far don't seem to work (with the exception of re-engineering the whole thing ala dbr's answer)
What if you turn your problem around? Instead of trying to get names from variables, get the variables from the names:
["foo", "goo", "bar"].each { |param_name|
param = eval(param_name)
if param.class != Array
puts "#{param_name} wasn't an Array. It was a/an #{param.class}"
return "Error: #{param_name} wasn't an Array"
end
}
If there were a chance of one the variables not being defined at all (as opposed to not being an array), you would want to add "rescue nil" to the end of the "param = ..." line to keep the eval from throwing an exception...
You need to re-architect your solution. Even if you could do it (you can't), the question simply doesn't have a reasonable answer.
Imagine a get_name method.
a = 1
get_name(a)
Everyone could probably agree this should return 'a'
b = a
get_name(b)
Should it return 'b', or 'a', or an array containing both?
[b,a].each do |arg|
get_name(arg)
end
Should it return 'arg', 'b', or 'a' ?
def do_stuff( arg )
get_name(arg)
do
do_stuff(b)
Should it return 'arg', 'b', or 'a', or maybe the array of all of them? Even if it did return an array, what would the order be and how would I know how to interpret the results?
The answer to all of the questions above is "It depends on the particular thing I want at the time." I'm not sure how you could solve that problem for Ruby.
It seems you are trying to solve a problem that has a far easier solution..
Why not just store the data in a hash? If you do..
data_container = {'foo' => ['goo', 'baz']}
..it is then utterly trivial to get the 'foo' name.
That said, you've not given any context to the problem, so there may be a reason you can't do this..
[edit] After clarification, I see the issue, but I don't think this is the problem.. With [foo, bar, bla], it's equivalent like saying ['content 1', 'content 2', 'etc']. The actual variables name is (or rather, should be) utterly irrelevant. If the name of the variable is important, that is exactly why hashes exist.
The problem isn't with iterating over [foo, bar] etc, it's the fundamental problem with how the SOAP server is returing the data, and/or how you're trying to use it.
The solution, I would say, is to either make the SOAP server return hashes, or, since you know there is always going to be three elements, can you not do something like..
{"foo" => foo, "goo" => goo, "bar"=>bar}.each do |param_name, param|
if param.class != Array
puts "#{param_name} wasn't an Array. It was a/an #{param.class}"
puts "Error: #{param_name} wasn't an Array"
end
end
OK, it DOES work in instance methods, too, and, based on your specific requirement (the one you put in the comment), you could do this:
local_variables.each do |var|
puts var if (eval(var).class != Fixnum)
end
Just replace Fixnum with your specific type checking.
I do not know of any way to get a local variable name. But, you can use the instance_variables method, this will return an array of all the instance variable names in the object.
Simple call:
object.instance_variables
or
self.instance_variables
to get an array of all instance variable names.
Building on joshmsmoore, something like this would probably do it:
# Returns the first instance variable whose value == x
# Returns nil if no name maps to the given value
def instance_variable_name_for(x)
self.instance_variables.find do |var|
x == self.instance_variable_get(var)
end
end
There's Kernel::local_variables, but I'm not sure that this will work for a method's local vars, and I don't know that you can manipulate it in such a way as to do what you wish to acheive.
Great question. I fully understand your motivation. Let me start by noting, that there are certain kinds of special objects, that, under certain circumstances, have knowledge of the variable, to which they have been assigned. These special objects are eg. Module instances, Class instances and Struct instances:
Dog = Class.new
Dog.name # Dog
The catch is, that this works only when the variable, to which the assignment is performed, is a constant. (We all know that Ruby constants are nothing more than emotionally sensitive variables.) Thus:
x = Module.new # creating an anonymous module
x.name #=> nil # the module does not know that it has been assigned to x
Animal = x # but will notice once we assign it to a constant
x.name #=> "Animal"
This behavior of objects being aware to which variables they have been assigned, is commonly called constant magic (because it is limited to constants). But this highly desirable constant magic only works for certain objects:
Rover = Dog.new
Rover.name #=> raises NoMethodError
Fortunately, I have written a gem y_support/name_magic, that takes care of this for you:
# first, gem install y_support
require 'y_support/name_magic'
class Cat
include NameMagic
end
The fact, that this only works with constants (ie. variables starting with a capital letter) is not such a big limitation. In fact, it gives you freedom to name or not to name your objects at will:
tmp = Cat.new # nameless kitty
tmp.name #=> nil
Josie = tmp # by assigning to a constant, we name the kitty Josie
tmp.name #=> :Josie
Unfortunately, this will not work with array literals, because they are internally constructed without using #new method, on which NameMagic relies. Therefore, to achieve what you want to, you will have to subclass Array:
require 'y_support/name_magic'
class MyArr < Array
include NameMagic
end
foo = MyArr.new ["goo", "baz"] # not named yet
foo.name #=> nil
Foo = foo # but assignment to a constant is noticed
foo.name #=> :Foo
# You can even list the instances
MyArr.instances #=> [["goo", "baz"]]
MyArr.instance_names #=> [:Foo]
# Get an instance by name:
MyArr.instance "Foo" #=> ["goo", "baz"]
MyArr.instance :Foo #=> ["goo", "baz"]
# Rename it:
Foo.name = "Quux"
Foo.name #=> :Quux
# Or forget the name again:
MyArr.forget :Quux
Foo.name #=> nil
# In addition, you can name the object upon creation even without assignment
u = MyArr.new [1, 2], name: :Pair
u.name #=> :Pair
v = MyArr.new [1, 2, 3], ɴ: :Trinity
v.name #=> :Trinity
I achieved the constant magic-imitating behavior by searching all the constants in all the namespaces of the current Ruby object space. This wastes a fraction of second, but since the search is performed only once, there is no performance penalty once the object figures out its name. In the future, Ruby core team has promised const_assigned hook.
You can't, you need to go back to the drawing board and re-engineer your solution.
Foo is only a location to hold a pointer to the data. The data has no knowledge of what points at it. In Smalltalk systems you could ask the VM for all pointers to an object, but that would only get you the object that contained the foo variable, not foo itself. There is no real way to reference a vaiable in Ruby. As mentioned by one answer you can stil place a tag in the data that references where it came from or such, but generally that is not a good apporach to most problems. You can use a hash to receive the values in the first place, or use a hash to pass to your loop so you know the argument name for validation purposes as in DBR's answer.
The closest thing to a real answer to you question is to use the Enumerable method each_with_index instead of each, thusly:
my_array = [foo, baz, bar]
my_array.each_with_index do |item, index|
if item.class != Array
puts "#{my_array[index]} wasn't an Array. It was a/an #{item.class}"
end
end
I removed the return statement from the block you were passing to each/each_with_index because it didn't do/mean anything. Each and each_with_index both return the array on which they were operating.
There's also something about scope in blocks worth noting here: if you've defined a variable outside of the block, it will be available within it. In other words, you could refer to foo, bar, and baz directly inside the block. The converse is not true: variables that you create for the first time inside the block will not be available outside of it.
Finally, the do/end syntax is preferred for multi-line blocks, but that's simply a matter of style, though it is universal in ruby code of any recent vintage.

Resources