Ruby YAML parser by passing constructor - ruby

I am working on an application that takes input from a YAML file, parses them into objects, and let's them do their thing. The only problem I'm having now, is that the YAML parser seems to ignore the objects "initialize" method. I was counting on the constructor to fill in any instance variables the YAML file was lacking with defaults, as well as store some things in class variables. Here is an example:
class Test
##counter = 0
def initialize(a,b)
#a = a
#b = b
#a = 29 if #b == 3
##counter += 1
end
def self.how_many
p ##counter
end
attr_accessor :a,:b
end
require 'YAML'
a = Test.new(2,3)
s = a.to_yaml
puts s
b = YAML::load(s)
puts b.a
puts b.b
Test.how_many
puts ""
c = Test.new(4,4)
c.b = 3
t = c.to_yaml
puts t
d = YAML::load(t)
puts d.a
puts d.b
Test.how_many
I would have expected the above to output:
--- !ruby/object:Test
a: 29
b: 3
29
3
2
--- !ruby/object:Test
a: 4
b: 3
29
3
4
Instead I got:
--- !ruby/object:Test
a: 29
b: 3
29
3
1
--- !ruby/object:Test
a: 4
b: 3
4
3
2
I don't understand how it makes these objects without using their defined initialize method. I'm also wondering if there is anyway to force the parser to use the initialize method.

Deserializing an object from Yaml doesn’t use the initialize method because in general there is no correspondance between the object’s instance variables (which is what the default Yaml serialization stores) and the parameters to initialize.
As an example, consider an object with an initialize that looks like this (with no other instance variables):
def initialize(param_one, param_two)
#a_variable = some_calculation(param_one, param_two)
end
Now when an instance of this is deserialized, the Yaml processor has a value for #a_variable, but the initialize method requires two parameters, so it can’t call it. Even if the number of instance variables matches the number of parameters to initialize it is not necessarily the case that they correspond, and even if they did the processor doesn’t know the order they shoud be passed to initialize.
The default process for serializing and deserializing a Ruby object to Yaml is to write out all instance variables (with their names) during serialization, then when deserializing allocate a new instance of the class and simply set the same instance variables on this new instance.
Of course sometimes you need more control of this process. If you are using the Psych Yaml processor (which is the default in Ruby 1.9.3) then you should implement the encode_with (for serialisation) or or init_with (for deserialization) methods as appropriate.
For serialization, Psych will call the encode_with method of an object if it is present, passing a coder object. This object allows you to specify how the object should be represented in Yaml – normally you just treat it like a hash.
For deserialization, Psych will call the init_with method if it is present on your object instead of using the default procedure described above, again passing a coder object. This time the coder will contain the information about the objects representation in Yaml.
Note you don’t need to provide both methods, you can just provide either one if you want. If you do provide both, the coder object you get passed in init_with will essentially be the same as the one passed to encode_with after that method has run.
As an example, consider an object that has some instance variables that are calculated from others (perhaps as an optimisation to avoid a large calculation), but shouldn’t be serialized to the Yaml.
class Foo
def initialize(first, second)
#first = first
#second = second
#calculated = expensive_calculation(#first, #second)
end
def encode_with(coder)
# #calculated shouldn’t be serialized, so we just add the other two.
# We could provide different names to use in the Yaml here if we
# wanted (as long as the same names are used in init_with).
coder['first'] = #first
coder['second'] = #second
end
def init_with(coder)
# The Yaml only contains values for #first and #second, we need to
# recalculate #calculated so the object is valid.
#first = coder['first']
#second = coder['second']
#calculated = expensive_calculation(#first, #second)
end
# The expensive calculation
def expensive_calculation(a, b)
...
end
end
When you dump an instance of this class to Yaml, it will look something like this, without the calculated value:
--- !ruby/object:Foo
first: 1
second: 2
When you load this Yaml back into Ruby, the created object will have the #calculated instance variable set.
If you wanted you could call initialize from within init_with, but I think it would be better to keep the a clear separation between initializing a new instance of the class, and deserializing an existing instance from Yaml. I would recommend extracting the common logic into methods that can be called from both instead,

If you only want this behavior with pure ruby classes that use #-style instance variables (not those from compiled extensions and not Struct-style), the following should work. YAML seems to call the allocate class method when loading an instance of that class, even if the instance is nested as a member of another object. So we can redefine allocate. Example:
class Foo
attr_accessor :yaml_flag
def self.allocate
super.tap {|o| o.instance_variables.include?(:#yaml_flag) or o.yaml_flag = true }
end
end
class Bar
attr_accessor :foo, :yaml_flag
def self.allocate
super.tap {|o| o.instance_variables.include?(:#yaml_flag) or o.yaml_flag = true }
end
end
>> bar = Bar.new
=> #<Bar:0x007fa40ccda9f8>
>> bar.foo = Foo.new
=> #<Foo:0x007fa40ccdf9f8>
>> [bar.yaml_flag, bar.foo.yaml_flag]
=> [nil, nil]
>> bar_reloaded = YAML.load YAML.dump bar
=> #<Bar:0x007fa40cc7dd48 #foo=#<Foo:0x007fa40cc7db90 #yaml_flag=true>, #yaml_flag=true>
>> [bar_reloaded.yaml_flag, bar_reloaded.foo.yaml_flag]
=> [true, true]
# won't overwrite false
>> bar.foo.yaml_flag = false
=> false
>> bar_reloaded = YAML.load YAML.dump bar
=> #<Bar:0x007fa40ccf3098 #foo=#<Foo:0x007fa40ccf2f08 #yaml_flag=false>, #yaml_flag=true>
>> [bar_reloaded.yaml_flag, bar_reloaded.foo.yaml_flag]
=> [true, false]
# won't overwrite nil
>> bar.foo.yaml_flag = nil
=> nil
>> bar_reloaded = YAML.load YAML.dump bar
=> #<Bar:0x007fa40cd73518 #foo=#<Foo:0x007fa40cd73360 #yaml_flag=nil>, #yaml_flag=true>
>> [bar_reloaded.yaml_flag, bar_reloaded.foo.yaml_flag]
=> [true, nil]
I intentionally avoided a o.nil? check in the tap blocks because nil may actually be a meaningful value that you don't want to overwrite.
One last caveat: allocate may be used by third party libraries (or by your own code), and you may not want to set the members in those cases. If you want to restrict allocation, to just yaml loading, you'll have to do something more fragile and complex like check the caller stack in the allocate method to see if yaml is calling it.
I'm on ruby 1.9.3 (with psych) and the top of the stack looks like this (path prefix removed):
psych/visitors/to_ruby.rb:274:in `revive'",
psych/visitors/to_ruby.rb:219:in `visit_Psych_Nodes_Mapping'",
psych/visitors/visitor.rb:15:in `visit'",
psych/visitors/visitor.rb:5:in `accept'",
psych/visitors/to_ruby.rb:20:in `accept'",
psych/visitors/to_ruby.rb:231:in `visit_Psych_Nodes_Document'",
psych/visitors/visitor.rb:15:in `visit'",
psych/visitors/visitor.rb:5:in `accept'",
psych/visitors/to_ruby.rb:20:in `accept'",
psych/nodes/node.rb:35:in `to_ruby'",
psych.rb:128:in `load'",

from_yaml(input)
Special loader for YAML files. When a Specification object is loaded from a YAML file, it bypasses the normal Ruby object initialization routine (initialize). This method makes up for that and deals with gems of different ages.
input can be anything that YAML.load() accepts: String or IO.
This is the reason that the initialize method was not being run when you executed YAML::Load.

Related

Plus equals with ruby send message

I'm getting familiar with ruby send method, but for some reason, I can't do something like this
a = 4
a.send(:+=, 1)
For some reason this doesn't work. Then I tried something like
a.send(:=, a.send(:+, 1))
But this doesn't work too. What is the proper way to fire plus equals through 'send'?
I think the basic option is only:
a = a.send(:+, 1)
That is because send is for messages to objects. Assignment modifies a variable, not an object.
It is possible to assign direct to variables with some meta-programming, but the code is convoluted, so far the best I can find is:
a = 1
var_name = :a
eval "#{var_name} = #{var_name}.send(:+, 1)"
puts a # 2
Or using instance variables:
#a = 2
var_name = :#a
instance_variable_set( var_name, instance_variable_get( var_name ).send(:+, 1) )
puts #a # 3
See the below :
p 4.respond_to?(:"+=") # false
p 4.respond_to?(:"=") # false
p 4.respond_to?(:"+") # true
a+=1 is syntactic sugar of a = a+1. But there is no direct method +=. = is an assignment operator,not the method as well. On the other hand Object#send takes method name as its argument. Thus your code will not work,the way you are looking for.
It is because Ruby doesn't have = method. In Ruby = don't work like in C/C++ but it rather assign new object reference to variable, not assign new value to variable.
You can't call a method on a, because a is not an object, it's a variable, and variables aren't objects in Ruby. You are calling a method on 4, but 4 is not the thing you want to modify, a is. It's just not possible.
Note: it is certainly possible to define a method named = or += and call it, but of course those methods will only exist on objects, not variables.
class Fixnum
define_method(:'+=') do |n| self + n end
end
a = 4
a.send(:'+=', 1)
# => 5
a
# => 4
This might miss the mark a bit, but I was trying to do this where a is actually a method dynamically called on an object. For example, with attributes like added_count and updated_count for Importer I wrote the following
class Importer
attr_accessor :added_count, :updated_count
def increment(method)
send("#{method}=", (send(method) + 1))
end
end
So I could use importer.increment(:added_count) or importer.increment(:updated_count)
Now this may seem silly if you only have these 2 different counters but in some cases we have a half dozen or more counters and different conditions on which attr to increment so it can be handy.

ruby optional parameter, if changed, effects in caller?

Consider the following code in ruby, assume I called prestart from somewhere:
def tester(process_name, *host_list)
hosts = []
hosts = host_list[0]
hosts[0] = nil
end
def prestart(process_name, *host)
host_list = ['192.168.1.1', '192.168.1.2']
puts host_list.inspect # -> ['192.168.1.1', '192.168.1.2']
tester(process_name, host_list)
puts host_list.inspect # -> [nil, '192.168.1.2']
abort
end
How did it become nil? Is this how ruby works? If yes, how do I make sure it doesn't effect the caller?
Arrays are objects, and therefore are past by reference. If you want to change it without affecting the original, you need to duplicate it by calling .dup on it. You can do it either in the caller or in the called method.

Setting variable A with name stored in variable B

I have the following two variables:
a = 1;
b = 'a';
I want to be able to do
SOMETYPEOFEVALUATION(b) = 2;
so that the value of variable a is now set to 2.
a # => 2
Is this possible?
Specifically, I am working with the Facebook API. Each object has a variety of different connections (friends, likes, movies, etc). I have a parser class that stores the state of the last call to the Facebook API for all of these connections. These states are all named corresponding to the the GET you have to call in order to update them.
For example, to update the Music connection, you use https://graph.facebook.com/me/music?access_token=... I store the result in a variable called updated_music. For books, its updated_books. If I created a list of all these connection type names, I ideally want to do something like this.
def update_all
connection_list.each do |connection_name|
updated_SomeTypeOfEvalAndConcatenation(connection_name) = CallToAPI("https://graph.facebook.com/me/#{connection_name}?access_token=...")
end
end
Very new to both Rails and StackOverflow so please let me know if there is a better way to follow any conventions.
Tried the below.
class FacebookParser
attr_accessor :last_albums_json,
def update_parser_vars(service)
handler = FacebookAPIHandler.new
connections_type_list = ['albums']
connections_type_list.each do |connection_name|
eval "self.last_#{connection_name}_json = handler.access_api_by_content_type(service, #{connection_name})['data']"
end
#self.last_albums_json = handler.access_api_by_content_type(service, 'albums')['data']
end
end
And I get this error
undefined local variable or method `albums' for #<FacebookParser:0xaa7d12c>
Works fine when I use line that is commented out.
Changing an unrelated variable like that is a bit of a code smell; Most programmers don't like it when a variable magically changes value, at least not without being inside an enclosing class.
In that simple example, it's much more common to say:
a=something(b)
Or if a is a more complex thing, make it a class:
class Foo
attr_accessor :a
def initialize(value)
#a = value
end
def transform(value)
#a = "new value: #{value}"
end
end
baz = "something"
bar = Foo.new(2)
bar.a
=> 2
bar.transform(baz)
bar.a
=> "new value: something"
So while the second example changes an internal variable but not through the accessor, at least it is part of an encapsulated object with a limited API.
Update Ah, I think the question is how do do like php's variable variables. As mu suggests, if you want to do this, you are probably doing the wrong thing... it's a concept that should never have been thought of. Use classes or hashes or something.
how about
eval "#{b}=2"
and with instance variables you can also do instance_variable_set("#name", value)
EDIT:
you can also use send method if you have a setter defined(and you have), try this:
class FacebookParser
attr_accessor :last_albums_json,
def update_parser_vars(service)
handler = FacebookAPIHandler.new
connections_type_list = ['albums']
connections_type_list.each do |connection_name|
send("last_#{connection_name}_json=",
handler.access_api_by_content_type(
service, connection_name)['data']))
end
end
end
problem with your original code is that
eval ".... handler.access_api_by_content_type(service, #{connection_name})"
would execute
... handler.access_api_by_content_type(service, albums)
# instead of
... handler.access_api_by_content_type(service, 'albums')
so you had to write
eval ".... handler.access_api_by_content_type(service, '#{connection_name}')" <- the quotes!
this is why people usually avoid using eval - it's easy to do this kind of mistakes
These sort of things are not usually done using local variables and their names in Ruby. A usual approach could include hashes and symbols:
data = Hash.new
data[:a] = 1 # a = 1
b = :a # b = 'a'
and then, later
data[b] = 2 # SOMETYPEOFEVALUATION(b) = 2
data[:a] # => 2

Dynamically set local variables in Ruby [duplicate]

This question already has answers here:
How to dynamically create a local variable?
(4 answers)
Closed 7 years ago.
I'm interested in dynamically setting local variables in Ruby. Not creating methods, constants, or instance variables.
So something like:
args[:a] = 1
args.each_pair do |k,v|
Object.make_instance_var k,v
end
puts a
> 1
I want locally variables specifically because the method in question lives in a model and I dont want to pollute the global or object space.
As an additional information for future readers, starting from ruby 2.1.0 you can using binding.local_variable_get and binding.local_variable_set:
def foo
a = 1
b = binding
b.local_variable_set(:a, 2) # set existing local variable `a'
b.local_variable_set(:c, 3) # create new local variable `c'
# `c' exists only in binding.
b.local_variable_get(:a) #=> 2
b.local_variable_get(:c) #=> 3
p a #=> 2
p c #=> NameError
end
As stated in the doc, it is a similar behavior to
binding.eval("#{symbol} = #{obj}")
binding.eval("#{symbol}")
The problem here is that the block inside each_pair has a different scope. Any local variables assigned therein will only be accessible therein. For instance, this:
args = {}
args[:a] = 1
args[:b] = 2
args.each_pair do |k,v|
key = k.to_s
eval('key = v')
eval('puts key')
end
puts a
Produces this:
1
2
undefined local variable or method `a' for main:Object (NameError)
In order to get around this, you could create a local hash, assign keys to this hash, and access them there, like so:
args = {}
args[:a] = 1
args[:b] = 2
localHash = {}
args.each_pair do |k,v|
key = k.to_s
localHash[key] = v
end
puts localHash['a']
puts localHash['b']
Of course, in this example, it's merely copying the original hash with strings for keys. I'm assuming that the actual use-case, though, is more complex.
interesting, you can change a local variable but you cannot set it:
def test
x=3
eval('x=7;')
puts x
end
test =>
7
def test
eval('x=7;')
puts x
end
test =>
NameError: undefined local variable or method `x' for main:Object
This is the only reason why Dorkus Prime's code works.
I suggest you use the hash (but keep reading for other alternatives).
Why?
Allowing arbitrary named arguments makes for extremely unstable code.
Let's say you have a method foo that you want to accept these theoretical named arguments.
Scenarios:
The called method (foo) needs to call a private method (let's call it bar) that takes no arguments. If you pass an argument to foo that you wanted to be stored in local variable bar, it will mask the bar method. The workaround is to have explicit parentheses when calling bar.
Let's say foo's code assigns a local variable. But then the caller decides to pass in an arg with the same name as that local variable. The assign will clobber the argument.
Basically, a method's caller must never be able to alter the logic of the method.
Alternatives
An alternate middle ground involves OpenStruct. It's less typing than using a hash.
require 'ostruct'
os = OpenStruct.new(:a => 1, :b => 2)
os.a # => 1
os.a = 2 # settable
os.foo # => nil
Note that OpenStruct allows you access non-existent members - it'll return nil. If you want a stricter version, use Struct instead.
This creates an anonymous class, then instantiates the class.
h = {:a=>1, :b=>2}
obj = Struct.new(* h.keys).new(* h.values)
obj.a # => 1
obj.a = 2 # settable
obj.foo # NoMethodError
since you don't want constants
args = {}
args[:a] = 1
args[:b] = 2
args.each_pair{|k,v|eval "##{k}=#{v};"}
puts #b
2
you might find this approach interesting ( evaluate the variables in the right context)
fn="b*b"
vars=""
args.each_pair{|k,v|vars+="#{k}=#{v};"}
eval vars + fn
4

Why don't numbers support .dup?

>> a = 5
=> 5
>> b = "hello, world!"
=> "hello, world!"
>> b.dup
=> "hello, world!"
>> a.dup
TypeError: can't dup Fixnum
from (irb):4:in `dup'
from (irb):4
I understand that Ruby will make a copy every time you assign an integer to a new variable, but why does Numeric#dup raise an error?
Wouldn't this break abstraction, since all objects should be expected to respond to .dup properly?
Rewriting the dup method will fix the problem, as far as I can tell:
>> class Numeric
>> def dup()
>> self
>> end
>> end
Does this have a downside I'm not seeing? Why isn't this built into Ruby?
Most objects in Ruby are passed by reference and can be dupped. Eg:
s = "Hello"
t = s # s & t reference to the same string
t.upcase! # modifying either one will affect the other
s # ==> "HELLO"
A few objects in Ruby are immediate, though. They are passed by value, there can only be one of this value and it therefore cannot be duped. These are any (small) integers, true, false, symbols and nil. Many floats are also immediates in Ruby 2.0 on 64 bit systems.
In this (preposterous) example, any "42" will hold the same instance variable.
class Fixnum
attr_accessor :name
alias_method :original_to_s, :to_s
def to_s
name || original_to_s
end
end
42.name = "The Answer"
puts *41..43 # => 41, The Answer, 43
Since you would normally expect something.dup.name = "new name" to not affect any other object than the copy obtained with dup, Ruby chooses not to define dup on immediates.
Your question is more complex than it appears. There was some discussion on ruby-core as to how this can be made easier. Also, other types of Numeric objects (floats, bignums, rationals and complex numbers) can not be duped although they are not immediates either.
Note that ActiveSupport (part of rails) provide the method duplicable? on all objects
The problem with the dup() function that you defined is that it doesn't return a copy of the object, but rather returns the object itself. This is not what a duplicate procedure is supposed to do.
I don't know Ruby, but a possible reason I can think of for dup not being defined for numbers is that a number is a basic type and thus, doing something like:
>> a = 5
>> b = a
would automatically assign the value 5 into the variable b, as opposed to making b and a point to the same value in memory.

Resources