Extended hash wants to load itself from YAML - ruby

I'm making a class that's intended to be an intelligent Hash that knows how to load its own values if given a YAML filename and then perform various operations on them. Except that first step is stumping me. Given this code:
class Agent < Hash
def initialize
super
end
def load_from_file(filename)
if (File.file?(filename))
self = YAML.load_file(filename)
end
end
end
...the error message is that one "Can't change the value of self"
How would you make a hash that loads itself from a file?

You're very close. Rather than the self assignment, you just want to use Hash#replace:
class Agent < Hash
def initialize
super
end
def load_from_file(filename)
if (File.file?(filename))
replace YAML.load_file(filename)
end
end
end
#replace replaces the keys and values of the calling hash with they keys and values from the passed hash - exactly what you want in this case. However, be sure that you validate that the YAML data is indeed a Hash before calling #replace.

Related

Saving key string in a hash using a method

I'm utilizing yaml to create a configuration file to automate creating machine setup files. I'm have some basic ruby scripting experience but looking to start utilizing classes more to make things cleaner and get better at programming.
My YAML names config.yaml:
`machine_configurations:
MACHINE_NAME_1:
Settings:
MACHINE_NAME_2:
Settings:`
I have a class machine_builder.rb
`require 'yaml'
class MachineBuilder
def initialize
#config = YAML.load_file("config.yaml")
end
def machine_list
#config['machine_configurations'].each do |k,v|
k
end
end
end
What I'm trying to figure out how to do is to store an array of the machine configuration strings
I've testing trying to use
test = MachineBuilder.new
machine_list = []
machine_list << test.machine_list
what I'm trying to get for a result is
machine_list = ['MACHINE_NAME_1','MACHINE_NAME_2']
but I keep getting the entire hash key and values stored in the array.
machine_list = ['MACHINE_NAME_1 => Settings: ...',' MACHINE_NAME_2 => Settings...']
I've tried changing the method using the following but I guess I'm missing something.
def machine_list
#config['machine_configurations'].each do |k,v|
return k
end
end
This attempt only returns one value, and I'm assuming that this is because return exits the loop once the one value is found.
def machine_list
#config['machine_configurations'].each do |k,v|
puts k
end
end
I guess in the end I'm also trying to figure out what is the best practice in to iterate and return values in a method or help to better understand using methods and returning values using the methods.
The each method returns the original enumerable object it was called upon, this is why you keep getting the entire hash when you call the machine_list method.
You can try the following code to get an array of the keys of the #config hash:
def machine_list
#config['machine_configurations'].keys
end
and then:
test = MachineBuilder.new
machine_list = test.machine_list
this way the result will be:
machine_list = ['MACHINE_NAME_1','MACHINE_NAME_2']

Return containing object instance in ruby

I'm trying to implement a funky version of method chaining. Returning the instance of the class after each function call is easy, you just do
def chainable_method
some_code()
self
end
My idea is that the methods you can call depend on the previous method call. I'm trying to achieve this by returning an object belonging to the containing object. The contained object will have a few special methods, and then implement method_missing to return the containing object's instance.
Edit: The child object has some state associated with it that should be in itself, and not the parent. It might not have been clear previously as to why I need a whole instance for just method calls.
super is irrelevant in this case because the contained object doesn't inherit from the containing object, and I wouldn't want to call the containing object's methods on the contained object anyway - I want to call the containing object's methods on the containing object itself. I want the containing object, not the containing object class.
Not sure if this is possible.
Edit: reworded everything to use "containing/contained object" instead of the completely incorrect parent/child object.
Also, I'm using 1.9.3, if that matters. Version isn't important, I can change if needed.
My explanation was probably unclear. Here's the code:
class AliasableString
def initialize(string)
#string = string
end
def as(aka)
#aka = aka
end
def has_aka?
!#aka.nil?
end
# alias is a reserved word
def aka
#aka
end
def to_s
#string + (self.has_aka? ? (" as " + #aka) : "")
end
end
class Query
def initialize
#select_statements = Array.new
end
def select(statement)
select_statement = AliasableString.new(statement)
#select_statements.push(select_statement)
select_statement
end
def print
if #select_statements.size != 0
puts "select"
#select_statements.each_with_index {| select, i|
puts select
}
end
end
end
# Example usage
q0 = Query.new
q0.select("This is a select statement")
.select("Here's another one")
.as("But this one has an alias")
.select("This should be passed on to the parent!")
q0.print
I haven't yet fully implemented print. AliasableString needs to have #string and #aka separate so I can pull them apart later.
First of all, it doesn't matter what class of object is contained within a Query instance. All of the syntax shown on your 'example usage' section is appropriately defined in Query. The only requirement of the objects contained within a query instance is that they respond to as (or some similar method). What you have here is something like a state machine, but the only state that really matters is that some object occupies the last position in the select_statements array. Here's how I would build this (again, based mostly on your example at the end, I'm afraid I can't quite follow your initial explanation):
class Query
# ... initialize, etc.
def select(statement, statement_class = AliasableString)
select_statements << statement_class.new(statement)
self
end
def as(aka)
# this will only ever be used on the most recent statement added
statement_to_alias = select_statements.last
# throw an error if select_statements is empty (i.e., :last returns nil)
raise 'You must add a statement first' unless statement_to_alias
# forward the message on to the statement
statement_to_alias.as(aka)
# return the query object again to permit further chaining
self
end
end
AliasableString doesn't need to know a thing about Query; all it needs to do is respond appropriately to as.

Read and write YAML files without destroying anchors and aliases

This question has been asked before: Read and write YAML files without destroying anchors and aliases?
I was wondering how to solve that problem with many anchors and aliases?
thanks
The problem here is that anchors and aliases in Yaml are a serialization detail, and so aren’t part of the data after it’s been parsed, so the original anchor name isn’t known when writing the data back out to Yaml. In order to keep the anchor names when round tripping you need to store them somewhere when parsing so that they are available later when serializing. In Ruby any object can have instance variables associated with it, so an easy way to achieve this would be to store the anchor name in an instance variable of the objet in question.
Continuing from the example in the earlier question, for hashes we can change our redifined revive_hash method so that if the hash is an anchor then as well as recording the anchor name in the #st variable so later alises can be recognised, we add the it as an instance variable on the hash.
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
#st[o.anchor] = hash
hash.instance_variable_set "#_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) { |k,v|
key = accept(k)
hash[key] = accept(v)
}
hash
end
end
Note that this only affects yaml mappings that are anchors. If you want to have other types to keep their anchor name you’ll need to look at psych/visitors/to_ruby.rb and make sure the name is added in all cases. Most types can be included by overriding register but there are a couple of others; search for #st.
Now that the hash has the desired anchor name associated with it, you need to make Psych use it instead of the object id when serializing it. This can be done by subclassing YAMLTree. When YAMLTree processes an object, it first checks to see if that object has been seen already, and emits an alias for it if it has. For any new objects, it records that it has seen the object in case it needs to create an alias later. The object_id is used as the key in this, so you need to override those two methods to check for the instance variable, and use that instead if it exists:
class MyYAMLTree < Psych::Visitors::YAMLTree
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('#_yaml_anchor_name')
if #st.key? anchor_name
oid = anchor_name
node = #st[oid]
anchor = oid.to_s
node.anchor = anchor
return #emitter.alias anchor
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '#_yaml_anchor_name' set
super
end
# record object for future, using '#_yaml_anchor_name' rather
# than object_id if it exists
def register target, yaml_obj
anchor_name = target.instance_variable_get('#_yaml_anchor_name') || target.object_id
#st[anchor_name] = yaml_obj
yaml_obj
end
end
Now you can use it like this (unlike the previous question, you don’t need to create a custom emitter in this case):
builder = MyYAMLTree.new
builder << data
tree = builder.tree
puts tree.yaml # returns a string
# alternativelty write direct to file:
File.open('a_file.yml', 'r+') do |f|
tree.yaml f
end
here's a slightly modified version for up to newer versions of the psych gem. before it gave me the following error:
NoMethodError - undefined method `[]=' for #<Psych::Visitors::YAMLTree::Registrar:0x007fa0db6ba4d0>
the register method moved into a subclass of YAMLTree, so this works now with respect to everything what matt says in his answer:
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
#st[o.anchor] = hash
hash.instance_variable_set "#_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) { |k,v|
key = accept(k)
hash[key] = accept(v)
}
hash
end
end
class MyYAMLTree < Psych::Visitors::YAMLTree
class Registrar
# record object for future, using '#_yaml_anchor_name' rather
# than object_id if it exists
def register target, node
anchor_name = target.instance_variable_get('#_yaml_anchor_name') || target.object_id
#obj_to_node[anchor_name] = node
end
end
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('#_yaml_anchor_name')
if #st.key? anchor_name
oid = anchor_name
node = #st[oid]
anchor = oid.to_s
node.anchor = anchor
return #emitter.alias anchor
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '#_yaml_anchor_name' set
super
end
end
I had to further modify the code that #markus posted to work with Psych v2.0.17.
Here's what I ended up with. I hope it helps someone else save quite a bit of time. :-)
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
if o.anchor
#st[o.anchor] = hash
hash.instance_variable_set "#_yaml_anchor_name", o.anchor
end
o.children.each_slice(2) do |k,v|
key = accept(k)
hash[key] = accept(v)
end
hash
end
end
class Psych::Visitors::YAMLTree::Registrar
# record object for future, using '#_yaml_anchor_name' rather
# than object_id if it exists
def register target, node
#targets << target
#obj_to_node[_anchor_name(target)] = node
end
def key? target
#obj_to_node.key? _anchor_name(target)
rescue NoMethodError
false
end
def node_for target
#obj_to_node[_anchor_name(target)]
end
private
def _anchor_name(target)
target.instance_variable_get('#_yaml_anchor_name') || target.object_id
end
end
class MyYAMLTree < Psych::Visitors::YAMLTree
# check to see if this object has been seen before
def accept target
if anchor_name = target.instance_variable_get('#_yaml_anchor_name')
if #st.key? target
node = #st.node_for target
node.anchor = anchor_name
return #emitter.alias anchor_name
end
end
# accept is a pretty big method, call super to avoid copying
# it all here. super will handle the cases when it's an object
# that's been seen but doesn't have '#_yaml_anchor_name' set
super
end
def visit_String o
if o == '<<'
style = Psych::Nodes::Scalar::PLAIN
tag = 'tag:yaml.org,2002:str'
plain = true
quote = false
return #emitter.scalar o, nil, tag, plain, quote, style
end
# visit_String is a pretty big method, call super to avoid copying it all
# here. super will handle the cases when it's a string other than '<<'
super
end
end

Trying to re-define += in an Array subclass doesn't seem to do anything?

In an Array subclass (just an array that does some coercing of input values) I've defined #concat to ensure values are coerced. Since nobody ever uses #concat and is more likely to use #+= I tried to alias #+= to #concat, but it never seems to get invoked. Any ideas?
Note that the coercing is actually always to objects of a particular superclass (which accepts input via the constructor), in case this code seems not to do what I describe. It's part of an internal, private API.
class CoercedArray < Array
def initialize(type)
super()
#type = type
end
def push(object)
object = #type.new(object) unless object.kind_of?(#type)
super
end
def <<(object)
push(object)
end
def concat(other)
raise ArgumentError, "Cannot append #{other.class} to #{self.class}<#{#type}>" unless other.kind_of?(Array)
super(other.inject(CoercedArray.new(#type)) { |ary, v| ary.push(v) })
end
alias :"+=" :concat
end
#concat is working correctly, but #+= seems to be completely by-passed.
Since a += b is syntactic sugar for a = a + b I'd try to overwrite the + method.

Ruby core extensions with modules

Basically I have two modules: CoreExtensions::CamelcasedJsonString and …::CamelcasedJsonSymbol. The latter one overrides the Symbol#to_s, so that the method returns a String which is extended with the first module. I don't want every string to be a CamelcasedJsonString. This is the reason why I try to apply the extension instance specific.
My problem is, that Symbol#to_s seems to be overridden again after I included my module (the last spec fails):
require 'rubygems' if RUBY_VERSION < '1.9'
require 'spec'
module CoreExtensions
module CamelcasedJsonString; end
module CamelcasedJsonSymbol
alias to_s_before_core_extension to_s
def to_s(*args)
to_s_before_core_extension(*args).extend(CamelcasedJsonString)
end
end
::Symbol.send :include, CamelcasedJsonSymbol
end
describe Symbol do
subject { :chunky_bacon }
it "should be a CamelcasedJsonSymbol" do
subject.should be_a(CoreExtensions::CamelcasedJsonSymbol)
end
it "should respond to #to_s_before_core_extension" do
subject.should respond_to(:to_s_before_core_extension)
end
specify "#to_s should return a CamelcasedJsonString" do
subject.to_s.should be_a(CoreExtensions::CamelcasedJsonString)
end
end
However the following example works:
require 'rubygems' if RUBY_VERSION < '1.9'
require 'spec'
module CoreExtensions
module CamelcasedJsonString; end
end
class Symbol
alias to_s_before_core_extension to_s
def to_s(*args)
to_s_before_core_extension(*args).extend(CoreExtensions::CamelcasedJsonString)
end
end
describe Symbol do
subject { :chunky_bacon }
it "should respond to #to_s_before_core_extension" do
subject.should respond_to(:to_s_before_core_extension)
end
specify "#to_s should return a CamelcasedJsonString" do
subject.to_s.should be_a(CoreExtensions::CamelcasedJsonString)
end
end
Update: Jan 24, 2010
The background of my problem is that I try to convert a huge nested hash
structure into a JSON string. Each key in this hash is a Ruby Symbol in the
typical underscore notation. The JavaScript library which consumes the JSON
data expects the keys to be strings in camelcase notation. I thought that
overriding the Symbol#to_json method might be the easiest way. But that
didn't work out since Hash#to_json calls first #to_s and afterwards
#to_json on each key. Therefore I thought it might be a solution to extend
all Strings returnd by Symbol#to_s with a module which overrides the
#to_json method of this specific string instance to return a string that has
a #to_json method which returns itself in camelcase notation.
I'm not sure if there is an easy way to monkey patch Hash#to_json.
If someone wants to take a look into the JSON implementation I'm using, here is the link: http://github.com/flori/json/blob/master/lib/json/pure/generator.rb (lines 239 and following are of interest)
Your second monkeypatch works since you are re-opening the Symbol class.
The first one doesn't because all the include does is add the module in the list of included modules. These get called only if the class itself doesn't define a specific method, or if that method calls super. So your code never gets called.
If you want to use a module, you must use the included callback:
module CamelcasedJsonSymbol
def self.included(base)
base.class_eval do
alias_method_chain :to_s, :camelcase_json
end
end
def to_s_with_camelcase_json(*args)
to_s_without_camelcase_json(*args).extend(CamelcasedJsonString)
end
end
I've used active_record alias_method_chain, which you should always do when monkey patching. It encourages you to use the right names and thus avoid collisions, among other things.
That was the technical answer.
On a more pragmatic approach, you should rethink this. Repeatedly extending strings like this is not nice, will be a huge performance drain on most implementations (it clears the whole method cache on MRI, for instance) and is a big code smell.
I don't know enough about the problem to be sure, or suggest other solutions (maybe a Delegate class could be the right thing to return?) but I have a feeling this is not the right way to arrive to your goals.
Since you want to convert the keys of a hash, you could pass an option to #to_json and monkeypatch that instead of #to_s, like:
{ :chunky_bacon => "good" }.to_json(:camelize => true)
My first idea was to monkeypatch Symbol#to_json but that won't work as you point out because Hash will force the keys to strings before calling to_json, because javascript keys must be strings. So you can monkeypatch Hash instead:
module CamelizeKeys
def self.included(base)
base.class_eval do
alias_method_chain :to_json, :camelize_option
end
end
def to_json_with_camelize_option(*args)
if args.empty? || !args.first[:camelize]
to_json_without_camelize_option(*args)
else
pairs = map do |key, value|
"#{key.to_s.camelize.to_json(*args)}: #{value.to_json(*args)}"
end
"{" << pairs.join(",\n") << "}"
end
end
end
That looks kind of complicated. I probably don't understand what it is you're trying to achieve, but what about something like this?
#!/usr/bin/ruby1.8
class Symbol
alias_method :old_to_s, :to_s
def to_s(*args)
if args == [:upcase]
old_to_s.upcase
else
old_to_s(*args)
end
end
end
puts :foo # => foo
puts :foo.to_s(:upcase) # => FOO
and a partial spec:
describe :Symbol do
it "should return the symbol as a string when to_s is called" do
:foo.to_s.should eql 'foo'
end
it "should delegate to the original Symbol.to_s method when to_s is called with unknown arguments" do
# Yeah, wish I knew how to test that
end
it "should return the symbol name as uppercase when to_s(:upcase) is called" do
:foo.to_s(:upcase).should eql "FOO"
end
end

Resources