Ruby: marshal and unmarshal a variable, not an instance - ruby

OK, Ruby gurus, this is a hard one to describe in the title, so bear with me for this explanation:
I'm looking to pass a string that represents a variable: not an instance, not the collection of properties that make up an object, but the actual variable: the handle to the object.
The reason for this is that I am dealing with resources that can be located on the filesystem, on the network, or in-memory. I want to create URI handler that can handle each of these in a consistent manner, so I can have schemes like eg.
file://
http://
ftp://
inmemory://
you get the idea. It's the last one that I'm trying to figure out: is there some way to get a string representation of a reference to an object in Ruby, and then use that string to create a new reference? I'm truly interested in marshalling the reference, not the object. Ideally there would be something like taking Object#object_id, which is easy enough to get, and using it to create a new variable elsewhere that refers to the same object. I'm aware that this could be really fragile and so is an unusual use case: it only works within one Ruby process for as long as there is an existing variable to keep the object from being garbage collected, but those are both true for the inmemory scheme I'm developing.
The only alternatives I can think of are:
marshal the whole object and cram it into the URI, but that won't work because the data in the object is an image buffer - very large
Create a global or singleton purgatory area to store a variable for retrieval later using e.g. a hash of object_id:variable pairs. This is a bit smelly, but would work.
Any other thoughts, StackOverflowers?

There's ObjectSpace._id2ref :
f = Foo.new #=> #<Foo:0x10036c9b8>
f.object_id #=> 2149278940
ObjectSpace._id2ref(2149278940) #=> #<Foo:0x10036c9b8>
In addition to the caveats about garbage collection ObjectSpace carries a large performance penalty in jruby (so much so that it's disabled by default)

Variables aren't objects in Ruby. You not only cannot marshal/unmarshal them, you can't do anything with them. You can only do something with objects, which variables aren't.
(It would be really nice if they were objects, though!)

You could look into MagLev which is an alternative Ruby implementation built on top of VMware's Gemstone. It has a distributes object model wiht might suit your use-case.
Objects are saved in the central Gemstne instance (with some nifty caching) and can be accessed by any number of remote worker instances. That way, any of the workers act on the same object space and can access the very same objects simultaneously. That way, you can even do things like having the global Garbage Collector running on a single Ruby instance or seamlessly moving execution at any point to different nodes (while preserving all the stack frames) using Continuations.

Related

Near-sdk-as Contracts - Singleton style vs Bag of functions

It seems like there are two styles for writing NEAR smart contracts in assembly script
Bag of functions like Meme Museum
Singleton style like Lottery.
I was wondering under what circumstance one style is recommended over the other.
When should you use one over the other? What are the advantages/disadvantages of each style?
The big differences is in initialization of the contract.
Bag of Functions (BoF)
For the first style, usually a persistent collection is declared at the top level. All top level code in each file is placed into a function and one start function calls each. The Wasm binary format allows to specify the start function so that anytime the binary is instantiated (e.i. loaded by a runtime) that function is called before any exported functions can be called.
For persistent collections this means allocating the objects, but no data is required from storage, until something is read, including the length or size of the collection.
Singleton
A top level instance of the contract is declared. Then storage is checked for the "STATE" key, which contains the state of the instance. If it is present storage is read and the instance is deserialized from storage. Then each "method" of the contract is an exported function that uses the global instance, e.g. instance.method(...), passing the arguments to the method. If the method is decorated with #mutateState then the instance is written back to storage after the method call.
Which To Use
Singleton's provide a nice interface for the contract and is generally easier to understand. However, since you are reading from storage at every method call, it can be more expensive.
The other advantage of the BoF is it's easier to extend as you can export more functions from a dependency.
However, it's also possible to use a combination of both as well.
So really it's whatever makes sense to you.

Chef Ruby object and collections

I've dipped my toe in chef and am having a few difficulties with what should be simple concepts.
I'm obtains data from a node by running a search; my plan is to iterate over the results and create an object of type X setting its variables as I go.
I'd like to store these objects in a collection so that I can access them later in the recipe to carry out other tasks and so on.
My GoogleFu has so far come up short and I'm worried that I'm tackling this in the wrong way. My search is fine and returning the values, my separate class is also fine but the storing of these objects into a collection and then persisting that is proving more difficult. Many posts frown against using arrays for my purpose(if it's possible) and I've not found anything similar to an ArrayList or Map. Additionally, if I use a ruby collection, does it need to be maintained inside a ruby block?
Thanks for any help / advice.
With Chef, you have several ways of storing persistent data:
1) set node attributes
2) chef data bags
3) chef vault
4) environments
5) environment recipes
6) roles
IMHO, you should decide where this data should reside by determining which of the items I listed it belongs to.
What does it apply to? What does it describe?
You got to be more specific in where you are actually facing the issue but as far as I have understood why not
just define a ruby class and initialize all the variables you are supposed to get. In the recipe instantiate the object and keep settings its properties from the result. There should be no issue in this approach.
But more importantly what is your use-case here, because you could just define an array attribute and then keep syncing the result in that attribute . As in ruby Objects are referenced , thus any change you make to resource attributes which takes that object, changes are persistent to in that object.

Deleting an instance of a class via a method of that class

I have a class for pieces on a board. I want to be able to delete an instance of Piece so that anything else in the program that points to that piece will just point to nil.
Here's the very basic code version of what I want to do:
piece = Piece.new
variable = piece
variable #=> <Piece:0x0000000xxxxxxxx>
piece.delete
variable #=> nil
This seems like a very basic task so I feel like I'm missing something obvious. I've tried creating a delete method for the class with "self = nil", but this returns an error ("Can't change the value of self").
So far I have just worked around this by updating the other things that point to the object in my 'delete' method, but it seems like there should be a better way.
This is not possible.
Firstly, Ruby is an object-oriented language, which means that all manipulation is done via messages to objects, and all that is manipulated are objects. Variables are not objects, therefore you cannot manipulate them. (The only things you can do with variables are assign a value to them and dereference them.)
And even if you could manipulate variables, you would still need to hunt down every single reference to the object in question and remove it, in order for the object to be eligible for "deletion" (i.e. garbage collection).

In Ruby, what are the use cases for adding methods to an instance's singleton class?

Thanks to some other posts and reading, I understand singleton/meta classes. And I understand why we'd want to use them on a class. But I still don't understand why we'd want to use them on instance objects. And I've yet to see it in practice.
I'm referring to something like this:
class Vehicle
def odometer_reading
# some code
end
end
my_car = Vehicle.new
def my_car.open_door
# some code
end
At first thought, this seems like a bad idea as it would lead to difficulties in understanding the code and debugging.
Why would we want to do this? What are some examples of when this is a good idea?
One example is using it for testing purposes: creating mock and double objects, stubbing methods. Debugging is somewhere nearby: re-defining the logging method for a specific object that you suspect is mis-behaving, so that the log info is printed directly to console (or more info is printed) during the debug session.
Another example is dealing with special cases - instead of inheritance you can do just that. Starting from a classical example if you use two types of Employees, say, Engineers and SalesPersons, for which the rules of compensation calculation are different, you can put the common logic into the Employee class, then inherit the other two classes from it and implement their own calculate_salary methods there. Now, if there is an outlier - a star salesman that you have agreed to a different compensation scheme with, a CEO with a very special scheme, etc - instead of creating a whole sub-class for this special employee, you can just define this method for a specific object representing that employee.
The third example is dealing with an object lifecycle and performance considerations. Instead of having a long case of various states in some processing method. E.g. for a file-reading class that transparently caches the entire file in the background (I know a too-simplistic-for-real-life approach, but just as a model) all read requests while the file is not entirely read should check if the requested data is already in the cache or should be read from disk. Once the file is fully read they always go from the cache. Instead of having the if (case if there are more states) to deal with this you could simply re-define the read method at the object-level once the file is fully read to the cache. For this simple example it doesn't lead to any sizable performance benefit (if any benefit at all), but for more complex cases that may be worth it.
You wouldn't add them using def, that's a rather rigid way of doing it, but instead by using something like define_method or extend.
Although this is not the sort of thing you'd do on a routine basis, it does mean you can do some rather unusual things. ActiveRecord in Rails produces results in the form of an Array with additional methods added on to perform other operations.
An Object-Relationship Mapper would be a case where you'd probably want to do this. Sometimes, depending on how you fetch a record, the methods available differ significantly. Being able to add those dynamically means each fetched object can be completely customized even if they have the same class and general-purpose methods.
Another example: You have an array of hashes and you want each hash to have a method-call getter and setter. Something like:
user = HashOnSteroids.new(name: 'John')
user[:name] # => 'John'
user[:name] = 'Joe'
user.name # => 'Joe'
user.name = 'John'
user.set(name: 'Jim', age: 5)
This means you cannot write standard method definitions in the class as each hash will have a different set of keys (method names). This means you have to resort to defining singleton methods so each object has its own set of methods (not a pack of shared methods).
Warning: Using singleton methods for this use case is highly inefficient. A sneaky method_missing is faster and uses way less memory as it doesn't have to allocate a billion of proc objects.

How can I update an already instantiated Ruby object with YAML?

Basically, I have an instance of a Ruby object already but want to update whatever instance variables I can from yaml. There is a to_yaml function that will dump my object to yaml. I'm looking for something in the reverse. For example, my_obj.from_yaml(yaml_stuff) and have it update instance variables from the yaml passed in.
Would I need to, in my from_yaml function, use YAML::load and copy each instance variable? Is there a function I can use to quickly copy those variables without much typing if that is the case?
Does Ruby's yaml library have something already where I can pass it the object and the yaml and it'll just do what I want it to do?
Editing for clarity
This is a simple object that will store and load very simple yaml compatible types such as strings and integers.
What I ended up doing
Although I answered this question I wanted to add what I ended up doing, my Object monkey patch
class Object
def from_yaml(yml)
if (yml.nil?)
return
end
yml.instance_variables.each do |iv|
if (self.instance_variable_defined?(iv))
self.instance_variable_set(iv, yml.instance_variable_get(iv))
end
end
end
end
Your question is not clear enough. Which class are you talking about? What kind of YAML documents? You can't have everything serialized to and from YAML.
Let's assume that your object just has a set of instance variables of simple, YAML-compatible types, such as strings, numbers and symbols.
In that case, you can generally, write from_yaml method, which would load YAML file into a hash of key->value pairs, iterate through it and update every instance variable named key with value. Does that seem useful, and if it does, do you need help writing such method?
Edit:
There is no need for you to keep your object state in a hash - you can still use ivars and attr_accessors - just open up a new module (say YamlUpdateable), implement a from_yaml method which would update your ivars from a hash deserialized from YAML, and include the module in whichever class you want to deserialize from YAML.
As far as I know, there's nothing like that included with the YAML library itself; it's mostly meant for dumping and reading data, not keeping it up-to-date in memory and on disk. If you're planning to keep data in memory and on disk synced with each other with minimal hassle, have you considered a data persistence library like ActiveRecord or Stone?
If you're still keen on using the YAML library, and assuming you don't have many different classes to persist, it might make sense to simply write a small "updater" method that updates an object of that class given a similar object. Or you could rework your application to make sure you can simply reload all the objects from the YAML without having to update them (i.e., dump the old objects and create new ones).
The other option is to use metaprogramming to read into an object's properties and update them accordingly, but that seems error-prone and dangerous.
What you are looking for is the merge command.
// fetch yaml file
yml = YAML.load_file("path/to/file.yml")
// merge variables
my_obj.merge(yml)

Resources