Why isn't there a deep copy method in Ruby? - ruby

I am working on a solution for technical drawings (svg/ruby). I want to manipulate rectangles, and have an add! method in this class:
class Rect
def add!(delta)
#x1+=delta
... # and so on
self
end
end
I also need an add method returning a Rect, but not manipulating self:
def add(delta)
r=self.dup/clone/"copy" # <-- not realy the 3 and no quotes, just in text here
r.add! delta
end
dup and clone don't do my thing but:
def copy; Marshal.load(Marshal.dump(self)); end
does.
Why does such a basic functionality not exist in plain Ruby? Please just don't tell me that I could reverse add and add!, letting add do the job, and add! calling it.

I'm not sure why there's no deep copy method in Ruby, but I'll try to make an educated guess based on the information I could find (see links and quotes below the line).
Judging from this information, I could only infer that the reason Ruby does not have a deep copy method is because it's very rarely necessary and, in the few cases where it truly is necessary, there are other, relatively simple ways to accomplish the same task:
As you already know, using Marshal.dump and Marshal.load is currently the recommended way to do this. This is also the approach recommended by Programming Ruby (see excerpts below).
Alternatively, there are at least 3 available implementations found in these gems: deep_cloneable, deep_clone and ruby_deep_clone; the first being the most popular.
Related Information
Here's a discussion over at comp.lang.ruby which might shed some light on this. There's another answer here with some associated discussions, but it all comes back to using Marshal.
There weren't any mentions of deep copying in Programming Ruby, but there were a few mentions in The Ruby Programming Language. Here are a few related excerpts:
[…]
Another use for Marshal.dump and Marshal.load is to create deep copies
of objects:
def deepcopy(o)
Marshal.load(Marshal.dump(o))
end
[…]
… the binary format used by Marshal.dump and Marshal.load is
version-dependent, and newer versions of Ruby are not guaranteed to be
able to read marshalled objects written by older versions of Ruby.
[…]
Note that files and I/O streams, as well as Method and Binding
objects, are too dynamic to be marshalled; there would be no reliable
way to restore their state.
[…]
Instead of making a defensive deep copy of the array, just call
to_enum on it, and pass the resulting enumerator instead of the array
itself. In effect, you’re creating an enumerable but immutable proxy
object for your array.

Forget marshalling. The deep_dive gem will solve your problems.
https://rubygems.org/gems/deep_dive

Why can't you use something like this:
new_item = Item.new(old_item.attributes)
new_item.save!
This would copy all the attributes from existing item to new one, without issues. If you have other objects, you can just copy them individually.
I think it's the quickest way to copy an object

Related

Is there a gem that provides support to detect changes to native ruby type instances?

Although I agree that extending native types and objects is a bad practice, inheriting from them should not be.
In a supposedly supporting gem (that I could not find), the way that the native types were to be used would be as follows:
require 'cool-unkown-light-gem'
class MyTypedArray < CoolArray # would love to directly < Array
def initialize(*args)
super(*args)
# some inits for DataArray
#caches_init = false
end
def name?(name)
init_caches unless !#caches_init
!!#cache_by_name[name]
end
def element(name)
init_caches unless !#caches_init
#cache_by_name[name]
end
private
# overrides the CoolArray method:
# CoolArray methods that modify self will call this method
def on_change
#caches_init = false
super
end
def init_caches
return #cache_by_name if #caches_init
#caches_init = true
#cache_by_name = self.map do |elem|
[elem.unique_name, elem]
end.to_h
end
end
Any method of the parent class not overridden by the child class that modifies self would call, let's say (in this case), the on_change function. Which would allow to do not have to re-define every single one of those methods to avoid losing track on changes.
Let's say the MyTypedArray would array Foo objects:
class Foo
attr_reader :unique_name
def initialize(name)
#unique_name = name
end
end
a short example of the expected behaviour of its usage:
my_array = MyTypedArray.new
my_array.push( Foo.new("bar") ).push( Foo.new("baz") )
my_array.element("bar").unique_name
# => "bar"
my_array.shift # a method that removes the first element from self
my_array.element("bar").unique_name
# => undefined method `unique_name' for nil:NilClass (NoMethodError)
my_array.name?("bar")
# => false
I understand that we should search for immutable classes, yet those native types support changes on the same object and we want a proper way to do an inheritance that is as brief and easy as possible.
Any thoughts, approaches, or recommendations are more than welcome, of course. I do not think I am the only one that have thought on this.
The reason why I am searching for a maintained gem is because different ruby versions may offer different supported methods or options for native types / classes.
[Edit]
The aim of the above is to figure out a pattern that works. I could just follow the rules and suggestions of other posts, yet would not get things work the way I am intended and when I see it proper (a coding language is made by and for humans, and not humans made for coding languages). I know everyone is proud of their achievements in learning, developing and making things shaped in a pattern that is well known in the community.
The target of the above is because all the methods of Array are more than welcome. I do not care if in the version 20 of Ruby they remove some methods of Array. By then my application will be obsolete or someone will achieve the same result in far less code.
Why Array?
Because the order matters.
Why an internal Hash?
Because for the usage I want to make of it, in overall, the cost of building the hash compensates the optimization it offers.
Why not just include Enumerable?
Because we just reduce the number of methods that change the object, but we do not actually have a pattern that allows to change #caches_init to false, so the Hash is rebuilt on next usage (so same problem as with Array)
Why not just whitelist and include target Array methods?
Because that does not get me where I want to be. What if I want anyone to still use pop, or shift but I do not want to redefine them, or even having to bother to manage my mixins and constantly having to use responds_to?? (perhaps that exercise is good to improve your skills in coding and read code from other people, but that is not what it should be)
Where I want to be?
I want to be in a position that I can re-use / inherit any, I repeat, any class (no matter if it is native or not). That is basic for an OOP language. And if we are not talking about an OOP language (but just some sugar at the top of it to make it appear as OOP), then let's keep ourselves open to analyse patterns that should work well (no matter if they are odd - for me is more odd that there are no intermediate levels; which is symptom of many conventional patterns, which in turn is symptom of poor support for certain features that are more widely required than what is accepted).
Why should a gem offer the above?
Well, let's humble it. The above is a very simple case (and even though not covered). You may gain in flexibility at some point by using what some people want to call the Ruby way. But at a cost when you move to bigger architectures. What if I want to create intermediate classes to inherit from? Enriched native classes that boost simple code, yet keeping it aligned with the language. It is easier to say this is not the Ruby way than trying to make the language closer to something that escalates well from the bottom.
I am not surprised that Rails and Ruby are almost "indistinctly" used by many. Because at some point, without some Rails support, what you have with Ruby is a lot of trouble. As, consequently, I am not surprised that Rails is so maintained.
Why should I redefine a pop, or a last, or first methods? For what? They are already implemented.
Why should I whitelist methods and create mixins? is that a object or method oriented programming?
Anyway... I do not expect anyone to share my view on this. I do see other patterns, and I will keep allowing my mind to find them. If anyone is open enough, please, feel free to share. Someone may criticize the approach and be right, but if you got there is because it worked.
To answer your question as it is written, no, there is no gem for this. This is not a possibility of the language, either in pure Ruby or in C which is used internally.
There is no mechanism in detect when self is changed, nor any way to detect if a method is pure (does not change self) or impure (does change self). It seems you want a way to "automatically" be able to know when a method is one or the other, and that, to put simply, is just not possible, nor is it in any language that I am aware of.
Internally (using your example) an Array is backed by a RArray structure in C. A struct is simple storage space: a way to look at an arbitrary block of memory. C does not care how you choose to look at memory, I could just as easily cast the pointer of this struct and say it is a now a pointer to an array of integers and change it that way, it will happily manipulate the memory as I tell it to, and there is nothing that can detect that I did so. Now add in the fact that anyone, any script, or any gem can do this and you have no control over it, and it just shows that this solution is fundamentally and objectively flawed.
This is why most (all?) languages that need to be notified when an object is changed use an observer pattern. You create a function that "notifies" when something changes, and you invoke that function manually when needed. If someone decides to subclass your class, they need only continue the pattern to raise that function if it changes the object state.
There is no such thing as an automatic way of doing this. As already explained, this is an "opt-in" or "whitelist" solution. If you want to subclass an existing object instead of using your own from scratch, then you need to modify its behavior accordingly.
That said, adding the functionality is not as daunting as you may think if you use some clever aliasing and meta-programming with module_eval, class_eval or the like.
# This is 100% untested and not even checked for syntax, just rough idea
def on_changed
# Do whatever you need here when object is changed
end
# Unpure methods like []=, <<, push, map!, etc, etc
unpure_methods.each do |name|
class_eval <<-EOS
alias #{name}_orig #{name}
def #{name}(*args, &block)
#{name}_orig(*args, &block)
on_changed
end
EOS
end

Immutable Nokogiri documents

Once I parse a complex HTML document into a Nokogiri::HTML object, I want to pass it around to various classes to extract information from it. Recently I discovered some code that accidentally mutated the original object:
body = Nokogiri::HTML(html_text_from_the_internets)
tables = body.css('table')
tables.each do {|table|
table.css('br').each do |line_break|
line_break.replace("\n")
end
end
All <br> tags are now removed from all tables in body
Is there are a way to freeze body so it can not be mutated? That way, during testing, hopefully, I can catch other side-effect mutations before they break things. Calling body.freeze doesn't work (probably because it only freezes the top-level object).
Yes, you can "deep freeze" a nested object.
A deep freeze is essentially a traversal that freezes all sub-objects.
You can write it yourself if you want, or use a gem such as ice_nine
Another way to test, that doesn't involve freezing, is to compare the overall object before and after. An easy way to do this is Marshall dump the object before, and after, and compare them. If the dumps are different, they will show you exactly what has changed. (Similarly, you could do a deep clone a.k.a. deep dup before, and use that for the comparison. Similarly, you could use an object hash.)

Is it possible to change Ruby's frozen object handling behaviour?

I am submitting solutions to Ruby puzzles on codewars.com and experimenting with how locked into the testing enviroment I am for one of the challenges.
I can redefine the classes used to test my solution but they are defined by the system after I submit my code. If I freeze these objects, the system cannot write over them but a RunTime error is raised when it tries to.
I'm fairly new to Ruby, so I'm not sure which parts (other than falsiness and truthiness) are impossible to override. Can I use Ruby code to force modification of frozen objects to silently fail instead of terminate the program or is that bound up in untouchable things like the assignment operator or similar?
The real answer here is that if you might want to modify an object later, you shouldn't freeze it. That's inherent in the whole concept of "freezing" an object. But since you asked, note that you can test whether an object is frozen with:
obj.frozen?
So if those pesky RuntimeErrors are getting you down, one solution is to use a guard clause like:
obj.do_something! if !obj.frozen?
If you want to make the guard clauses implicit, you can redefine the "problem" methods using a monkey patch:
class Array
# there are a couple other ways to do this
# read up on Ruby metaprogramming if you want to know
alias :__pop__ :pop
def pop
frozen? ? nil : __pop__
end
end
If you want your code to work seamlessly with any and all Ruby libraries/gems, adding behavior to built-in methods like this is probably a bad idea. In this case, I doubt it will cause any problems, but whenever you choose to start hacking on Ruby's core classes, you have to be ready for the possible consequences.

Is there a way to check what's inside method's code in Ruby or even modify it?

I'm currently creating a my own plugin for Redmine. I found the following method in its core (not exact code, but the idea is preserved):
def method(foo, bar, array)
# Do some complex stuff with foo and bar
#array = array
#array.uniq!
#array = #array[0:3]
# Do some complex weird stuff with #array
end
I have to change this '3' to '6', because three elements in array is not enough for my plugin's purposes. I can change it manually and nothing crashes, but I don't want to patch Redmine's core. So, I'm writing a plugin which replaces this method with my own implementation, which does the same thing, but three is changed to six.
Here's the problem: if this file updates, outdated method will be used. Is there any way to check what's written inside method in runtime (for example, when server starts)?
By the way, is there any method to directly modify this constant without overriding the whole method?
If you are able to access the file, then you can use the pry gem to check the source code. Or without such gem, you can manually check the location of the method by doing puts method(:foo).source_location, and read that part.
The easiest for you to change the behaviour is to override the entire method.
No, there is no way to get a method's source code at runtime in Ruby. On some Ruby implementations there may be a way to get the source code which works some of the time, but that will be neither reliable nor portable. After all, there isn't even a guarantee that the source code will even be Ruby, since most implementations allow you to write methods in languages other than Ruby (e.g. C in YARV, Java in JRuby etc.)

Adding a "source" attribute to ruby objects using Rubinius

I'm attempting to (for fun and profit) add the ability to inspect objects in ruby and discover their source code. Not the generated bytecode, and not some decompiled version of the internal representation, but the actual source that was parsed to create that object.
I was up quite late learning about Rubinius, and while I don't have my head around it yet fully, I think I've made some progress.
I'm having trouble figuring out how to do this, though. My first approach was to simply add another instance attribute to the AST structures (for, say, a ClosedScope object). Then, somehow pull that attribute out again when the bytecode is interpreted at runtime.
Does this seem like a sound approach?
As Mr Samuel says, you can just use pry and do show-source foo. But perhaps you'd like to know how it works under the hood.
Ruby provides two things that are useful: firstly you can get a list of all methods on an object. Just call foo.methods. Secondly it provides a file_name and line_number attribute for each method.
To find the entire source code for an object, we scan through all the methods and group them by where they are defined. We then scan up the file back until we see class or module or a few other ways rubyists use to define methods. We then scan forward in each file until we have identified the entire class/module definition.
As dgitized points out we often end up with multiple such definitions, if people have monkey patched core objects. By default pry only shows the module definition which contains most methods; but you can request the others with show-source -a.
Have you looked into Pry? It is a Ruby interpreter/debugger that claims to be able to do just what you've asked.
have you tried set_trace_func? It's not rubinius specific, but does what you ask and isn't based on pry or some other gem.
see http://www.ruby-doc.org/core-1.9.3/Kernel.html#method-i-set_trace_func

Resources