Is it theoretically safe to redefine any Ruby method that doesn't begin with underscores? - ruby

For example, is it theoretically safe to modify Object#object_id since there's always Object#__id__ if you really need to know what an object's id is?
Background: Curiosity piqued by What's another name for object_id?

In an ideal system where everything is perfectly documented and all people working with the code are aware of what's been re-defined and patched - then yes, maybe.
But as we all know, such situations rarely exist. Personally, I feel that patching anything already defined in Kernel, Class, Module or Object is a no-no (unless you're doing it at a framework level.)
Ultimately, I believe that Principle of Least Surprise should permeate coding decisions at all levels.

is it theoretically safe to modify Object#object_id
Well, I think we are probably more concerned with reality than theory here. The fact is, people aren't going to use the __X__ version until they realize that you have overriden and completely jacked up the default behavior. With power comes responsibility; use monkey-patching carefully and never introduce unexpected behavior. Better just to add a new method to the class instead.

Related

Modifying core classes in a gem

I'm developing a Gem which I think will be useful for more people than just me. One issue I'm facing is that I need to merge nested hashes. I found this useful Gist to accomplish that, but now I'm wondering if it's okay to modify Hash# like this in a Gem?
I'm sure there's a community "standard" or best practices that either accepts or declines this sort of code, so I'm turning to SO for guidance.
Thank you.
When in doubt, just subclass Hash and include the Module in your subclass. You should particularly do this, when you override methods or change the behavior dramatically.
But i don't see why you shouldn't just modify the Hash class. Rails, for example, extends Core classes rigorously and i've never heard someone complain. You might take a look at how Rails' activesupport extends core classes:
https://github.com/rails/rails/tree/master/activesupport/lib/active_support/core_ext
Just make sure not to break existing behavior, so the users of your gem will not experience unwanted side-effects.
You should always have a good reason for extending core class. I'd say that deep merging is good enough reason. To add new Hash method for deep merging like Hash#deep_merge - why not.
But I don't believe it justifies modifying existing method - overriding Hash#merge to support deep merge. People expect this method to behave certain way and if it suddenly behaves differently after requiring your gem, they won't like it. If you absolutely must do it, describe it at the prominent place in your documentation.
UPDATE: just read the comments under the other answer. Yeah, best solution in this case is simply to include activesupport/lib/active_support/core_ext/hash/deep_merge :)

Do ruby gems ever conflict?

As a ruby newbie, I was wondering, will gems ever conflict with eachother? For example, if 2 gems overrode the << method on array, which would win, or is there something to stop this?
Thanks
I assume you are talking about redefining methods, not overriding them, right? If two libraries overrode the same method in two different subclasses, there wouldn't be any problem.
If two or more libraries redefine the same method, then whichever one happens to be loaded last wins. In fact, this is actually no different than just one library redefining a method: the Ruby interpreter provides an implementation of Array#<< for you, and if you redefine it, your definition wins, simply because it came later.
The best way to stop this is simple: don't run around screwing with existing methods. And don't use libraries that do. The -w commandline flag to enable warnings is very helpful there, since at least in Ruby 1.9.2 it prints a warning if methods get redefined.
In Ruby 2.0, there will probably be some kind of mechanism to isolate method (re-)definitions into some kind of namespace. I wouldn't hold my breath, though: these so-called selector namespaces have been talked about in the Ruby community for almost 10 years now, and in the Smalltalk community even longer than that, and AFAIK nobody has ever produced a working implementation or even a working design for Ruby. A newer idea is the idea of Classboxes.
As far as I can tell, you're talking about monkeypatching (also known as duck punching in the ruby community).
This article has another example of monkeypatching (and other practices) gone bad.
In practice, no, though you could probably construct a situation like that if you really tried. Here's an interesting article (though old) that explains how this could happen.
If two gems "overrode the << method on array" they would need to be subclassing Array, and those classes would have different names or be in different modules.

What are the things you would like improved in the Ruby language?

What are the things you wish Ruby (and more generally the Ruby community) would improve?
I read somewhere that Ruby is the love-child of Smalltalk and LISP, with Miss Perl as the Nanny.
I have a lot of respect for Ruby's parents, but I'm not sure I like the influence Miss Perl had on the child. Specifically, I don't like the predefined variables: I need a cheat sheet to know what they mean. You could say "just don't use them". Well, I don't... but other people do. And when I download a plugin on the Web, I have no choice but to fetch my cheat-sheet if I ever need to go and touch the source code. I just wish they would remove those from the language itself.
Also, I think that Ruby is too much of a moving target. My code breaks on every new Ruby upgrade, even on minor releases. This is true also of Ruby on Rails and most Rails plugins I have worked with: they just change all the time, and nobody seems to care whether the changes break everything or not. IMHO, although I love a lot of things in Ruby, this lack of stability is almost a show-stopper.
I wish people would consider backward compatibility between minor releases as an unbreakable rule when releasing a new language (or library or framework) version.
I wish that some of the lesser used modules of the standard library were documented.
Make require-ing files less painful. Don't ask me how, but maybe have one file dedicated to knowing the paths involved and just get rid of the relative path crud from everything else.
Getting rid of the artificial distinction between Modules and Classes would be nice.
Both Modules and Classes are Namespaces. Modules are also Mixins, while Classes aren't. Classes can also be instantiated while Modules can't. This distinction is unnecessary. Just get rid of Modules and allow Classes to be used as Mixins.
An example of a language where this works is Newspeak.
I'd appreciate being able to install ruby 1.9 as an RPM rather than having to use the source.
Make Ruby completely Message Sending based, get rid of everything that is not a message send: local variables, global variables, instance variables, class hierarchy variables, constants, magic globals, magic constants, builtin operators, builtin keywords, even literals. See Self, Ioke or Newspeak for the incredible power and elegance this gains.
I wish they would get rid of the predefined variables: $!, $&, $+, etc.
I would like to have support for static compile-time metaprogramming. The Converge Programming Language might be a good starting point.
Replace the Mixin system with a Traits system.
Replace Exceptions with a Common Lisp style Conditions system.

Monkey-patching Vs. S.O.L.I.D. principles?

I'm slowly moving from PHP5 to Python on some personal projects, and I'm currently loving the experience. Before choosing to go down the Python route I looked at Ruby. What I did notice from the ruby community was that monkey-patching was both common and highly-regarded. I also came across a lot of horror stories regarding the trials of debugging ruby s/w because someone included a relatively harmless library to do a little job but which patched some heavily used core object without telling anyone.
I chose Python for (among other reasons) its cleaner syntax and the fact that it could do everything Ruby can. Python is making OO click much better than PHP ever has, and I'm reading more and more on OO principles to enhance this better understanding.
This evening I've been reading about Robert Martin's SOLID principles:
Single responsibility principle,
Open/closed principle,
Liskov substitution principle,
Interface segregation principle, and
Dependency inversion principle
I'm currently up to O: SOFTWARE ENTITIES (CLASSES, MODULES, FUNCTIONS, ETC.) SHOULD BE OPEN FOR EXTENSION, BUT CLOSED FOR MODIFICATION.
My head's in a spin over the conflict between ensuring consistency in OO design and the whole monkey-patching thing. I understand that its possible to do monkey-patching in Python. I also understand that being "pythonic" is to follow common, well-tested, oop best-practices & principles.
What I'd like to know is the community's opinion on the two opposing subjects; how they interoperate, when its best to use one over the other, whether the monkey-patching should be done at all... hopefully you can provide a resolution to the matter for me.
There's a difference between monkey-patching (overwriting or modifying pre-existing methods) and simple addition of new methods. I think the latter is perfectly fine, and the former should be looked at suspiciously, but I'm still in favour of keeping it.
I've encountered quite a few those problems where a third party extension monkeypatches the core libraries and breaks things, and they really do suck. Unfortunately, they all invariably seem stem from the the third party extension developers taking the path of least resistance, rather than thinking about how to actually build their solutions properly.
This sucks, but it's no more the fault of monkey patching than it's the fault of knife makers that people sometimes cut themselves.
The only times I've ever seen legitimate need for monkey patching is to work around bugs in third party or core libraries. For this alone, it's priceless, and I really would be disappointed if they removed the ability to do it.
Timeline of a bug in a C# program we had:
Read strange bug reports and trace problem to a minor bug in a CLR library.
Invest days coming up with a workaround involving catching exceptions in strange places and lots of hacks which compromises the code a lot
Spend days extricating hacky workaround when Microsoft release a service pack
Timeline of a bug in a rails program we had:
Read strange bug reports and trace problem to a minor bug in a ruby standard library
Spend 15 minutes performing minor monkey-patch to remove bug from ruby library, and place guards around it to trip if it's run on the wrong version of ruby.
Carry on with normal coding.
Simply delete monkeypatch later when next version of ruby is released.
The bugfixing process looks similar, except with monkeypatching, it's a 15 minute solution, and a 5-second 'extraction' whereas without it, pain and suffering ensues.
PS: The following example is "technically" monkeypatching, but is it "morally" monkeypatching? I'm not changing any behaviour - this is more or less just doing AOP in ruby...
class SomeClass
alias original_dostuff dostuff
def dostuff
# extra stuff, eg logging, opening a transaction, etc
original_dostuff
end
end
In my view, monkeypatching is useful to have but something that can be abused. People tend to discover it and feel like it should be used in every situation, where perhaps a mixin or other construct may be more appropriate.
I don't think it's something that you should outlaw, it's just something that the Ruby guys like to use. You can do similar things with Python but the community has taken the stance that things should be simpler and more obvious.
Monkey patching is not ruby-explicit, its done all over javascript too, with negative (IMO) effects.
My personal opinion is monkey patching should only be done to
a) Add functionality to an old version of a language which is available in the new version of the language which you need.
b) When there is no other "logical" place for it.
There are many many easy ways to make monkey patching really awful, such as the ability to change how basic functions such as ADDITION work.
My stance is, if you can avoid it, do so.
If you can avoid it in a nice way, kudos to you.
If you can't avoid it, get the opinion of 200 people because you probably just have not thought about it hard enough.
My pet hate is mootools extending the function object. Yes, you can do this. Instead of people just learning how javascript works:
setTimeout(function(){
foo(args);
}, 5000 );
There was added a new method to every function object, ( yes, im not joking ) so that functions now have their own functions.
foo.delay( 5000 , args );
Which had the additional effect of this sort of crap being valid:
foo.delay.delay( 500, [ 500, args ] );
And on like that ad infinitum.
The result? You no longer have a library, and a language, your langauge bows to the library and if the library happens to be in scope, you no longer have a language, and you cant just do things the way that they were done when you learn the language, and instead have to learn a new subset of commands just to not have it fall flat on its face ( at the cost of excessive slowdowns! )
may i note that foo.delay also returned an object, with its own methods, so you could do
x = foo.delay( 500, args );
x.clear();
and even
x.clear.delay(10);
which may sound overly useful, ... but you have to take into consideration the massive overhead used to make this viable.
clearTimeout(x);
SO HARD!
(Disclaimer: its been a while since I used moo, and have tried to forget it, and function names/structure may be incorrect. This is not an API reference. Please check their site for details ( sorry, their API reference sucks! ))
Mokeypatching is generally wrong. Create a proper subclass and add the methods.
I've used monkeypatching once in production code.
The issue is that REST uses GET, POST, PUT and DELETE. But the Django test client only offers GET and POST. I've monkeypatched methods for PUT (like POST) and DELETE (like GET).
Because of the tight binding between Django client and the Django test driver, it seemed easiest to monkeypatch it to support full REST testing.
You may find enlightening this discussion about Ruby's open classes and the Open-Closed Principle.
Even though I like Ruby, I feel monkey-patching is a tool of last resort to get things done. All things being equal, I prefer using traditional OO techniques with a sprinkle of functional programming goodness.
In my eyes, monkey-patching is one form of AOP. The article Aspect-Oriented Design Principles: Lessons from Object-Oriented Design (PDF) gives some ideas of how SOLID and other OOP principles can be applied to AOP.
My first thought is that monkey-patching violates OCP, since clients of a class should be able to expect that class to work consistently.
Monkey-patching is just plain wrong, IMHO. I've not come across the open/closed principle you mention before, but it's a principle I've long held myself, I agree with it 100%. I think of monkey-patching as a code-smell on a larger scale, a coding-philosophy-smell, as it were.

How can I program defensively in Ruby?

Here's a perfect example of the problem: Classifier gem breaks Rails.
** Original question: **
One thing that concerns me as a security professional is that Ruby doesn't have a parallel of Java's package-privacy. That is, this isn't valid Ruby:
public module Foo
public module Bar
# factory method for new Bar implementations
def self.new(...)
SimpleBarImplementation.new(...)
end
def baz
raise NotImplementedError.new('Implementing Classes MUST redefine #baz')
end
end
private class SimpleBarImplementation
include Bar
def baz
...
end
end
end
It'd be really nice to be able to prevent monkey-patching of Foo::BarImpl. That way, people who rely on the library know that nobody has messed with it. Imagine if somebody changed the implementation of MD5 or SHA1 on you! I can call freeze on these classes, but I have to do it on a class-by-class basis, and other scripts might modify them before I finish securing my application if I'm not very careful about load order.
Java provides lots of other tools for defensive programming, many of which are not possible in Ruby. (See Josh Bloch's book for a good list.) Is this really a concern? Should I just stop complaining and use Ruby for lightweight things and not hope for "enterprise-ready" solutions?
(And no, core classes are not frozen by default in Ruby. See below:)
require 'md5'
# => true
MD5.frozen?
# => false
I don't think this is a concern.
Yes, the mythical "somebody" can replace the implementation of MD5 with something insecure. But in order to do that, the mythical somebody must actually be able to get his code into the Ruby process. And if he can do that, then he presumably could also inject his code into a Java process and e.g. rewrite the bytecode for the MD5 operation. Or just intercept the keypresses and not actually bother with fiddling with the cryptography code at all.
One of the typical concerns is: I'm writing this awesome library, which is supposed to be used like so:
require 'awesome'
# Do something awesome.
But what if someone uses it like so:
require 'evil_cracker_lib_from_russian_pr0n_site'
# Overrides crypto functions and sends all data to mafia
require 'awesome'
# Now everything is insecure because awesome lib uses
# cracker lib instead of builtin
And the simple solution is: don't do that! Educate your users that they shouldn't run untrusted code they downloaded from obscure sources in their security critical applications. And if they do, they probably deserve it.
To come back to your Java example: it's true that in Java you can make your crypto code private and final and what not. However, someone can still replace your crypto implementation! In fact, someone actually did: many open-source Java implementations use OpenSSL to implement their cryptographic routines. And, as you probably know, Debian shipped with a broken, insecure version of OpenSSL for years. So, all Java programs running on Debian for the past couple of years actually did run with insecure crypto!
Java provides lots of other tools for defensive programming
Initially I thought you were talking about normal defensive programming,
wherein the idea is to defend the program (or your subset of it, or your single function) from invalid data input.
That's a great thing, and I encourage everyone to go read that article.
However it seems you are actually talking about "defending your code from other programmers."
In my opinion, this is a completely pointless goal, as no matter what you do, a malicious programmer can always run your program under a debugger, or use dll injection or any number of other techniques.
If you are merely seeking to protect your code from incompetent co-workers, this is ridiculous. Educate your co-workers, or get better co-workers.
At any rate, if such things are of great concern to you, ruby is not the programming language for you. Monkeypatching is in there by design, and to disallow it goes against the whole point of the feature.
Check out Immutable by Garry Dolley.
You can prevent redefinition of individual methods.
I guess Ruby has that a feature - valued more over it being a security issue. Ducktyping too.
E.g. I can add my own methods to the Ruby String class rather than extending or wrapping it.
"Educate your co-workers, or get better co-workers" works great for a small software startup, and it works great for the big guns like Google and Amazon. It's ridiculous to think that every lowly developer contracted in for some small medical charts application in a doctor's office in a minor city.
I'm not saying we should build for the lowest common denominator, but we have to be realistic that there are lots of mediocre programmers out there who will pull in any library that gets the job done, paying no attention to security. How could they pay attention to security? Maybe the took an algorithms and data structures class. Maybe they took a compilers class. They almost certainly didn't take an encryption protocols class. They definitely haven't all read Schneier or any of the others out there who practically have to beg and plead with even very good programmers to consider security when building software.
I'm not worried about this:
require 'evil_cracker_lib_from_russian_pr0n_site'
require 'awesome'
I'm worried about awesome requiring foobar and fazbot, and foobar requiring has_gumption, and ... eventually two of these conflict in some obscure way that undoes an important security aspect.
One important security principle is "defense in depth" -- adding these extra layers of security help you from accidentally shooting yourself in the foot. They can't completely prevent it; nothing can. But they help.
If monkey patching is your concen, you can use the Immutable module (or one of similar function).
Immutable
You could take a look at Why the Lucky Stiff's "Sandbox"project, which you can use if you worry about potentially running unsafe code.
http://code.whytheluckystiff.net/sandbox/
An example (online TicTacToe):
http://www.elctech.com/blog/safely-exposing-your-app-to-a-ruby-sandbox
Raganwald has a recent post about this. In the end, he builds the following:
class Module
def anonymous_module(&block)
self.send :include, Module.new(&block)
end
end
class Acronym
anonymous_module do
fu = lambda { 'fu' }
bar = lambda { 'bar' }
define_method :fubar do
fu.call + bar.call
end
end
end
That exposes fubar as a public method on Acronyms, but keeps the internal guts (fu and bar) private and hides helper module from outside view.
If someone monkeypatched an object or a module, then you need to look at 2 cases: He added a new method. If he is the only one adding this meyhod (which is very likely), then no problems arise. If he is not the only one, you need to see if both methods do the same and tell the library developer about this severe problem.
If they change a method, you should start to research why the method was changed. Did they change it due to some edge case behaviour or did they actually fix a bug? especially in the latter case, the monkeypatch is a god thing, because it fixes a bug in many places.
Besides that, you are using a very dynamic language with the assumption that programmers use this freedom in a sane way. The only way to remove this assumption is not to use a dynamic language.

Resources