Related
The following code is valid:
(1..5).to_a
(1..5) is a Range. The method to_a appears to convert a range to an array.
However, the documentation for Range does not document. Since this documentation is presumably auto-generated from the source with Yard, I doubt it could not be in the list of methods. Is there auto conversion going on?
How is the above legal Ruby?
The following code is valid ruby:
b = (1..5).to_a
(1..5) is a Range object, and b is an Array object. The official(?) documentation for the Class Range does not document the method to_a, which appears to convert a range to an array.
So, how is the above legal Ruby?
Ruby has something called "inheritance". Inheritance is a method for differential code-reuse that actually does not only exist in Ruby, but is in fact quite popular in many languages such as Java, C♯, C++, Python, PHP, Scala, Kotlin, Ceylon, and so on and so forth.
Inheritance allows you to define methods in one place, and then inherit them in another place, overriding and defining only the methods whose behavior differs. Hence, "differential code re-use".
In this particular case, the method you are looking at is Enumerable#to_a.
Note: Ruby actually has two forms of inheritance, mixin inheritance and class inheritance. Mixin inheritance is like class inheritance where the mixin doesn't know its superclass. (The definitive resource about mixin inheritance is Gilad Bracha's PhD Thesis The Programming Language Jigsaw – Mixins, Modularity, and Multiple Inheritance.)
The official(?) documentation
Actually, ruby-doc is a third-party site. There is no official documentation site. (However, the documentation on ruby-doc is generated from documentation comments in YARV, one of the major Ruby implementations, and one that Yukihiro Matsumoto actively contributes to.)
Since this documentation is presumably autogenerated from the source with Yard,
It is actually autogenerated from the YARV source with RDoc, not YARD.
I'm confused how it could not be in the list of methods,
Note that the explanation I gave is the correct one, but there could be many other explanations. The most simple one: the reason why the method is not in the documentation is that it is not documented. Another reason could be a bug in the documentation generator that prevents the documentation for that method being displayed, a typo in the source code that prevents the documentation generator from recognizing the documentation, and many, many others.
so is there auto conversion going on?
No. Ruby does not have automatic conversions in the language. There are some methods which for reasons of efficiency are implemented in a non-object-oriented manner, and require an object to be an instance of a specific class (as opposed to the object-oriented manner, which would only require the object to conform to a specific protocol). Because it is a severe restriction to require an object to be an instance of a specific class, those methods will usually offer an "escape hatch" to the programmer, by sending a specific message and allowing the object to respond with an object of the required class.
For example, all methods printing something to the console, require an instance of String, but they will try sending to_s first, before rejecting the argument.
These are sometimes called "implicit conversions", but they have nothing to do with implicit conversions as in Scala or implicit casts as in C♯. They are in fact not even part of the language, they are just a convention for library writers.
As a ruby newbie, I was wondering, will gems ever conflict with eachother? For example, if 2 gems overrode the << method on array, which would win, or is there something to stop this?
Thanks
I assume you are talking about redefining methods, not overriding them, right? If two libraries overrode the same method in two different subclasses, there wouldn't be any problem.
If two or more libraries redefine the same method, then whichever one happens to be loaded last wins. In fact, this is actually no different than just one library redefining a method: the Ruby interpreter provides an implementation of Array#<< for you, and if you redefine it, your definition wins, simply because it came later.
The best way to stop this is simple: don't run around screwing with existing methods. And don't use libraries that do. The -w commandline flag to enable warnings is very helpful there, since at least in Ruby 1.9.2 it prints a warning if methods get redefined.
In Ruby 2.0, there will probably be some kind of mechanism to isolate method (re-)definitions into some kind of namespace. I wouldn't hold my breath, though: these so-called selector namespaces have been talked about in the Ruby community for almost 10 years now, and in the Smalltalk community even longer than that, and AFAIK nobody has ever produced a working implementation or even a working design for Ruby. A newer idea is the idea of Classboxes.
As far as I can tell, you're talking about monkeypatching (also known as duck punching in the ruby community).
This article has another example of monkeypatching (and other practices) gone bad.
In practice, no, though you could probably construct a situation like that if you really tried. Here's an interesting article (though old) that explains how this could happen.
If two gems "overrode the << method on array" they would need to be subclassing Array, and those classes would have different names or be in different modules.
I'm expanding my Ruby understanding by coding an equivalent of Kent Beck's xUnit in Ruby. Python (which Kent writes in) has an assert() method in the language which is used extensively. Ruby does not. I think it should be easy to add this but is Kernel the right place to put it?
BTW, I know of the existence of the various Unit frameworks in Ruby - this is an exercise to learn the Ruby idioms, rather than to "get something done".
No it's not a best practice. The best analogy to assert() in Ruby is just raising
raise "This is wrong" unless expr
and you can implement your own exceptions if you want to provide for more specific exception handling
I think it is totally valid to use asserts in Ruby. But you are mentioning two different things:
xUnit frameworks use assert methods for checking your tests expectations. They are intended to be used in your test code, not in your application code.
Some languages like C, Java or Python, include an assert construction intended to be used inside the code of your programs, to check assumptions you make about their integrity. These checks are built inside the code itself. They are not a test-time utility, but a development-time one.
I recently wrote solid_assert: a little Ruby library implementing a Ruby assertion utility and also a post in my blog explaining its motivation. It lets you write expressions in the form:
assert some_string != "some value"
assert clients.empty?, "Isn't the clients list empty?"
invariant "Lists with different sizes?" do
one_variable = calculate_some_value
other_variable = calculate_some_other_value
one_variable > other_variable
end
And they can be deactivated, so assert and invariant get evaluated as empty statements. This let you avoid performance problems in production. But note that The Pragmatic Programmer: from journeyman to master recommends against deactivating them. You should only deactivate them if they really affect the performance.
Regarding the answer saying that the idiomatic Ruby way is using a normal raise statement, I think it lacks expressivity. One of the golden rules of assertive programming is not using assertions for normal exception handling. They are two completely different things. If you use the same syntax for the two of them, I think your code will be more obscure. And of course you lose the capability of deactivating them.
Some widely-regarded books that dedicate whole sections to assertions and recommend their use:
The Pragmatic Programmer: from Journeyman to Master by Andrew Hunt and David Thomas
Code Complete: A Practical Handbook of Software Construction by Steve McConnell
Writing Solid Code by Steve Maguire
Programming with
assertions
is an article that illustrates well what assertive programming is about and
when to use it (it is based in Java, but the concepts apply to any
language).
What's your reason for adding the assert method to the Kernel module? Why not just use another module called Assertions or something?
Like this:
module Assertions
def assert(param)
# do something with param
end
# define more assertions here
end
If you really need your assertions to be available everywhere do something like this:
class Object
include Assertions
end
Disclaimer: I didn't test the code but in principle I would do it like this.
It's not especially idiomatic, but I think it's a good idea. Especially if done like this:
def assert(msg=nil)
if DEBUG
raise msg || "Assertion failed!" unless yield
end
end
That way there's no impact if you decide not to run with DEBUG (or some other convenient switch, I've used Kernel.do_assert in the past) set.
My understanding is that you're writing your own testing suite as a way of becoming more familiar with Ruby. So while Test::Unit might be useful as a guide, it's probably not what you're looking for (because it's already done the job).
That said, python's assert is (to me, at least), more analogous to C's assert(3). It's not specifically designed for unit-tests, rather to catch cases where "this should never happen".
How Ruby's built-in unit tests tend to view the problem, then, is that each individual test case class is a subclass of TestCase, and that includes an "assert" statement which checks the validity of what was passed to it and records it for reporting.
I'm slowly moving from PHP5 to Python on some personal projects, and I'm currently loving the experience. Before choosing to go down the Python route I looked at Ruby. What I did notice from the ruby community was that monkey-patching was both common and highly-regarded. I also came across a lot of horror stories regarding the trials of debugging ruby s/w because someone included a relatively harmless library to do a little job but which patched some heavily used core object without telling anyone.
I chose Python for (among other reasons) its cleaner syntax and the fact that it could do everything Ruby can. Python is making OO click much better than PHP ever has, and I'm reading more and more on OO principles to enhance this better understanding.
This evening I've been reading about Robert Martin's SOLID principles:
Single responsibility principle,
Open/closed principle,
Liskov substitution principle,
Interface segregation principle, and
Dependency inversion principle
I'm currently up to O: SOFTWARE ENTITIES (CLASSES, MODULES, FUNCTIONS, ETC.) SHOULD BE OPEN FOR EXTENSION, BUT CLOSED FOR MODIFICATION.
My head's in a spin over the conflict between ensuring consistency in OO design and the whole monkey-patching thing. I understand that its possible to do monkey-patching in Python. I also understand that being "pythonic" is to follow common, well-tested, oop best-practices & principles.
What I'd like to know is the community's opinion on the two opposing subjects; how they interoperate, when its best to use one over the other, whether the monkey-patching should be done at all... hopefully you can provide a resolution to the matter for me.
There's a difference between monkey-patching (overwriting or modifying pre-existing methods) and simple addition of new methods. I think the latter is perfectly fine, and the former should be looked at suspiciously, but I'm still in favour of keeping it.
I've encountered quite a few those problems where a third party extension monkeypatches the core libraries and breaks things, and they really do suck. Unfortunately, they all invariably seem stem from the the third party extension developers taking the path of least resistance, rather than thinking about how to actually build their solutions properly.
This sucks, but it's no more the fault of monkey patching than it's the fault of knife makers that people sometimes cut themselves.
The only times I've ever seen legitimate need for monkey patching is to work around bugs in third party or core libraries. For this alone, it's priceless, and I really would be disappointed if they removed the ability to do it.
Timeline of a bug in a C# program we had:
Read strange bug reports and trace problem to a minor bug in a CLR library.
Invest days coming up with a workaround involving catching exceptions in strange places and lots of hacks which compromises the code a lot
Spend days extricating hacky workaround when Microsoft release a service pack
Timeline of a bug in a rails program we had:
Read strange bug reports and trace problem to a minor bug in a ruby standard library
Spend 15 minutes performing minor monkey-patch to remove bug from ruby library, and place guards around it to trip if it's run on the wrong version of ruby.
Carry on with normal coding.
Simply delete monkeypatch later when next version of ruby is released.
The bugfixing process looks similar, except with monkeypatching, it's a 15 minute solution, and a 5-second 'extraction' whereas without it, pain and suffering ensues.
PS: The following example is "technically" monkeypatching, but is it "morally" monkeypatching? I'm not changing any behaviour - this is more or less just doing AOP in ruby...
class SomeClass
alias original_dostuff dostuff
def dostuff
# extra stuff, eg logging, opening a transaction, etc
original_dostuff
end
end
In my view, monkeypatching is useful to have but something that can be abused. People tend to discover it and feel like it should be used in every situation, where perhaps a mixin or other construct may be more appropriate.
I don't think it's something that you should outlaw, it's just something that the Ruby guys like to use. You can do similar things with Python but the community has taken the stance that things should be simpler and more obvious.
Monkey patching is not ruby-explicit, its done all over javascript too, with negative (IMO) effects.
My personal opinion is monkey patching should only be done to
a) Add functionality to an old version of a language which is available in the new version of the language which you need.
b) When there is no other "logical" place for it.
There are many many easy ways to make monkey patching really awful, such as the ability to change how basic functions such as ADDITION work.
My stance is, if you can avoid it, do so.
If you can avoid it in a nice way, kudos to you.
If you can't avoid it, get the opinion of 200 people because you probably just have not thought about it hard enough.
My pet hate is mootools extending the function object. Yes, you can do this. Instead of people just learning how javascript works:
setTimeout(function(){
foo(args);
}, 5000 );
There was added a new method to every function object, ( yes, im not joking ) so that functions now have their own functions.
foo.delay( 5000 , args );
Which had the additional effect of this sort of crap being valid:
foo.delay.delay( 500, [ 500, args ] );
And on like that ad infinitum.
The result? You no longer have a library, and a language, your langauge bows to the library and if the library happens to be in scope, you no longer have a language, and you cant just do things the way that they were done when you learn the language, and instead have to learn a new subset of commands just to not have it fall flat on its face ( at the cost of excessive slowdowns! )
may i note that foo.delay also returned an object, with its own methods, so you could do
x = foo.delay( 500, args );
x.clear();
and even
x.clear.delay(10);
which may sound overly useful, ... but you have to take into consideration the massive overhead used to make this viable.
clearTimeout(x);
SO HARD!
(Disclaimer: its been a while since I used moo, and have tried to forget it, and function names/structure may be incorrect. This is not an API reference. Please check their site for details ( sorry, their API reference sucks! ))
Mokeypatching is generally wrong. Create a proper subclass and add the methods.
I've used monkeypatching once in production code.
The issue is that REST uses GET, POST, PUT and DELETE. But the Django test client only offers GET and POST. I've monkeypatched methods for PUT (like POST) and DELETE (like GET).
Because of the tight binding between Django client and the Django test driver, it seemed easiest to monkeypatch it to support full REST testing.
You may find enlightening this discussion about Ruby's open classes and the Open-Closed Principle.
Even though I like Ruby, I feel monkey-patching is a tool of last resort to get things done. All things being equal, I prefer using traditional OO techniques with a sprinkle of functional programming goodness.
In my eyes, monkey-patching is one form of AOP. The article Aspect-Oriented Design Principles: Lessons from Object-Oriented Design (PDF) gives some ideas of how SOLID and other OOP principles can be applied to AOP.
My first thought is that monkey-patching violates OCP, since clients of a class should be able to expect that class to work consistently.
Monkey-patching is just plain wrong, IMHO. I've not come across the open/closed principle you mention before, but it's a principle I've long held myself, I agree with it 100%. I think of monkey-patching as a code-smell on a larger scale, a coding-philosophy-smell, as it were.
Here's a perfect example of the problem: Classifier gem breaks Rails.
** Original question: **
One thing that concerns me as a security professional is that Ruby doesn't have a parallel of Java's package-privacy. That is, this isn't valid Ruby:
public module Foo
public module Bar
# factory method for new Bar implementations
def self.new(...)
SimpleBarImplementation.new(...)
end
def baz
raise NotImplementedError.new('Implementing Classes MUST redefine #baz')
end
end
private class SimpleBarImplementation
include Bar
def baz
...
end
end
end
It'd be really nice to be able to prevent monkey-patching of Foo::BarImpl. That way, people who rely on the library know that nobody has messed with it. Imagine if somebody changed the implementation of MD5 or SHA1 on you! I can call freeze on these classes, but I have to do it on a class-by-class basis, and other scripts might modify them before I finish securing my application if I'm not very careful about load order.
Java provides lots of other tools for defensive programming, many of which are not possible in Ruby. (See Josh Bloch's book for a good list.) Is this really a concern? Should I just stop complaining and use Ruby for lightweight things and not hope for "enterprise-ready" solutions?
(And no, core classes are not frozen by default in Ruby. See below:)
require 'md5'
# => true
MD5.frozen?
# => false
I don't think this is a concern.
Yes, the mythical "somebody" can replace the implementation of MD5 with something insecure. But in order to do that, the mythical somebody must actually be able to get his code into the Ruby process. And if he can do that, then he presumably could also inject his code into a Java process and e.g. rewrite the bytecode for the MD5 operation. Or just intercept the keypresses and not actually bother with fiddling with the cryptography code at all.
One of the typical concerns is: I'm writing this awesome library, which is supposed to be used like so:
require 'awesome'
# Do something awesome.
But what if someone uses it like so:
require 'evil_cracker_lib_from_russian_pr0n_site'
# Overrides crypto functions and sends all data to mafia
require 'awesome'
# Now everything is insecure because awesome lib uses
# cracker lib instead of builtin
And the simple solution is: don't do that! Educate your users that they shouldn't run untrusted code they downloaded from obscure sources in their security critical applications. And if they do, they probably deserve it.
To come back to your Java example: it's true that in Java you can make your crypto code private and final and what not. However, someone can still replace your crypto implementation! In fact, someone actually did: many open-source Java implementations use OpenSSL to implement their cryptographic routines. And, as you probably know, Debian shipped with a broken, insecure version of OpenSSL for years. So, all Java programs running on Debian for the past couple of years actually did run with insecure crypto!
Java provides lots of other tools for defensive programming
Initially I thought you were talking about normal defensive programming,
wherein the idea is to defend the program (or your subset of it, or your single function) from invalid data input.
That's a great thing, and I encourage everyone to go read that article.
However it seems you are actually talking about "defending your code from other programmers."
In my opinion, this is a completely pointless goal, as no matter what you do, a malicious programmer can always run your program under a debugger, or use dll injection or any number of other techniques.
If you are merely seeking to protect your code from incompetent co-workers, this is ridiculous. Educate your co-workers, or get better co-workers.
At any rate, if such things are of great concern to you, ruby is not the programming language for you. Monkeypatching is in there by design, and to disallow it goes against the whole point of the feature.
Check out Immutable by Garry Dolley.
You can prevent redefinition of individual methods.
I guess Ruby has that a feature - valued more over it being a security issue. Ducktyping too.
E.g. I can add my own methods to the Ruby String class rather than extending or wrapping it.
"Educate your co-workers, or get better co-workers" works great for a small software startup, and it works great for the big guns like Google and Amazon. It's ridiculous to think that every lowly developer contracted in for some small medical charts application in a doctor's office in a minor city.
I'm not saying we should build for the lowest common denominator, but we have to be realistic that there are lots of mediocre programmers out there who will pull in any library that gets the job done, paying no attention to security. How could they pay attention to security? Maybe the took an algorithms and data structures class. Maybe they took a compilers class. They almost certainly didn't take an encryption protocols class. They definitely haven't all read Schneier or any of the others out there who practically have to beg and plead with even very good programmers to consider security when building software.
I'm not worried about this:
require 'evil_cracker_lib_from_russian_pr0n_site'
require 'awesome'
I'm worried about awesome requiring foobar and fazbot, and foobar requiring has_gumption, and ... eventually two of these conflict in some obscure way that undoes an important security aspect.
One important security principle is "defense in depth" -- adding these extra layers of security help you from accidentally shooting yourself in the foot. They can't completely prevent it; nothing can. But they help.
If monkey patching is your concen, you can use the Immutable module (or one of similar function).
Immutable
You could take a look at Why the Lucky Stiff's "Sandbox"project, which you can use if you worry about potentially running unsafe code.
http://code.whytheluckystiff.net/sandbox/
An example (online TicTacToe):
http://www.elctech.com/blog/safely-exposing-your-app-to-a-ruby-sandbox
Raganwald has a recent post about this. In the end, he builds the following:
class Module
def anonymous_module(&block)
self.send :include, Module.new(&block)
end
end
class Acronym
anonymous_module do
fu = lambda { 'fu' }
bar = lambda { 'bar' }
define_method :fubar do
fu.call + bar.call
end
end
end
That exposes fubar as a public method on Acronyms, but keeps the internal guts (fu and bar) private and hides helper module from outside view.
If someone monkeypatched an object or a module, then you need to look at 2 cases: He added a new method. If he is the only one adding this meyhod (which is very likely), then no problems arise. If he is not the only one, you need to see if both methods do the same and tell the library developer about this severe problem.
If they change a method, you should start to research why the method was changed. Did they change it due to some edge case behaviour or did they actually fix a bug? especially in the latter case, the monkeypatch is a god thing, because it fixes a bug in many places.
Besides that, you are using a very dynamic language with the assumption that programmers use this freedom in a sane way. The only way to remove this assumption is not to use a dynamic language.