What are the things to be careful about while redefining methods in Ruby? Is redefining core library methods okay?
the problems IMHO are
You'll forget about the change.
You'll copy paste a snippet from the internet, which will trigger an error the altered behavior and you'll scratch your head until you get hairless patches.
Another developer will come after your and fight a bug for 3 months, until he finds it's in one of the monkey patches. He'll go to HR, get your address and and show you why not to do monkey patches.
Now, sometimes you need to monkey patch a class (even in the core library, why not). My suggestion is
Put all of your monkey-patches in ONE source folder.
The second thing you say to a new developer after "Hello my name is ..." is the location of that folder and a detailed explanation of what each monkey patch does.
I don't do much monkeypatching myself, but I've heard that rather than doing
class String
def improved_method
# teh codes
end
end
It's better to put the new method into a module, and then include the module
module ImprovedString
def improved_method
# teh codes
end
end
class String
include ImprovedString
end
it makes it easier to find where a method has been defined, and the old version still exists without having to do alias chaining.
I like the other answers. Though, I have to add that:
Sometimes you may only want to redefine methods only for certain instances. You can do this, and it makes it somehow more controlled than changing the functionality for all objects of a certain class - as long as a proper debugger is used for debugging:
class << object_instance
def method_redefinition
return "method_redefinition"
end
end
object_instance.method_redefinition => "method redefinition"
The metioned set of functionalities can also be encapsulated in a mix-in in order to avoid too much nesting and messy "code definition inside code execution":
module M
def method_redefinition
"method_redefinition"
end
end
object_instance.extend M
object_instance.method_redefinition => "method_redefinition"
You're talking about monkey patching and it's dangerous for the following reasons according to wikipedia
Carelessly written or poorly
documented monkey patches can lead to
problems:
They can lead to upgrade problems when the patch makes assumptions about
the patched object that are no longer
true; if the product you have changed
changes with a new release it may very
well break your patch. For this reason
monkey patches are often made
conditional, and only applied if
appropriate.
If two modules attempt to monkey-patch the same method, one of
them (whichever one runs last) "wins"
and the other patch has no effect,
unless monkeypatches are written with
pattern like alias_method_chain
They create a discrepancy between the original source code on disk and
the observed behaviour that can be
very confusing to anyone unaware of
the patches' existence.
Even if monkey patching isn't used,
some see a problem with the
availability of the feature, since the
ability to use monkey patching in a
programming language is incompatible
with enforcing strong encapsulation,
as required by the object-capability
model, between objects.
There's talk of a safer way of monkey
patching coming in ruby 2 called
refinements
This tiny gem might be useful if you're finding yourself running into monkey-patching issues: (https://github.com/gnovos/ctx). I originally wrote it to create more expressive DSLs by allowing alterations to the base objects without doing too much damage elsewhere, but it can probably be put to any number of uses. It gets around some of the monkey-patching issues by scoping method re-definition into arbitrary contexts that can be swapped out as needed.
If I want to redefine a method in some core class (for example, in String, etc), I use ctx_define instead of "def" and then later wrap the section of code that should use the new definition in a ctx block, like so:
class ::String
ctx_define :dsl, :+ do |other|
"#{self[0].upcase}#{self[1..-1]}#{other.capitalize}"
end
end
puts "this" + "is" + "normal" + "text" + "concatination"
# => thisisnormaltextconcatination
ctx(:dsl) { puts "this" + "is" + "special" + "text" + "concatination" }
# => ThisIsSpecialTextConcatination
I only threw it together in a few minutes, so I can't make any guarantees about how robust it is in any number of complicated situations, but it seems to work fine for simple needs. Give it a look if you're interested and see if it is of any help. :)
Related
Although I agree that extending native types and objects is a bad practice, inheriting from them should not be.
In a supposedly supporting gem (that I could not find), the way that the native types were to be used would be as follows:
require 'cool-unkown-light-gem'
class MyTypedArray < CoolArray # would love to directly < Array
def initialize(*args)
super(*args)
# some inits for DataArray
#caches_init = false
end
def name?(name)
init_caches unless !#caches_init
!!#cache_by_name[name]
end
def element(name)
init_caches unless !#caches_init
#cache_by_name[name]
end
private
# overrides the CoolArray method:
# CoolArray methods that modify self will call this method
def on_change
#caches_init = false
super
end
def init_caches
return #cache_by_name if #caches_init
#caches_init = true
#cache_by_name = self.map do |elem|
[elem.unique_name, elem]
end.to_h
end
end
Any method of the parent class not overridden by the child class that modifies self would call, let's say (in this case), the on_change function. Which would allow to do not have to re-define every single one of those methods to avoid losing track on changes.
Let's say the MyTypedArray would array Foo objects:
class Foo
attr_reader :unique_name
def initialize(name)
#unique_name = name
end
end
a short example of the expected behaviour of its usage:
my_array = MyTypedArray.new
my_array.push( Foo.new("bar") ).push( Foo.new("baz") )
my_array.element("bar").unique_name
# => "bar"
my_array.shift # a method that removes the first element from self
my_array.element("bar").unique_name
# => undefined method `unique_name' for nil:NilClass (NoMethodError)
my_array.name?("bar")
# => false
I understand that we should search for immutable classes, yet those native types support changes on the same object and we want a proper way to do an inheritance that is as brief and easy as possible.
Any thoughts, approaches, or recommendations are more than welcome, of course. I do not think I am the only one that have thought on this.
The reason why I am searching for a maintained gem is because different ruby versions may offer different supported methods or options for native types / classes.
[Edit]
The aim of the above is to figure out a pattern that works. I could just follow the rules and suggestions of other posts, yet would not get things work the way I am intended and when I see it proper (a coding language is made by and for humans, and not humans made for coding languages). I know everyone is proud of their achievements in learning, developing and making things shaped in a pattern that is well known in the community.
The target of the above is because all the methods of Array are more than welcome. I do not care if in the version 20 of Ruby they remove some methods of Array. By then my application will be obsolete or someone will achieve the same result in far less code.
Why Array?
Because the order matters.
Why an internal Hash?
Because for the usage I want to make of it, in overall, the cost of building the hash compensates the optimization it offers.
Why not just include Enumerable?
Because we just reduce the number of methods that change the object, but we do not actually have a pattern that allows to change #caches_init to false, so the Hash is rebuilt on next usage (so same problem as with Array)
Why not just whitelist and include target Array methods?
Because that does not get me where I want to be. What if I want anyone to still use pop, or shift but I do not want to redefine them, or even having to bother to manage my mixins and constantly having to use responds_to?? (perhaps that exercise is good to improve your skills in coding and read code from other people, but that is not what it should be)
Where I want to be?
I want to be in a position that I can re-use / inherit any, I repeat, any class (no matter if it is native or not). That is basic for an OOP language. And if we are not talking about an OOP language (but just some sugar at the top of it to make it appear as OOP), then let's keep ourselves open to analyse patterns that should work well (no matter if they are odd - for me is more odd that there are no intermediate levels; which is symptom of many conventional patterns, which in turn is symptom of poor support for certain features that are more widely required than what is accepted).
Why should a gem offer the above?
Well, let's humble it. The above is a very simple case (and even though not covered). You may gain in flexibility at some point by using what some people want to call the Ruby way. But at a cost when you move to bigger architectures. What if I want to create intermediate classes to inherit from? Enriched native classes that boost simple code, yet keeping it aligned with the language. It is easier to say this is not the Ruby way than trying to make the language closer to something that escalates well from the bottom.
I am not surprised that Rails and Ruby are almost "indistinctly" used by many. Because at some point, without some Rails support, what you have with Ruby is a lot of trouble. As, consequently, I am not surprised that Rails is so maintained.
Why should I redefine a pop, or a last, or first methods? For what? They are already implemented.
Why should I whitelist methods and create mixins? is that a object or method oriented programming?
Anyway... I do not expect anyone to share my view on this. I do see other patterns, and I will keep allowing my mind to find them. If anyone is open enough, please, feel free to share. Someone may criticize the approach and be right, but if you got there is because it worked.
To answer your question as it is written, no, there is no gem for this. This is not a possibility of the language, either in pure Ruby or in C which is used internally.
There is no mechanism in detect when self is changed, nor any way to detect if a method is pure (does not change self) or impure (does change self). It seems you want a way to "automatically" be able to know when a method is one or the other, and that, to put simply, is just not possible, nor is it in any language that I am aware of.
Internally (using your example) an Array is backed by a RArray structure in C. A struct is simple storage space: a way to look at an arbitrary block of memory. C does not care how you choose to look at memory, I could just as easily cast the pointer of this struct and say it is a now a pointer to an array of integers and change it that way, it will happily manipulate the memory as I tell it to, and there is nothing that can detect that I did so. Now add in the fact that anyone, any script, or any gem can do this and you have no control over it, and it just shows that this solution is fundamentally and objectively flawed.
This is why most (all?) languages that need to be notified when an object is changed use an observer pattern. You create a function that "notifies" when something changes, and you invoke that function manually when needed. If someone decides to subclass your class, they need only continue the pattern to raise that function if it changes the object state.
There is no such thing as an automatic way of doing this. As already explained, this is an "opt-in" or "whitelist" solution. If you want to subclass an existing object instead of using your own from scratch, then you need to modify its behavior accordingly.
That said, adding the functionality is not as daunting as you may think if you use some clever aliasing and meta-programming with module_eval, class_eval or the like.
# This is 100% untested and not even checked for syntax, just rough idea
def on_changed
# Do whatever you need here when object is changed
end
# Unpure methods like []=, <<, push, map!, etc, etc
unpure_methods.each do |name|
class_eval <<-EOS
alias #{name}_orig #{name}
def #{name}(*args, &block)
#{name}_orig(*args, &block)
on_changed
end
EOS
end
I'm working on a ruby project in which we are planning to do some operations with ruby strings. Some operations are simple (like counting the number of words) and others more complex (like checking if a given string is in the correct language).
A possible way to implement this is by patching the String class with extra methods, without modifying any existing methods, and adding behaviors like "some string".word_count and "some string".cjk?.
Another approach, based on FileUtils is to create a class or module full of methods and always use string as parameters, like OddClassName.word_count("some string") and OddClassName.cjk?("some string"). We like the first better because of readability.
I understand that monkey patching a basic class as described in the first alternative can have name clashes. However, if this is the main application, not a library, should I worry with it at all?
So, the questions are:
Is adding methods to ruby base classes a bad practice? If yes, is that in all cases or only in some cases?
What is the best approach to accomplish this?
What could be the name of 'OddClassName'?
Please suggest any alternatives.
Monkey patching isn't considered to be a bad practice unless you are writing odd methods that do not have PatchedClass-related behavior (for example, String.monkeyPatchForMakingJpegFromString is rather bad, but Jpeg.fromString is good enough.)
But if your project is rather large, the libraries that you use in it may happen to have colliding patches, so you may have one more problem with all these patching stuffs. In Ruby 2.0, refinements come to an aid. They work as follows: you define a module, refine your (even core) class in it, and then use that module where it's necessary. So, in your code it works as:
YourClass.new.refinedMethodFromCoreClass #=> some result
But
CoreClass.refinedMethodFromCoreClass
produces undefined method exception.
That's all monkey patching stuff: monkey patching is useful and convenient, but refinements add some features, that make your code more secure, maintainable and neat.
I'd use a new class, call it Doc or something because getting the word count and checking languages sounds like operations for documents.
Let it take a string as a constructor parameter and have modifications chain to return a new Doc. Also give it a to_s method that returns the string.
class Doc
def initialize(str)
#str = str
end
def to_s
#str
end
define word_count, cjk?, etc.
end
Doc.new("Some document").word_count
# => 2
I tend to use before blocks to set instance variables. I then use those variables across my examples. I recently came upon let(). According to RSpec docs, it is used to
... to define a memoized helper method. The value will be cached across multiple calls in the same example but not across examples.
How is this different from using instance variables in before blocks? And also when should you use let() vs before()?
I always prefer let to an instance variable for a couple of reasons:
Instance variables spring into existence when referenced. This means that if you fat finger the spelling of the instance variable, a new one will be created and initialized to nil, which can lead to subtle bugs and false positives. Since let creates a method, you'll get a NameError when you misspell it, which I find preferable. It makes it easier to refactor specs, too.
A before(:each) hook will run before each example, even if the example doesn't use any of the instance variables defined in the hook. This isn't usually a big deal, but if the setup of the instance variable takes a long time, then you're wasting cycles. For the method defined by let, the initialization code only runs if the example calls it.
You can refactor from a local variable in an example directly into a let without changing the
referencing syntax in the example. If you refactor to an instance variable, you have to change
how you reference the object in the example (e.g. add an #).
This is a bit subjective, but as Mike Lewis pointed out, I think it makes the spec easier to read. I like the organization of defining all my dependent objects with let and keeping my it block nice and short.
A related link can be found here: http://www.betterspecs.org/#let
The difference between using instances variables and let() is that let() is lazy-evaluated. This means that let() is not evaluated until the method that it defines is run for the first time.
The difference between before and let is that let() gives you a nice way of defining a group of variables in a 'cascading' style. By doing this, the spec looks a little better by simplifying the code.
I have completely replaced all uses of instance variables in my rspec tests to use let(). I've written a quickie example for a friend who used it to teach a small Rspec class: http://ruby-lambda.blogspot.com/2011/02/agile-rspec-with-let.html
As some of the other answers here says, let() is lazy evaluated so it will only load the ones that require loading. It DRYs up the spec and make it more readable. I've in fact ported the Rspec let() code to use in my controllers, in the style of inherited_resource gem. http://ruby-lambda.blogspot.com/2010/06/stealing-let-from-rspec.html
Along with lazy evaluation, the other advantage is that, combined with ActiveSupport::Concern, and the load-everything-in spec/support/ behavior, you can create your very own spec mini-DSL specific to your application. I've written ones for testing against Rack and RESTful resources.
The strategy I use is Factory-everything (via Machinist+Forgery/Faker). However, it is possible to use it in combination with before(:each) blocks to preload factories for an entire set of example groups, allowing the specs to run faster: http://makandra.com/notes/770-taking-advantage-of-rspec-s-let-in-before-blocks
It is important to keep in mind that let is lazy evaluated and not putting side-effect methods in it otherwise you would not be able to change from let to before(:each) easily.
You can use let! instead of let so that it is evaluated before each scenario.
In general, let() is a nicer syntax, and it saves you typing #name symbols all over the place. But, caveat emptor! I have found let() also introduces subtle bugs (or at least head scratching) because the variable doesn't really exist until you try to use it... Tell tale sign: if adding a puts after the let() to see that the variable is correct allows a spec to pass, but without the puts the spec fails -- you have found this subtlety.
I have also found that let() doesn't seem to cache in all circumstances! I wrote it up in my blog: http://technicaldebt.com/?p=1242
Maybe it is just me?
Dissenting voice here: after 5 years of rspec I don't like let very much.
1. Lazy evaluation often makes test setup confusing
It becomes difficult to reason about setup when some things that have been declared in setup are not actually affecting state, while others are.
Eventually, out of frustration someone just changes let to let! (same thing without lazy evaluation) in order to get their spec working. If this works out for them, a new habit is born: when a new spec is added to an older suite and it doesn't work, the first thing the writer tries is to add bangs to random let calls.
Pretty soon all the performance benefits are gone.
2. Special syntax is unusual to non-rspec users
I would rather teach Ruby to my team than the tricks of rspec. Instance variables or method calls are useful everywhere in this project and others, let syntax will only be useful in rspec.
3. The "benefits" allow us to easily ignore good design changes
let() is good for expensive dependencies that we don't want to create over and over.
It also pairs well with subject, allowing you to dry up repeated calls to multi-argument methods
Expensive dependencies repeated in many times, and methods with big signatures are both points where we could make the code better:
maybe I can introduce a new abstraction that isolates a dependency from the rest of my code (which would mean fewer tests need it)
maybe the code under test is doing too much
maybe I need to inject smarter objects instead of a long list of primitives
maybe I have a violation of tell-don't-ask
maybe the expensive code can be made faster (rarer - beware of premature optimisation here)
In all these cases, I can address the symptom of difficult tests with a soothing balm of rspec magic, or I can try address the cause. I feel like I spent way too much of the last few years on the former and now I want some better code.
To answer the original question: I would prefer not to, but I do still use let. I mostly use it to fit in with the style of the rest of the team (it seems like most Rails programmers in the world are now deep into their rspec magic so that is very often). Sometimes I use it when I'm adding a test to some code that I don't have control of, or don't have time to refactor to a better abstraction: i.e. when the only option is the painkiller.
let is functional as its essentially a Proc. Also its cached.
One gotcha I found right away with let... In a Spec block that is evaluating a change.
let(:object) {FactoryGirl.create :object}
expect {
post :destroy, id: review.id
}.to change(Object, :count).by(-1)
You'll need to be sure to call let outside of your expect block. i.e. you're calling FactoryGirl.create in your let block. I usually do this by verifying the object is persisted.
object.persisted?.should eq true
Otherwise when the let block is called the first time a change in the database will actually happen due to the lazy instantiation.
Update
Just adding a note. Be careful playing code golf or in this case rspec golf with this answer.
In this case, I just have to call some method to which the object responds. So I invoke the _.persisted?_ method on the object as its truthy. All I'm trying to do is instantiate the object. You could call empty? or nil? too. The point isn't the test but bringing the object ot life by calling it.
So you can't refactor
object.persisted?.should eq true
to be
object.should be_persisted
as the object hasn't been instantiated... its lazy. :)
Update 2
leverage the let! syntax for instant object creation, which should avoid this issue altogether. Note though it will defeat a lot of the purpose of the laziness of the non banged let.
Also in some instances you might actually want to leverage the subject syntax instead of let as it may give you additional options.
subject(:object) {FactoryGirl.create :object}
"before" by default implies before(:each). Ref The Rspec Book, copyright 2010, page 228.
before(scope = :each, options={}, &block)
I use before(:each) to seed some data for each example group without having to call the let method to create the data in the "it" block. Less code in the "it" block in this case.
I use let if I want some data in some examples but not others.
Both before and let are great for DRYing up the "it" blocks.
To avoid any confusion, "let" is not the same as before(:all). "Let" re-evaluates its method and value for each example ("it"), but caches the value across multiple calls in the same example. You can read more about it here: https://www.relishapp.com/rspec/rspec-core/v/2-6/docs/helper-methods/let-and-let
Note to Joseph -- if you are creating database objects in a before(:all) they won't be captured in a transaction and you're much more likely to leave cruft in your test database. Use before(:each) instead.
The other reason to use let and its lazy evaluation is so you can take a complicated object and test individual pieces by overriding lets in contexts, as in this very contrived example:
context "foo" do
let(:params) do
{ :foo => foo, :bar => "bar" }
end
let(:foo) { "foo" }
it "is set to foo" do
params[:foo].should eq("foo")
end
context "when foo is bar" do
let(:foo) { "bar" }
# NOTE we didn't have to redefine params entirely!
it "is set to bar" do
params[:foo].should eq("bar")
end
end
end
I use let to test my HTTP 404 responses in my API specs using contexts.
To create the resource, I use let!. But to store the resource identifier, I use let. Take a look how it looks like:
let!(:country) { create(:country) }
let(:country_id) { country.id }
before { get "api/countries/#{country_id}" }
it 'responds with HTTP 200' { should respond_with(200) }
context 'when the country does not exist' do
let(:country_id) { -1 }
it 'responds with HTTP 404' { should respond_with(404) }
end
That keeps the specs clean and readable.
According to Wikipedia, a monkey patch is:
a way to extend or modify the runtime
code of dynamic languages [...]
without altering the original source
code.
The following statement from the same entry confused me:
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
I would like to know the exact meaning of monkey patching in Ruby. Is it doing something like the following, or is it something else?
class String
def foo
"foo"
end
end
The best explanation I heard for Monkey patching/Duck-punching is by Patrick Ewing in RailsConf 2007
...if it walks like a duck and talks like a duck, it’s a duck, right? So
if this duck is not giving you the noise that you want, you’ve got to
just punch that duck until it returns what you expect.
The short answer is that there is no "exact" meaning, because it's a novel term, and different folks use it differently. That much at least can be discerned from the Wikipedia article. There are some who insist that it only applies to "runtime" code (built-in classes, I suppose) while some would use it to refer to the run-time modification of any class.
Personally, I prefer the more inclusive definition. After all, if we were to use the term for modification of built-in classes only, how would we refer to the run-time modification of all the other classes? The important thing to me is that there's a difference between the source code and the actual running class.
In Ruby, the term monkey patch was
misunderstood to mean any dynamic
modification to a class and is often
used as a synonym for dynamically
modifying any class at runtime.
The above statement asserts that the Ruby usage is incorrect - but terms evolve, and that's not always a bad thing.
Monkey patching is when you replace methods of a class at runtime (not adding new methods as others have described).
In addition to being a very un-obvious and difficult to debug way to change code, it doesn't scale; as more and more modules start monkey patching methods, the likelihood of the changes stomping each other grow.
You are correct; it's when you modify or extend an existing class rather than subclass it.
This is monkey patching:
class Float
def self.times(&block)
self.to_i.times { |i| yield(i) }
remainder = self - self.to_i
yield(remainder) if remainder > 0.0
end
end
Now I imagine this might be useful sometimes, but imagine if you saw routine.
def my_method(my_special_number)
sum = 0
my_special_number.times { |num| sum << some_val ** num }
sum
end
And it breaks only occasionally when it gets called. To those paying attention you already know why, but imagine that you didn't know about the float type having a .times class-method and you automatically assumed that my_special_number is an integer. Every time the parameter is a whole number, integer or float, it would work fine (whole ints are passed back except when there is a floating-point remainder). But pass a number with anything in the decimal area in and it'll break for sure!
Just imagine how often this might happen with your gems, Rails plugins, and even by your own co-workers in your projects. If there's one or two little methods in there like this and it could take some time to find and correct.
If you wonder why it breaks, note that sum is an integer and a floating-point remainder could be passed back; in addition, the exponential sign only works when types are the same. So you might think it's fixed, because you converted bother numbers to floats ... only to find that the sum can't take the floating-point result.
In Python monkeypatching is referred to a lot as a sign of embarrassment: "I had to monkeypatch this class because..." (I encountered it first when dealing with Zope, which the article mentions). It's used to say that it was necessary to take hold of an upstream class and fix it at runtime instead of lobbying to have the unwanted behaviors fixed in the actual class or fixing them in a subclass. In my experience Ruby people don't talk about monkeypatching that much, because it's not considered especially bad or even noteworthy (hence "duck punching"). Obviously you have to be careful about changing the return values of a method that will be used in other dependencies, but adding methods to a class the way that active_support and facets do is perfectly safe.
Update 10 years later: I would amend the last sentence to say "is relatively safe". Extending a core library class with new methods can lead to problems if somebody else gets the same idea and adds the same method with a different implementation or method signature, or if people confuse extended methods for core language functionality. Both cases often happen in Ruby (especially regarding active_support methods).
Explanation of the concept without code:
It means you can "dynamically" modify code. Wanna add a method "dynamically" to a particular class known only at "runtime"? No problem. It's powerful, yes: but can be misused. The concept "dynamically" might be a little too esoteric to understand, so I have prepared an example below (no code, I promise):
How to monkey patch a car:
Normal Car Operations
How do you normally start a car? It’s simple: you turn the ignition, the car starts!
Great, but how can we "monkey patch" the car class?
This is what Fabrizzio did to poor Michael Corleone. Normally, if you want to change how a car operates, you would have to make those changes in the car manufacturing plant (i.e. at "compile" time, within the Car class ^**). Fabrizzio ain't got no time for that: he monkey patches cars by getting under the bonnet to surreptitiously and sneakily rewire things. In other words, he re-opens the Car class, makes the changes he wants, and he's done: he's just monkey patched a car. he done this "dynamically".
You have to really know what you are doing when you monkey patch otherwise the results could be quite explosive.
“Fabrizzio, where are you going?”
Boom!
Like Confucius Say:
"Keep your source code close, but your monkey patches closer."
It can be dangerous.
^** yes i know, dynamic languages.
Usually it is meant about ad-hoc changes, using Ruby open classes, frequently with low quality code.
Here's a good follow-up on the subject.
Here's a perfect example of the problem: Classifier gem breaks Rails.
** Original question: **
One thing that concerns me as a security professional is that Ruby doesn't have a parallel of Java's package-privacy. That is, this isn't valid Ruby:
public module Foo
public module Bar
# factory method for new Bar implementations
def self.new(...)
SimpleBarImplementation.new(...)
end
def baz
raise NotImplementedError.new('Implementing Classes MUST redefine #baz')
end
end
private class SimpleBarImplementation
include Bar
def baz
...
end
end
end
It'd be really nice to be able to prevent monkey-patching of Foo::BarImpl. That way, people who rely on the library know that nobody has messed with it. Imagine if somebody changed the implementation of MD5 or SHA1 on you! I can call freeze on these classes, but I have to do it on a class-by-class basis, and other scripts might modify them before I finish securing my application if I'm not very careful about load order.
Java provides lots of other tools for defensive programming, many of which are not possible in Ruby. (See Josh Bloch's book for a good list.) Is this really a concern? Should I just stop complaining and use Ruby for lightweight things and not hope for "enterprise-ready" solutions?
(And no, core classes are not frozen by default in Ruby. See below:)
require 'md5'
# => true
MD5.frozen?
# => false
I don't think this is a concern.
Yes, the mythical "somebody" can replace the implementation of MD5 with something insecure. But in order to do that, the mythical somebody must actually be able to get his code into the Ruby process. And if he can do that, then he presumably could also inject his code into a Java process and e.g. rewrite the bytecode for the MD5 operation. Or just intercept the keypresses and not actually bother with fiddling with the cryptography code at all.
One of the typical concerns is: I'm writing this awesome library, which is supposed to be used like so:
require 'awesome'
# Do something awesome.
But what if someone uses it like so:
require 'evil_cracker_lib_from_russian_pr0n_site'
# Overrides crypto functions and sends all data to mafia
require 'awesome'
# Now everything is insecure because awesome lib uses
# cracker lib instead of builtin
And the simple solution is: don't do that! Educate your users that they shouldn't run untrusted code they downloaded from obscure sources in their security critical applications. And if they do, they probably deserve it.
To come back to your Java example: it's true that in Java you can make your crypto code private and final and what not. However, someone can still replace your crypto implementation! In fact, someone actually did: many open-source Java implementations use OpenSSL to implement their cryptographic routines. And, as you probably know, Debian shipped with a broken, insecure version of OpenSSL for years. So, all Java programs running on Debian for the past couple of years actually did run with insecure crypto!
Java provides lots of other tools for defensive programming
Initially I thought you were talking about normal defensive programming,
wherein the idea is to defend the program (or your subset of it, or your single function) from invalid data input.
That's a great thing, and I encourage everyone to go read that article.
However it seems you are actually talking about "defending your code from other programmers."
In my opinion, this is a completely pointless goal, as no matter what you do, a malicious programmer can always run your program under a debugger, or use dll injection or any number of other techniques.
If you are merely seeking to protect your code from incompetent co-workers, this is ridiculous. Educate your co-workers, or get better co-workers.
At any rate, if such things are of great concern to you, ruby is not the programming language for you. Monkeypatching is in there by design, and to disallow it goes against the whole point of the feature.
Check out Immutable by Garry Dolley.
You can prevent redefinition of individual methods.
I guess Ruby has that a feature - valued more over it being a security issue. Ducktyping too.
E.g. I can add my own methods to the Ruby String class rather than extending or wrapping it.
"Educate your co-workers, or get better co-workers" works great for a small software startup, and it works great for the big guns like Google and Amazon. It's ridiculous to think that every lowly developer contracted in for some small medical charts application in a doctor's office in a minor city.
I'm not saying we should build for the lowest common denominator, but we have to be realistic that there are lots of mediocre programmers out there who will pull in any library that gets the job done, paying no attention to security. How could they pay attention to security? Maybe the took an algorithms and data structures class. Maybe they took a compilers class. They almost certainly didn't take an encryption protocols class. They definitely haven't all read Schneier or any of the others out there who practically have to beg and plead with even very good programmers to consider security when building software.
I'm not worried about this:
require 'evil_cracker_lib_from_russian_pr0n_site'
require 'awesome'
I'm worried about awesome requiring foobar and fazbot, and foobar requiring has_gumption, and ... eventually two of these conflict in some obscure way that undoes an important security aspect.
One important security principle is "defense in depth" -- adding these extra layers of security help you from accidentally shooting yourself in the foot. They can't completely prevent it; nothing can. But they help.
If monkey patching is your concen, you can use the Immutable module (or one of similar function).
Immutable
You could take a look at Why the Lucky Stiff's "Sandbox"project, which you can use if you worry about potentially running unsafe code.
http://code.whytheluckystiff.net/sandbox/
An example (online TicTacToe):
http://www.elctech.com/blog/safely-exposing-your-app-to-a-ruby-sandbox
Raganwald has a recent post about this. In the end, he builds the following:
class Module
def anonymous_module(&block)
self.send :include, Module.new(&block)
end
end
class Acronym
anonymous_module do
fu = lambda { 'fu' }
bar = lambda { 'bar' }
define_method :fubar do
fu.call + bar.call
end
end
end
That exposes fubar as a public method on Acronyms, but keeps the internal guts (fu and bar) private and hides helper module from outside view.
If someone monkeypatched an object or a module, then you need to look at 2 cases: He added a new method. If he is the only one adding this meyhod (which is very likely), then no problems arise. If he is not the only one, you need to see if both methods do the same and tell the library developer about this severe problem.
If they change a method, you should start to research why the method was changed. Did they change it due to some edge case behaviour or did they actually fix a bug? especially in the latter case, the monkeypatch is a god thing, because it fixes a bug in many places.
Besides that, you are using a very dynamic language with the assumption that programmers use this freedom in a sane way. The only way to remove this assumption is not to use a dynamic language.