What are the negative impacts of extending classes in ActionScript 3? - performance

In my game engine I use Box2D for physics. Box2D's naming conventions and poor commenting ruin the consistent and well documented remainder of my engine which is a little frustrating and presents poorly when you're using it.
I've considered making a set of wrapper classes for Box2D. That is, classes which extend each of the common Box2D objects and have their functions rewritten to follow the naming conventions of the rest of my engine, and to have them more clearly and consistently commented. I have even considered building ontop of some of the classes and adding some bits and pieces (like getters for pixel-based measurements in the b2Vec2 class).
This is fine but I am not 100% sure what the negative impacts of this would be and the degree to which those would affect my applications and games. I'm not sure if the compiler alleviates some of my concerns to a degree or whether I do need to be considerate when adding somewhat unnecessary classes for the sake of readability and consistency.
I have some suspicions:
More memory consumption to accommodate the extra level of class structure.
Performance impact when creating new objects due to initializing an extra level of members?
I am asking specifically about runtime impacts.

This is a pretty common problem when it comes to integrating third party libraries, especially libraries that are ports (as Box2DAS3 is), where they keep the coding and naming conventions of the parent language rather than fully integrating with the destination language (case in point: Box2DAS3 using getFoo() and setFoo() instead of a .foo getter/setter).
To answer your question quickly, no, there will be no significant performance impact with making wrapper classes; no more than you'll see in the class hierarchy in your own project. Sure, if you time a loop of 5 million iterations, you might see a millisecond or two of difference, but in normal usage, you won't notice it.
"More memory consumption to accommodate the extra level of class structure."
Like any language that has class inheritence, a vtable will be used behind the scenes, so you will have a small increase in memory/perf, but it's negligible.
"Performance impact when creating new objects due to initializing an extra level of members?"
No more than normal instantiation, so not something to worry about unless you're creating a huge amount of objects.
Performance wise, you should generally have no problem (favour readability and usability over performance unless you actually have a problem with it), but I'd look at it more as an architectural problem and, with that in mind, what I would consider to be a negative impact of extending/modifying classes of an external library generally fall into 3 areas, depending on what you want to do:
Modify the library
Extend the classes
Composition with your own classes
Modify the libary
As Box2DAS3 is open source, there's nothing stopping you jumping in and refactoring all the class/function names to your hearts content. I've seriously considered doing this at times.
Pros:
You can modify what you want - functions, classes, you name it
You can add any missing pieces that you need (e.g. pixel-meters conversion helpers)
You can fix any potential performance issues (I've noticed a few things that could be done better and faster if they were done in an "AS3" way)
Cons:
If you plan to keep your version up to date, you'll need to manually merge/convert any updates and changes. For popular libraries, or those that change a lot, this can be a huge pain
It's very time-consuiming - aside from modifications, you'll need a good understanding on what's going on so you can make any changes without breaking functionality
If there's multiple people working with it, they can't rely as much on external documentation/examples, as the internals might have changed
Extend the classes
Here, you simply make your own wrapper classes, which extend the base Box2D classes. You can add properties and functions as you want, including implementing your own naming scheme which translates to the base class (e.g. MyBox2DClass.foo() could simply be a wrapper for Box2DClass.bar())
Pros:
You implement just the classes you need, and make just the changes necessary
Your wrapper classes can still be used in the base Box2D engine - i.e. you can pass a MyBox2DClass object to an internal method that takes a Box2DClass and you know it'll work
It's the least amount of work, out of all three methods
Cons:
Again, if you plan to keep your version up to date, you'll need to check that any changes don't break your classes. Normally not much of a problem, though
Can introduce confusion into the class, if you create your own functions that call their Box2D equivalent (e.g. "Should I use setPos() or SetPosition()?). Even if you're working on your own, when you come back to your class in 6 months, you'll have forgotten
Your classes will lack coherence and consistency (e.g. some functions using your naming methodology (setPos()) while others use that of Box2D (SetPosition()))
You're stuck with Box2D; you can't change physics engines without a lot of dev, depending on how your classes are used throughout the project. This might not be such a big deal if you don't plan on switching
Composition with your own classes
You make your own classes, which internally hold a Box2D property (e.g. MyPhysicalClass will have a property b2Body). You're free to implement your own interface as you wish, and only what's necessary.
Pros:
Your classes are cleaner and fit in nicely with your engine. Only functions that you're interested in are exposed
You're not tied to the Box2D engine; if you want to switch to Nape, for example, you only need to modify your custom classes; the rest of your engine and games are oblivious. Other developers also don't need to learn the Box2D engine to be able to use it
While you're there, you can even implement multiple engines, and switch between them using a toggle or interfaces. Again, the rest of your engine and games are oblivious
Works nicely with component based engines - e.g. you can have a Box2DComponent that holds a b2Body property
Cons:
More work than just extending the classes, as you're essentially creating an intermediary layer between your engine and Box2D. Ideally, outside of your custom classes, there shouldn't be a reference to Box2D. The amount of work depends on what you need in your class
Extra level of indirection; normally it shouldn't be a problem, as Box2D will use your Box2D properties directly, but if your engine is calling your functions a lot, it's an extra step along the way, performance wise
Out of the three, I prefer to go with composition, as it gives the most flexibility and keeps the modular nature of your engine intact, i.e. you have your core engine classes, and you extend functionality with external libraries. The fact that you can switch out libraries with minimal effort is a huge plus as well. This is the technique that I've employed in my own engine, and I've also extended it to other types of libraries - e.g. Ads - I have my engine Ad class, that can integrate with Mochi, Kongregate, etc as needed - the rest of my game doesn't care what I'm using, which lets me keep my coding style and consistency throughout the engine, whilst still being flexible and modular.
----- Update 20/9/2013 -----
Big update time! So I went back to do some testing on size and speed. The class I used is too big to paste here, so you can download it at http://divillysausages.com/files/TestExtendClass.as
In it, I test a number of classes:
An Empty instance; a Class that just extends Object and implements an empty getPostion() function. This will be our benchmark
A b2Body instance
A Box2DExtends instance; a Class that extends b2Body and implements a function getPosition() that just returns GetPosition() (the b2Body function)
A Box2DExtendsOverrides instance; a Class that extends b2Body and overrides the GetPosition() function (it simply returns super.GetPosition())
A Box2DComposition instance; a Class that has a b2Body property and a getPosition() function that returns the b2Body's GetPosition()
A Box2DExtendsProperty instance; a Class that extends b2Body and adds a new Point property
A Box2DCompositionProperty instance; a Class that has both a b2Body property and a Point property
All tests were done in the standalone player, FP v11.7.700.224, Windows 7, on a not-great laptop.
Test1: Size
AS3 is a bit annoying in that if you call getSize(), it'll give you the size of the object itself, but any internal properties that are also Objects will just result in a 4 byte increase as they're only counting the pointer. I can see why they do this, it just makes it a bit awkward to get the right size.
Thus I turned to the flash.sampler package. If we sample the creation of our objects, and add up all the sizes in the NewObjectSample objects, we'll get the full size of our object (NOTE: if you want to see what's created and the size, comment in the log calls in the test file).
Empty's size is 56 // extends Object
b2Body's size is 568
Box2DExtends's size is 568 // extends b2Body
Box2DExtendsOverrides's size is 568 // extends b2Body
Box2DComposition's size is 588 // has b2Body property
Box2DExtendsProperty's size is 604 // extends b2Body and adds Point property
Box2DCompositionProperty's size is 624 // has b2Body and Point properties
These sizes are all in bytes. Some points worth noting:
The base Object size is 40 bytes, so just the class and nothing else is 16 bytes.
Adding methods doesn't increase the size of the object (they're implemented on a class basis anyway), while properties obviously do
Just extending the class didn't add anything to it
The extra 20 bytes for Box2DComposition come from 16 for the class and 4 for the pointer to the b2Body property
For Box2DExtendsProperty etc, you have 16 for the Point class itself, 4 for the pointer to the Point property, and 8 for each of the x and y property Numbers = 36 bytes difference between that and Box2DExtends
So obviously the difference in size depends on the properties that you add, but all in all, pretty negligible.
Test 2: Creation Speed
For this, I simply used getTimer(), with a loop of 10000, itself looped 10 (so 100k) times to get the average. System.gc() was called between each set to minimise time due to garbage collection.
Empty's time for creation is 3.9ms (av.)
b2Body's time for creation is 65.5ms (av.)
Box2DExtends's time for creation is 69.9ms (av.)
Box2DExtendsOverrides's time for creation is 68.8ms (av.)
Box2DComposition's time for creation is 72.6ms (av.)
Box2DExtendsProperty's time for creation is 76.5ms (av.)
Box2DCompositionProperty's time for creation is 77.2ms (av.)
There's not a whole pile to note here. The extending/composition classes take slightly longer, but it's like 0.000007ms (this is the creation time for 100,000 objects), so it's not really worth considering.
Test 3: Call Speed
For this, I used getTimer() again, with a loop of 1000000, itself looped 10 (so 10m) times to get the average. System.gc() was called between each set to minimise time due to garbage collection. All the objects had their getPosition()/GetPosition() functions called, to see the difference between overriding and redirecting.
Empty's time for getPosition() is 83.4ms (av.) // empty
b2Body's time for GetPosition() is 88.3ms (av.) // normal
Box2DExtends's time for getPosition() is 158.7ms (av.) // getPosition() calls GetPosition()
Box2DExtendsOverrides's time for GetPosition() is 161ms (av.) // override calls super.GetPosition()
Box2DComposition's time for getPosition() is 160.3ms (av.) // calls this.body.GetPosition()
Box2DExtendsProperty's time for GetPosition() is 89ms (av.) // implicit super (i.e. not overridden)
Box2DCompositionProperty's time for getPosition() is 155.2ms (av.) // calls this.body.GetPosition()
This one surprised me a bit, with the difference between the times being ~2x (though that's still 0.000007ms per call). The delay seems entirely down to the class inheritence - e.g. Box2DExtendsOverrides simply calls super.GetPosition(), yet is twice as slow as Box2DExtendsProperty, which inherits GetPosition() from its base class.
I guess it has to do with the overhead of function lookups and calling, though I took a look at the generated bytecode using swfdump in the FlexSDK, and they're identical, so either it's lying to me (or doesn't include it), or there's something I'm missing :) While the steps might be the same, the time between them probably isn't (e.g. in memory, it's jumping to your class vtable, then jumping to the base class vtable, etc)
The bytecode for var v:b2Vec2 = b2Body.GetPosition() is simply:
getlocal 4
callproperty :GetPosition (0)
coerce Box2D.Common.Math:b2Vec2
setlocal3
whilst var v:b2Vec2 = Box2DExtends.getPosition() (getPosition() returns GetPosition()) is:
getlocal 5
callproperty :getPosition (0)
coerce Box2D.Common.Math:b2Vec2
setlocal3
For the second example, it doesn't show the call to GetPosition(), so I'm not sure how they're resolving that. The test file is available for download if someone wants to take a crack at explaining it.
Some points to keep in mind:
GetPosition() doesn't really do anything; it's essentially a getter disguised as a function, which is one reason why the "extra class step penalty" appears so big
This was on a loop of 10m, which you're unlikely to doing in your game. The per-call penalty isn't really worth worrying about
Even if you do worry about the penalty, remember that this is the interface between your code and Box2D; the Box2D internals will be unaffected by this, only the calls to your interface
All-in-all, I'd expect the same results from extending one of my own classes, so I wouldn't really worry about it. Implement the architecture that works the best for your solution.

I know this answer will not qualify for the bounty as I am way to lazy to write benchmarks. But having worked on the Flash code base I can maybe give some hints:
The avm2 is a dynamic language, so the compiler will not optimize anything in this case.
Wrapping a call as a sub class call will have a cost. However that cost will be constant time and small.
Object creation cost will also at most be affected by a constant amount of time and memory. Also the time and amount will probably be insignificant compared to the base cost.
But, as with many things the devil is in the details. I never used box2d, but if it does any kind of object pooling things might not work well anymore. In general games should try to run without object allocations at play time. So be very careful not to add functions that allocate objects just to be prettier.
function addvectors(a:vec,b:vec,dest:vec):void
Might be ugly but is much faster than
function addvectors(a:vec,b:vec):vec
(I hope I got my AS3 syntax right...). Even more useful and more ugly might be
function addvectors(a:Vector.<vec>, b:Vector.<vec>, dest:Vector.<vec>, offset:int, count:int):void
So my answer is, if you are only wrapping for readability, go for it. It's a small, but constant cost. But be very, very careful to change how functions work.

I don't know if there is a big impact for instanciation time, but I will answer your question differently: what are your other options? Do they seem they'll do better?
There is a beautiful benchmark made by Jackson Dunstan about function performance: http://jacksondunstan.com/articles/1820
To sum it up:
closures are expensive
static is slow: http://jacksondunstan.com/articles/1713
overriding, calling a function inside a subClass does not seem to have a big impact
So, if you want to not use inheritance, maybe you'll have to replace it with static calls, and it is bad for performance.
Personally, I'll extend those classes and add an eager instanciation of all objects I'll need at runtime: if it is big, make a beautiful loading screen...
Also, take a look at post bytecode optimizations such as apparat: http://code.google.com/p/apparat/

I don't think that extending will impact the performance a lot. Yes, there is some cost, but it's not so high as long as you don't use composition. I.e. instead of extending Box2d classes directly, you create an instance of that classes and work with it inside your class. For example this
public class Child extends b2Body {
public function Child() {
// do some stuff here
}
}
instead of this
public class Child {
private var _body:b2Body;
public function Child() {
...
_body = _world.CreateBody(...);
...
}
}
I guess you know that as less objects as you create the better. As long as you keep the number of created instances you will have same performance.
From another point of view:
a) adding one more layer of abstractions may change the Box2d a lot. If you work in a team this may be an issue, because the other developers should learn your naming
b) be careful about Middle Man code smell. Usually when you start wrapping already existing functionality you end up with classes which are just delegators.

Some great answers here but I'm going to throw my two cents in.
There are two different concepts you have to recognize: when you extend a class and when you implement a class.
Here is an example of extending MovieClip
public class TrickedOutClip extends MovieClip {
private var rims = 'extra large'
public function TrickedOutClip() {
super();
}
}
Here is an example of implementing MovieClip
public class pimpMyClip {
private var rims = 'extra large';
private var pimpedMovieClip:MovieClip;
public function pimpMyClip() {
pimpedMovieClip = new MovieClip();
pimpedMovieClip.bling = rims;
}
public function getPimpedClip() {
return pimpedMovieClip;
}
}
I think you probably do not want to extend these box2D classes but implement them. Here's a rough outline:
public class myBox2DHelper {
private var box2d = new Box2D(...);
public function MyBox2DHelper(stage) {
}
public function makeBox2DDoSomeTrickyThing(varA:String, varB:Number) {
// write your custom code here
}
public function makeBox2DDoSomethingElse(varC:MovieClip) {
// write your custom code here
}
}
Good luck.

Related

Writing less and keeping a good performance, is it possible?

These past few days I've been thinking of a way to avoid needing to write a lot of code and still keeping a good performance for a Air desktop game I'm developing, as a hobby.
The game is a sort of vertical shooter, that consist of several entities moving and checking collision. There are plenty of different kind of units. Each frame I have something like:
entity.execute();
The simpler approach is to have all different entities to inherit the Entity class, and manually customize them all. This is slow and cumbersome, and hard to maintain. But it's fast, performance wise.
The other approach is to have only one Entity class, and just using some sort of composition to simply add "behaviors". So for example I have a master class with things like types of movements, attacks, etc, and the different entities use them.
The problem with that approach is, calling a function is slow, according to my tests, it is ~3 times slower than just having the code right there (inside execute()).
I'm in a dilemma, I can't find a way to reuse chunks of code to decorate generic Entity instances, and keep a good performance. Seems like I have to use one or the other.
I tried using [Inline], but I've read it's not a stable feature, and I didn't see any noticeable performance improvement, I didn't test it much though.
Any insight is appreciated.
Abstraction through inheritance is a good object oriented pattern, I'd argue that it is not slow or cumbersome to maintain. Separation of concerns would add clarity to classes that inherit your base Entity class; as well, reduce copied code. Interfaces would further abstract concrete types.
ActionScript does not support powerful object oriented language features that you might find in a language like C# - no abstract base class, no partial classes, limited template / generics, limited polymorphism. Composition and decorator patterns would likely force using dynamic classes, which would also slow down the runtime due to type checking.
Perhaps the greater issue is too much business logic in the Entity class. I would think some world container or controller would be responsible for collision detection.
Something you could consider is a physics engine like Box 2D.
There are ports of Box2D built with CrossBridge (formerly Alchemy, FlasCC), which is a C++ compiler for the AVM2, able to run Flash up to 10x faster through lean optimized bytecode that features high performance memory-access opcodes for Flash (known as Domain Memory).
This is how games like Angry Bots or Neverball are made.
Check out Jesse Sternberg's Box2d Flash Alchemy Port + World Construction Kit if using a AS3 physics engine sounds interesting.
There are some common approaches to speeding up the flash in game development. One of them is to avoid using display objects, in favour of simple bitmaps. In this case you have a stage as a bitmap, and keep all your game state in lightweight objects, and then just make a game state snapshot drawn into that stage bitmap data (with copyPixels) periodically (on enter frame, or on timer)
schematically: say you have a game with units
class PseudoSprite {
public var x:uint;
public var y:uint;
public var currentAnimFrame:uint;
protected var snapshotCreator:AbstractSnapshotCreator;
public function makeSnapshot():BitmapData {
return snapshotCreator.createSnapshot(currentAnimFrame);
}
....
}
class Unit extends PseudoSprite {
public var directionAngle:Number = 0;
public var speed:uint = 0;
function Unit() {
snapshotCreator = UnitSnapshotCreator.instance;
}
public function doStep():void {
x = //count x by speed and direction
y = //count y by speed and direction
animationFrame++;
}
}
class Game {
public var stage:Bitmap;
private var objects:Vector.<PseudoSprite> = new <PseudoSprite>[
new Unit(), new Unit()];
public function step() {
for each (var unit:PseudoSprite in objects) {
unit.doStep();
//draw unit.snapshot() to the stage bitmap data
}
}
}
so, you can see: you can build whole units (or all game objects) hierarchy using normal OOP, and get some suitable performance..
After some tests, I've found out that I can just do something like:
public var foo:Function;
and then when I create the entity I can:
entity.foo = myCustomFoo;
And then in the main loop I can:
entity.foo();
This is as performant as calling a native member function inside the Entity instance. Warning, don't create a getter to access your function, it becomes a lot slower.

Is There Any Overhead Difference Between Protobuf Runtime and Static Tag Methods?

We are using Probuf on a .netcf target. Everything works well. I started out using the static [ProtoContract], [ProtoMember, 1].. etc. My colleage was concerned about adding potential overhead to the class object so I switched to a runtime model with .add(# , " ") which seemed more "disconnected" from the class in question. I actually prefer the static tags in the class since names are inherently updated if variable names are refactored later. Since I do not know how or what protobuf does under the hood, is there any advantage or disadvantage to using the static tags vs. the runtime model in terms of overhead, speed, etc.
Thanks!
I haven't profiled this aspect extensively - mainly because any overhead from reflection on the attributes is done once and once only. There might be a slight difference in cold-start performance, but: if the ultimate in startup performance is your aim you should ideally try to use the precompiler available in the google-code download. This only works with the attribute model but has the advantage that when using a precompiled model no reflection whatsoever occurs at runtime. It will also generate pure IL, where-as CF is usually very restricted so IIRC the runtime usage is forced to use some reflection even for things like member access. Finally, it means you can use the "CoreOnly" rather than "Full" build, which is smaller and less complex.
http://marcgravell.blogspot.co.uk/2012/07/introducing-protobuf-net-precompiler.html

Magento, magic getters v getData

I have been using magento for a while now and always cant decide between using the magic getter and getData()
Can someone explain the main difference, apart from the slight performance overhead (and it must be very slight).
I am thinking in terms:
Future code proof (i think magento 2 will not be using magic getter)
Stylistically
Performance
Stability
Any other reasons to use 1 over the other
There is no clear way to go based on the core code as it uses a mixture of both
There's no one answer to fit all situations and it's best to decide based on the model you are using and the particular use case.
Performance is quite poor for magic methods, as well as the extra overhead of converting from CamelCase to under_score on each accessor.
the magic methods are basically a wrapper for getData() anyway, with extra overhead.
There's is one advantage of using magic methods though, for example:
if you use getAttributeName() rather than getData('attribute_name')
at some point in the future, the model may be updated to include a real, concrete getAttributeName() method, in which case your code will still work fine. However if you have used getData(), you access the attribute directly, and bypass the new method, which could include some important calculations which you are bypassing.
In my opinion, the safest way is to always use getData($key). The magic getter uses the same method as you already pointed out.
The advantage is that you can find all references to getData in your code and change it appropriately in case the getData() method is refactored. Compare that with having to find out all magic method calls where they are always named differently.
The second thing is that the magic getter can screw you up easily when you have a method which is named the same way (I think getName() got me once and it took quite some time to debug).
So my vote is definitely for using getData().
As stated before, it's best to use getData over the magic methods. Just wanted to add 2 quick points:
1) The performance overhead is not that slight, especially because of the implementation of _underscore in Varien_Object (as mentioned by Andrew).
2) The implementation of getData has some logic that helps "pretify" code, and although it is a little slower than typical getData calls, is still much faster than magic methods.
If you have nested Varien_Object's so that you need to perform a call like:
$firstObject->getData('second_object')->getData('third_object')->getData('some_string');
you can also perform that call like this:
$firstObject->getData('second_object/third_object/some_string');

Is the DI pattern limiting wrt expensive object creation coupled with infrequent dependency usage?

I'm having a hard time getting my head around what seems like an obvious pattern problem/limitation when it comes to typical constructor dependency injection. For example purposes, lets say I have an ASP.NET MVC3 controller that looks like:
Public Class MyController
Inherits Controller
Private ReadOnly mServiceA As IServiceA
Private ReadOnly mServiceB As IServiceB
Private ReadOnly mServiceC As IServiceC
Public Sub New(serviceA As IServiceA, serviceB As IServiceB, serviceC As IServiceC)
Me.mServiceA = serviceA
Me.mServiceB = serviceB
Me.mServiceC = serviceC
End Sub
Public Function ActionA() As ActionResult
' Do something with Me.mServiceA and Me.mServiceB
End Function
Public Function ActionB() As ActionResult
' Do something with Me.mServiceB and Me.mServiceC
End Function
End Class
The thing I'm having a hard time getting over is the fact that the DI container was asked to instantiate all three dependencies when at any given time only a subset of the dependencies may be required by the action methods on this controller.
It's seems assumed that object construction is dirt-cheep and there are no side effects from object construction OR all dependencies are consistently utilized. What if object construction wasn't cheep or there were side effects? For example, if constructing IServiceA involved opening a connection or allocating other significant resources, then that would be completely wasted time/resources when ActionB is called.
If these action methods used a service location pattern (or other similar pattern), then there would never be the chance to unnecessarily construct an object instance that will go unused, of course using this pattern has other issues attached making it unattractive.
Does using the canonical constructor injection + interfaces pattern of DI basically lock the developer into a "limitation" of sorts that implementations of the dependency must be cheep to instantiate or the instance must be significantly utilized? I know all patterns have their pros and cons, is this just one of DI's cons? I've never seen it mentioned before, which I find curious.
If you have a lot of fields that aren't being used by every member this means that the class' cohesion is low. This is a general programming concept - Constructor Injection just makes it more visible. It's usually a pretty good indicator that the Single Responsibility Principle is being violated.
If that's the case then refactor (e.g. to Facade Services).
You don't have to worry about performance when creating object graphs.
When it comes to side effects, (DI) constructors should be simple and not have side effects.
Generally speaking, there should be no major costs or side effects of object construction. This is a general statement that I believe applies to most (not all) objects, but is especially true for services that you would inject via DI. In other words, constructing a service class automatically makes a database/service call, or changes the state of your system in a way that would have side effects is (at least) a code smell.
Regarding instances that go unused: it's hard to create a system that has perfect utilization of instances within dependent classes, regardless of whether you use DI or not. I'm not sure achieving this is very important, as long as you are adhering to the Single Responsibility Principle. If you find that your class has too many services injected, or that utilization is really uneven, it might be a sign that your class is doing too much and needs to be split into two or more smaller classes with more focused responsibilities.
No you are not tied to the limitations you have listed. As of .net 4 you do have Lazy(Of T) at your disposal, which will allow you to defer instantiation of your dependencies until required.
It is not assumed that object construction is dirt-cheap and consequently some DI containers support Lazy(Of T) out of the box. Whilst Unity 2.0 supports lazy initialization out of the box through automatic factories, there is a good article here on an extension supporting Lazy(Of T) the author has on MSDN.
Isn't your controller a singleton though? That is the normal way to do it in Java. There is only one instance created. Also you could split the controller into multiple controllers if the roles of the actions is so distinct.

TDD and DI: dependency injections becoming cumbersome

C#, nUnit, and Rhino Mocks, if that turns out to be applicable.
My quest with TDD continues as I attempt to wrap tests around a complicated function. Let's say I'm coding a form that, when saved, has to also save dependent objects within the form...answers to form questions, attachments if available, and "log" entries (such as "blahblah updated the form." or "blahblah attached a file."). This save function also fires off emails to various people depending on how the state of the form changed during the save function.
This means in order to fully test out the form's save function with all of its dependencies, I have to inject five or six data providers to test out this one function and make sure everything fired off in the right way and order. This is cumbersome when writing the multiple chained constructors for the form object to insert the mocked providers. I think I'm missing something, either in the way of refactoring or simply a better way to set the mocked data providers.
Should I further study refactoring methods to see how this function can be simplified? How's the observer pattern sound, so that the dependent objects detect when the parent form is saved and handle themselves? I know that people say to split out the function so it can be tested...meaning I test out the individual save functions of each dependent object, but not the save function of the form itself, which dictates how each should save themselves in the first place?
First, if you are following TDD, then you don't wrap tests around a complicated function. You wrap the function around your tests. Actually, even that's not right. You interweave your tests and functions, writing both at almost exactly the same time, with the tests just a little ahead of the functions. See The Three Laws of TDD.
When you follow these three laws, and are diligent about refactoring, then you never wind up with "a complicated function". Rather you wind up with many, tested, simple functions.
Now, on to your point. If you already have "a complicated function" and you want to wrap tests around it then you should:
Add your mocks explicitly, instead of through DI. (e.g. something horrible like a 'test' flag and an 'if' statement that selects the mocks instead of the real objects).
Write a few tests in order to cover the basic operation of the component.
Refactor mercilessly, breaking up the complicated function into many little simple functions, while running your cobbled together tests as often as possible.
Push the 'test' flag as high as possible. As you refactor, pass your data sources down to the small simple functions. Don't let the 'test' flag infect any but the topmost function.
Rewrite tests. As you refactor, rewrite as many tests as possible to call the simple little functions instead of the big top-level function. You can pass your mocks into the simple functions from your tests.
Get rid of the 'test' flag and determine how much DI you really need. Since you have tests written at the lower levels that can insert mocks through areguments, you probably don't need to mock out many data sources at the top level anymore.
If, after all this, the DI is still cumbersome, then think about injecting a single object that holds references to all your data sources. It's always easier to inject one thing rather than many.
Use an AutoMocking container. There is one written for RhinoMocks.
Imagine you have a class with a lot of dependencies injected via constructor injection. Here's what it looks like to set it up with RhinoMocks, no AutoMocking container:
private MockRepository _mocks;
private BroadcastListViewPresenter _presenter;
private IBroadcastListView _view;
private IAddNewBroadcastEventBroker _addNewBroadcastEventBroker;
private IBroadcastService _broadcastService;
private IChannelService _channelService;
private IDeviceService _deviceService;
private IDialogFactory _dialogFactory;
private IMessageBoxService _messageBoxService;
private ITouchScreenService _touchScreenService;
private IDeviceBroadcastFactory _deviceBroadcastFactory;
private IFileBroadcastFactory _fileBroadcastFactory;
private IBroadcastServiceCallback _broadcastServiceCallback;
private IChannelServiceCallback _channelServiceCallback;
[SetUp]
public void SetUp()
{
_mocks = new MockRepository();
_view = _mocks.DynamicMock<IBroadcastListView>();
_addNewBroadcastEventBroker = _mocks.DynamicMock<IAddNewBroadcastEventBroker>();
_broadcastService = _mocks.DynamicMock<IBroadcastService>();
_channelService = _mocks.DynamicMock<IChannelService>();
_deviceService = _mocks.DynamicMock<IDeviceService>();
_dialogFactory = _mocks.DynamicMock<IDialogFactory>();
_messageBoxService = _mocks.DynamicMock<IMessageBoxService>();
_touchScreenService = _mocks.DynamicMock<ITouchScreenService>();
_deviceBroadcastFactory = _mocks.DynamicMock<IDeviceBroadcastFactory>();
_fileBroadcastFactory = _mocks.DynamicMock<IFileBroadcastFactory>();
_broadcastServiceCallback = _mocks.DynamicMock<IBroadcastServiceCallback>();
_channelServiceCallback = _mocks.DynamicMock<IChannelServiceCallback>();
_presenter = new BroadcastListViewPresenter(
_addNewBroadcastEventBroker,
_broadcastService,
_channelService,
_deviceService,
_dialogFactory,
_messageBoxService,
_touchScreenService,
_deviceBroadcastFactory,
_fileBroadcastFactory,
_broadcastServiceCallback,
_channelServiceCallback);
_presenter.View = _view;
}
Now, here's the same thing with an AutoMocking container:
private MockRepository _mocks;
private AutoMockingContainer _container;
private BroadcastListViewPresenter _presenter;
private IBroadcastListView _view;
[SetUp]
public void SetUp()
{
_mocks = new MockRepository();
_container = new AutoMockingContainer(_mocks);
_container.Initialize();
_view = _mocks.DynamicMock<IBroadcastListView>();
_presenter = _container.Create<BroadcastListViewPresenter>();
_presenter.View = _view;
}
Easier, yes?
The AutoMocking container automatically creates mocks for every dependency in the constructor, and you can access them for testing like so:
using (_mocks.Record())
{
_container.Get<IChannelService>().Expect(cs => cs.ChannelIsBroadcasting(channel)).Return(false);
_container.Get<IBroadcastService>().Expect(bs => bs.Start(8));
}
Hope that helps. I know my testing life has been made a whole lot easier with the advent of the AutoMocking container.
You're right that it can be cumbersome.
Proponent of mocking methodology would point out that the code is written improperly to being with. That is, you shouldn't be constructing dependent objects inside this method. Rather, the injection API's should have functions that create the appropriate objects.
As for mocking up 6 different objects, that's true. However, if you also were unit-testing those systems, those objects should already have mocking infrastructure you can use.
Finally, use a mocking framework that does some of the work for you.
I don't have your code, but my first reaction is that your test is trying to tell you that your object has too many collaborators. In cases like this, I always find that there's a missing construct in there that should be packaged up into a higher level structure. Using an automocking container is just muzzling the feedback you're getting from your tests. See http://www.mockobjects.com/2007/04/test-smell-bloated-constructor.html for a longer discussion.
In this context, I usually find statements along the lines of "this indicates that your object has too many dependencies" or "your object has too many collaborators" to be a fairly specious claim. Of course a MVC controller or a form is going to be calling lots of different services and objects to fulfill its duties; it is, after all, sitting at the top layer of the application. You can smoosh some of these dependencies together into higher-level objects (say, a ShippingMethodRepository and a TransitTimeCalculator get combined into a ShippingRateFinder), but this only goes so far, especially for these top-level, presentation-oriented objects. That's one less object to mock, but you've just obfuscated the actual dependencies via one layer of indirection, not actually removed them.
One blasphemous piece of advice is to say that if you are dependency injecting an object and creating an interface for it that is quite unlikely to ever change (Are you really going to drop in a new MessageBoxService while changing your code? Really?), then don't bother. That dependency is part of the expected behavior of the object and you should just test them together since the integration test is where the real business value lies.
The other blasphemous piece of advice is that I usually see little utility in unit testing MVC controllers or Windows Forms. Everytime I see someone mocking the HttpContext and testing to see if a cookie was set, I want to scream. Who cares if the AccountController set a cookie? I don't. The cookie has nothing to do with treating the controller as a black box; an integration test is what is needed to test its functionality (hmm, a call to PrivilegedArea() failed after Login() in the integration test). This way, you avoid invalidating a million useless unit tests if the format of the login cookie ever changes.
Save the unit tests for the object model, save the integration tests for the presentation layer, and avoid mock objects when possible. If mocking a particular dependency is hard, it's time to be pragmatic: just don't do the unit test and write an integration test instead and stop wasting your time.
The simple answer is that code that you are trying to test is doing too much. I think sticking to the Single Responsibility Principle might help.
The Save button method should only contain a top-level calls to delegate things to other objects. These objects can then be abstracted through interfaces. Then when you test the Save button method, you only test the interaction with mocked objects.
The next step is to write tests to these lower-level classes, but thing should get easier since you only test these in isolation. If you need a complex test setup code, this is a good indicator of a bad design (or a bad testing approach).
Recommended reading:
Clean Code: A Handbook of Agile Software Craftsmanship
Google's guide to writing testable code
Constructor DI isn't the only way to do DI. Since you're using C#, if your constructor does no significant work you could use Property DI. That simplifies things greatly in terms of your object's constructors at the expense of complexity in your function. Your function must check for the nullity of any dependent properties and throw InvalidOperation if they're null, before it begins work.
When it is hard to test something, it is usually symptom of the code quality, that the code is not testable (mentioned in this podcast, IIRC). The recommendation is to refactor the code so that the code will be easy to test. Some heuristics for deciding how to split the code into classes are the SRP and OCP. For more specific instructions, it would be necessary to see the code in question.

Resources