All the projects I work interface to a piece of hardware and this is often the main purpose of the software. Are there any effective ways I can apply TDD to the code that works with the hardware?
Update: Sorry for not being clearer with my question.
The hardware I use is a frame grabber that capture images from a camera. I then process these images, display them and save them to disk. I can simulate all the processing that takes place after the images are captured by using previously captured images that are stored on disk.
But it's the actual interaction with the hardware that I want to test. For instance does my software cope correctly when there isn't a camera attached, does it properly start and stop grabbing etc. But this is so tied into the hardware I don't know how to test it when the hardware isn't present or if I should even be trying to do this?
2nd Update: I'm also looking for some concrete examples of exactly how people have dealt this situation.
Create a thin layer for controlling the hardware, and use system tests (manual or automatic) with the full hardware to make sure that the control layer works as expected. Then create a fake/mock implementation of the control layer, that behaves externally like the interface to the real hardware, and use it when doing TDD for the rest of the program.
Years ago, I was writing software for taking measurements with a SQUID magnetometer. The hardware was big, unmovable and expensive (video), so it was not possible to always have access to the hardware. We had documentation about the communication protocol with the devices (through serial ports), but the documentation was not 100% accurate.
What helped us very much was creating a software which listens to the data coming from one serial port, logs it and redirects it to another serial port. Then we were able to find out how the old program (which we were replacing) communicated with the hardware, and reverse engineer the protocol. They were chained like this: Old Program <-> Virtual Loopback Serial Port <-> Our Data Logger <-> Real Serial Port <-> Hardware.
Back then we did not use TDD. We did consider writing an emulator for the hardware, so that we could test the program in isolation, but since we did not know exactly how the hardware was supposed to work, it was hard to write an accurate emulator so in the end we did not do it. If we had known the hardware better, we could have created an emulator for it, and it would have made developing the program much easier. Testing with the real hardware was most valuable, and in hindsight we should have spent even more time testing with the hardware.
Split your test suite into two parts:
The first part runs tests against the real hardware. This part is used to build the mockups. By writing automatic tests for this, you can run them again if you have any doubts whether your mockups work correctly.
The second part run against the mockups. This part runs automatically.
Part #1 gets run manually after you made sure the hardware is wired up correctly, etc. A good idea is to create a suite of tests which run against something returned by the factory and run these tests twice: Once with a factory that returns the real "driver" and once against the factory of your mock objects. This way, you can be sure that your mocks work exactly as the real thing:
class YourTests extends TestCase {
public IDriver getDriver() { return new MockDriver (); }
public boolean shouldRun () { return true; }
public void testSomeMethod() throws Exception {
if (!shouldRun()) return; // Allows to disable all tests
assertEquals ("1", getDriver().someMethod());
}
}
In my code, I usually use a system property (-Dmanual=yes) to toggle the manual tests:
class HardwareTests extends YourTests {
public IDriver getDriver() { return new HardwareDriver (); }
public boolean shouldRun () { return "yes".equals (System.getProperty("manual")); }
}
If you are writing software to manipulate data comming out of a specialized piece of hardware, then you could reasonably create stand-ins for the hardware to test the software.
If the hardware interface is something simple like a serial port, you could easily use a loop-back cable to have your program talk to the mock hardware. I used this approach some years ago when writing software to talk to a credit processor. My test app was given to think that that my simulator was a modem and a back-end processor.
If you are writing PCI device drivers or equivalent level software, then you probably can't create a software stand-in.
The only good way to apply TDD to such issues is if you are able to spoof the hardware's i/o with another program. For instance, I work with credit card handling for gas stations. On my current project we have a simulator that is the pump electronics hooked to some switches such that operation of a pump (lift handle, squeeze trigger, fuel flow) can be simulated. It's quite conceivable that we could have a simulator built that was controllable by software.
Alternately, depending on the device, you might be able to use standard test equipment (signal generators, etc) to feed it 'known inputs'.
Note that this has the problem that you are testing both the hardware and the device drivers together. Unfortunately, that's really the only good choice you have at this stage - Any simulated hardware is likely to be different enough from the real equipment that it's going to be useless to test with.
It's probably not a good idea to include tests that access the hardware in your test suite. One of the problems with this approach is that the tests will only be able to run on a machine that is connected to this special piece of hardware, which makes it difficult to run the tests say as part of a (nightly) automatic build process.
One solution could be to write some software modules that behave like the missing hardware modules, at least from the interface point of view. When running your test suite, access these software modules instead of the real hardware.
I also like the idea of splitting the test suite into two parts:
one that accesses the real hardware, which you run manually
one that accesses the software modules, which runs as part of the automatic testing
In my experience, tests that involve real hardware almost always require some amount of manual interaction (e.g. plug something in and out to see if it's correctly detected), which makes it very hard to automate. The benefits are often just not worth the trouble.
When I was working on set-top boxes we had a tool that would generate mocks from any C API with doxygen comments.
We'd then prime the mocks with what we wanted the hardware to return in order to unit-test our components.
So in the example above you'd set the result of FrameGrabber_API_IsDeviceAttached to be false, and when your code calls that function it returns false and you can test it.
How easy it will be to test depends on what your code is currently structured like.
The tool we used to generate the mocks was in house, so I can't help you with that. But there are some hopeful google hit:
(disclaimer - I've used any of these, but hopefully they can be of help to you)
http://code.google.com/p/googlemock/
http://www.opensourcetesting.org/unit_c.php
http://spin.atomicobject.com/2006/12/29/cmock-ruby-based-mock-tools-for-c
Just checking - do you have something like direct ioctl calls in your code or something? Those are always hard to mock up. We had a OS wrapper layer that we could easily write mocks for so it was pretty easy for us.
If you have a simulator, you could write tests against the simulator and run these tests against the hardware.
It's hard to answer the questions with so little detail :-)
Related
I would say that all three are the same, but I wonder if there is small differences between them. In the end, what I think is that you are testing user scenarios on all of them.
UI testing: user interface testing. In other words, you have to make sure that all buttons, fields, labels and other elements on the screen work as assumed in a specification.
GUI testing: graphical user interface. You have to make sure that all elements on the screen work as mentioned in a specification and also color, font, element size and other similar stuff match design.
Functional testing: the process of quality assurance of a product that assumes the testing of the functions/functionalities of component or system in general, according to specification requirements.
E2E testing: it needs for identifying system dependencies and ensuring that the right information is passed through multiple components and systems.
Please make yourself familiar with Hermetic Testing.
You have two ways to access systems in your test:
You have a local service. For example an in memory database instead of the real database
You mock the system.
For me UI-tests work like in above picture: All tests use local resources. They are hermetic.
But End-to-end Tests involve other systems. Example: Your SUT (system under test) creates an email. You want to be sure that this email gets send to a server and later arrives in an Inbox. For me this contradicts with "separation of concerns". This mixes two distinct topics. First: Your application creates an email and sends it to an server. This could be handled with a mocked mail server. But end-to-end tests mix it with a second concern: You want the mail server to be alive and receive and forward mails correctly. This is not software testing, this is monitoring.
My advice: Do hermetic UI-Testing of code and do check/monitor your production system. But don't mix both concepts. I think for small environments end-to-end-tests are not needed.
I don't think that functional testing is the same as UI/GUI testing at all. consider that we talk about a mechanical domain or another which is not software; for me the functional testing, test the function;e.g. if you click on the hard button of your microwave, it should start working. Now if instead of the buttons, your microwave has a touch screen and an OS to manage the screen, and you click on the soft button,this soft button should drive the hard button in order that he microwave functions. So for me, functional testing means testing the microwave using the hard button, but UI testing means testing the Microwave using the soft button and since soft button drives the hard button, by testing the UI, you ALSO do functional testing.
Does it make sense to oy?
End to end testing means exercising an application from the outer boundaries to verify its behavior. This far I've only done written tests for a single executable artifact. How should I test systems made up of multiple artifacts that is deployed on different hosts?
I see two alternatives.
The tests set up the whole system and exercise it from the very outer edges.
Each artifact is end to end tested in isolation, relying on the test content to enforce the protocol between them.
Is there a clear case for only adhering to one of these, or are one of them preferred, or are they interchangeable? If interchangeable, then what are some advantages and disadvantages between them?
Even though I think it depends on the context, I prefer the first alternative. Here are my random thoughts:
I like my tests to be as closely mapped to use cases as possible (BDD style) (with the disclaimer that I misuse the term use case). These use cases may span several applications and sub-systems.
Example: A back office administrator can view a transaction made by a user from the public interface.
Here, the back office admin interface and the public interface are different applications, but they are included in the same use case.
Mapping these thoughts to your problem where you have sub-systems deployed on different hosts, I would say it depends on how it is used, from the user/actor perspective. Do the use cases span several sub-systems?
Also, perhaps the fact that the system is deployed on several hosts isn't important to the tests. You could replace the inter-process communication with method calls in your tests and have the whole system within the same process during tests, reducing the complexity. Supplement this with tests that only verify the inter-process communication.
Edit:
I realise that I forgot to include why I prefer to test the whole system.
Your asset is features, that is, behaviour, while the code is a liability. Therefore you'd like to test the behaviour, not the code (BDD style).
If you are testing each sub-system separately you are testing the code, not the features. Why? When you divided your system into sub-systems you did so based on some technical reasons. When you learn more you might discover that the chosen seam is sub-optimal and would like to move some responsibility from one sub-system to another. And you would have to modify test and production code at the same time, leaving you without a safety net. That's a typical symptom of testing implementation details.
That said, these kind of tests are too blunt to test everything. So you need to have complementary tests for details as well, where necessary.
Testing each artifact end-to-end separately would be highly desiderable in any case. This will ensure that every artifact is sound.
In addition, you might want to test a composition of artifacts. That would catch problems in the interactions between artifacts. I don't know about your situation, but one thing that is important to have is a test environment that is a copy of production. Testing the system in the test environment is a very good idea. You might also want to test the system in the production environment; this might be feasible or not. For instance, if your system processes credit card payments, you may want to avoid test payments on the production system.
In any case, testing each system separately is imho more important than testing the composition. Once you know that your artifacts are sound in isolation, catching interaction tests will be much easier. If you only have the end-to-end test of the whole system, it's much more difficult to understand where is the error when the tests fail.
Anyone can point out the difference between virtual user and real user?
In the context of web load testing, there are a lot of differences. A virtual user is a simulation of human using a browser to perform some actions on a website. One company offers what they call "real browser users", but they, too, are simulations - just at a different layer (browser vs HTTP). I'm going to assume you are using "real users" to refer to humans.
Using humans to conduct a load test has a few advantages, but is fraught with difficulties. The primary advantage is that there are real humans using real browsers - which means that, if they are following the scripts precisely, there is virtually no difference between a simulation and real traffic. The list of difficulties, however, is long: First, it is expensive. The process does not scale well beyond a few dozen users in a limited number of locations. Humans may not follow the script precisely...and you may not be able to tell if they did. The test is likely not perfectly repeatable. It is difficult to collect, integrate and analyze metrics from real browsers. I could go on...
Testing tools which use virtual users to simulate real users do not have any of those disadvantages - as they are engineered for this task. However, depending on the tool, they may not perform a perfect simulation. Most load testing tools work at the HTTP layer - simulating the HTTP messages passed between the browser and server. If the simulation of these messages is perfect, then the server cannot tell the difference between real and simulated users...and thus the test results are more valid. The more complex the application is, particularly in the use of javascript/AJAX, the harder it is to make a perfect simulation. The capabilities of tools in this regard varies widely.
There is a small group of testing tools that actually run real browsers and simulate the user by pushing simulated mouse and keyboard events to the browser. These tools are more likely to simulate the HTTP messages perfectly, but they have their own set of problems. Most are limited to working with only a single browser (i.e. Firefox). It can be hard to get good metrics out of real browsers. This approach is far more scalable better than using humans, but not nearly as scalable as HTTP-layer simulation. For sites that need to test <10k users, though, the web-based solutions using this approach can provide the required capacity.
There is a difference.
Depends on your jmeter testing, if you are doing from a single box, your IO is limited. You cant imitate lets say 10K users with jmeter in single box. You can do small tests with one box. If you use multiple jmeter boxes that s another story.
Also, how about the cookies, do you store cookies while load testing your app? that does make a difference
A virtual user is an automatic emulation of a real users browser and http requests.
Thus the virtual users is designed to simulate a real user. It is also possible to configure virtual users to run through what we think a real users would do, but without all the delay between getting a page and submitting a new one.
This allows us to simulate a much higher load on our server.
The real key differences between virtual user simulations and real users is are the network between the server and thier device as well as the actual actions a real user performs on the website.
the following tries to get an idea of which technologies would be suitable for a specific (as outlined) distributed/RPC problem. If something is not clear, am am very happy to add more details, but please request these in a comment and not in an "answer". Thanks.
First I will describe the current situation, and then follows what we want to achieve and the actual question. Despite this being a rather long post to get some context, the question itself is rather short (see at the end).
The Application Split challenge
Application description:
The app allows the user to configure a number of hardware devices(*)
and then communicate with these to control and collect measurement
channels of a physical experiment.
(*) Hardware devices include temperature sensors, pressure sensors,
motors, ... Communication ranges from serial port communication,
TCP/UDP communication to interfacing with the drivers of 3rd party
plugin cards.
Control involves sending commands to the various hardware devices
to configure them according to the protocols they support.
Measuring involves getting the data from (some of) these devices.
We are hard pressed to keep the whole thing running as customers
demand more and more channels at higher sample rates and we have to
keep up with writing the data+timestamps we get from all devices to
disk, display a subset of the data and still keep the system
responding properly.
Current situation:
[ DisplayAndControl.exe ]
|| /\
|| DLL Interface ||
|| || Window Messages (SendMessage, PostMessage)
|| ||
\/ ||
[ ChannelManager.dll ]
ChannelManager.dll (Native C++ DLL on Windows)
Manages n data channels (physical measurement variables)
Each channel holds a shifting arbitrary number of samples with
high-precision timestamps
Allows to group channels and write their ongoing updates or
historical values ("measurement") to disk
Calculations with channels (arithmetic, integration, mean
values, etc.)
Interfaces with (realtime) hardware devices to get the timestamps
and values of channels
Get value+timestamp from hardware and save in internal
ring buffer for channel
DisplayAndControl.exe (Native C++ MFC App on Windows)
Control the functions of ChannelManager.dll (configure channels
and HW devices)
Live display current values/timestamps/changes of all channels
Graph values of (groups of) channels in diagrams
print diagrams and tables of channel values
Summary of current situation:
The application as it is at the moment is already somewhat modular
in that the (main) executable does the display+interaction and the
(one of several) DLL does the data management (saving of live data
to disk, communication with devices, etc.)
From a performance POV, communication btw. the display module and
the data management module is optimally performant at the moment.
New situation:
[ DisplayAndControl.exe ]
|| /\
|| ? RPC/Messaging ||
|| || ? RPC/Messaging
|| ||
\/ ||
[ ChannelManager.exe (same PC or another) ]
Summary of the envisioned new situation:
For usability, performance and safety reasons, we wish to split up
this Windows app into two separate applications, so that the
performance (and safety) sensitive ChannelManager module can run as
a separate process possibly on a separate Windows PC.
Additionally, since we're already going to split this, we will
allow for multiple DisplayAndControl.exe apps connected to one
single ChannelManager.exe.
One QUESTION now is what technology we should use to facilitate the
communication btw. the now two (or, rather, 1 : small_n) applications.
Performance is important, because a lot of data travels btw. the
two applications and latency should be kept to a minimum. It "only"
needs to work on Windows, but it should be usable from native C++
only which makes all purely .NET based technologies unattractive.
(Note: Porting parts of DisplayAndControl.exe to .NET/WPF is
planned, but ChannelManager.exe should stay pure native, as we
don't want any .NET stuff running inside this process.)
Regarding latency: It is important that we achieve some level of
soft-realtime in the sense that small latency is acceptable, but
large and especially varying latency is not acceptable for usability
and safety reasons. Therefore any protocol that would help in
getting some sort of (soft) realtime behavior would be preferred.
RPC technologies we've looked at:
WCF (or .NET remoting) - Is dotnet only, therefore not
attractive. Performance figures are also not very good.
(D)COM - COM is great for Windows RPC communication, but it
breaks down once you have to have inter-PC comm because it is
horrible to get the security settings working in a corporate IT
network.
CORBA - We have had good experience with CORBA communications in
the past. The communication is easy to get working; there's not
much infrastructure overhead; it works well from C++; writing
a .NET wrapper is pretty trivial. The problem with CORBA is that
it's somewhat complicated to use correctly in C++ (people will use
a lot of time on chasing memory leaks, esp. inexperienced C++
devs). It also will be a learning curve for every developer and
every new developer, as no one expects people to "know" CORBA
nowadays. Also, it might not perform as well as we'd like it to and
as far as I know there's no readily available realtime support.
Thrift - still looks half-baked to use in our scenario.
ICE (from ZeroC) - I would prefer ICE over CORBA anytime, after all
it promises to be a "better CORBA" and I think it does deliver on
that. However, their licensing policy is very suboptimal as they do not sell development licenses but only license per
installation. (Well that's what they told us last time we asked end of 2009.) Their licensing policy also suggests that any 3rd party possibly interested in interfacing with our modules would first have to negotiate a license contract with ZeroC too.
Open MPI - Message Passing interface seems to be targeted at
scenarios with lots of clients "heavily" distributed. Doesn't seem
to fit our problem.
Writing our own communication layer using TCP/UDP - Oh my. I'd
rather not :-)
Google Protocol Buffers - Is not an RPC technology.
Distributed Shared Memory - Well. This got thrown in by a few
devs and I for one am neither sure if there's a working
implementation nor if it fit's our problem.
So again the QUESTION - what "RPC"-like technology would you prefer
in this situation and why?
I can elaborate on Johnny's answer. CORBA provides a robust infrastructure with services that go far beyond simple RPC. As your distributed application grows, you can use CORBA features to manage the mapping between interface and implementation, to provide secure connections, etc. As an RPC, CORBA provides the means for easy synchronous or asynchronous invocations.
The learning curve isn't that steep either. While some of the terms are a little arcane, the concepts such as managed (counted) references should be familiar to today's C++ programmers. And when the C++0x mapping is available, it will be even easier. Training is available to help make this transition even easier.
You mentioned not knowing about realtime support. In fact, CORBA for C++ has rich RT support. There is a RT CORBA specification and several C++ ORBs that implement it. TAO, which is open source and commercially supported, has extensive RT support, including the RT_ORB, RT_POA, an TAO-Specific RT Event service. With these tools you are able to designate priority levels for threads in the ORB, and have separate communication channels for different priority levels.
I'd suggest taking a look at Thrift. While it looks half-baked, I believe it's only the documentation that's lacking - the implementation is quite solid.
CORBA should perform well and there are people with experience. We realize that the IDL to C++ mapping is hard to use, there is a RFP from the OMG asking for a new IDL to C++0x mapping, that should make it much easier to use
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a good grasp of unit testing, DI, mocks, and all the design principal goodness required to have as close to full code coverage as humanly possible (single responsibility principal, think 'how will i test this' as I code, etc...).
My most recent app, I did not code doing true TDD. I kept unit-testing in mind as I coded, and wrote my tests after writing the code, refactoring, etc.. I did TDD when it was 'easy' to do... however I did not have as good of a grasp as I do now... That was the first project I made full use of DI, mocking frameworks, etc, and the first which had full code coverage - and I learned a lot from it as I went along. I'm itching to get assigned to my next project so I can code it completely doing TDD from scratch.
I know this is a broad question, and I've already ordered TDD by example and XP Unleashed, but I'm hoping for a brief overview of how you all design / write a large application doing TDD.
Do you write the entire application, using nothing but stubbed out code? (e.g., write all the function signatures, interfaces, structures, and write the entire application but without writing any actual implementation)? I could picture it working on small-mid sized, but is this even possible on large applications?
If not, how the heck would you write your first unit test for the highest level function in your system? Lets say for example - on a web service where you have a function called DoSomethingComplicated(param1,...,param6) exposed to the world. Obviously, writing the test first for a simple function like AddNumbers() is trivial - but when the function is at the top of the call stack such as this?
Do you still do design up-front? Obviously you still want to do 'architecture' design - e.g., a flow chart showing IE talking to IIS which talks to a windows service via WCF which talks to the SQL Database... an ERD which shows all your SQL tables and their fields, etc... but what about class design? Interactions between the classes, etc? Do you design this up-front, or just keep writing stub code, refactoring the interactions as you go along, until the whole thing connects and looks like it will work?
Any advice is much appreciated
Do you do design up front?
Of course you do. You've got a big application in front of you. You've got to have some idea of the structure it will have before you start writing tests and code. You don't have to have it all worked out in detail, but you should have some basic idea of the layers, components, and interfaces. For example, if you are working on a web services system, you ought to know what the top level services are, and have a good first approximation of their signatures.
Do you write the entire application using nothing but stubbed out code?
No. You stub things out only if they are really difficult to control in a test. For example, I like to stub out the database, and the UI. I will also stub out third party interfaces. Sometimes I will stub out one of my own components if it vastly increases the test time, or it forces me to create test data that is too complicated. But most of the time I let my tests work on a pretty well integrated system.
I have to say I really dislike the style of testing that relies heavily on mocks and stubs. Don't get me wrong, I think mocks and stubs are very useful for decoupling from things that are hard to test. But I don't like writing things that are hard to test, and so I don't use a lot of mocks and stubs.
How do you write your first unit test for a high level function?
Most high level functions have degenerate behavior. For example, login is a pretty high level function and can be very complicated. But if you try to log in with no user name and no password, the response from the system is going to be pretty simple. Writing that tests will also be very simple. So you start with the degenerate cases. Once you have exhausted them, you move on to the next level of complexity. For example, what if a user tries to log in with a username but no password? Bit by bit you climb the ladder of complexity, never tackling the more complex aspects until the less complex aspects are all passing.
It is remarkable how well this strategy works. You might think that you'd just be climbing around the edges all the time and never getting to the meat; but that's not what happens. Instead you find yourself designing the internal structure of the code based on all the degenerate and exceptional cases. When you finally get around to the primary flow, you find that the structure of the code you are working on has a nice hole of just the right shape to plug the main flow in.
Please don't create your UI first.
UIs are misleading things. They make you focus on the wrong aspects of the system. Instead, imagine that your system must have many different UIs. Some will be web, some will be thick client, some will be pure text. Design your system to work properly irrespective of the UI. Get all the business rules working first, with all tests passing. Then plug the UI in later. I know this flies in the face of a lot of conventional wisdom, but I wouldn't do it any other way.
Please don't design the database first.
Databases are details. Save the details for later. Rather, design your system as though you had no idea what kind of database you were using, Keep any notion of schema, tables, rows, and columns out of the core of the system. Implement your business rules as though all the data were kept in memory all the time. Then add the database later, once you've gotten all the business rules working. Again, I know this flies in the face of some conventional wisdom, but coupling systems to databases too early is a source of a lot of badly warped designs.
Do I write the entire application, using nothing but stubbed out code?
No, not in the slightest sense - that sounds like a very wasteful approach. We must always keep in mind that the underlying reason for doing TDD is rapid feedback. An automated test suite can tell us if we broke anything much faster than a manual test can. If we wait wiring things together until the last moment, we don't get rapid feedback - while we may get rapid feedback from our unit tests, we wouldn't know if the application works as a whole. Unit tests are only one form of test we need to perform to verify the application.
A better approach is to start with the most important feature and work your way in from there, using an outside-in approach. This often means starting with some UI.
The way I do it is by creating the desired UI. Since we normally can't develop UI with TDD, I simply create the View with the technology of choice. No tests there, but I wire up the UI to some API (preferrably using declarative databinding), and that's when the testing begins.
In the beginning, I would then TDD my ViewModels/Presentation Models and corresponding Controllers, possibly hard-coding some responses to see that the UI works. As soon as I have something that doesn't explode when you run it, I check in the code (remember, many small incremental check-ins).
I subsequently work my way vertically down that feature and ensure that this particular piece of UI can go all the way to the data source (or whatever), ignoring all other features.
When the feature is done, I can start on the next feature. The way I picture this process is that I fill out the application by doing one vertical slice at a time until all features are done.
Kick-starting a greenfield app this way always takes extra long time for the first feature since this is where you have to wire up everything, so pick something simple (like the initial View of the app) to keep things as simple as possible. Once the first feature is done, the next ones become much easier because the foundations are now in place.
Do I still design up-front?
Not much, no. I normally have an overall design in mind before I start, and when I work in a team, we sketch this overall architecture on a whiteboard or a slide deck before we start.
This is more or less limited to
The number and names of layers (UI, Presentation Logic, Domain Model, Data Access, etc).
The technologies used (WPF, ASP.NET MVC, SQL Server, .NET 3.5 or whatnot)
How we structure production code and test code, and which test technologies we use
Quality requirements for the code (pair programming, static code analysis, coding standards, etc.)
The rest we figure out as we go, but we use many ad-hoc design sessions at the whiteboard as we go along.
+1 Good question
I truly don't know the answer, but I would start with building blocks of classes that I could test then build into the application, not with the top-level stuff. And yes I would have a rough up-front design of the interfaces, otherwise I think you would find those interfaces changing so often as you refactor that it would be a real hinderance.
TDD By Example won't help I don't think. IIRC it goes through a simple example. I am reading Roy Osherove's The Art of Unit Testing and while it seems to comprehensively cover tools and techniques like mocks and stubs, the example so far seem also pretty simple and I don't see that it tells you how to approach a large project.
Do you write the entire application, using nothing but stubbed out code?
To test our systems we mainly do unit, integration and remote services testing. In unit tests we stub out all long running, time consuming, and external services, i.e. database operations, web services connection or any connection to external services. This is to make sure that our tests are fast, independent and not relying on the response of any external service to provide us quick feedback. We have learnt this the hard way because we do have some tests that do database operations which makes it really slow that goes against the principle "Unit tests must be fast to run"
In integration tests, we test the database operations but still not the web services and external services because that can make the test brittle depending on their availability and we use autotest to run the tests in the background all the while we are coding.
However, to test any kind of remote services, we have tests that connect to the external services, do the operation on them and get the response. What matters to the test is their response and their end state if it is important for the test. The important thing here is, we keep these kind of tests in another directory called remote (that's a convention we created and follow) and these remote tests are only run by our CI (continuous integration) server when we merge any code to the master/trunk branch and push/commit it to the repo so that we know quickly if there has been any changes in those external services that can affect our application.
Do I still design up-front?
Yes but we don't do big design up front basically what uncle Bob (Robert C. Martin) said.
In addition, we get to the whiteboard before immersing ourself into coding and create some Class Collaboration Diagrams just to make it clear and sure that everyone in the team is on the same page and this also helps us to divide the work amongst the team members.