Should class members be sorted? - coding-style

On a new project with a new team, should we enforce to sort the members of the classes automatically in a specific order (e.g. by modifier and alphabet) prior to check-in?
The alternative is to let each developer group the members as he thinks. And since everyone has a different opinion of what is related and how the grouping should be, this pretty much comes down to random order.
So what are the pros and cons of having them sorted automatically? Is this bound to a specific IDE/development-process/build-process/language? What else do we have to consider?
Edit to foster more answers:
I once was in a project where we had to maintain several branches. Because of the inability of the RCS to support this appropriately (SVN by the time), we had to manually move classes and methods from one branch to another and than merge back again (most RCS can maintain a subset-superset-relation only in one direction). Because the methods could appear anywhere in the class in any order, merging was a nightmare. Enforcing automatic sorting of members right from the beginning would have avoided much of the pain.
On the other hand, if working in a long existing project without automatic sort order, it can be a bad idea to enforce this. Moving all the members around is basically the same as throwing away the versioning up to this point, because comparing files with older versions via diff will be no good anymore for the same reason that merging in the other project was a pain.
Same goes if refactoring is due. When methods are renamed they will also be moved, making a diff of two versions practically pointless. With different names AND different places, it is difficult to recognize methods again.

Given that your IDE can sort your members the way you prefer, I'd personally avoid a global company policy on the matter.
I think rules-for-rules-sake are an important factor in de-motivating a team. As programmers we have a certain mindset, a certain way of seeing the world. Practicality and pragmatism are often valued higher by many programmers than policy.
If it's a quick click of a couple of menu items to have the code look the way you want it to when it's your turn to look at it, I'd stick with those few clicks. (and make this into a quick keyboard shortcut for your convenience)

I like to have a consistent code layout, but I have learned the hard way that anything which only touches the topic of "coding style" always leads to endless discussions and can waste a lot of time. It is not worth it.
Far more important is to make decisions on other topics (architecture and design, tests, how to communicate).
Usually I tend to assume that related members will be grouped together over time. I see no advantage in using an alphabetical sort order, because that is what the IDE can do for me.
Renaming, moving code, deleting green code, adding comments is nothing I like to see mixed with other changes. That is why I usually split it into two changes - one, that updates the "code layout/style" and another, which changes the behaviour of the program.

In my case... I consider usefull to order by access level. I follow the StyleCop rules (.net but valid in any other languaje)
Public
Internal
Protected Internal
Protected
Private
static
non-static
Inside of this groups... I've some randomness, but I always put things like Id's or unique identificator first.
I'm not saying this is the better good practice in the word, but at least people know where to look for things.
Depending of the lenguaje and the IDE you choose, maybe you could be lucky and find a tool that rearange the code for you based on your owns preferences. (Resharper, in my case, It's a good help)

I consider sorting of class members useful if it results in better readability of code. A sorting scheme should not be too strict but strict enough to add to better code readability. I prefer this sorting scheme:
static fields
instance fields
constructor
methods
Each method that calls another method (mostly private) the called method should be below the calling method.
As pointed out above the only reason to order class members should be better readability because you write code once but read it a hundred times, so having an accepted (by the team) order system can boost productivity.
Ordering code to work around inabilities of RCS will not per se lead to better readability and thus will not boost productivity. In most cases such an ordering method will fail. I'm in doubt if an alphabetic order method could lead to better readability.

Related

Is this backwards naming convention a bad idea (ie. contrary to industry standards)?

I've always reversed names so that they naturally group in intellisense. I am wondering if this is a bad idea.
For example, I run a pet store and I have invoicing pages add, edit, delete, and store pages display, preview, edit. To get the URL for these, I would call the methods (in a suitable class like GlobalUrls.cs
InvoicingAddUrl()
InvoicingEditUrl()
InvoicingDeleteUrl()
StoreDisplayUrl()
StorePreviewUrl()
StoreEditUrl()
This groups them nicely in intellisense. More logical naming would be:
AddInvoiceUrl()
EditInvoiceUrl()
DeleteInvoiceUrl()
DisplayStoreUrl()
PreviewStoreUrl()
EditStoreUrl()
Is it better (better being, more of an industry standard way) to group them for intellisense, or logically?
Grouping in Intellisense is just one factor in creating a naming scheme, but logically grouping by category rather than function is a common practice as well.
Most naming "conventions" dictate usage of characters, casing, underscores, etc. I think it is a matter of personal preference (company, team or otherwise) as to whether you use NounVerb or VerbNoun formatting for your method names.
Here are some resources:
Microsoft - General Naming Conventions
Wikibooks C# Programming/Naming
Akadia .NET Naming Conventions
Related questions:
Naming Conventions - Guidelines for Verbs, Nouns and English Grammar Usage
Do vs. Run vs. Execute vs. Perform verbs
Events - naming convention and style
Check out how the military names things. For example, MREs are Meals, Ready to Eat. They do this because of sort order, efficiency and not making mistakes. They are ready to ignore the standard naming conventions of the language (i.e., English) used outside of their organization because they are not impressed with the quality of operations outside of their organization. In the military, the quality of operations is literally a matter of life and death. Also, by doing things their own way they have a way of identifying who is inside and who is outside of the organization. Anyone unable or unwilling to learn the military way, which is different but not impossibly difficult, is not their first choice for recruitment or promotion.
So, if you are impressed with the standard quality of software out there, then by all means keep doing what everyone else is doing. But, if you wish to do better than you have in the past, or better than your competitor, then I suggest looking at other fields for lessons learned the hard way, such as the military. Then make some choices for your organization, that are not impossible but are for you and your competitiveness. You can choose big-endian names (most significant information comes last) or the military-style little-endian names (most significant information comes first), or you can use the dominant style your competitors probably use, which is doing whatever you feel like whenever you feel like it.
Personally, I prefer little-endian Hungarian (Apps) naming, which was widely seen as superior when it first came out, but then lost favor because Hungarian (Sys) naming destroyed the advantage due to a mistranslation of the basic idea, and because of rampant abbreviations. The original intent was to start a name with what kind of a thing it is, then become increasingly specific until you end with a unique qualification. This is also the order that most array dimensions and object qualifiers are in, so in most languages little-endian naming flows into the larger scheme of the language.
You are on to something. Forward, march.
It's not intrinsically bad. It has the upside of being easier to identify the type while scanning, and groups the options together in Intellisense like you said. As long as you and everyone else on your team picks a way of doing things and stays consistent about it there shouldn't be any big problems.
Based on the methods listed, you might be able to refactor Invoicing and Store out into their own classes, which would be closer to the mythical "industry standard" way.
That said, whatever your programming team can agree on for naming convention should be fine. The important thing is to be consistent within the project.
I don't think it's a good idea to develop a coding standard around a tool (as least not as the first consideration). Even though most IDEs will have Intellisense these days, and most people will be using said IDEs, I think that first and foremost a coding standard should be about making the code legible and navigable on its own merits.
I would opt for most logical naming, personally. When I write code and I have some object I'm about to call a member function on, I'm usually thinking about what member function to call based on the action I'm about to do, because I already know the object I'm manipulating. So my first impulse would be to start typing "Add" if I wanted to add something, and see what Intellisense showed me. This is, of course, subjective.
I have never actually seen anybody using your alphabetical, Intellisense grouping anywhere -- at least not in code that is not worth using as a basis for comparison because it was so horrid in other ways.
That said, if it's your standard, do what you want -- consistency is the important part.

"refactor refactor refactor your code." What does this mean exactly and why do it?

I often heard from professionals blog something like refactoring your code whenever the chance you get. What is it exactly? Rewriting your code in simpler and fewer lines? What is the purpose of doing this?
Refactoring code is a process of cleaning up your code, reducing the clutter and improving the readability without causing any side effects or changes to features.
Basically, you refactor by applying a series of code change rules that improve code readability and re-usability, without affecting the logic.
Always unit test before and after refactoring to ensure your logic isn't affected.
This Wikipedia article will give you an idea of the types of things included in the general concept of Refactoring.
The idea is adapt / evolve your code as you go. Simple things may be to rename variables or method parameters, but others may be to pass an additional parameter or to drop one, or to change its type. The data model may evolve as well. etc.
Often refactoring, works hand-in-hand with unit-testing, whereby the risk of "breaking something" is offset by the fact that such an issue may likely be discovered by the automatic testing (provide a good coverage and relevant test cases...).
In a nutshell, the ability to refactor (and btw, most IDE or add-ons to the IDEs, offer various tools that make refactoring easier and less error prone) allows one to write more quickly without stressing about some decisions ("should this object include an array or a list etc...) letting the programmer change some of these decisions as times goes, and with the added insight offered by having a workable, if not perfect solution. See a related concept: agile development.
Beware, refactoring doesn't give you license to start coding without putting any thought in design, in the object model, the APIs etc., however it lessens the stiffness of some of these decisions.
Martin Fowler has probably done the most to popularize refactoring, but I think good developers have always done these sorts of restructurings. Check out Fowler'srefactoring web site, and his 1999 Refactoring, which is an excellent introduction and catalog of specific refactorings using Java.
And I see he's a co-author of the brand new Refactoring, Ruby Edition, which should be a great resource.
I find that regularly cleaning up your code like this makes it a lot clearer and more maintainable.
To take one example, I wrote a small (Java 1.6) client library for accessing remote web services (using the REST architectural style). The bulk of this library is in one source file, and about half of that deals with the web services, while the other half is a simple in-memory cache of the responses (for performance). Over time both halves have grown in functionality, to the point where the source file was getting too complex. So today I used Fowler's "Extract Class" refactoring to move the cache logic into a new class. Before that I had to do some "Extract Methods" to isolate the caching logic. Along the way I did a few "Rename Methods" and an "Introduce Explaining Variable".
As other folks have noted, it's very important to have a good set of unit tests to apply after you make each change. They help ensure that you're not introducing new bugs, among other good things.
In a nutshell, refactoring means improving the design and/or implementation of software, usually without changing its behavior. This is normally done to make the code easier to understand and work with going forward, thereby making future development faster and less bug-prone.
Refactoring is a long-term investment in your code - since it doesn't affect the outward "appearance" of the software, there is very often pressure (from management, etc.) to "just get it working and move on to the next thing." While this may sometimes be the right decision, depending on business drivers, a codebase that undergoes change but never gets refactored will decay into a difficult, buggy mess (See also Technical Debt).
Specifically, the top reasons to refactor are usually the following:
Getting rid of duplicated code
Breaking up a long method into smaller pieces by extracting new methods from sections of the longer method
Breaking up a class that has too many responsibilities into smaller, more targeted classes or subclasses
Moving methods from one class to another. Often this is done so the methods reside in the same class as the data they operate on.
In the simplest terms, refactoring code is optimizing code. The criteria for what is "better" code is open to much interpretation as there are various coding styles and patterns out there. A central idea with refactoring is the question of, "Could this code be made better?" A few examples of that criteria can include scalability, maintainability, readablity, performance, size of executable, or minimizing memory used in executing the code.
"Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." -- MartinFowler in RefactoringImprovingTheDesignOfExistingCode
see this WhatIsRefactoring for more explanation.
Refactoring code generally means taking code that has been patched multiple times and re-writing it so that the needs of the later patches are taken into account.

How do you decide which parts of the code shall be consolidated/refactored next?

Do you use any metrics to make a decision which parts of the code (classes, modules, libraries) shall be consolidated or refactored next?
I don't use any metrics which can be calculated automatically.
I use code smells and similar heuristics to detect bad code, and then I'll fix it as soon as I have noticed it. I don't have any checklist for looking problems - mostly it's a gut feeling that "this code looks messy" and then reasoning that why it is messy and figuring out a solution. Simple refactorings like giving a more descriptive name to a variable or extracting a method take only a few seconds. More intensive refactorings, such as extracting a class, might take up to a an hour or two (in which case I might leave a TODO comment and refactor it later).
One important heuristic that I use is Single Responsibility Principle. It makes the classes nicely cohesive. In some cases I use the size of the class in lines of code as a heuristic for looking more carefully, whether a class has multiple responsibilities. In my current project I've noticed that when writing Java, most of the classes will be less than 100 lines long, and often when the size approaches 200 lines, the class does many unrelated things and it is possible to split it up, so as to get more focused cohesive classes.
Each time I need to add new functionality I search for already existing code that does something similar. Once I find such code I think of refactoring it to solve both the original task and the new one. Surely I don't decide to refactor each time - most often I reuse the code as it is.
I generally only refactor "on-demand", i.e. if I see a concrete, immediate problem with the code.
Often when I need to implement a new feature or fix a bug, I find that the current structure of the code makes this difficult, such as:
too many places to change because of copy&paste
unsuitable data structures
things hardcoded that need to change
methods/classes too big to understand
Then I will refactor.
I sometimes see code that seems problematic and which I'd like to change, but I resist the urge if the area is not currently being worked on.
I see refactoring as a balance between future-proofing the code, and doing things which do not really generate any immediate value. Therefore I would not normally refactor unless I see a concrete need.
I'd like to hear about experiences from people who refactor as a matter of routine. How do you stop yourself from polishing so much you lose time for important features?
We use Cyclomatic_complexity to identify the code that needs to be refactored next.
I use Source Monitor and routinely refactor methods when the complexity metric goes aboove around 8.0.

Dealing with god objects

I work in a medium sized team and I run into these painfully large class files on a regular basis. My first tendency is to go at them with a knife, but that usually just makes matters worse and puts me into a bad state of mind.
For example, imagine you were just given a windows service to work on. Now there is a bug in this service and you need to figure out what the service does before you can have any hope of fixing it. You open the service up and see that someone decided to just use one file for everything. Start method is in there, Stop method, Timers, all the handling and functionality. I am talking thousands of lines of code. Methods under a hundred lines of code are rare.
Now assuming you cannot rewrite the entire class and these god classes are just going to keep popping up, what is the best way to deal with them? Where do you start? What do you try to accomplish first? How do you deal with this kind of thing and not just want to get all stabby.
If you have some strategy just to keep your temper in check, that is welcome as well.
Tips Thus Far:
Establish test coverage
Code folding
Reorganize existing methods
Document behavior as discovered
Aim for incremental improvement
Edit:
Charles Conway recommend a podcast which turned out to be very helpful. link
Michael Feathers (guy in the podcast) begins with the premise that were are too afraid to simply take a project out of source control and just play with it directly and then throw away the changes. I can say that I am guilty of this.
He essentially said to take the item you want to learn more about and just start pulling it apart. Discover it's dependencies and then break them. Follow it through everywhere it goes.
Great Tip
Take the large class that is used elsewhere and have it implement an emtpy interface. Then take the code using the class and have it instantiate the interface instead. This will give you a complete list of all the dependencies to that large class in your code.
Ouch! Sounds like the place I use to work.
Take a look at Working effectivly with legacy code. It has some gems on how to deal with atrocious code.
DotNetRocks recently did a show on working with legacy code. There is no magic pill that is going to make it work.
The best advice I've heard is start incrementally wrapping the code in tests.
That reminds me of my current job and when I first joined. They didn't let me re-write anything because I had the same argument, "These classes are so big and poorly written! no one could possibly understand them let alone add new functionality to them."
So the first thing I would do is to make sure there are comprehensive testing behind the areas that you're looking to change. And at least then you will have a chance of changing the code and not having (too many) arguments (hopefully). And by tests, I mean testing the components functionally with integration or acceptance tests and making sure it is 100% covered. If the tests are good, then you should be able to confidently change the code by splitting up the big class into smaller ones, getting rid of duplication etc etc
Even if you cannot refactor the file, try to reorganize it. Move methods/functions so that they are at least organized within the file logically. Then put in lots of comments explaining each section. No, you haven't rewritten the program, but at least now you can read it properly, and the next time you have to work on the file, you'll have lots of comments, written by you (which hopefully means that you will be able to understand them) which will help you deal with the program.
Code Folding can help.
If you can move stuff around within the giant class and organize it in a somewhat logical way, then you can put folds around various blocks.
Hide everthing, and you're back to a C paradigm, except with folds rather than separate files.
I've come across this situation as well.
Personally I print out (yeah, it can be a lot of pages) the code first. Then I draw a box around sections of code that are not part of any "main-loop" or are just helper functions and make sure I understand these things first. The reason is they are probably referred to many times within the main body of the class and it's good to know what they do
Second, I identify the main algorithm(s) and decompose them into their parts using a numbering system that alternates between numbers and letters (it's ugly but works well for me). For example you could be looking at part of an algorithm 4 "levels" deep and the numbering would be 1.b.3.e or some other god awful thing. Note that when I say levels, I am not referring directly to control blocks or scope necessarily, but where I have identified steps and sub-steps of an algorithm.
Then it's a matter of just reading and re-reading the algorithm. When you start out it sounds like a lot of time, but I find that doing this develops a natural ability to comprehend a great deal of logic all at once. Also, if you discover an error attributed to this code, having visually broken it down on paper ahead of time helps you "navigate" the code later, since you have a sort of map of it in your head already.
If your bosses don't think you understand something until you have some form of UML describing it, a UML sequence diagram could help here if you pretend the sub-step levels are different "classes" represented horizontally, and start-to-finish is represented vertically from top-to-bottom.
I feel your pain. I tackled something like this once for a hobby project involving processing digital TV data on my computer. A fellow on a hardware forum had written an amazing tool for recording shows, seeing everything that was on, and more. Plus, he had done incredibly vital work of working around bugs in real broadcast signals that were in violation of the standard. He'd done amazing work with thread scheduling to be sure that no matter what, you wouldn't lose those real-time packets: on an old Pentium, he could record four streams simultaneously while also playing Doom and never lose a package. In short, this code incorporated a ton of great knowledge. I was hoping to take some pieces and incorporate them into my own project.
I got the source code. One file, 22,000 lines of C, no abstraction. I spent hours reading it; there was all this great work, but it was all done badly. I was not able to reuse a single line or even a single idea.
I'm not sure what the moral of the story is, but if I had been forced to use this stuff at work, I would have begged permission to chip pieces off it one at a time, build unit tests for each piece, and eventually grow a new, sensible thing out of the pieces. This approach is a bit different than trying to refactor and maintain a large brick in place, but I would rather have left the legacy code untouched and tried to bring up a new system in parallel.
The first thing I would do is write some unit tests to box the current behavior, assuming that there are none already. Then I'd start in the area where I need to make the change and try to get that method cleaned up -- i.e. refactor working code before introducing changes. Use common refactoring techniques to extract and reuse methods from existing long methods to make them more understandable. When you extract a method, look for other places in the code where similar code exists, box that area, and reuse the method you've just extracted.
Look for groups of methods that "hang together" that can be broken out into their own classes. Write some tests for how those classes should work, build the classes using the existing code as a template if need be, then substitute the new classes into the existing code, removing the methods that they replace. Again, using your tests to make sure that you're not breaking anything.
Make enough improvement to the existing code so that you feel you can implement your new feature/fix in a clean way. Then write the tests for the new feature/fix and implement to pass the tests. Don't feel that you have to fix everything the first time. Aim for gradual improvement, but always leave the code better than you found it.

Improving really bad systems

How would you begin improving on a really bad system?
Let me explain what I mean before you recommend creating unit tests and refactoring. I could use those techniques but that would be pointless in this case.
Actually the system is so broken it doesn't do what it needs to do.
For example the system should count how many messages it sends. It mostly works but in some cases it "forgets" to increase the value of the message counter. The problem is that so many other modules with their own workarounds build upon this counter that if I correct the counter the system as a whole would become worse than it is currently. The solution could be to modify all the modules and remove their own corrections, but with 150+ modules that would require so much coordination that I can not afford it.
Even worse, there are some problems that has workarounds not in the system itself, but in people's head. For example the system can not represent more than four related messages in one message group. Some services would require five messages grouped together. The accounting department knows about this limitation and every time they count the messages for these services, they count the message groups and multiply it by 5/4 to get the correct number of the messages. There is absolutely no documentation about these deviations and nobody knows how many such things are present in the system now.
So how would you begin working on improving this system? What strategy would you follow?
A few additional things: I'm a one-men-army working on this so it is not an acceptable answer to hire enough men and redesign/refactor the system. And in a few weeks or months I really should show some visible progression so it is not an option either to do the refactoring myself in a couple of years.
Some technical details: the system is written in Java and PHP but I don't think that really matters. There are two databases behind it, an Oracle and a PostgreSQL one. Besides the flaws mentioned before the code itself is smells too, it is really badly written and documented.
Additional info:
The counter issue is not a synchronization problem. The counter++ statements are added to some modules, and are not added to some other modules. A quick and dirty fix is to add them where they are missing. The long solution is to make it kind of an aspect for the modules that need it, making impossible to forget it later. I have no problems with fixing things like this, but if I would make this change I would break over 10 other modules.
Update:
I accepted Greg D's answer. Even if I like Adam Bellaire's more, it wouldn't help me to know what would be ideal to know. Thanks all for the answers.
Put out the fires. If there are any issues of critical priority, whatever they are, you've got to handle them first. Hack it in if you must, with a smelly codebase it's ok. You know you'll improve it going forward. This is your sales technique targeted at whomever you're reporting to.
Pick some low-hanging fruit. I assume you're relatively new to this particular software and that you were re-tasked to deal with it. Find some apparently easy problems in a related subsystem of the code that shouldn't take more than a day or two to resolve apiece, and fix them. This may involve refactoring, or it may not. The goal is to familiarize yourself with the system and with the style of the original author. You may not get really lucky (One of the two incompetents who worked on my system before me always post-fixed his comments with four punctuation marks instead of one, which made it very easy to distinguish who wrote the particular segment of code.), but you'll develop insight into the author's weaknesses so you know what to look out for. Extensive, tight coupling with global state vs poor understanding of language tools, for example.
Set a big goal. If your experience parallels mine, you'll find yourself in a particular bit of spaghetti code more and more often as you perform the prior step. This is the first knot you need to untangle. With the experience you've gained understanding the component and knowledge about what the original author likely did wrong (and thus, what you need to watch out for), you can start envisioning a better model for this subset of the system. Don't worry if you still have to maintain some messy interfaces to maintain functionality, just take it one step at a time.
Lather, rinse, repeat! :)
Given time, consider adding unit tests for your new model one level underneath your interfaces with the rest of the system. Don't engrave the bad interfaces in code via tests that use them, you'll be changing them in a future iteration.
Addressing the particular issues you mention:
When you run into a situation that users are working around manually, talk with the users about changing it. Verify that they'll accept the change if you provide it before sinking the time into it. If they don't want the change, your job is to maintain the broken behavior.
When you run into a buggy component that multiple other components have worked around, I espouse a parallel component technique. Create a counter that works how the existing one should work. Provide a similar (or, if practical, identical) interface and slide the new component into the codebase. When you touch external components that work around the broken one, try to replace the old component with the new one. Similar interfaces ease porting of the code, and the old component is still around if the new one fails. Don't remove the old component until you can.
What is being asked of you right now? Are you being asked to implement functionality, or fix bugs? Do they even know what they want you to do?
If you don't have the manpower, time, or resources to "fix" the system as a whole, then all you can do is bail water. You're saying you should be able to make some "visible progress" in a few months' time. Well, with the system being as bad as you described, you may actually make the system worse. Under pressure to do something noticeable, you'll simply add code, and make the sysem even more convoluted.
You need to refactor, eventually. There is no way around it. If you can find a way to refactor that is visible to your end users, that would be ideal, even if it takes 6-9 months or a year instead of "a few months." But if you can't, then you have a choice to make:
Refactor, and risk being viewed as "not accomplishing anything" despite your efforts
Don't refactor, accomplish "visible" goals, and make the system more convoluted and more difficult to refactor one day. (Maybe after you find a better job, and hope the next developer to come along can never find out where you live.)
Which one is most beneficial to you personally depends on your company's culture. Will they one day decide to hire more developers, or replace this system completely with some other product?
Conversely, if your efforts to "fix things" actually break other things, will they be understanding about the monstrosity you're being asked to tackle single-handedly?
No easy answers here, sorry. You have to evaluate based on your unique, individual situation.
This is a whole book that will basically say unit test and refactor, but with more practical advice on how to do it
http://ecx.images-amazon.com/images/I/51RCXGPXQ8L._SL500_AA240_.jpg
http://www.amazon.com/Working-Effectively-Legacy-Robert-Martin/dp/0131177052
You open the directory that contains this system with Windows Explorer. Then, press Ctrl-A, and then Shift-Delete. That sounds like an improvement in your case.
Seriously though: that counter sounds like it's got thread-safety issues. I'd put a lock around the increasing functions.
And regarding the rest of the system, you can't do the impossible so try to do the possible. You need to attack your system from two fronts. Take care of the more visibly problematic issues first, so you can show progress. At the same time, you should deal with the more infrastructural problems, so that you have a chance at actually fixing this thing some day.
Good luck, and may the source be with you.
Pick one area that would be of medium difficulty to refactor. Create a skeleton of the original code with only the method signatures of the existing ones; maybe use an Interface even. Then start hacking away. You can even point the "new" methods to the old ones until you get to them.
Then, testing, testing, testing. Since there aren't any unit tests, maybe just use good old fashioned Voice-Activated-Unit Tests (people)? Or write your own tests as you go.
Document your progress as you go in some kind of repository, including frustrations and questions, so that when the next poor schmuck who gets this project won't be where you are :).
Once you get the first part done, move on to the next. The key is to build on top of incremental progress, that's why you shouldn't start with the hardest part first; it'll be too easy to get demoralized.
Joel has a couple of articles on rewriting/refactoring:
http://www.joelonsoftware.com/articles/fog0000000069.html
http://www.joelonsoftware.com/articles/fog0000000348.html
I've been working with a legacy system with the same characteristics for almost three years now, and there are no shortcuts that I'm aware of.
What bothers me most with our legacy system is that I'm not allowed to fix some bugs, since many other functions could break if I fixed them. This calls for ugly workarounds or creating new versions of old functions. Calls to the old functions can then be replaced with the new one at a time (while testing).
I'm not sure what the goal of your task is, but I strongly advise you to touch as little of the code as possible. Only do what you need to do.
You may want to get as much as possible documented by interviewing people. This is a huge task, since you don't know which questions to ask, and people will have forgotten a lot of details.
Other than that: make sure you're getting paid and enough moral support. There will be weeping and gnashing of teeth...
Well you need to start somewhere, and it sounds like there are bugs that need fixing. I would work through those bugs, making quick win refactorings, and writing any unit tests possible along the way. I would also use a tool like SourceMonitor to identify some of the most 'complex' parts of code in the system and see if I could simplify their design in any way. Ultimately, you just have to accept that it will be a slow process, and make small steps towards a better system.
I would try to pick a part of the system that could be extracted and rewritten in isolation fairly quickly. Even if it doesn't do much, you could show progress pretty quickly, and you don't have the problem of interfacing with the legacy code directly.
Hopefully, if you could pick off a few such tasks, they will see you making visible progress, and you could put forward an argument for hiring more people to rewrite the bigger modules. When parts of the system rely on broken behaviour, you don't have much choice but to separate before you fix anything.
Hopefully, you could gradually build a team capable of rewriting the whole lot.
All of this would have to go hand in hand with some decent training, otherwise people's old habits will stick, and your work will get the blame when things don't work as expected.
Good luck!
Deprecate everything that currently exists that has problems, and write new ones that work correctly. Document as much as you can about what will change and put big red flashing signs all over the place pointing to this documentation.
By doing it that way, you can keep your existing bugs (the ones that are being compensated for somewhere else) around without slowing down your progress towards getting an actual working system.

Resources