Related
Note to readers: Bear with me. I promise there's a question.
I have a problem to solve and think to myself "Oh, I'll do it in Ruby."
$ bundle gem problemsolver
create problemsolver/Gemfile
create problemsolver/Rakefile
create problemsolver/.gitignore
create problemsolver/problemsolver.gemspec
create problemsolver/lib/problemsolver.rb
create problemsolver/lib/problemsolver/version.rb
Initializating git repo in /tmp/harang/problemsolver
Remove the comment on s.add_development_dependency "rspec" in problemsolver/problemsolver.gemspec and then
$ bundle exec rspec --init
The --configure option no longer needs any arguments, so true was ignored.
create spec/spec_helper.rb
create .rspec
New tests go into spec/ and must be in files that end in _spec.rb. For instance, spec/version_spec.rb
describe 'Problemsolver' do
it 'should be at version 0.0.1' do
Problemsolver::VERSION.should == '0.0.1'
end
end
To run specs--ignorning code-change runners like guard--is trivial:
$ bundle exec rspec
.
Finished in 0.00021 seconds
1 example, 0 failures
You can't see it, but the message is nicely color coded for quick "Did I screw up?" scanning? The things that are very good about this:
Setup was rapid, almost brainless (though figuring out which commands to invoke is not trivial).
Standardized layout of the source tree reduces the familiarization period with a new code-base, making collaboration more simple and reducing the lull time when picking up a project you've left for a bit.
A heavy reliance on tooling distributes best-practices through the community, roughly at the speed of new project creation.
Adding coverage tools, code watchers, linters, behavior test tools and others is no more difficult.
This stands unfavorably in contrast to the situation if one thinks, "Oh, I'll do it in Haskell."
$ mkdir problemsolver
$ cd problemsolver/
$ cabal init
Package name [default "problemsolver"]?
Package version [default "0.1"]? 0.0.1
Please choose a license:
1) GPL
2) GPL-2
3) GPL-3
4) LGPL
5) LGPL-2.1
6) LGPL-3
* 7) BSD3
8) MIT
9) PublicDomain
10) AllRightsReserved
11) OtherLicense
12) Other (specify)
Your choice [default "BSD3"]?
Author name? Brian L. Troutwine
Maintainer email [default "brian#troutwine.us"]?
Project homepage/repo URL?
Project synopsis? Solves a problem.
Project category:
1) Codec
2) Concurrency
3) Control
4) Data
5) Database
6) Development
7) Distribution
8) Game
9) Graphics
10) Language
11) Math
12) Network
13) Sound
14) System
15) Testing
16) Text
17) Web
18) Other (specify)
Your choice? ProblemSolver
ProblemSolver is not a valid choice.
Your choice? 18
Please specify? ProblemSolver
What does the package build:
1) Library
2) Executable
Your choice? 2
Generating LICENSE...
Generating Setup.hs...
Generating y.cabal...
You may want to edit the .cabal file and add a Description field.
"Great," you think, "I was so pestered I bet all the latest Haskell best-practices in software development are just waiting on my disk."
$ ls
LICENSE problemsolver.cabal Setup.hs
Allow me to summarize my feelings: :(
The generated cabal file doesn't even have a Main specified, much less instructions for setting up a rudimentary project. Still, okay. If you fart around for a bit trying to find the right search keywords you'll land on How to write a Haskell program which is okay except:
All of Haq source code gets thrown into the root directory.
The test code for Haq is only in Test.hs, is only QuickCheck and has no facility for continuing the project with split-file tests.
All of this has to be manually written or copied for each new project.
Checking Real World Haskell's Chapter 11 you'll find it doesn't even mention cabal and skirts the issue of project layout entirely. None of the resources that Don Stewart kindly answers with here are addressed in either of the aforementioned and, I'll note, Mr. Stewart doesn't explain how to use any of the tools referenced.
Note that the accepted answer in Haskell testing workflow references a project that's since moved on sufficiently so as not be a good answer but does say
As cabal test doesn't yet exist -- we have a student working on it for this year's summer of code! -- the best mechanism we have is to use cabal's user hook mechanism.
Hey, okay, the cabal documentation! The appropriate section does have examples, but they're awfully contrived but don't fail to give the impression that everyone is on their own and good luck to you.
Of course, there's always test-framework that seems to be nice but it example code doesn't offer anything beyond what's seen in the wiki and is non-scalable in the sense that as soon as my program grows in complexity I'm on the hook to develop ways of dividing up tests into manageable modules. I'm not even sure what's going on with HTF and agree with Mr. Volkov's assessment.
Mr. Jelvis' comment on the linked HTF question was of particular interest to me: the Haskell tool-chain suffers, very badly, from a tyranny of small decisions. I can't actually get down to the task at hand--solving my problem in Haskell--because I'm on the hook for getting my environment just right. Why this is bad:
It's wasted effort. Unless I'm writing a test tool, I will very, very rarely care about how my tests are slurped up, only from where.
It's difficult to learn. There seems to be no singular resource for setting up a project with testing baked in, and the various sources that do exist are sufficiently diverse as to be unhelpful.
It's difficult to reproduce. With so many moving pieces to arrange I'm bound to do it differently each time.
As a corollary, it's idiosyncratic. That means its difficult to collaborate and to pick up dormant projects.
This just plain stinks.
Maybe I'm wrong, though. Does there exist some poorly advertised tool or closely developed tools to do something similar to Bundler+Rspec in the Haskell space? If not, is there a poorly advertised canonical example of modern Haskell testing with all of Mr. Stewart's referenced goodies baked right in? The project created or demonstrated:
should by convention and tooling keep test code separate from application code in a well-defined manner (in Ruby-land, Rspec tests go in spec/, Cucumber features in features/),
should not require end-users to compile and install testing dependencies
should be easily reproducible, desirably in no more than 10 minutes and
should be standardized or have the hope of standardization.
Am I wrong in believing that there's nothing at all like this in Haskell-land?
Edit0: Please note, the Ruby language's community isn't the only applicable comparison. Paul R. is correct in identifying the strong current of configuration over convention. Other languages solve the problem of getting a scalable project structure off the ground in other ways:
C :: This language is venerable and so well-documented that you'll have trouble figuring out which well-documented approach to take. No tooling as such.
Java :: Configuration over convention: you're bound into it at the compiler level. Many tools and very well documented.
Scala :: Strong tool support.
Erlang :: Venerable and loosely documented if you know what you're looking for. Arguably configuration over convention if you're using rebar or are otherwise targeting the OTP.
Paul R.'s solution of using a custom template works great if, like C, there's sufficient documentation to compile such a thing. This still runs into issues that I attempted to identify explicitly in the post, but it's workable. Haskell's best offering--that I'm aware of--is "How to write a Haskell program" but falls short of being more than the equivalent of dumping a lone Cub Scout off in the woods with a flashlight and a flask of water.
Also, yes, Static Types are great and do solve many problems that would otherwise need explicit testing. If they were an end-all solution, or mostly sufficient, even, the snap-framework would not be so thoroughly tested. (Arguably "Copy snap-core." is an answer to my question.)
There's currently no one single way to set up a testsuite. Hopefully, people will standardize on cabal test, which is out-of-the box. In fact, both HUnit and QuickCheck are also provided with the Haskell Platform, and so setting up tests doesn't require downloading any extra dependencies.
You're correct that an old accepted answer doesn't provide information on cabal test. I edited it, and now it does! You're also probably correct that the linked page on the Haskell wiki (also written before cabal test became available) doesn't provide information on current testing best practices. It's a wiki, and I encourage folks to edit it! Note that the page does, however, provide a link to another page that describes how one might structure a more complex Haskell project.
tldr; Use cabal test. I'm fond of test-framework, which you can integrate with cabal test should you so desire. Sorry that cabal test is sort of new and not all the resources we have (generally community editable) have been updated to point to it and describe how to use it. Updating lots of resources and creating tutorials is the job of a community. We should probably do a better job promoting lots of the new awesome tools introduced to the Haskell ecosystem in the last few years.
There are many points here. First, there is a comparison of convention-over-configuration with explicit configuration. In the ruby land, the former is often prefered. In my experience, although it works great for do-a-{blog|social-thing|gem|library}-in-5-minutes-screencast and quick experiments, it has much less value in your real projects (more than 5 minutes) as init time gets quickly amortized. Also, there is a reason why tools provide configuration facilities : there are many different needs and usages. So my advice to your cabal-init problem is : make your own template file. Put stub for everything you need, with great comments, and use it whenever you need it.
Regarding tests, the landscape is quiet different between ruby and haskell. In ruby, one can write foo do { oh dear I am typing nonsense here } and there is no other way to catch this nonsense than actually running the code. So automated tests are absolutely required. In the haskell land however, there is a great static analysis of your code coupled with a very sane paradigm (purely functional non-strict), and after years of using it, I'm still surprised haw hard it is to write nonsense without being immediately caught by the compiler. I do ruby at work as well, and really, 90% of my tests are poor-man manual "static checks".
Still, there is room for wrong design or corner-case errors, that's why quickcheck exists. It will automatically (yes, really automatically) find corner-case errors and help you a lot find design errors. You can still write unit tests with one of existing packages if you need manual checks.
So my conclusion here is : don't be surprised to find shadow everywhere if you shade ruby light on the haskell land. Things are very different over here, and need to be experienced to grab power. That doesn't mean that everything is perfect, actually improving the toolchain is a commonly expressed wish. Just the points you raised are not really problematic, and really don't deserve some vocabulary you picked. Try first, judge after :)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I have complex project without comments. The project is programed in Java but have more than one main class, use several .txt files like a template and use several .bat files. I don't know where to start and how to start discovering the project, because I need to make some changes in that project.
As with others I say this is a slow process.
However having done this in the past many times, this is my methodology:
Identify as many requirements that the code fulfils. This may give you the some reasons as to why certain things are the way they are when you look deeper. A common way of finding these is look for any tests that be available. The automated ones are best, but usually they're as missing as the comments.
Find the entry points to the code. These will give you places where you can poke the code to see how different inputs affect the flow. Common entry points are Code Loading 'Main' type functions, service interfaces, web page post backs etc..
Diagram the code. Look for tools that can build black/white box pictures of the code. For me this invaluable. I have on occasion printed out large listings and attacked then with markers and rulers. You're aim to create your own flow chart (mental or other wise) of the code flow.
Using the above (iteratively) build a set of outputs to the code that you think should occur, and add to these the outputs you may already know about such as logs, data files, database writes etc..
Finally if you have time, create some manual tests though preferably in automated test harnesses to verify the above. This where I start to involve the debugger to see detail in the code.
This methodology usually gives me confidence to make changes.
Note this is iterative process and can be done with portions of the code or overall as you see fit. I usually favour a top down approach to start with and then as I gain some insight I drill down till details become overwhelming and then I repeat. However this is just because my mind works in this way - you may be different. Good luck.
Find the main Main class. The starting point.
Start drawing a picture of the classes and the objects they own and the external entities they reference.
Follow all the branches until you can find a logical ending.
I've used UML reverse engineering tools in the past and while a visual picture is good, stepping through the code has always been the hardest and yet best methodology for me.
And, as you step through each piece you can add in your own comments..
I usually start with doxygen, turning on every extracting option (especially EXTRACT_ALL and EXTRACT_PRIVATE), and enable the SOURCE_BROWSER, HAVE_DOT, CALL_GRAPH and CALLER_GRAPH options (you also need to have dot installed). This gives good view of the software. For every function the calls are displayed and linked in a graph, also the sources are linked from there.
While doxygen is intended for C and C++, it also works with Java sources (set the OPTIMIZE_OUTPUT_JAVA option).
Auch. I'm afraid there is no speedy way to do this. Comment out a line (or two) -> test -> see what breaks. You could also put break statements here and there and run the debugger. That should give you some indication how you got there (ie. what the hierarchy between the classes is).
Hopefully the original developers used some patterns that you can recognize and make notes. Make lots of notes of everything. Start by trying to understand the high level structure and work down from there.
Be prepared to spend endless hours not understanding what the thing is doing.
Speak to the client and try to understand what the project is for, and what are all the things that it does. Someone somewhere had to put in some requirements for the stuff that's in there, if only in an email.
I would try to find the first entry point in the code closest to where you suspect you'll need to start making your changes, set a breakpoint, and start debugging. Check out the contents of local variables and work your way deeper as you get to become familiar with whats going on. Then, when you have a basic understanding of the area of code you're going to be working with, start fiddling with some small changes. Test your understanding of it. Try diagramming what you see happening. If you can do that confidently, you'll be able to decide if you need to go back and continue learning more about the code, or if you know enough to get done what you need to get done.
A start is to use an automated uml modeling tool (if you use eclipse you can use a plug-gin), and start creating UML diagrams of the various classes to see how they are related in a high level and visualize the code. This has helped many times
If there are log files being generated, have a look at it to understand the flow from the starting point (main class). Otherwise, put debug statements to understand the flow.
Ya, that sounds like a pretty bad spot to be in.
I would say that the best way is to just walk through the program line for line. Try to grasp the big picture in the code, and write alot of notes, both on paper and in comments in the code.
I would say, a good approach would be to generate documentation using javadoc or doxygen's class diagram feature, then as you run the code traverse through the class diagrams generated using doxygen and see who calls what. This works wonderfully for me everytime i am working on such a project.
I completely agree to most of the answers posted.
I can add to use a development tool that reverse engineering the code and create a class diagram, to have an overall picture of what is involved.
Then you need patience. But you will be a stronger and smarter developer when you'll get through...
Good luck!
One of the best and first things to do is to try to build and run the code. It might sound a bit simplistic but the problem when you take over undocumented code is that you can't even build it and run it. When have no clue were to start.
OK. Our product works. Beta testers are actually getting their stuff done. Time for the next iteration. But how to ensure quality? We need a tester!
How do I get someone fresh off the street started in testing? I have no clue on how to do it myself (I'm a developer, not a tester)!
We are a tiny team:
2 architects (as in buildings, not software, they are the domain experts here) figuring out what to build
me building it
and a new guy to do some testing before we push releases out
None of us has a clue on how to do this professionally. So far we have:
a bunch of virtual machines spanning the configurations we would like to test
various versions of windows
german and english, the two languages likely to be in use by our customers
the host software we are writing for (Autodesk Revit Architecture 2010, we are building a plugin for energy calculations)
a text document describing some tests I did (installed release xyz, did this, did that, etc.)
a bug tracking system the tester can add all the bugs he finds
I expect we will need a test script. But how? Who? What? When?
Why are you looking for "someone off the street"? To me, it sounds kind of like asking "I want to hire a new programmer, how do I get someone off the street and get him up to speed programming my software?". Why would you want to do that, over hiring someone who is a programmer already?
In your situation, which is that you don't know much about testing, I'd definitely think about hiring someone with experience in the field.
Specifically, I'd probably look for:
Someone with some experience performing tests under his belt (since you're going to want him actually doing tests).
Someone with some experience writing test plans/etc.
Someone with some experience running a QA team.
The last point is optional, but hopefully your team will be growing as your software grows, so it might make sense to get someone who can grow in the role as well (not to mention having the experience to help you decide when and how to grow the QA team).
Well, are you looking to expand your team with a tester? Have you considered just hiring a test specialist from a consultancy firm?
Before you get somebody to test, make sure you meet the requirements for testing. At a minimum you need:
A specification: Some authoritative source on what the application is supposed to do. This could be an expert that can answer any and all questions on exactly what the app is supposed to do, but the more that is written down and the more formally defined it is the better.
Time: Testing takes time. You can't hand off an application to the tester 30 minutes before it's supposed to go live and expect any worthwhile results. If you're doing waterfall development, testing will require a lot of time at the end. Lots of other development models let testing run in parallel with development, which saves a lot of time, but regardless of the model you use, testing will require more time than not testing.
If you don't have these two things, quality assurance is just a pipe dream.
Now if you do have those met, and you're trying to train somebody to test, here's my crash course on testing.
Fundamentally, testing an application means that you are attempting to ensure two things:
The program does what it is supposed to do.
The program does not do what it is not supposed to do.
That's the core mindset that I use. Building from that I approach things in terms of actions and attempt to verify:
An expected action with expected preconditions produces an expected effect.
An expected action with unexpected preconditions produces no effect or is handled appropriately.
An unexpected action produces no effect or is handled appropriately.
No unexpected effects occur.
Item 1 comes directly from the spec: You make sure that the program does what it is supposed to do.
Items 2 and 3 are where the art of testing comes in. What unexpected actions and preconditions can I perform? I could try to enter the wrong password. I could try to directly type in the URL of a supposedly secured page. I could try to paste odd unicode characters into a text field. I could try to put SQL or javascript code into a text field.
Item 4 is the infinite no-man's land of testing, the part that makes complete testing impossible. (2 and 3 are also infinite, but not as depressing to think about.) That doesn't mean you ignore it. You always keep an eye out for anything unusual. Also, sometimes inspiration strikes and you think of a possible way to cause an unexpected effect: "What happens if I log in between 11:59:59PM and 12:00:00AM on the third tuesday of the month? Oh look, it made me an administrator." Technical knowledge and a peek inside the black box help with coming up with scenarios like that.
There is a whole lot more to say about testing, but that's the bare minimum I can think of: The technical requirements and the approach to the problem.
Ideally, you'll need to give the tester:
training to make sure he knows the product to be tested.
documentation on what the expected results are.
test plans - what needs to be tested and how
a test tracking system to track what is being tested, what passed the tests, what needs to be fixed, etc. That system does not have to be too sophisticated, depending on the size of the project, an Excel spreadsheet may suffice.
In their podcast #64, Jeff and Joel discuss (among other things) what skills a good tester should possess. Transcript also available (about halfway down the page)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I would like to know your experience when you need to take over somebody else's software project - more so when the original software developer has already resigned.
The most success that we've had with that is to "wiki" everything. During the notice period ask the leaving developer to help you document everything in the team/company wiki and see if you can do code reviews with him/her and add comments to the code while doing the reviews that explain sections. Best for the "taking over" developer to write the comments in the code under the supervision of the leaver.
Cases where original devs leaved before handing over the project are always the most interesting: you're stuck with a codebase in an unknown state. What I always find intriguing is how the new devs often do their utmost best to comment on how badly designed the code is: they forget about the constraints the old devs might have been under, the shortcuts they might have been forced to make. The saying is always Old dev == bad dev. What do you people think:
I would even call this out as an official bad practice: bad-mouthing the ones who have been before us.
I try to take as much a pragmatic approach as possible: learn the codebase, wander around a bit. Try to understand the relation between requirements and code, even is there is no clear initial relationship at all. There will always be the "aha moment" when you realise why they did something was done this way or that. If you're still convinced something is implemented the wrong way, do your refactorings if possible. And isolate the pieces of code you cannot change: unit test them by using a mocking framework.
Hail to the maintenance developer.
I once joined a team which has been handed over a pile of steaming crap from outsourcing. The original project - a multimedia content manager based on Java, Struts, Hibernate|Oracle - was well structured (it seems like it was the work of a couple of people, pair programming, wise use of design patterns, some unit testing). Then someone else inherited the project and endlessly copy-pasted features, loosened the business rules, patched, branched until it became a huge spaghetti monster with fine crafted piece of codes like:
List<Stuff> stuff = null;
if (LOG.isDebugEnabled())
{
stuff = findStuff();
LOG.debug("Yeah, I'm a smart guy!");
for (Stuff stu : stuff)
{
LOG.debug("I've got this stuff: " + stu);
}
}
methodThatUsesStuff(stuff);
hidden amongst the other brilliant ingenuity.
I tamed the beast via patient refactoring (extracting methods and classes more of the times), commenting the code from time to time, reorganizing everything till the codebase shrunk by 30%, getting more and more manageable over time.
I had to take over someone else’s code of different degrees of quality on several occasions. Hence the tips:
Make effort to take structured notes of any piece of significant information from minute one: names of stakeholders, business rules, code and document locations etc. It is best to dedicate a fresh spiral notebook, so you could tear pages out if you had to.
Make use of one of the better free indexing and desktop search tools available on the market (Google Desktop Search, MS Windows Search will do). Add all document, e-mail, code locations to it.
Before developing anything do document analysis: find everything you can get you hands on electronically on network and printed out docs, make effort of simply reading it. There is amazingly much of useful information even within unfinished drafts.
Mind map the code, architecture etc as you go.
With lesser documented and maintained systems you inevitably will have moments of despair that are likely to push you into procrastination mode. Especially during your first days or week when amount of new information your mind has to digest is overwhelming. At these times it is nice to have someone to remind you (or just do it yourself) to take it easy, concentrate on important things first and revert to making smaller steps in trying to gain understanding instead of trying to leap forward.
Keep taking notes, making diagrams, drawing rich pictures, mind mapping. It really helps to digest the copious amounts of new information, mostly disorganised.
Hei, good luck!
We actually have a specified set of "Deliverables" that has to be present for us to take over a project.
If we have the chance we try to push in one of our folks within the group developing the project at first. That way we get some firsthand knowledgde before our group takes over the code. (in the line of what #Guy wrote)
That being said, the most important part for me would be:
Some kind og highlevel overview(drawing?) of what the code do.
Easy access to ask questions of the people who actually wrote the code
This for me is alpha omega when taking over code and projects
As you work in a legacy codebase what will have the greatest impact over time that will improve the quality of the codebase?
Remove unused code
Remove duplicated code
Add unit tests to improve test coverage where coverage is low
Create consistent formatting across files
Update 3rd party software
Reduce warnings generated by static analysis tools (i.e.Findbugs)
The codebase has been written by many developers with varying levels of expertise over many years, with a lot of areas untested and some untestable without spending a significant time on writing tests.
Read Michael Feather's book "Working effectively with Legacy Code"
This is a GREAT book.
If you don't like that answer, then the best advice I can give would be:
First, stop making new legacy code[1]
[1]: Legacy code = code without unit tests and therefore an unknown
Changing legacy code without an automated test suite in place is dangerous and irresponsible. Without good unit test coverage, you can't possibly know what affect those changes will have. Feathers recommends a "stranglehold" approach where you isolate areas of code you need to change, write some basic tests to verify basic assumptions, make small changes backed by unit tests, and work out from there.
NOTE: I'm not saying you need to stop everything and spend weeks writing tests for everything. Quite the contrary, just test around the areas you need to test and work out from there.
Jimmy Bogard and Ray Houston did an interesting screen cast on a subject very similar to this:
http://www.lostechies.com/blogs/jimmy_bogard/archive/2008/05/06/pablotv-eliminating-static-dependencies-screencast.aspx
I work with a legacy 1M LOC application written and modified by about 50 programmers.
* Remove unused code
Almost useless... just ignore it. You wont get a big Return On Investment (ROI) from that one.
* Remove duplicated code
Actually, when I fix something I always search for duplicate. If I found some I put a generic function or comment all code occurrence for duplication (sometime, the effort for putting a generic function doesn't worth it). The main idea, is that I hate doing the same action more than once. Another reason is because there's always someone (could be me) that forget to check for other occurrence...
* Add unit tests to improve test coverage where coverage is low
Automated unit tests is wonderful... but if you have a big backlog, the task itself is hard to promote unless you have stability issue. Go with the part you are working on and hope that in a few year you have decent coverage.
* Create consistent formatting across files
IMO the difference in formatting is part of the legacy. It give you an hint about who or when the code was written. This can gave you some clue about how to behave in that part of the code. Doing the job of reformatting, isn't fun and it doesn't give any value for your customer.
* Update 3rd party software
Do it only if there's new really nice feature's or the version you have is not supported by the new operating system.
* Reduce warnings generated by static analysis tools
It can worth it. Sometime warning can hide a potential bug.
I'd say 'remove duplicated code' pretty much means you have to pull code out and abstract it so it can be used in multiple places - this, in theory, makes bugs easier to fix because you only have to fix one piece of code, as opposed to many pieces of code, to fix a bug in it.
Add unit tests to improve test coverage. Having good test coverage will allow you to refactor and improve functionality without fear.
There is a good book on this written by the author of CPPUnit, Working Effectively with Legacy Code.
Adding tests to legacy code is certianly more challenging than creating them from scratch. The most useful concept I've taken away from the book is the notion of "seams", which Feathers defines as
"a place where you can alter behavior in your program without editing in that place."
Sometimes its worth refactoring to create seams that will make future testing easier (or possible in the first place.) The google testing blog has several interesting posts on the subject, mostly revolving around the process of Dependency Injection.
I can relate to this question as I currently have in my lap one of 'those' old school codebase. Its not really legacy but its certainly not followed the trend of the years.
I'll tell you the things I would love to fix in it as they bug me every day:
Document the input and output variables
Refactor the variable names so they actually mean something other and some hungarian notation prefix followed by an acronym of three letters with some obscure meaning. CammelCase is the way to go.
I'm scared to death of changing any code as it will affect hundreds of clients that use the software and someone WILL notice even the most obscure side effect. Any repeatable regression tests would be a blessing since there are zero now.
The rest is really peanuts. These are the main problems with a legacy codebase, they really eat up tons of time.
I'd say it largely depends on what you want to do with the legacy code...
If it will indefinitely remain in maintenance mode and it's working fine, doing nothing at all is your best bet. "If it ain't broke, don't fix it."
If it's not working fine, removing the unused code and refactoring the duplicate code will make debugging a lot easier. However, I would only make these changes on the erring code.
If you plan on version 2.0, add unit tests and clean up the code you will bring forward
Good documentation. As someone who has to maintain and extend legacy code, that is the number one problem. It's difficult, if not downright dangerous to change code you don't understand. Even if you're lucky enough to be handed documented code, how sure are you that the documentation is right? That it covers all of the implicit knowledge of the original author? That it speaks to all of the "tricks" and edge cases?
Good documentation is what allows those other than the original author to understand, fix, and extend even bad code. I'll take hacked yet well-documented code that I can understand over perfect yet inscrutable code any day of the week.
The single biggest thing that I've done to the legacy code that I have to work with is to build a real API around it. It's a 1970's style COBOL API that I've built a .NET object model around, so that all the unsafe code is in one place, all of the translation between the API's native data types and .NET data types is in one place, the primary methods return and accept DataSets, and so on.
This was immensely difficult to do right, and there are still some defects in it that I know about. It's not terrifically efficient either, with all the marshalling that goes on. But on the other hand, I can build a DataGridView that round-trips data to a 15-year-old application which persists its data in Btrieve (!) in about half an hour, and it works. When customers come to me with projects, my estimates are in days and weeks rather than months and years.
As a parallel to what Josh Segall said, I would say comment the hell out of it. I've worked on several very large legacy systems that got dumped in my lap, and I found the biggest problem was keeping track of what I already learned about a particular section of code. Once I started placing notes as I go, including "To Do" notes, I stopped re-figuring out what I already figured out. Then I could focus on how those code segments flow and interact.
I would say just leave it alone for the most part. If it's not broken then don't fix it. If it is broken then go ahead and fix and improve the portion of the code that is broken and its immediately surrounding code. You can use the pain of the bug or sorely missing feature to justify the effort and expense of improving that part.
I would not recommend any wholesale kind of rewrite, refactor, reformat, or putting in of unit tests that is not guided by actual business or end-user need.
If you do get the opportunity to fix something, then do it right (the chance of doing it right the first time might have already passed, but since you are touching that part again might as well do it right time around) and this includes all the items you mentioned.
So in summary, there's no single or just a few things that you should do. You should do it all but in small portions and in an opportunistic manner.
Late to the party, but the following may be worth doing where a function/method is used or referenced often:
Local variables often tend to be poorly named in legacy code (often owing to their scope expanding when a method is modified, and not being updated to reflect this). Renaming these in line with their actual purpose can help clarify legacy code.
Even just laying out the method slightly differently can work wonders - for instance, putting all the clauses of an if on one line.
There might be stale/confusing code comments there already. Remove them if they're not needed, or amend them if you absolutely have to. (Of course, I'm not advocating removal of useful comments, just those that are a hindrance.)
These might not have the massive headline impact you're looking for, but they are low risk, particularly if the code can't be unit tested.