Revision histories and documenting changes - coding-style

I work on legacy systems and I used to see revision history of files or functions being modified every release in the source code, for example:
//
// Rev. No Date Author Description
// -------------------------------------------------------
// 1.0 2009/12/01 johnc <Some description>
// 1.1 2009/12/24 daveb <Some description>
// -------------------------------------------------------
void Logger::initialize()
{
// a = b; // Old code, just commented and not deleted
a = b + c; // New code
}
I'm just wondering if this way of documenting history is still being practiced by many today? If yes, how do you apply modifications on the source code - do you comment it or delete it completely?
If not, what's the best way to document these revisions? If you use version control systems, does it follow that your source files contain pure source codes, except for comments when necessary (no revision history for each function, etc.)?

Just rely on your version control system. Yes, just pure source code. If code is commented out I delete it. If I'm not sure I leave it there with a TODO comment.
I don't insert comments to reference tickets in the source but in the commit message. You don't need to document in the code what it used to look like.

Manual revision histories are hard to maintain, hence almost always out-of-date.
I trust the revision control system to give me this information. In addition to be correct, it can be much more precise, for example with pre-line annotations (who has changed that line last).
I do insert comments to reference tickets in the bug tracking system directly in the source when I feel that it is necessary, even though this info is also available in the commit message.
It does make sense to have a changes/release notes file per project (not per source file) that is manually maintained and updated for every release.

I used to work for a company that mandated this type of comment on all SPs that went to the database. I found it incredibly tedious and completely redundant over the notes we were required to enter into our source control system.
The main use of the inline comments was to validate that the deployment to a new environment was successful (like production). This also was a tedious process and was only used because there was no other method.
I have never found a use for inline comments like this anywhere else and found they only caused tedious headache inducing work.
I would advocate for a system where source control manages the comments and revision history and not the code. That is one of the purposes of the system and that's the best place for it. IMO.

Nowadays the best way to document these revisions is to let your version control do it, but that also means that you have to enforce the rule that developers write meaningful commit comments, or at least some bug tracking numbers, so it would be easy to figure out what they committed what for.

I've never seen revision history in code comments. At most I've seen the Javadoc #version tags to document the version.
The oldest version control method is like this:
logo2007-1.png, logo2007-2.png
Version control should be the best solution.
see also: http://betterexplained.com/articles/a-visual-guide-to-version-control/

Related

Cross version line matching

I'm considering how to do automatic bug tracking and as part of that I'm wondering what is available to match source code line numbers (or more accurate numbers mapped from instruction pointers via something like addr2line) in one version of a program to the same line in another. (Assume everything is in some kind of source control and is available to my code)
The simplest approach would be to use a diff tool/lib on the files and do some math on the line number spans, however this has some limitations:
It doesn't handle cross file motion.
It might not play well with lines that get changed
It doesn't look at the information available in the intermediate versions.
It provides no way to manually patch up lines when the diff tool gets things wrong.
It's kinda clunky
Before I start diving into developing something better:
What already exists to do this?
What features do similar system have that I've not thought of?
Why do you need to do this? If you use decent source version control, you should have access to old versions of the code, you can simply provide a link to that so people can see the bug in its original place. In fact the main problem I see with this system is that the bug may have already been fixed, but your automatic line tracking code will point to a line and say there's a bug there. Seems this system would be a pain to build, and not provide a whole lot of help in practice.
My suggestion is: instead of trying to track line numbers, which as you observed can quickly get out of sync as software changes, you should decorate each assertion (or other line of interest) with a unique identifier.
Assuming you're using C, in the case of assertions, this could be as simple as changing something like assert(x == 42); to assert(("check_x", x == 42)); -- this is functionally identical, due to the semantics of the comma operator in C and the fact that a string literal will always evaluate to true.
Of course this means that you need to identify a priori those items that you wish to track. But given that there's no generally reliable way to match up source line numbers across versions (by which I mean that for any mechanism you could propose, I believe I could propose a situation in which that mechanism does the wrong thing) I would argue that this is the best you can do.
Another idea: If you're using C++, you can make use of RAII to track dynamic scopes very elegantly. Basically, you have a Track class whose constructor takes a string describing the scope and adds this to a global stack of currently active scopes. The Track destructor pops the top element off the stack. The final ingredient is a static function Track::getState(), which simply returns a list of all currently active scopes -- this can be called from an exception handler or other error-handling mechanism.

Is Code with author's name absolute necessary?

Is there any need for code with author's name added in every function or even the files?
Yes code will be in source control and many programmers involved
According to Code Complete, comments are used to illustrate the purpose of the code. Using it for other purposes may result in 'comment decay'.
That said, keeping track of code ownership, change-log and who last modify the file, IMHO, is the job of a source control repo like SVN and such, and should not be inside the comments. Unless it is a license of some sorts. Or use an IDE's bookmark system to keep track of who authored a function, and who is the person responsible for it.
All this are just my 2 cent worth, though.
If the code is under source control, no.
That kind of data should be stored in the source code repository.
If the code is meant to be widely distributed (in other environment without any kind of repository), then yes, that could be useful. (some java sources do include that kind of information, as well as the release number from which some evolution of the code are available)
Obviously that kind of information (for widely deployed code base) is at the file level only.
Consider for example the source code of java.lang.Boolean in Java:
/**
* [...]
* In addition, this class provides many methods for
* converting a {#code boolean} to a {#code String} and a
* {#code String} to a {#code boolean}, as well as other
* constants and methods useful when dealing with a
* {#code boolean}.
*
* #author Arthur van Hoff
* #version 1.60, 05/05/07
* #since JDK1.0
*/
public final class Boolean implements java.io.Serializable,
Comparable<Boolean> {
[...]
You do not have all the authors from the beginning of time, only the most recent one, with the associated last major version for the most recent modification, and the version for the original introduction of the class.
That can help for API tooling for example, when you want to maintain good APIs.
But informations about the author is still limited to the file, not the functions: he represents the coordinator or aggregation manager for all the functions present in the class, even though there may have been more than one contributor over time.
As such, this is a public information, worthy of being put explicitly in the file, as opposed to a private meta-data (who writes what), stored as all other meta-data (date, version, branch, merge information, ...) in a source code repository.
Yes code will be in source control
Then no. Source control takes care of tracking this (blame).
Especially for team OpenSource projects, it might be useful or necessary to indicate the author of a particular piece of code. But commenting on every single function seems really excessive, especially since most of a class will be written by the same author (n’est-ce pas?). I like the Java library convention of specifying the author(s) for each class. Somehow, this seems to be the right trade-off.
On the flipside, if you co-authored a class, you’re to blame if someone else writes bad code in it. I actually think that this is a good thing. A class (in OOP, at least) is one entity so the quality is determined by its overall quality. If one function is bad, so is the whole class.
In some projects, author names can be used to give due credit to someone who put more effort into development than expected of him. Such recognition can improve motivation.
No. You or the company you work for implicitly has copyright. But that being said for tracking purposes it can be useful to ask that person what this code did later.
yes it is needed. Also if possible give the date alongwith the name. It is used for the tracking purpose as well as gives priviledge to others to know the owner of that function.

Best Practice for comments in Java source files?

This doesn't have to be Java, but it's what I'm dealing with. Also, not so much concerned with the methods and details of those, I'm wondering about the overall class file.
What are some of the things I really need to have in my comments for a given class file? At my corporation, the only things I really can come up with:
Copyright/License
A description of what the class does
A last modified date?
Is there anything else which should be provided?
One logical thing I've heard is to keep authors out of the header because it's redundant with the information already being provided via source control.
Update:
JavaDoc can be assumed here, but I'm really more concerned about the details of what's good to include content-wise, whether it's definitive meta-data that can be mined, or the more loose, WHY etc...
One logical thing I've heard is to keep authors out of the header because it's redundant
with the information already being provided via source control.
also last modified date is redundant
I use a small set of documentation patterns:
always documenting about thread-safety
always documenting immutability
javadoc with examples
#Deprecation with WHY and HOW to replace the annotated element
keeping comments at minimum
No to the "last modified date" - that belongs in source control too.
The other two are fine. Basically concentrate on the useful text - what the class does, any caveats around thread safety, expected usage etc.
Implementation comments should usually be about why you're doing something non-obvious - and should therefore be rare. (For instance, it could be because some API behaves in an unusual way, or because there's a useful shortcut you can use but which isn't immediately obvious.)
For the sanity of yourself and future developers, you really ought to be writing Javadocs.
When you feel the need to write comments to explain what some code does, improve the readability of the code, so that comments are not needed. You can do that by renaming methods/fields/classes to have more meaningful names, and by splitting larger methods into smaller methods using the composed method pattern.
If even after all your efforts the code is not self-explanatory, for example the reason why some unobvious code had to be written is not clear from the code, then apologize by writing comments. (Sometimes you can document the reasons by writing a test which will fail, if somebody changes the unobvious-but-correct code to do the obvious-but-wrong thing. But having a comment in addition to that is also useful. I prefix such comments often with "// HACK:" or "// XXX:".)
An overall description of the purpose of the class, a description for each field and a contract for each method. Javadoc format works well.
If you assign ownership of components to particular developers or teams, owners should be recorded in the component source or VCS metadata.

Configuration Management - History in Code Comments

Let me pose a bit of background information before asking my question:
I recently joined a new software development group that uses Rational tools for configuration management, including a source control and change management system.
In addition to these tools, the team has a standard practice of noting any code changes as a comment in the code, such as:
///<history>
[mt] 3/15/2009 Made abc changes to fix xyz
///</history>
Their official purpose for the commenting standard is that "the comments provide traceability from requirement to code modification".
I am preparing to pose an argument that this practice is unnecessary and redundant; that the team should get rid of this standard immediately.
To wit - the change management system is the place to build traceability from requirement to code modification, and source control can provide detailed history of changes by performing a Diff between versions. When source code is checked in, the corresponding change management ticket is noted. When a CM ticket is resolved, we note which source code files were modified. I believe this provides a sufficient cross-reference for the desired traceability.
I would like to know if anyone disagrees with my argument. Am I missing some benefit of commented source code history that change management and source control systems cannot provide?
For myself, I have always found such comments to be more trouble than they're worth: they can cause merge conflicts, can appear as 'false positives' when you're trying to isolate the diffs between two versions, and may reference code changes that have since been obsoleted by later changes.
It's often (not always, but often) possible to change version-control systems without losing metadata. If you were to move your code to a system that doesn't support this, it would not be hard to write a script to convert the change history into comments before the cutover.
A comment allows you to find all the changes and their reasons in the code right where they are relevant without having to dig into diffs and version control system intricacies. Furthermore, should you decide to change of version control system, the comments will stay.
I worked on a large project with similar practice that had changed of source control system twice. There wasn't a day when I wasn't glad to have these comments.
Is it redundant? Yes.
Is it unnecessary? No.
I've always thought that code should be, of course, under version control, and that the current source code (the one that you can open and read today) should be valid only in present tense.
It doesn't matter if a report could have up to 3 axis in the past and last month you updated it to support up to 6 axis. It doesn't matter if you expanded some function or fixed some bug, as long as the current version can be easily understood. When you fix a bug, just leave the fixed code.
There's an exception, though. If (and only if) the fixed code looks less intuitive to you than the previous, incorrect one; if you feel that someone might come tomorrow and, just by reading the code, be tempted to change it back to what "seems more correct", then it's good to add a comment: "This is done this way to avoid... blah blah blah." Also, if the problem behind is an infamous war story inside the team's culture, or if for some reason the bug report database contains very interesting information about this part of the code, I wouldn't find it incorrect to add "(see Bug Id 10005)" to the explaining comment.
The one that jumps to mind to me is vendor lockin. If you ever moved away from Rational, you'd need to make sure that the full change history was maintained during the migration - not just the version of the artifacts.
When you're in the code you need to know why it's structured like that, hence in code commenting. Tools that sit outside the code, good though they may be, require far too much of a context shift in your brain to be useful. As well as that, trying to reverse engineer the code intent from documentation and a diff is pretty damn hard, I'd much rather read a line of comment any day.
There was a phase in the code I work on, back in the 1994-96 time frame, where there was a tendency to insert change history comments at the top of the file. Those comments are now meaningless and useless, and one of the many standard cleanups I perform when editing files containing such comments is to remove them.
In contrast, there are also some comments with a bug number at the location where the change is made, typically explaining why the ridiculous code is as it is. These can be very helpful. The bug number gives you somewhere else to look for information, and fingers the culprit (or victim - it varies).
On the other hand, items like this one - genuine; cleaned up last week - make me grit my teeth.
if (ctab->tarray && ctab->tarray[i])
#ifndef NT
prt_theargs(*(ctab->tarray[i]));
#else
/* Correct the parameter type mismatch in the line above */
prt_theargs(ctab->tarray[i]);
#endif /* NT */
The NT team got the call correct; why they thought it was a platform-specific fix is beyond me. Of course, if the code had used prototypes instead of just parameterless declarations before now, then the Unix team would have had to fix the code too. The comment was a help - assuring me that the bug was genuine - but exasperating.

Commenting code that is removed [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 5 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Is it a good practice to comment code that is removed? For example:
// Code to do {task} was removed by Ajahn on 10/10/08 because {reason}.
Someone in my developer group during a peer review made a note that we should comment the lines of code to be removed. I thought this was a terrible suggestion, since it clutters the code with useless comments. Which one of us is right?
Generally, code that is removed should not be commented, precisely because it clutters the codebase (and, why would one comment on something that doesn't exist?).
Your defect tracking system or source control management tools are where such comments belong.
There are some (rare) situations when commenting code out (instead of deleting) is a good idea. Here's one.
I had a line of code that seemed good and necessary. Later I realized that it is unnecessary and harmful. Instead of deleting the line, I commented it out, adding another comment: "The line below is wrong for such and such reason". Why?
Because I am sure next reader of the code will first think that not having this line is an error and will try to add it back. (Even if the reader is me two years from now.) I don't expect him to consult source control first. I need to add comment to warn him of this tricky situation; and having wrong line and the reason why it is wrong happened to be the best way to do so.
I agree that it is not a good idea to leave code removed in comments.
Code history should be viewed through a version control system, which is where old code can be found, as well as the reason it was removed.
You should delete the code always.
As for being able to see old/removed code, that's what revision control is.
Depends on the reason for removal.
I think of comments as hints for people maintaining the code in the future, if the information that the code was there but was removed can be helpful to someone maintaining the code (maybe as a "don't do that" sign) then it should be there.
Otherwise adding detailed comments with names and dates on every code change just make the whole thing unreadable.
I think it's pretty useless and make the code less readable. Just think what it will be like after some monthes....
// removed because of this and that
/*
removed this stuff because my left leg...
*/
doSomething();
// this piece of has been removed, we don't need it...
You'll spend half an hour to find out what's going on
The question is, why do you remove code?
Is it useless? Was it a mistake to put it there in the first place?
No comments needed from my point of view.
It's useful when debugging, but there's no reason to check in code that way. The whole point of source control is being able to recover old versions without cluttering up the code with commented-out code.
I would suggest that, yes it's good practice to comment on code that has been removed but not in the code itself.
To further clarify this position, you should be using a source code control system (SCCS) that allows some form of check-in comment. That is where you should place the comments about why code was removed. The SCCS will provide the full contextual history of what has happened to the code, including what has been removed. By adding check-in comments you further clarify that history.
Adding comments in the code directly simply leads to clutter.
The recent consensus (from other discussions on here) is that the code should just be removed.
I personally will comment out code and tag it with a date or a reason. If it's old/stale and I'm passing through the file, then I strip it out. Version control makes going back easy, but not as easy as uncommenting...
It sounds like you are trying to get around versioning your code. In theory, it sounds like a great idea, but in practice it can get very confusing very quickly.
I highly recommend commenting code out for debugging or running other tests, but after the final decision has been made remove it from the file completely!
Get a good versioning system in place and I think you'll find that the practice of commenting out changes is messy.
Nobody here has written much about why you shouldn't leave commented-out code, other than that it looks messy. I think the biggest reason is that the code is likely to stop working. Nobody's compiling it. Nobody's running it through unit tests. When people refactor the rest of the code, they're not refactoring it. So pretty soon, it's going to become useless. Or worse than useless -- someone might uncomment it, blindly trusting that it works.
There are times when I'll comment out code, if we're still doing heavy design/development on a project. At this stage, I'm usually trying out several different designs, looking for the right approach. And sometimes the right approach is one I had already attempted earlier. So it's nice if that code isn't lost in the depths of source control. But once the design has been settled, I'll get rid of the old code.
In general I tend to comment very sparsely. I believe good code should be easy to read without much commenting.
I also version my code. I suppose I could do diffs over the last twenty checkins to see if a particular line has changed for a particular reason. But that would be a huge waste of my time for most changes.
So I try comment my code smartly. If some code is being deleted for a fairly obvious reason, I won't bother to comment the deletion. But if a piece of code is being deleted for a subtle reason (for example it performed a function that is now being handled by a different thread) I will comment-out or delete the code and add a banner comment why:
// this is now handled by the heartbeat thread
// m_data.resort(m_ascending);
Or:
// don't re-sort here, as it is now handled by the heartbeat thread
Just last month, I encountered a piece of code that I had changed a year ago to fix a particular issue, but didn't add a comment explaining why. Here is the original code:
cutoff = m_previous_cutofftime;
And here is the code as it was initially fixed to use a correct cutoff time when resuming an interrupted state:
cutoff = (!ok_during) ? m_previous_cutofftime : 0;
Of course another unrelated issue came up, which happened to touch the same line of code, in this case reverting it back to its original state. So the new issue was now fixed, but the old issue suddenly became rebroken. D'oh!
So now the checked-in code looks like this:
// this works for overlong events but not resuming
// cutoff = m_previous_cutofftime;
// this works for resuming but not overlong events
// cutoff = (!ok_during) ? m_previous_cutofftime : 0;
// this works for both
cutoff = (!resuming || !ok_during) ? m_previous_cutofftime : 0;
Of course, YMMV.
As the lone dissenting voice, I will say that there is a place for commenting out code in special circumstances. Sometimes, you'll have data that continues to exist that was run through that old code and the clearest thing to do is to leave that old code in with source. In such a case I'd probably leave little note indicating why the old code was simply commented out. Any programmers coming along after would be able to understand the still extant data, without having to psychically detect the need to check old versions.
Usually though, I find commented out code completely odious and I often delete it when I come across it.
If you are removing code. You should not comment it that you removed it. This is the entire purpose of source control (You are using source control? Right?), and as you state the comment just clutters up the code.
I agree that it's a terrible suggestion. That's why you have Source Control that has revisions. If you need to go back and see what was changed between two revisions, diff the two revisions.
I hate seeing code that's cluttered with commented out code. Delete the code and write a commit message that says why it was removed. You do use source control, don't you?
Don't litter active code with dead code.
I'll add my voice to the consensus: put the comments on why code was deleted in the source control repository, not in the code.
This is one of those "broken" windows thinkgs like compiler hints/warnings left unaddressed. it will hurt you one day and it promotes sloppiness in the team.
The check in comment in version control can track what/why this code was removed - if the developer didnt leave a note, track them down and throttle them.
A little anecdote, for fun: I was in a company, some years ago, knowing nothing of source code version control (they got such tool later...).
So they had a rule, in our C sources: "when you make a change, disable the old code with preprocessor macros":
#ifdef OLD /* PL - 11/10/1989 */
void Buggy()
{
// ...
}
#else
void Good()
{
// ...
}
#end
No need to say, our sources quickly became unreadable! It was a nightmare to maintain...
That's why I added to SciTE the capacity to jump between nested #ifdef / #else / #end and such... It can be still useful in more regular cases.
Later, I wrote a Visual Studio macro to happily get rid of old code, once we got our VCS!
Now, like buti-oxa, sometime I felt the need to indicate why I removed some code. For the same reason, or because I remove old code which I feel is no longer needed, but I am not too sure (legacy, legacy...). Obviously not in all cases!
I don't leave such comment, actually, but I can understand the need.
At worse, I would comment out in one version, and remove everything in the next version...
At my current work, for important local changes, we leave the old code but can reactivate it by properties, in case of emergency. After testing it some time in production, we eventually remove the old code.
Of course, VCS comments are the best option, but when the change is a few lines in a big file with other changes, referencing the little removal can be hard...
If you are in the middle of major changes, and need to make a fix to existing functionality, commenting out the future code is a reasonable thing to do, provided you remark that this is future functionality, at least until we have futures friendly source control systems.
I comment unnused code because you never know when will you have to fallback on the ancient code, and maybe the old code will help other people to understand it, if it was simpler back then.
I agree with you Andrew; IMO this is why you use version control. With good checkin/commit comments and a diff tool you can always find out why lines were removed.
If you are using any form of Source Control then this approach is somewhat redundant (as long as descriptive log messages are used)
I also think it's a terrible suggestion :)
You should use source control and if you remove some code you can add a comment when you commit. So you still have the code history if you want...
There's a general "clean code" practice that says that one should never keep removed code around as commented out since it clutters and since your CVS/SVN would archive it anyway.
While I do agree with the principle I do not think that it is an acceptable approach for all development situations. In my experience very few people keep track of all the changes in the code and every check-in. as a result, if there is no commented out code, they may never be aware that it has ever existed.
Commenting code out like that could be a way of offering a general warning that it is about to be removed, but of course, there are no guarantees that interested parties would ever see that warning (though if they frequently work with that file, they will see it).
I personally believe that the correct approach is to factor that code out to another private method, and then contact relevant stakeholders and notify them of the pending removal before actually getting rid of the function.
Where I am at we comment out old code for one release cycle and then remove the comments after that. (It gives us quick fix ability if some of the new code is problematic and needs to be replaced with the old code.)
In almost all cases old code should of course be removed and tracked in your RCS.
Like all things though, I think that making the statement 'All deleted code will ALWAYS be removed' is an incorrect approach.
The old code might want to be left in for a miriad of reasons. The prime reason to leave the code in is when you want any developer who is working in that section of code in the future to see the old code.
Relying on source tracking obviously does not give this.
So, I believe the correct answer is:
-Delete old code unless leaving it in provides crucial information that the next developer in the code would require. Ie, remove it 99% of the time but don't make a draconian rule that would remove your ability to provide much needed documentation to the next developer when circumstances warrant it.

Resources