Best Practice for comments in Java source files? - comments

This doesn't have to be Java, but it's what I'm dealing with. Also, not so much concerned with the methods and details of those, I'm wondering about the overall class file.
What are some of the things I really need to have in my comments for a given class file? At my corporation, the only things I really can come up with:
Copyright/License
A description of what the class does
A last modified date?
Is there anything else which should be provided?
One logical thing I've heard is to keep authors out of the header because it's redundant with the information already being provided via source control.
Update:
JavaDoc can be assumed here, but I'm really more concerned about the details of what's good to include content-wise, whether it's definitive meta-data that can be mined, or the more loose, WHY etc...

One logical thing I've heard is to keep authors out of the header because it's redundant
with the information already being provided via source control.
also last modified date is redundant
I use a small set of documentation patterns:
always documenting about thread-safety
always documenting immutability
javadoc with examples
#Deprecation with WHY and HOW to replace the annotated element
keeping comments at minimum

No to the "last modified date" - that belongs in source control too.
The other two are fine. Basically concentrate on the useful text - what the class does, any caveats around thread safety, expected usage etc.
Implementation comments should usually be about why you're doing something non-obvious - and should therefore be rare. (For instance, it could be because some API behaves in an unusual way, or because there's a useful shortcut you can use but which isn't immediately obvious.)

For the sanity of yourself and future developers, you really ought to be writing Javadocs.

When you feel the need to write comments to explain what some code does, improve the readability of the code, so that comments are not needed. You can do that by renaming methods/fields/classes to have more meaningful names, and by splitting larger methods into smaller methods using the composed method pattern.
If even after all your efforts the code is not self-explanatory, for example the reason why some unobvious code had to be written is not clear from the code, then apologize by writing comments. (Sometimes you can document the reasons by writing a test which will fail, if somebody changes the unobvious-but-correct code to do the obvious-but-wrong thing. But having a comment in addition to that is also useful. I prefix such comments often with "// HACK:" or "// XXX:".)

An overall description of the purpose of the class, a description for each field and a contract for each method. Javadoc format works well.

If you assign ownership of components to particular developers or teams, owners should be recorded in the component source or VCS metadata.

Related

Handling comments in code when doing i18n

I'm in the process of translating a Open Source project from Chinese to English, and I've used i18n (in this case babel) to separate the code from both English and Chinese translations.
Everything's done, except for a rather large number of inline comments in the code.
Obviously, babel can't translate comments inline (and it would be rather obnoxious if it did, anyway. Since code would not be unique across languages and therefore less easily verifiable.)
The way I see it, there are a number of options:
Leave comments in -
Pro: Helps original author, etc.
Con: Makes it distracting for ongoing translation and anyone who doesn't speak the language
Strip out all the comments -
Pro: Code is native-language-agnostic, so it makes sense. Who needs comments anyway? Use the source, Luke!
Con: Goes against SE principles. Could lose something important in understanding how the code works - maybe something's been done to avoid a security risk, etc.
Place English comments near Chinese comments
(Possibly moved to lines above and prefixed with "EN" and "ZH", for example).
Pro: Best of both worlds, comments kept close to code
Con: Not conducive to dictionary-style translation. Can get bulky with more languages.
Create a comment dictionary / notes
Pro: Keeps the comments in a separate file for easy translation.
Con: Difficult to keep synced with code. Not intuitive to remember to update comments related to code when changing coe.
Use a different preprocessor for i18n before/after each development cycle.
Pro: Comments et al would be in your language. Could link this to git pull/push so you only ever see the code in your language.
Con: Bulky, non-obvious process. Could result in code-verification or even compilation errors.
None of these seem like really great solutions.
If you do alot of this, and the code is shared publicly between developers who don't share a native tongue, is there a recommended way to handle translating (or not) comments in the code itself?
I am not sure I understand... You say you separated the code from the languages part. So now you should have code (with comments) + English resources + Chinese resources (i used resources for whatever your programming language use to store localizable content)
Translators only see the resources, not the code, nor the comments. The comments stay untranslated, for the developers.
Short Answer
It seems to be a mixture of:
Strip out all the comments, and
Place English comments near Chinese comments.
Inline comments are almost always trivial - Strip them
Functional comments are not as intrusive - Translate them (possibly with a i18n prefix e.g. "[cn]:" or "[en]:").
Explanation
My meagre amount of research tends to suggest that larger projects make strong attempts to reduce comments and let the code explain itself, instead focusing on code quality which reduces the need for comments.
e.g. From the Linux Kernel Coding guidelines:
NEVER try to explain HOW your code works in a comment: it's much
better to write the code so that the working is obvious, and it's
a waste of time to explain badly written code.
...and from the MySQL coding standards:
Comment your code when you do something that someone else may think is
not trivial.
Both of these standards (and others) recommend minimal function descriptions also, so that's not as obtrusive to understanding the code, and, since function descriptions are generally multi-lined and above the code itself, multiple languages can be included as necessary.
Maybe someone, somewhere has built an interface that can isolate comments into the readers language, but I couldn't (yet) find any that do so.
I always think that API comments exported in the project and private comments in open source projects should be internationalized, which is very convenient for developers in other countries.
On Github, there are actually many developers who use their own national language to comment on some well-known open source projects and some of their own annotations. Most of the reason is that if they do not translate, the efficiency of developers reading comments very low.
Similar to .d.ts in TypeScript, I think function annotation translation can also take a similar form, which is more convenient for the community to feedback translation content, because in fact many developers are willing to do so.

Does identifier casing really matter?

FxCop thought me (basically, from memory) that functions, classes and properties should be written in MajorCamelCase, while private variables should be in minorCamelCase.
I was talking about a reasonably popular project on IRC and quoted some code. One other guy, a fairly notorious troll who was also a half-op (gasp!) didn't seem to agree. Everything oughta be in the same casing, and he quite fervently favored MajorCamelCase, or even underscore_separation.
Ofcourse, he was just a troll so I reckoned I'd just keep doing it the way I already did. Before I learned the above guidelines, I hardly even had a coherent naming style.
He got me thinking, though -- does stuff like this really matter?
You need to make sure that your code is readable in the future. Please remember that you might want to pass the development of your application to someone else and this person will need to read and understand it. You could stop actively working on a project and return to it after a year - and be suprised that you have to read code carefully to understand how it works.
I believe it was Steve McConnell who said that specific naming style does not really matter (you could use anything you want as long as you are consistent) but this only applies when everyone working on the project agree with you.
In general it is better to adopt community-accepted coding styles where possible to facilitate code reuse and shorten learning curves.
If you don't care about long-term maintanability of your project (or consistency or readability) then no, casing (and coding conventions in general) don't really matter. Otherwise, they do matter. See this.
Your specific coding style doesn't matter (much), so long as it is consistent throughout the project.
This improves readability and understanding, as if an identifier is named in a particular way, the reader can (hopefully) be confident as to what that naming style implies.
As regards CamelCase v underscores, etc: again, it's down to your coding convention. One approach which uses both is to apply a prefix with underscore to indicate the module in which the function, or file-scope/global variable, is used, e.g. Config_Update(), Status_Get().

What is your personal approach/take on commenting?

Duplicate
What are your hard rules about commenting?
A Developer I work with had some things to say about commenting that were interesting to me (see below). What is your personal approach/take on commenting?
"I don't add comments to code unless
its a simple heading or there's a
platform-bug or a necessary
work-around that isn't obvious. Code
can change and comments may become
misleading. Code should be
self-documenting in its use of
descriptive names and its logical
organization - and its solutions
should be the cleanest/simplest way
to perform a given task. If a
programmer can't tell what a program
does by only reading the code, then
he's not ready to alter it.
Commenting tends to be a crutch for
writing something complex or
non-obvious - my goal is to always
write clean and simple code."
"I think there a few camps when it
comes to commenting, the
enterprisey-type who think they're
writing an API and some grand
code-library that will be used for
generations to come, the
craftsman-like programmer that thinks
code says what it does clearer than a
comment could, and novices that write
verbose/unclear code so as to need to
leave notes to themselves as to why
they did something."
There's a tragic flaw with the "self-documenting code" theory. Yes, reading the code will tell you exactly what it is doing. However, the code is incapable of telling you what it's supposed to be doing.
I think it's safe to say that all bugs are caused when code is not doing what it's supposed to be doing :). So if we add some key comments to provide maintainers with enough information to know what a piece of code is supposed to be doing, then we have given them the ability to fix a whole lot of bugs.
That leaves us with the question of how many comments to put in. If you put in too many comments, things become tedious to maintain and the comments will inevitably be out of date with the code. If you put in too few, then they're not particularly useful.
I've found regular comments to be most useful in the following places:
1) A brief description at the top of a .h or .cpp file for a class explaining the purpose of the class. This helps give maintainers a quick overview without having to sift through all of the code.
2) A comment block before the implementation of a non-trivial function explaining the purpose of it and detailing its expected inputs, potential outputs, and any oddities to expect when calling the function. This saves future maintainers from having to decipher entire functions to figure these things out.
Other than that, I tend to comment anything that might appear confusing or odd to someone. For example: "This array is 1 based instead of 0 based because of blah blah".
Well written, well placed comments are invaluable. Bad comments are often worse than no comments. To me, lack of any comments at all indicates laziness and/or arrogance on the part of the author of the code. No matter how obvious it is to you what the code is doing or how fantastic your code is, it's a challenging task to come into a body of code cold and figure out what the heck is going on. Well done comments can make a world of difference getting someone up to speed on existing code.
I've always liked Refactoring's take on commenting:
The reason we mention comments here is that comments often are used as a deodorant. It's surprising how often you look at thickly commented code and notice that the comments are there because the code is bad.
Comments lead us to bad code that has all the rotten whiffs we've discussed in the rest of this chapter. Our first action is to remove the bad smells by refactoring. When we're finished, we often find that the comments are superfluous.
As controversial as that is, it's rings true for the code I've read. To be fair, Fowler isn't saying to never comment, but to think about the state of your code before you do.
You need documentation (in some form; not always comments) for a local understanding of the code. Code by itself tells you what it does, if you read all of it and can keep it all in mind. (More on this below.) Comments are best for informal or semiformal documentation.
Many people say comments are a code smell, replaceable by refactoring, better naming, and tests. While this is true of bad comments (which are legion), it's easy to jump to concluding it's always so, and hallelujah, no more comments. This puts all the burden of local documentation -- too much of it, I think -- on naming and tests.
Document the contract of each function and, for each type of object, what it represents and any constraints on a valid representation (technically, the abstraction function and representation invariant). Use executable, testable documentation where practical (doctests, unit tests, assertions), but also write short comments giving the gist where helpful. (Where tests take the form of examples, they're incomplete; where they're complete, precise contracts, they can be as much work to grok as the code itself.) Write top-level comments for each module and each project; these can explain conventions that keep all your other comments (and code) short. (This supports naming-as-documentation: with conventions established, and a place we can expect to find subtleties noted, we can be confident more often that the names tell all we need to know.) Longer, stylized, irritatingly redundant Javadocs have their uses, but helped generate the backlash.
(For instance, this:
Perform an n-fold frobulation.
#param n the number of times to frobulate
#param x the x-coordinate of the center of frobulation
#param y the y-coordinate of the center of frobulation
#param z the z-coordinate of the center of frobulation
could be like "Frobulate n times around the center (x,y,z)." Comments don't have to be a chore to read and write.)
I don't always do as I say here; it depends on how much I value the code and who I expect to read it. But learning how to write this way made me a better programmer even when cutting corners.
Back on the claim that we document for the sake of local understanding: what does this function do?
def is_even(n): return is_odd(n-1)
Tests if an integer is even? If is_odd() tests if an integer is odd, then yes, that works. Suppose we had this:
def is_odd(n): return is_even(n-1)
The same reasoning says this is_odd() tests if an integer is odd. Put them together, of course, and neither works, even though each works if the other does. Change it a bit and we'd have code that does work, but only for natural numbers, while still locally looking like it works for integers. In microcosm that's what understanding a codebase is like: tracing dependencies around in circles to try to reverse-engineer assumptions the author could have explained in a line or two if they'd bothered. I hate the expense of spirit thoughtless coders have put me to this way over the past couple of decades: oh, this method looks like it has the side effect of farbuttling the warpcore... always? Well, if odd crobuncles desaturate, at least; do they? Better check all the crobuncle-handling code... which will pose its own challenges to understanding. Good documentation cuts this O(n) pointer-chasing down to O(1): e.g. knowing a function's contract and the contracts of the things it explicitly uses, the function's code should make sense with no further knowledge of the system. (Here, contracts saying is_even() and is_odd() work on natural numbers would tell us that both functions need to test for n==0.)
My only real rule is that comments should explain why code is there, not what it is doing or how it is doing it. Those things can change, and if they do the comments have to be maintained. The purpose the code exists in the first place shouldn't change.
the purpose of comments is to explain the context - the reason for the code; this, the programmer cannot know from mere code inspection. For example:
strangeSingleton.MoveLeft(1.06);
badlyNamedGroup.Ignite();
who knows what the heck this is for? but with a simple comment, all is revealed:
//when under attack, sidestep and retaliate with rocket bundles
strangeSingleton.MoveLeft(1.06);
badlyNamedGroup.Ignite();
seriously, comments are for the why, not the how, unless the how is unintuitive.
While I agree that code should be self-readable, I still see a lot of value in adding extensive comment blocks for explaining design decisions. For example "I did xyz instead of the common practice of abc because of this caveot ..." with a URL to a bug report or something.
I try to look at it as: If I'm dead and gone and someone straight out of college has to fix a bug here, what are they going to need to know?
In general I see comments used to explain poorly written code. Most code can be written in a way that would make comments redundant. Having said that I find myself leaving comments in code where the semantics aren't intuitive, such as calling into an API that has strange or unexpected behavior etc...
I also generally subscribe to the self-documenting code idea, so I think your developer friend gives good advice, and I won't repeat that, but there are definitely many situations where comments are necessary.
A lot of times I think it boils down to how close the implementation is to the types of ordinary or easy abstractions that code-readers in the future are going to be comfortable with or more generally to what degree the code tells the entire story. This will result in more or fewer comments depending on the type of programming language and project.
So, for example if you were using some kind of C-style pointer arithmetic in an unsafe C# code block, you shouldn't expect C# programmers to easily switch from C# code reading (which is probably typically more declarative or at least less about lower-level pointer manipulation) to be able to understand what your unsafe code is doing.
Another example is when you need to do some work deriving or researching an algorithm or equation or something that is not going to end up in your code but will be necessary to understand if anyone needs to modify your code significantly. You should document this somewhere and having at least a reference directly in the relevant code section will help a lot.
I don't think it matters how many or how few comments your code contains. If your code contains comments, they have to maintained, just like the rest of your code.
EDIT: That sounded a bit pompous, but I think that too many people forget that even the names of the variables, or the structures we use in the code, are all simply "tags" - they only have meaning to us, because our brains see a string of characters such as customerNumber and understand that it is a customer number. And while it's true that comments lack any "enforcement" by the compiler, they aren't so far removed. They are meant to convey meaning to another person, a human programmer that is reading the text of the program.
If the code is not clear without comments, first make the code a clearer statement of intent, then only add comments as needed.
Comments have their place, but primarily for cases where the code is unavoidably subtle or complex (inherent complexity is due to the nature of the problem being solved, not due to laziness or muddled thinking on the part of the programmer).
Requiring comments and "measuring productivity" in lines-of-code can lead to junk such as:
/*****
*
* Increase the value of variable i,
* but only up to the value of variable j.
*
*****/
if (i < j) {
++i;
} else {
i = j;
}
rather than the succinct (and clear to the appropriately-skilled programmer):
i = Math.min(j, i + 1);
YMMV
The vast majority of my commnets are at the class-level and method-level, and I like to describe the higher-level view instead of just args/return value. I'm especially careful to describe any "non-linearities" in the function (limits, corner cases, etc) that could trip up the unwary.
Typically I don't comment inside a method, except to mark "FIXME" items, or very occasionally some sort of "here be monsters" gotcha that I just can't seem to clean up, but I work very hard to avoid those. As Fowler says in Refactoring, comments tend to indicate smally code.
Comments are part of code, just like functions, variables and everything else - and if changing the related functionality the comment must also be updated (just like function calls need changing if function arguments change).
In general, when programming you should do things once in one place only.
Therefore, if what code does is explained by clear naming, no comment is needed - and this is of course always the goal - it's the cleanest and simplest way.
However, if further explanation is needed, I will add a comment, prefixed with INFO, NOTE, and similar...
An INFO: comment is for general information if someone is unfamiliar with this area.
A NOTE: comment is to alert of a potential oddity, such as a strange business rule / implementation.
If I specifically don't want people touching code, I might add a WARNING: or similar prefix.
What I don't use, and am specifically opposed to, are changelog-style comments - whether inline or at the head of the file - these comments belong in the version control software, not the sourcecode!
I prefer to use "Hansel and Gretel" type comments; little notes in the code as to why I'm doing it this way, or why some other way isn't appropriate. The next person to visit this code will probably need this info, and more often than not, that person will be me.
As a contractor I know that some people maintaining my code will be unfamiliar with the advanced features of ADO.Net I am using. Where appropriate, I add a brief comment about the intent of my code and a URL to an MSDN page that explains in more detail.
I remember learning C# and reading other people's code I was often frustrated by questions like, "which of the 9 meanings of the colon character does this one mean?" If you don't know the name of the feature, how do you look it up?! (Side note: This would be a good IDE feature: I select an operator or other token in the code, right click then shows me it's language part and feature name. C# needs this, VB less so.)
As for the "I don't comment my code because it is so clear and clean" crowd, I find sometimes they overestimate how clear their very clever code is. The thought that a complex algorithm is self-explanatory to someone other than the author is wishful thinking.
And I like #17 of 26's comment (empahsis added):
... reading the code will tell you exactly
what it is doing. However, the code is
incapable of telling you what it's
supposed to be doing.
I very very rarely comment. MY theory is if you have to comment it's because you're not doing things the best way possible. Like a "work around" is the only thing I would comment. Because they often don't make sense but there is a reason you are doing it so you need to explain.
Comments are a symptom of sub-par code IMO. I'm a firm believer in self documenting code. Most of my work can be easily translated, even by a layman, because of descriptive variable names, simple form, and accurate and many methods (IOW not having methods that do 5 different things).
Comments are part of a programmers toolbox and can be used and abused alike. It's not up to you, that other programmer, or anyone really to tell you that one tool is bad overall. There are places and times for everything, including comments.
I agree with most of what's been said here though, that code should be written so clear that it is self-descriptive and thus comments aren't needed, but sometimes that conflicts with the best/optimal implementation, although that could probably be solved with an appropriately named method.
I agree with the self-documenting code theory, if I can't tell what a peice of code is doing simply by reading it then it probably needs refactoring, however there are some exceptions to this, I'll add a comment if:
I'm doing something that you don't
normally see
There are major side effects or implementation details that aren't obvious, or won't be next year
I need to remember to implement
something although I prefer an
exception in these cases.
If I'm forced to go do something else and I'm having good ideas, or a difficult time with the code, then I'll add sufficient comments to tmporarily preserve my mental state
Most of the time I find that the best comment is the function or method name I am currently coding in. All other comments (except for the reasons your friend mentioned - I agree with them) feel superfluous.
So in this instance commenting feels like overkill:
/*
* this function adds two integers
*/
int add(int x, int y)
{
// add x to y and return it
return x + y;
}
because the code is self-describing. There is no need to comment this kind of thing as the name of the function clearly indicates what it does and the return statement is pretty clear as well. You would be surprised how clear your code becomes when you break it down into tiny functions like this.
When programming in C, I'll use multi-line comments in header files to describe the API, eg parameters and return value of functions, configuration macros etc...
In source files, I'll stick to single-line comments which explain the purpose of non-self-evident pieces of code or to sub-section a function which can't be refactored to smaller ones in a sane way. Here's an example of my style of commenting in source files.
If you ever need more than a few lines of comments to explain what a given piece of code does, you should seriously consider if what you're doing can't be done in a better way...
I write comments that describe the purpose of a function or method and the results it returns in adequate detail. I don't write many inline code comments because I believe my function and variable naming to be adequate to understand what is going on.
I develop on a lot of legacy PHP systems that are absolutely terribly written. I wish the original developer would have left some type of comments in the code to describe what was going on in those systems. If you're going to write indecipherable or bad code that someone else will read eventually, you should comment it.
Also, if I am doing something a particular way that doesn't look right at first glance, but I know it is because the code in question is a workaround for a platform or something like that, then I'll comment with a WARNING comment.
Sometimes code does exactly what it needs to do, but is kind of complicated and wouldn't be immediately obvious the first time someone else looked at it. In this case, I'll add a short inline comment describing what the code is intended to do.
I also try to give methods and classes documentation headers, which is good for intellisense and auto-generated documentation. I actually have a bad habit of leaving 90% of my methods and classes undocumented. You don't have time to document things when you're in the middle of coding and everything is changing constantly. Then when you're done you don't feel like going back and finding all the new stuff and documenting it. It's probably good to go back every month or so and just write a bunch of documentation.
Here's my view (based on several years of doctoral research):
There's a huge difference between commenting functions (sort of a black box use, like JavaDocs), and commenting actual code for someone who will read the code ("internal commenting").
Most "well written" code shouldn't require much "internal commenting" because if it performs a lot then it should be broken into enough function calls. The functionality for each of these calls is then captured in the function name and in the function comments.
Now, function comments are indeed the problem, and in some ways your friend is right, that for most code there is no economical incentive for complete specifications the way that popular APIs are documented. The important thing here is to identify what are the "directives": directives are those information pieces that directly affect clients, and require some direct action (and are often unexpected). For example, X must be invoked before Y, don't call this from outside a UI thread, be aware that this has a certain side effect, etc. These are the things that are really important to capture.
Since most people never read full function documentations, and skim what they do read, you can actually increase the chances of awareness by capturing only the directives rather than the whole description.
I comment as much as needed - then, as much as I will need it a year later.
We add comments which provide the API reference documentation for all public classes / methods / properties / etc... This is well worth the effort because XML Documentation in C# has the nice effect of providing IntelliSense to users of these public APIs. .NET 4.0's code contracts will enable us to improve further on this practice.
As a general rule, we do not document internal implementations as we write code unless we are doing something non-obvious. The theory is that while we are writing new implementations, things are changing and comments are more likely than not to be wrong when the dust settles.
When we go back in to work on an existing piece of code, we add comments when we realize that it's taking some thought to figure out what in the heck is going on. This way, we wind up with comments where they are more likely to be correct (because the code is more stable) and where they are more likely to be useful (if I'm coming back to a piece of code today, it seems more likely that I might come back to it again tomorrow).
My approach:
Comments bridge the gap between context / real world and code. Therefore, each and every single line is commented, in correct English language.
I DO reject code that doesn't observe this rule in the strictest possible sense.
Usage of well formatted XML - comments is self-evident.
Sloppy commenting means sloppy code!
Here's how I wrote code:
if (hotel.isFull()) {
print("We're fully booked");
} else {
Guest guest = promptGuest();
hotel.checkIn(guest);
}
here's a few comments that I might write for that code:
// if hotel is full, refuse checkin, otherwise
// prompt the user for the guest info, and check in the guest.
If your code reads like a prose, there is no sense in writing comments that simply repeats what the code reads since the mental processing needed for reading the code and the comments would be almost equal; and if you read the comments first, you will still need to read the code as well.
On the other hand, there are situations where it is impossible or extremely difficult to make the code looks like a prose; that's where comment could patch in.

What are your "hard rules" about commenting your code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 months ago.
Improve this question
I have seen the other questions but I am still not satisfied with the way this subject is covered.
I would like to extract a distiled list of things to check on comments at a code inspection.
I am sure people will say things that will just cancel each other. But hey, maybe we can build a list for each camp. For those who don't comment at all the list will just be very short :)
I have one simple rule about commenting: Your code should tell the story of what you are doing; your comments should tell the story of why you are doing it.
This way, I make sure that whoever inherits my code will be able to understand the intent behind the code.
I comment public or protected functions with meta-comments, and usually hit the private functions if I remember.
I comment why any sufficiently complex code block exists (judgment call). The why is the important part.
I comment if I write code that I think is not optimal but I leave it in because I cannot figure out a smarter way or I know I will be refactoring later.
I comment to remind myself or others of missing functionality or upcoming requirements code not present in the code (TODO, etc).
I comment to explain complex business rules related to a class or chunk of code. I have been known to write several paragraphs to make sure the next guy/gal knows why I wrote a hundred line class.
If a comment is out of date (does not match the code), delete it or update it. Never leave an inaccurate comment in place.
Documentation is like sex; when it's
good, it's very, very good, and when
it's bad, it's better than nothing
Write readable code that is self-explanatory as much as possible. Add comments whenever you have to write code that is too complex to understand at a glance. Also add comments to describe the business purpose behind code that you write, to make it easier to maintain/refactor it in the future.
The comments you write can be revealing about the quality of your code. Countless times I've removed comments in my code to replace them with better, clearer code. For this I follow a couple of anti-commenting rules:
If your comment merely explains a line of code, you should either let that line of code speak for itself or split it up into simpler components.
If your comment explains a block of code within a function, you should probably be explaining a new function instead.
Those are really the same rule repeated for two different contexts.
The other, more normal rules I follow are:
When using a dynamically-typed language, document the expectations that important functions make about their arguments, as well as the expectations callers can make about the return values. Important functions are those that will ever have non-local callers.
When your logic is dictated by the behavior of another component, it's good to document what your understanding and expectations of that component are.
When implementing an RFC or other protocol specification, comment state machines / event handlers / etc with the section of the spec they correspond to. Make sure to list the version or date of the spec, in case it is revised later.
I usually comment a method before I write it. I'll write a line or two of comments for each step I need to take within the function, and then I write the code between the comments. When I'm done, the code is already commented.
The great part about that is that it's commented before I write the code, so there are not unreasonable assumptions about previous knowledge in the comments; I, myself, knew nothing about my code when I wrote them. This means that they tend to be easy to understand, as they should be.
There are no hard rules - hard rules lead to dogma and people generally follow dogma when they're not smart enough to think for themselves.
The guidelines I follow:
1/ Comments tell what is being done, code tells how it's being done - don't duplicate your effort.
2/ Comments should refer to blocks of code, not each line. That includes comments that explain whole files, whole functions or just a complicated snippet of code.
3/ If I think I'd come back in a year and not understand the code/comment combination then my comments aren't good enough yet.
A great rule for comments: if you're reading through code trying to figure something out, and a comment somewhere would have given you the answer, put it there when you know the answer.
Only spend that time investigating once.
Eventually you will know as you write the places that you need to leave guidance, and the places that are sufficiently obvious to stand alone. Until then, you'll spend time trawling through your code trying to figure out why you did something :)
I document every class, every function, every variable within a class. Simple DocBlocks are the way forward.
I'll generally write these docblocks more for automated API documentation than anything else...
For example, the first section of one of my PHP classes
/**
* Class to clean variables
*
* #package Majyk
* #author Martin Meredith <martin#sourceguru.net>
* #licence GPL (v2 or later)
* #copyright Copyright (c) 2008 Martin Meredith <martin#sourceguru.net>
* #version 0.1
*/
class Majyk_Filter
{
/**
* Class Constants for Cleaning Types
*/
const Integer = 1;
const PositiveInteger = 2;
const String = 3;
const NoHTML = 4;
const DBEscapeString = 5;
const NotNegativeInteger = 6;
/**
* Do the cleaning
*
* #param integer Type of Cleaning (as defined by constants)
* #param mixed Value to be cleaned
*
* #return mixed Cleaned Variable
*
*/
But then, I'll also sometimes document significant code (from my init.php
// Register the Auto-Loader
spl_autoload_register("majyk_autoload");
// Add an Exception Handler.
set_exception_handler(array('Majyk_ExceptionHandler', 'handle_exception'));
// Turn Errors into Exceptions
set_error_handler(array('Majyk_ExceptionHandler', 'error_to_exception'), E_ALL);
// Add the generic Auto-Loader to the auto-loader stack
spl_autoload_register("spl_autoload");
And, if it's not self explanatory why something does something in a certain way, I'll comment that
The only guaranteed place I leave comments: TODO sections. The best place to keep track of things that need reworking is right there in the code.
I create a comment block at the beginning of my code, listing the purpose of the program, the date it was created, any license/copyright info (like GPL), and the version history.
I often comment my imports if it's not obvious why they are being imported, especially if the overall program doesn't appear to need the imports.
I add a docstring to each class, method, or function, describing what the purpose of that block is and any additional information I think is necessary.
I usually have a demarcation line for sections that are related, e.g. widget creation, variables, etc. Since I use SPE for my programming environment, it automatically highlights these sections, making navigation easier.
I add TODO comments as reminders while I'm coding. It's a good way to remind myself to refactor the code once it's verified to work correctly.
Finally, I comment individual lines that may need some clarification or otherwise need some metadata for myself in the future or other programmers.
Personally, I hate looking at code and trying to figure out what it's supposed to do. If someone could just write a simple sentence to explain it, life is easier. Self-documenting code is a misnomer, in my book.
I focus on the why. Because the what is often easy readable.
TODO's are also great, they save a lot of time.
And i document interfaces (for example file formats).
A really important thing to check for when you are checking header documentation (or whatever you call the block preceding the method declaration) is that directives and caveats are easy to spot.
Directives are any "do" or "don't do" instructions that affect the client: don't call from the UI thread, don't use in performance critical code, call X before Y, release return value after use, etc.
Caveats are anything that could be a nasty surprise: remaining action items, known assumptions and limitations, etc.
When you focus on a method that you are writing and inspecting, you'll see everything. When a programmer is using your method and thirty others in an hour, you can't count on a thorough read. I can send you research data on that if you're interested.
Pre-ambles only; state a class's Single Responsibility, any notes or comments, and change log. As for methods, if any method needs substantial commenting, it is time to refactor.
When you're writing comments, stop, reflect and ask yourself if you can change the code so that the comments aren't needed. Could you change some variable, class or method names to make things clearer? Would some asserts or other error checks codify your intentions or expectations? Could you split some long sections of code into clearly named methods or functions? Comments are often a reflection of our inability to write (a-hem, code) clearly. It's not always easy to write clearly with computer languages but take some time to try... because code never lies.
P.S. The fact that you use quotes around "hard rules" is telling. Rules that aren't enforced aren't "hard rules" and the only rules that are enforced are in code.
I add 1 comment to a block of code that summarizes what I am doing. This helps people who are looking for specific functionality or section of code.
I comment any complex algorithm, or process, that can't be figured out at first glance.
I sign my code.
In my opinion, TODO/TBD/FIXME etc. are ok to have in code which is currently being worked on, but when you see code which hasn't been touched in 5 years and is full of them, you realize that it's a pretty lousy way of making sure that things get fixed. In short, TODO notes in comments tend to stay there. Better to use a bugtracker if you have things which need to be fixed at some point.
Hudson (CI server) has a great plugin which scans for TODOs and notes how many there are in your code. You can even set thresholds causing the build to be classified as unstable if there are too many of them.
My favorite rule-of-thumb regarding comments is: if the code and the comments disagree, then both are likely incorrect
We wrote an article on comments (actually, I've done several) here:
http://agileinaflash.blogspot.com/2009/04/rules-for-commenting.html
It's really simple: Comments are written to tell you what the code cannot.
This results in a simple process:
- Write any comment you want at first.
- Improve the code so that the comment becomes redundant
- Delete the now-redundant comment.
- Only commit code that has no redundant comments
I'm writing a Medium article in which I will present this rule: when you commit changes to a repository, each comment must be one of these three types:
A license header at the top
A documentation comment (e.g., Javadoc), or
A TODO comment.
The last type should not be permanent. Either the thing gets done and the TODO comment is deleted, or we decide the task is not necessary and the TODO comment gets deleted.

Standards Document

I am writing a coding standards document for a team of about 15 developers with a project load of between 10 and 15 projects a year. Amongst other sections (which I may post here as I get to them) I am writing a section on code formatting. So to start with, I think it is wise that, for whatever reason, we establish some basic, consistent code formatting/naming standards.
I've looked at roughly 10 projects written over the last 3 years from this team and I'm, obviously, finding a pretty wide range of styles. Contractors come in and out and at times, and sometimes even double the team size.
I am looking for a few suggestions for code formatting and naming standards that have really paid off ... but that can also really be justified. I think consistency and shared-patterns go a long way to making the code more maintainable ... but, are there other things I ought to consider when defining said standards?
How do you lineup parenthesis? Do you follow the same parenthesis guidelines when dealing with classes, methods, try catch blocks, switch statements, if else blocks, etc.
Do you line up fields on a column? Do you notate/prefix private variables with an underscore? Do you follow any naming conventions to make it easier to find particulars in a file? How do you order the members of your class?
What about suggestions for namespaces, packaging or source code folder/organization standards? I tend to start with something like:
<com|org|...>.<company>.<app>.<layer>.<function>.ClassName
I'm curious to see if there are other, more accepted, practices than what I am accustomed to -- before I venture off dictating these standards. Links to standards already published online would be great too -- even though I've done a bit of that already.
First find a automated code-formatter that works with your language. Reason: Whatever the document says, people will inevitably break the rules. It's much easier to run code through a formatter than to nit-pick in a code review.
If you're using a language with an existing standard (e.g. Java, C#), it's easiest to use it, or at least start with it as a first draft. Sun put a lot of thought into their formatting rules; you might as well take advantage of it.
In any case, remember that much research has shown that varying things like brace position and whitespace use has no measurable effect on productivity or understandability or prevalence of bugs. Just having any standard is the key.
Coming from the automotive industry, here's a few style standards used for concrete reasons:
Always used braces in control structures, and place them on separate lines. This eliminates problems with people adding code and including it or not including it mistakenly inside a control structure.
if(...)
{
}
All switches/selects have a default case. The default case logs an error if it's not a valid path.
For the same reason as above, any if...elseif... control structures MUST end with a default else that also logs an error if it's not a valid path. A single if statement does not require this.
In the occasional case where a loop or control structure is intentionally empty, a semicolon is always placed within to indicate that this is intentional.
while(stillwaiting())
{
;
}
Naming standards have very different styles for typedefs, defined constants, module global variables, etc. Variable names include type. You can look at the name and have a good idea of what module it pertains to, its scope, and type. This makes it easy to detect errors related to types, etc.
There are others, but these are the top off my head.
-Adam
I'm going to second Jason's suggestion.
I just completed a standards document for a team of 10-12 that work mostly in perl. The document says to use "perltidy-like indentation for complex data structures." We also provided everyone with example perltidy settings that would clean up their code to meet this standard. It was very clear and very much industry-standard for the language so we had great buyoff on it by the team.
When setting out to write this document, I asked around for some examples of great code in our repository and googled a bit to find other standards documents that smarter architects than I to construct a template. It was tough being concise and pragmatic without crossing into micro-manager territory but very much worth it; having any standard is indeed key.
Hope it works out!
It obviously varies depending on languages and technologies. By the look of your example name space I am going to guess java, in which case http://java.sun.com/docs/codeconv/ is a really good place to start. You might also want to look at something like maven's standard directory structure which will make all your projects look similar.

Resources