Expressions which always return true (and the compiler doesn't know) - debugging

When experimenting I often use if (true) {..} or if (false) {..} to section off chunks of code I'm playing with.
The problem is that compilers these days sometimes issue a warning about unreachable code. I then have to promote my code to something like if ((10 % 2) == 0), but then some smarter compilers catch this as unreachable too.. (And so a small arms race begins..)
How do I solve this problem?

Try moving it out of the code, using pre-processor flags:
#if false
.... code which won't get run ....
#endif
Some IDEs will even collapse such un-runnable blocks for you to keep them out of the way.

Why not just use some kind of "environment variable" to control this kind of thing? Make it a reall conditional then you're not fighting the compiler.
if ( environment.MODE_EXPERIMENT) {
experimental code here
}
Personally I try pretty hard to avoid doing this kind of thing, we have SCM systems that allow us to keep old versions of code, really shouldn't need to do this too much.

This should do:
if (atoi("1")) {
}
This can also be useful to generate any kind of constant.

Almost every single language has a version of if(false) { } that a compiler/IDE won't complain about. The most common (afaik) look like the following:
/*
int i = 0;
String s = "I will never be run!";
*/
Yes, comments. Multi-line comments, to be exact. Stopping blocks of code from being run are pretty much their second main purpose (after documentation). if(true), of course, is then handled by not using them.
You're trying to re-invent the wheel here, and I have no idea why.

Related

Is it a good practice to add extra checks in disabled GUI options?

I'm referring to the following:
void setup_gui()
{
if (some_condition)
some_button.disable();
...
}
void some_button_click()
{
// Is this a good practice?
if (some_condition)
return;
...
}
Adding that check ensures that the program won't run the operation, but it can also be seen as hiding a bug (some_button_click() must not have run at all).
So, what do you think about it? Is it a safe coding practice or hiding a bug?
Defensive programming is as reasonable as defensive driving.
It may be helpful to think of this in terms of separate concerns. One concern is the presentation. Another may be a set of business rules. It is reasonable to make the same check in both places.
You want to make the check in the presentation layer to communicate to the user.
You may also want to make the check below the presentation layer:
To defend against present and future mistakes in the presentation layer.
In case the code underneath the presentation layer is re-used elsewhere.
[From mvds's comment] In case the condition may change since the control was enabled or disabled.
Edit: David Heffernan's DRY concern below can be addressed trivially by defining the condition exactly once, and accessing it elsewhere.
void setup_gui()
{
some_button.setEnabled( context.isThisActionAvailable() );
...
}
void some_button_click()
{
if ( context.isThisActionAvailable() )
return;
...
}
I don't take this belt and braces approach. The problem is that you have violated the DRY principle with the double use of some_condition. It's all to easy to change this in one place and not the other.
Of course some_condition is quite simple in this made-up example but in reality it's often be much more complex.
If you can't trust your GUI framework to block actions when you request them to be blocked, then you need to fix the framework.
It can be considered safe coding, it can also be considered hiding a bug. This is the time when perhaps you should re-evaluate your program logic. Code redundancy is something that is always best to avoid if possible. It would be better if your program could be structured in such a way that it would not need to check twice, by making sure the actual logic of the program works properly and as intended.
Sure, if you're in a rush this is a quick fix and it works, but it reeks of poor design and/or logic elsewhere in the program.
The silent return in some_button_click() potentially hides a bug. I would either don't do the check at all, or crash loud and violently.

Is coding style mostly for readability or are there other factors?

I remember reading in Douglas Crockford's "Javascript the Good Parts" book that there is potential for error with blockless statements because of automatic semicolon insertion.
if (condition)
foo = true;
vs
if (condition)
{
foo = true;
}
In the second the example it will work consistently, in the first example a semicolon will be automatically inserted by the interpreter and can lead to ambiguity in the code. As Douglas points out this is potentially bad and hard to debug, which I agree. But it got me thinking are there examples where coding "style" actually has syntax implications? In other words, examples where failing to follow a certain indentation or apparent style actually results in a bug or error. I suppose Python with its significant whitespace is an example, YML with its requirement for no tabs is another.
Feel free to respond in a wide variety of languages and idioms. I am curious to hear about the paradigm cases. In your answer I would like to know the WHAT and the WHY of the coding style or syntax behavior. I don't want to start any coding style flame wars, just matter of fact scenarios where the uninitiated would get tripped up.
Javascript treats these two cases seperately. You have to use the first
return {
// code
}
return
{
// code
}
If you do not the interpreter adds semi colons in the wrong places. I think it puts one after the condition. So the second would be read wrongly as.
return;
{
// code
}
Which is not invalid syntax.
No one has mentioned it before, but one point of style is to write
if (NULL==p) // ...
instead of
if (p==NULL) // ...
The two are functionally equivalent so it's a matter of style. Many prefer the style at the top, because if you type "=" by mistake instead of "==", the first won't compile whereas the second will, and creates a bug that is hard to find (though some compilers now give you a warning for if (p=NULL)).
A coding style is not only for readability. There are some other factors like
Avoid common language pitfalls (silent fails, exceptions while casting etc.)
Ease of maintenance
Increase the clarity of code (e.g. private functions always starting loweCase and public beeing PascalCase)
Enforcing conventions (e.g. not using multiple inheritance or always inherit public in c++)
an example for ease of maintenacne is below:
if(x) return true;
vs.
if(x)
{
return true;
}
it is clear the second is easier to maintain, because i can simply go ahead and add a new line and add a call to bla() without having to add brackets before.
In Python, the whitespace indentation, rather than curly braces or keywords, delimits the statement blocks. An increase in indentation comes after certain statements; a decrease in indentation signifies the end of the current block.
Whitespace is not significant in any of the C family of languages, except to separate certain language tokens. The layout of the source code has no effect on the executable produced by the compiler.
In C++, there's a difference between
vector<pair<int,int>>
and
vector<pair<int,int> >
because >> is treated as a single token by the parser.
Every time I open a parentheses, brace, single or double quote I always close it and then go back to write the statement, condition, etc... This saves me from quite some possibly big mistakes!
It depends on the language. In the curly family C/C++/Java/C#, whitespace is not the restriction as long as your braces are opened and closed properly.
In languages like VB where paired keywords are used, just because keywords delimit functions you can't have
Private Sub someFunction() End Sub
But in C, C++ and other curly languages, you can have
void someFunction() { }
In python, I guess its a completely different matter.
It all depends on the particular language. As per your specific example, I don't think there is a syntactical or semantic difference between the two.
It is language dependant.
For instance in Java
void someFunction() { }
and
void someFunction() {
}
and
void someFunction()
{
}
will have no implication whatsoever. However Sun has enlisted a list of coding conventions if you are interested you could read them here.
These are mainly for maintainability and readability of code but need not be strictly followed.
The title wasn't specific to conditional blocks, so this came to mind:
I don't see it mentioned yet, but one thing to consider is how do you find things with simple tools (like grep and its inferior implementations from windowsland).
For example consider this (admittedly a bit contrived) example
class Foo
// vs.
class
Foo
You can find former with regex "class\s+Foo", but for latter you'll have to have a specialized tool that can parse C++/C#/java.
This applies also in C for function prototypes, some weird people prefer
void
bar (void)
// to
void bar(void)
While you can generally assume that if in front of function name are some A-Z characters, then it's very likely definiton/declaration.
Actually when it comes to ifblocks, then placing the brace has very big impact on debugging code in Visual Studio 200x. When it steps into function/block, it will place text/execution line cursor on opening bracket. So if the bracket happens to be waaaaay on the right side, the code window has to scroll there and buggeritall stays there. Extremely annoying.

What is the most obfuscated code you've had to fix?

Most programmers will have had the experience of debugging/fixing someone else's code. Sometimes that "someone else's code" is so obfuscated it's bad enough trying to understand what it's doing.
What's the worst (most obfuscated) code you've had to debug/fix?
If you didn't throw it away and recode it from scratch, well why didn't you?
PHP OSCommerce is enough to say, it is obfuscated code...
a Java class
only static methods that manipulates DOM
8000 LOCs
long chain of methods that return null on "error": a.b().c().d().e()
very long methods (400/500 LOC each)
nested if, while, like:
if (...) {
for (...) {
if (...) {
if (...) {
while (...) {
if (...) {
cut-and-paste oriented programming
no exceptions, all exceptions are catched and "handled" using printStackTrace()
no unit tests
no documentation
I was tempted to throw away and recode... but, after 3 days of hard debugging,
I've added the magic if :-)
Spaghetti code PHP CMS system.
by default, programmers think someone else's code is obfuscated.
The worse I probably had to do was interpreting what variables i1, i2 j, k, t were in a simple method and they were not counters in 'for' loops.
In all other circumstances I guess the problem area was difficult which made the code look difficult.
I found this line in our codebase today and thought it was a nice example of sneaky obfuscation:
if (MULTICLICK_ENABLED.equals(propService.getProperty(PropertyNames.MULTICLICK_ENABLED))) {} else {
return false;
}
Just making sure I read the whole line. NO SKIMREADING.
When working on a GWT project, I would reach parts of GWT-compiled obfuscated JS code which wasn't mine.
Now good luck debugging real obfuscated code.
I can't remember the full code, but a single part of it remains burned into my memory as something I spend hours trying to understand:
do{
$tmp = shift unless shift;
$tmp;
}while($tmp);
I couldn't understand it at first, it looks so useless, then I printed out #_ for a list of arguments, a series of alternating boolean and function names, the code was used in conjunction with a library detection module that changed behaviour if a function was broken, but the code was so badly documented and made of things like that which made no sense without a complete understanding of the full code I gave up and rewrote the whole thing.
UPDATE from DVK:
And, lest someone claims this was because Perl is unreadable as opposed to coder being a golf master instead of good software developer, here's the same code in a slightly less obfuscated form (the really correct code wouldn't even HAVE alternating sub names and booleans in the first place :)
# This subroutine take a list of alternating true/false flags
# and subroutine names; and executes the named subroutines for which flag is true.
# I am also weird, otherwise I'd have simply have passed list of subroutines to execute :)
my #flags_and_sub_names_list = #_;
while ( #flags_and_sub_names_list ) {
my $flag = shift #flags_and_sub_names_list;
my $subName = shift #flags_and_sub_names_list;
next unless $flag && $subName;
&{ $subName }; # Call the named subroutine
}
I've had a case of a 300lines function performing input sanitization which missed a certain corner case. It was parsing certain situations manually using IndexOf and Substring plus a lot of inlined variables and constants (looks like the original coder didn't know anything about good practices), and no comment was provided. Throwing it away wasn't feasible due to time constraints and the fact that I didn't have the specification required so rewriting it would've meant understanding the original, but after understanding it fixing it was just quicker. I also added lots of comments, so whoever shall come after me won't feel the same pain taking a look at it...
The Perl statement:
select((select(s),$|=1)[0])
which, at the suggestion of the original author (Randal Schwartz himself, who said he disliked it but nothing else was available at the time), was replaced with something a little more understandable:
IO::Handle->autoflush
Beyond that one-liner, some of the Java JDBC libraries from IBM are obfuscated and all variables and functions are either combinations of the letter 'l' and '1' or single/double characters - very hard to track anything down until you get them all renamed. Needed to do this to track down why they worked fine in IBM's JRE but not Sun's.
If you're talking about HLL codes, once I was updating project written by a chinese and all comments were chinese (stored in ansii) and it was a horror to understand some code fragments, if you're talking about low level code there were MANY of them (obfuscated, mutated, vm-ed...).
I once had to reverse engineer a Java 1.1 framework that:
Extended event-driven SAX parser classes for every class, even those that didn't parse XML (the overridden methods were simply invoked ad hoc by other code)
Custom runtime exceptions were thrown in lieu of method invocations wherever possible. As a result, most of the business logic landed in a nested series of catch blocks.
If I had to guess, it was probably someone's "smart" idea that method invocations were expensive in Java 1.1, so throwing exceptions for non-exceptional flow control was somehow considered an optimization.
Went through about three bottles of eye drops.
I once found a time bomb that had been intentionally obfuscated.
When I had finally decoded what it was doing I mentioned it to the manager who said they knew about the time bomb but had left it in place because it was so ineffective and was interwoven with other code.
The time bomb was (presumably) supposed to go off after a certain date.
Instead, it had a bug in it so it only activated if someone was working after lunchtime on Dec 31st.
It had taken three years for that circumstance to occur since the guy who wrote the time bomb left the company.

Is it possible to write good and understandable code without any comments?

Can any one suggest what is the best way to write good code that is understandable without a single line of comments?
I once had a professor when I was in college tell me that any good code should never need any comments.
Her approach was a combination of very precise logic split out into small functions with very descriptive method/property/variable names. The majority of what she presented was, in fact, extremely readable with no comments. I try to do the same with everything I write...
Read Code Complete, 2nd Edition cover to cover. Perhaps twice.
To give some specifics:
Making code readable
Eliminating code repetition
Doing design/architecture before you write code
I like to 'humanise' code, so instead of:
if (starColour.red > 200 && starColour.blue > 200 && starColour.green > 200){
doSomething();
}
I'll do this:
bool starIsBright;
starIsBright = (starColour.red > 200 && starColour.blue > 200 && starColour.green > 200);
if(starIsBright){
doSomething();
}
In some cases - yes, but in many cases no. The Yes part is already answered by others - keep it simple, write it nicely, give it readable names, etc. The No part goes to when the problem you solve in code is not a code problem at all but rather domain specific problem or business logic problem. I've got no problem reading lousy code even if it doesn't have comments. It's annoying, but doable. But it's practically impossible to read some code without understanding why is it like this and what is it trying to solve. So things like :
if (starColour.red > 200 && starColour.blue > 200 && starColour.green > 200){
doSomething();
}
look nice, but could be quite meaningless in the context of what the program is actually doing. I'd rather have it like this:
// we do this according to the requirement #xxxx blah-blah..
if (starColour.red > 200 && starColour.blue > 200 && starColour.green > 200){
doSomething();
}
Well written code might eliminate the need for comments to explain what you're doing, but you'll still want comments to explain the why.
If you really want to then you would need to be very detailed in your variable names and methods names.
But in my opinion, there is no good way to do this. Comments serve a serious purpose in coding, even if you are the only one coding you still sometimes need to be reminded what part of the code you're looking at.
Yes, you can write code that doesn't need comments to describe what it does, but that may not be enough.
Just because a function is very clear in explaining what it does, does not, by itself, tell you why it is doing what it does.
As in everything, moderation is a good idea. Write code that is explanatory, and write comments that explain why it is there or what assumptions are being made.
I think that the concept of Fluent Interfaces is really a good example of this.
var bob = DB.GetCustomers().FromCountry("USA").WithName("Bob")
Clean Code by Robert C. Martin contains everything you need to write clean, understandable code.
Use descriptive variable names and descriptive method names. Use whitespace.
Make your code read like normal conversation.
Contrast the use of Matchers in Junit:
assertThat(x, is(3));
assertThat(x, is(not(4)));
assertThat(responseString, either(containsString("color")).or(containsString("colour")));
assertThat(myList, hasItem("3"));
with the traditional style of assertEquals:
assertEquals(3, x);
When I look at the assertEquals statement, it is not clear which parameter is "expected" and which is "actual".
When I look at assertThat(x, is(3)) I can read that in English as "Assert that x is 3" which is very clear to me.
Another key to writing self-documenting code is to wrap any bit of logic that is not clear in a method call with a clear name.
if( (x < 3 || x > 17) && (y < 8 || y > 15) )
becomes
if( xAndYAreValid( x, y ) ) // or similar...
I'm not sure writing code that is so expressive that you don't need comments is necessarily a great goal. Seems to me like another form of overoptimization. If I were on your team, I'd be pleased to see clear, concise code with just enough comments.
In most cases, yes, you can write code that is clear enough that comments become unnecessary noise.
The biggest problem with comments is there is no way to check their accuracy. I tend to agree with Uncle Bob Martin in chapter 4 of his book, Clean Code:
The proper use of comments is to compensate for our failure to express ourself in
code. Note that I used the word failure. I meant it. Comments are always failures. We must
have them because we cannot always figure out how to express ourselves without them,
but their use is not a cause for celebration.
So when you find yourself in a position where you need to write a comment, think it
through and see whether there isn’t some way to turn the tables and express yourself in
code. Every time you express yourself in code, you should pat yourself on the back. Every
time you write a comment, you should grimace and feel the failure of your ability of
expression.
Most comments are either needless redundancy, outright fallacy or a crutch used to explain poorly written code. I say most because there are certain scenarios where the lack of expressiveness lies with the language rather than the programmer.
For instance the copyright and license information typically found at the beginning of a source file. As far as I'm aware no known construct exists for this in any of the popular languages. Since a simple one or two line comment suffices, its unlikely that such a construct will be added.
The original need for most comments has been replaced over time by better technology or practices. Using a change journal or commenting out code has been supplanted with source control systems. Explanatory comments in long functions can be mitigated by simply writing shorter functions. etc.
You usually can turn your comment into a function name something like:
if (starColourIsGreaterThanThreshold(){
doSomething();
}
....
private boolean starColourIsGreaterThanThreshold() {
return starColour.red > THRESHOLD &&
starColour.blue > THRESHOLD &&
starColour.green > THRESHOLD
}
I think comments should express the why, perhaps the what, but as much as possible the code should define the how (the behavior).
Someone should be able to read the code and understand what it does (the how) from the code. What may not be obvious is why you would want such behavior and what this behavior contributes to the overall requirements.
The need to comment should give you pause, though. Maybe how you are doing it is too complicated and the need to write a comment shows that.
There is a third alternative to documenting code - logging. A method that is well peppered with logging statements can do a lot to explain the why, can touch on the what and may give you a more useful artifact than well named methods and variables regarding the behavior.
If you want to code entirely without comments and still have your code be followable, then you'll have to write a larger number of shorter methods. Methods will have to have descriptive names. Variables will also have to have descriptive names. One common method of doing this is to give variables the name of nouns and to give methods the names of verbal phrases. For example:
account.updateBalance();
child.givePacifier();
int count = question.getAnswerCount();
Use enums liberally. With an enum, you can replace most booleans and integral constants. For example:
public void dumpStackPretty(boolean allThreads) {
....
}
public void someMethod() {
dumpStackPretty(true);
}
vs
public enum WhichThreads { All, NonDaemon, None; }
public void dumpStackPretty(WhichThreads whichThreads) {
....
}
public void someMethod() {
dumpStackPretty(WhichThreads.All);
}
Descriptive names is your obvious first bet.
Secondly make sure each method does one thing and only one thing. If you have a public method that needs to do many things, split it up into several private methods and call those from the public method, in a way that makes the logic obvious.
Some time ago I had to create a method that calculated the correlation of two time series.
To calculate the correlation you also need the mean and standard deviation. So I had two private methods (well actually in this case they were public as they could be used for other purposes (but assuming they couldn't then they would be private)) for calculating A) the mean, B) the standard deviation.
This sort of splitting up of function into the smallest part that makes sense is probably the most important thing to make a code readable.
How do you decide where to break up methods. My way is, if the name is obvious e.g. getAddressFromPage it is the right size. If you have several contenders you are probably trying to do too much, if you can't think of a name that makes sense you method may not "do" enough - although the latter is much less likely.
I don't really think comments are a good idea in most cases. Comments don't get checked by the compiler so they so often are misleading or wrong as the code changes over time. Instead, I prefer self documenting, concise methods that don't need comments. It can be done, and I have been doing it this way for years.
Writing code without comments takes practice and discipline, but I find that the discipline pays off as the code evolves.
It may not be comments, but, to help someone better understand what it going on you may need some diagrams explaining how the program should work, as, if a person knows the big picture then it is easier to understand code.
But, if you are doing something complex then you may need some comments, for example, in a very math intensive program.
The other place I find comments useful and important, is to ensure that someone doesn't replace code with something that looks like it should work, but won't. In that case I leave the bad code in, and comment it out, with an explanation as to why it shouldn't be used.
So, it is possible to write code without comments, but only if you are limited in what types of applications you are writing, unless you can explain why a decision was made, somewhere, and not call it a comment.
For example, a random generator can be written many ways. If you pick a particular implementation it may be necessary to explain why you picked that particular generator, as the period may be sufficiently long for current requirements, but later the requirements may change and your generator may not be sufficient.
I believe it's possible, if you consider the fact that not everybody likes the same style. So in order to minimize comments, knowing your "readers" is the most important thing.
In "information systems" kind-of software, try using declarative sentence, try to approximate the code line to a line in english, and avoid "mathematical programming" (with the i,j and k for index, and the one-liners-to-do-a-lot) at all costs.
I think code can be self-documenting to a large degree, and I think it's crucial, but reading even well-written code can be like looking at cells of the human body with a microscope. It sometimes takes comments to really explain the big picture of how pieces of the system fit together, especially if it solves a really complex and difficult problem.
Think about special data structures. If all that computer scientists had ever published about data structures were well-written code, few would really understand the relative benefit of one data structure over another -- because Big-O runtime of any given operation is sometimes just not obvious from reading the code. That's where the math and amortized analysis presented in articles come in.

What types of coding anti-patterns do you always refactor when you cross them?

I just refactored some code that was in a different section of the class I was working on because it was a series of nested conditional operators (?:) that was made a ton clearer by a fairly simple switch statement (C#).
When will you touch code that isn't directly what you are working on to make it more clear?
I once was refactoring and came across something like this code:
string strMyString;
try
{
strMyString = Session["MySessionVar"].ToString();
}
catch
{
strMyString = "";
}
Resharper pointed out that the .ToString() was redundant, so I took it out. Unfortunately, that ended up breaking the code. Whenever MySessionVar was null, it wasn't causing the NullReferenceException that the code relied on to bump it down to the catch block. I know, this was some sad code. But I did learn a good lesson from it. Don't rapidly go through old code relying on a tool to help you do the refactoring - think it through yourself.
I did end up refactoring it as follows:
string strMyString = Session["MySessionVar"] ?? "";
Update: Since this post is being upvoted and technically doesn't contain an answer to the question, I figured I should actually answer the question. (Ok, it was bothering me to the point that I was actually dreaming about it.)
Personally I ask myself a few questions before refactoring.
1) Is the system under source control? If so, go ahead and refactor because you can always roll back if something breaks.
2) Do unit tests exist for the functionality I am altering? If so, great! Refactor. The danger here is that the existence of unit tests don't indicate the accuracy and scope of said unit tests. Good unit tests should pick up any breaking changes.
3) Do I thoroughly understand the code I am refactoring? If there's no source control and no tests and I don't really understand the code I am changing, that's a red flag. I'd need to get more comfortable with the code before refactoring.
In case #3 I would probably spend the time to actually track all of the code that is currently using the method I am refactoring. Depending on the scope of the code this could be easy or impossible (ie. if it's a public API). If it comes down to being a public API then you really need to understand the original intent of the code from a business perspective.
I only refactor it if tests are already in place. If not, it's usually not worth my time to write tests for and refactor presumably working code.
This is a small, minor antipattern but it so irritates me that whenever I find it, I expunge it immediately. In C (or C++ or Java)
if (p)
return true;
else
return false;
becomes
return p;
In Scheme,
(if p #t #f)
becomes
p
and in ML
if p then true else false
becomes
p
I see this antipattern almost exclusively in code written by undergraduate students. I am definitely not making this up!!
Whenever I come across it and I don't think changing it will cause problems (e.g. I can understand it enough that I know what it does. e.g. the level of voodoo is low).
I only bother to change it if there is some other reason I'm modifying the code.
How far I'm willing to take it depends on how confident I am that I won't break anything and how extensive my own changes to the code are going to be.
This is a great situation to show off the benefits of unit tests.
If unit tests are in place, developers can bravely and aggressively refactor oddly written code they might come across. If it passes the unit tests and you've increased readability, then you've done your good deed for the day and can move on.
Without unit tests, simplifying complex code that's filled with voodoo presents a great risk of breaking the code and not even knowing you've introduced a new bug! So most developers will take the cautious route and move on.
For simple refactoring I try to clean up deeply nested control structures and really long functions (more than one screen worth of text). However its not a great idea to refactor code without a good reason (especially in a big team of developers). In general, unless the refactoring will make a big improvement in the code or fix an egregious sin I try to leave well enough alone.
Not refactoring per-say but just as a matter of general housekeeping I generally do this stuff when I start work on a module:
Remove stupid comments
Comments that say nothing more than the function signature already says
Comments that are pure idiocy like "just what it looks like"
Changelogs at the top of the file (we have version control for a reason)
Any API docs that are clearly out-of-sync with the code
Remove commented-out chunks of code
Add version control tags like $Id$ if they are missing
Fix whitespace issues (this can be annoying to others though because your name shows up for a lot of lines in a diff even if all you did was change whitespace)
Remove whitespace at the end of the lines
Change tabs->spaces (for that is our convention where I work)
If the refactor makes the code much easier to read, the most common for me would be duplicate code, e.g. an if/else that only differs by the first/last commands.
if($something) {
load_data($something);
} else {
load_data($something);
echo "Loaded";
do_something_else();
}
More than (arguably) three or four lines of duplicate code always makes me think about refactoring. Also, I tend to move code around a lot, extracting the code I predict to be used more frequently into a separate place - a class with its own well-defined purpose and responsibilites, or a static method of a static class (usually placed in my Utils.* namespace).
But, to answer your question, yes, there are lot of cases when making the code shorter does not necessarily mean making it well structued and readable. Using the ?? operator in C# is another example. What you also have to think about are the new features in your language of choice - e.g. LINQ can be used to do some stuff in a very elegant manner but also can make a very simple thing very unreadable and overly complex. You need to weigh these two thing very carefully, in the end it all boils down to your personal taste and, mostly, experience.
Well, this is another "it depends" answer, but I am afraid it has to be.
I almost always break >2 similar conditionals into a switch... most often with regards to enums. I will short a return instead of a long statement.
ex:
if (condition) {
//lots of code
//returns value
} else {
return null;
}
becomes:
if (!condition)
return null;
//lots of code..
//return value
breaking out early reduces extra indents, and reduces long bits of code... also as a general rule I don't like methods with more than 10-15 lines of code. I like methods to have a singular purpose, even if creating more private methods internally.
Usually I don't refactor the code if I'm just browsing it, not actively working on it.
But sometimes ReSharper points out some stuff I just can't resist to Alt+Enter. :)
I tend to refactor very long functions and methods if I understand the set of inputs and outputs to a block.
This helps readability no end.
I would only refactor the code that I come across and am not actively working on after going through the following steps:
Speak with the author of the code (not always possible) to figure out what that piece of code does. Even if it is obvious to me as to what the piece of code is doing, it always helps to understand the rationale behind why the author may have decided to do something in a certain way. Spending a couple of minutes talking about it would not only help the original author understand your point of view, it also builds trust within the team.
Know or find out what that piece of code is doing in order to test it after re-factoring (A build process with unit tests is very helpful here. It makes the whole thing quick and easy). Run the unit tests before and after the change and ensure nothing is breaking due to your changes.
Send out a heads up to the team (if working with others) and let them know of the upcoming change so nobody is surprised when the change actually occurs
Refactoring for the sake of it is one of the roots of all evil. Please don't ever do it. (Unit tests do somewhat mitigate this).
Is vast swaths of commented out code an antipattern? While not code specifically, (it's comments) I see a lot of this kind of behaviour with people who don't understand how to use source control, and who want to keep the old code around for later so they can more easily go back if a bug was introduced. Whenever I see vast swaths of code that are commented out, I almost always remove them in their entirety.
I try to refactor based on following factors:
Do I understand enough to know whats happening?
Can I easily revert back if this change breaks the code.
Will I have enough time to revert the change back if it breaks the build.
And sometimes, if I have enough time, I refactor to learn. As in, I know my change may break the code, but I dont know where and how. So I change it anyways to find out where its breaking and why. That way I learn more about the code.
My domain has been mobile software (cell phones) where most of the code resides on my PC and wont impact others. Since I also maintain the CI build system for my company I can run a complete product build (for all phones) on the refactored code to ensure it doesnt break anything else. But thats my personal luxury which you may not have.
I tend to refactor global constants and enumerations quite a bit if they can be deemed a low refactor risk. Maybe it's just be, but something like a ClientAccountStatus enum should be in or close to the ClientAccount class rather than being in a GlobalConstAndEnum class.
Deletion/updating of comments which are clearly wrong or clearly pointless.
Removing them is:
safe
version control means you can find them again
improves the quality of the code to others and yourself
It is about the only 100% risk free refactoring I know.
Note that doing this in structured comments like javadoc comments is a different matter. Changing these is not risk free, as such they are very likely to be fixed/removed but not dead certain guaranteed fixes as standard incorrect code comments would be.

Resources