Related
This isn't related to a specific language, although I am writing in Java, but I was wondering how an if/else statement that was designed to catch a bad input should be structured, i.e. whether I should put the break statement first or the code first, or if it is largely semantics and doesn't matter. Here is an example:
Should it be:
if (faultyInput):
return Null;
else:
doActions();
or
if (!faultyInput):
doActions();
else:
return Null;
I think you must write:
if(faultyInput) retun null;
doActions();
No need to include the else statement, because there is already the return keyword.
It will be more clear.
In your example there is no single correct way to write the code. Both situations will give you exact same result. You should just do whichever you feel is what you want and will help you understand your code better.
Is there a consensus of preference between these two programming approaches? Could you please explain to me why, on pros`cons scale, for your chosen paradigm.
(i) A program has three functions that needs to be enacted on some input. It runs the first, gets a returned variable, runs the second with that variable and then does the same for the third. Finally printing the third's returned variable.
func1(){ return f1 }
func2(){ return f2 }
func3(){ return f3 }
main(){
fin=# of inputs
i=0
while i<fin
first=func1(in[i])
sec=func2(first)
third=func3(sec)
print(third)
i++
}
(ii) A program steps through a series of instructions, initially pushing the first domino from the main function.
func1(){ func2(newfrom1) }
func2(){ func3(newfrom2) }
func3(){ print(newfrom3) }
main(){
fin=# of inputs
i=0
while i<fin
func1(in[i])
i++
}
The only difference I see is that version 2 uses variables to store intermediate results.
So from a performance point of view, there should not be any difference, since a compiler would store these intermediate results in both versions in registers. But this can be checked by profiling.
But to me version 1 is more readable, and thus better.
The first approach is more reusable - what if you want to do whatever it is that func1 does to something else later on, but you don't then want to do func2 and func3 on it? If func1 was written to call those for the first scenario then you have to go and change everything.
My preference is to try to identify 'operations' that make sense for a single function to do, write a function to do that, then for more complex things write another function which calls several of the smaller ones to achieve its ends. One then often finds some of those smaller functions find use elsewhere at a later date.
Yes this leaves me with more function calls, and possibly more temporary storage being used, but I let the compiler worry about that - if it proves to be a performance issue I'll deal with it then. Usually performance is hurt by other things though.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Is it a bad practice to use break statement inside a for loop?
Say, I am searching for an value in an array. Compare inside a for loop and when value is found, break; to exit the for loop.
Is this a bad practice? I have seen the alternative used: define a variable vFound and set it to true when the value is found and check vFound in the for statement condition. But is it necessary to create a new variable just for this purpose?
I am asking in the context of a normal C or C++ for loop.
P.S: The MISRA coding guidelines advise against using break.
No, break is the correct solution.
Adding a boolean variable makes the code harder to read and adds a potential source of errors.
Lots of answers here, but I haven't seen this mentioned yet:
Most of the "dangers" associated with using break or continue in a for loop are negated if you write tidy, easily-readable loops. If the body of your loop spans several screen lengths and has multiple nested sub-blocks, yes, you could easily forget that some code won't be executed after the break. If, however, the loop is short and to the point, the purpose of the break statement should be obvious.
If a loop is getting too big, use one or more well-named function calls within the loop instead. The only real reason to avoid doing so is for processing bottlenecks.
You can find all sorts of professional code with 'break' statements in them. It perfectly make sense to use this whenever necessary. In your case this option is better than creating a separate variable just for the purpose of coming out of the loop.
Using break as well as continue in a for loop is perfectly fine.
It simplifies the code and improves its readability.
Far from bad practice, Python (and other languages?) extended the for loop structure so part of it will only be executed if the loop doesn't break.
for n in range(5):
for m in range(3):
if m >= n:
print('stop!')
break
print(m, end=' ')
else:
print('finished.')
Output:
stop!
0 stop!
0 1 stop!
0 1 2 finished.
0 1 2 finished.
Equivalent code without break and that handy else:
for n in range(5):
aborted = False
for m in range(3):
if not aborted:
if m >= n:
print('stop!')
aborted = True
else:
print(m, end=' ')
if not aborted:
print('finished.')
General rule: If following a rule requires you to do something more awkward and difficult to read then breaking the rule, then break the rule.
In the case of looping until you find something, you run into the problem of distinguishing found versus not found when you get out. That is:
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==42)
break;
}
// So when we get here, did we find one, or did we fall out the bottom?
So okay, you can set a flag, or initialize a "found" value to null. But
That's why in general I prefer to push my searches into functions:
Foo findFoo(int wantBar)
{
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==wantBar)
return foo;
}
// Not found
return null;
}
This also helps to unclutter the code. In the main line, "find" becomes a single statement, and when the conditions are complex, they're only written once.
There is nothing inherently wrong with using a break statement but nested loops can get confusing. To improve readability many languages (at least Java does) support breaking to labels which will greatly improve readability.
int[] iArray = new int[]{0,1,2,3,4,5,6,7,8,9};
int[] jArray = new int[]{0,1,2,3,4,5,6,7,8,9};
// label for i loop
iLoop: for (int i = 0; i < iArray.length; i++) {
// label for j loop
jLoop: for (int j = 0; j < jArray.length; j++) {
if(iArray[i] < jArray[j]){
// break i and j loops
break iLoop;
} else if (iArray[i] > jArray[j]){
// breaks only j loop
break jLoop;
} else {
// unclear which loop is ending
// (breaks only the j loop)
break;
}
}
}
I will say that break (and return) statements often increase cyclomatic complexity which makes it harder to prove code is doing the correct thing in all cases.
If you're considering using a break while iterating over a sequence for some particular item, you might want to reconsider the data structure used to hold your data. Using something like a Set or Map may provide better results.
break is a completely acceptable statement to use (so is continue, btw). It's all about code readability -- as long as you don't have overcomplicated loops and such, it's fine.
It's not like they were the same league as goto. :)
It depends on the language. While you can possibly check a boolean variable here:
for (int i = 0; i < 100 && stayInLoop; i++) { ... }
it is not possible to do it when itering over an array:
for element in bigList: ...
Anyway, break would make both codes more readable.
I agree with others who recommend using break. The obvious consequential question is why would anyone recommend otherwise? Well... when you use break, you skip the rest of the code in the block, and the remaining iterations. Sometimes this causes bugs, for example:
a resource acquired at the top of the block may be released at the bottom (this is true even for blocks inside for loops), but that release step may be accidentally skipped when a "premature" exit is caused by a break statement (in "modern" C++, "RAII" is used to handle this in a reliable and exception-safe way: basically, object destructors free resources reliably no matter how a scope is exited)
someone may change the conditional test in the for statement without noticing that there are other delocalised exit conditions
ndim's answer observes that some people may avoid breaks to maintain a relatively consistent loop run-time, but you were comparing break against use of a boolean early-exit control variable where that doesn't hold
Every now and then people observing such bugs realise they can be prevented/mitigated by this "no breaks" rule... indeed, there's a whole related strategy for "safer" programming called "structured programming", where each function is supposed to have a single entry and exit point too (i.e. no goto, no early return). It may eliminate some bugs, but it doubtless introduces others. Why do they do it?
they have a development framework that encourages a particular style of programming / code, and they've statistical evidence that this produces a net benefit in that limited framework, or
they've been influenced by programming guidelines or experience within such a framework, or
they're just dictatorial idiots, or
any of the above + historical inertia (relevant in that the justifications are more applicable to C than modern C++).
In your example you do not know the number of iterations for the for loop. Why not use while loop instead, which allows the number of iterations to be indeterminate at the beginning?
It is hence not necessary to use break statemement in general, as the loop can be better stated as a while loop.
I did some analysis on the codebase I'm currently working on (40,000 lines of JavaScript).
I found only 22 break statements, of those:
19 were used inside switch statements (we only have 3 switch statements in total!).
2 were used inside for loops - a code that I immediately classified as to be refactored into separate functions and replaced with return statement.
As for the final break inside while loop... I ran git blame to see who wrote this crap!
So according to my statistics: If break is used outside of switch, it is a code smell.
I also searched for continue statements. Found none.
It's perfectly valid to use break - as others have pointed out, it's nowhere in the same league as goto.
Although you might want to use the vFound variable when you want to check outside the loop whether the value was found in the array. Also from a maintainability point of view, having a common flag signalling the exit criteria might be useful.
I don't see any reason why it would be a bad practice PROVIDED that you want to complete STOP processing at that point.
In the embedded world, there is a lot of code out there that uses the following construct:
while(1)
{
if (RCIF)
gx();
if (command_received == command_we_are_waiting_on)
break;
else if ((num_attempts > MAX_ATTEMPTS) || (TickGet() - BaseTick > MAX_TIMEOUT))
return ERROR;
num_attempts++;
}
if (call_some_bool_returning_function())
return TRUE;
else
return FALSE;
This is a very generic example, lots of things are happening behind the curtain, interrupts in particular. Don't use this as boilerplate code, I'm just trying to illustrate an example.
My personal opinion is that there is nothing wrong with writing a loop in this manner as long as appropriate care is taken to prevent remaining in the loop indefinitely.
Depends on your use case. There are applications where the runtime of a for loop needs to be constant (e.g. to satisfy some timing constraints, or to hide your data internals from timing based attacks).
In those cases it will even make sense to set a flag and only check the flag value AFTER all the for loop iterations have actually run. Of course, all the for loop iterations need to run code that still takes about the same time.
If you do not care about the run time... use break; and continue; to make the code easier to read.
On MISRA 98 rules, that is used on my company in C dev, break statement shall not be used...
Edit : Break is allowed in MISRA '04
Ofcourse, break; is the solution to stop the for loop or foreach loop. I used it in php in foreach and for loop and found working.
I think it can make sense to have your checks at the top of your for loop like so
for(int i = 0; i < myCollection.Length && myCollection[i].SomeValue != "Break Condition"; i++)
{
//loop body
}
or if you need to process the row first
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
//loop body
}
This way you can have a singular body function without breaks.
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
PerformLogic(myCollection[i]);
}
It can also be modified to move Break into its own function as well.
for(int i = 0; ShouldContinueLooping(i, myCollection); i++)
{
PerformLogic(myCollection[i]);
}
I have the following VB.NET code (but for each loops are in most languages, thus the language-agnostic tag):
Public Function VerifyServiceName(ByRef sMachineName As String, ByRef sServiceName As String) As Boolean
Dim asServices As System.ServiceProcess.ServiceController() = System.ServiceProcess.ServiceController.GetServices(sMachineName)
Dim bVerified As Boolean = False
For Each sService In asServices
If sService.DisplayName = sServiceName Then bVerified = True
Next
Return bVerified
End Function
If I have X number of services to loop through, and my service name is #3. Is it better to have multiple return statements or an exit for? Or is there a more efficient way of writing this function?
I know that the time difference between looping X times and looping through 3 times could be marginal for what I am doing, but I always have performance on the brain.
I personally believe having one return at the bottom is far more readable and easier to debug than if you have return statements everywhere, as you can never tell when the function is going to exit so you end up putting breakpoints on every return statement instead of just once at the end, for example.
I think it's all down to preference though, as there are valid arguments for both ways.
Found more discussion here about the subject. I guess I am not that proficient at searching.
I would never use a goto as the target of another goto, so if there's some additional processing at the end of the function, use "break / Exit For", otherwise just return early. Otherwise you end up with lines that mean "return" but say "break"... that doesn't help maintainability.
This is a minor style question, but every bit of readability you add to your code counts.
So if you've got:
if (condition) then
{
// do stuff
}
else
{
// do other stuff
}
How do you decide if it's better like that, or like this:
if (!condition) then
{
// do other stuff
{
else
{
// do stuff
}
My heuristics are:
Keep the condition positive (less
mental calculation when reading it)
Put the most common path into the
first block
I prefer to put the most common path first, and I am a strong believer in nesting reduction so I will break, continue, or return instead of elsing whenever possible. I generally prefer to test against positive conditions, or invert [and name] negative conditions as a positive.
if (condition)
return;
DoSomething();
I have found that by drastically reducing the usage of else my code is more readable and maintainable and when I do have to use else its almost always an excellent candidate for a more structured switch statement.
Two (contradictory) textbook quotes:
Put the shortest clause of an if/else
on top
--Allen Holub, "Enough Rope to Shoot Yourself in the Foot", p52
Put the normal case after the if rather than after the else
--Steve McConnell, "Code Complete, 2nd ed.", p356
I prefer the first one. The condition should be as simple as possible and it should be fairly obvious which is simpler out of condition and !condition
It depends on your flow. For many functions, I'll use preconditions:
bool MyFunc(variable) {
if (variable != something_i_want)
return false;
// a large block of code
// ...
return true;
}
If I need to do something each case, I'll use an if (positive_clause) {} else {} format.
If the code is to check for an error condition, I prefer to put that code first, and the "successful" code second; conceptually, this keeps a function call and its error-checking code together, which makes sense to me because they are related. For example:
if (!some_function_that_could_fail())
{
// Error handling code
}
else
{
// Success code
}
I agree with Oli on using a positive if clause when possible.
Just please never do this:
if (somePositiveCondition)
else {
//stuff
}
I used to see this a lot at one place I worked and used to wonder if one of the coders didn't understand how not works...
When I am looking at data validation, I try to make my conditions "white listing" - that is, I test for what I will accept:
if DataIsGood() then
DoMyNormalStuff
else
TakeEvasiveAction
Rather than the other way around, which tends to degenerate into:
if SomeErrorTest then
TakeSomeEvasiveAction
else if SomeOtherErrorCondition then
CorrectMoreStupidUserProblems
else if YetAnotherErrorThatNoOneThoughtOf then
DoMoreErrorHandling
else
DoMyNormalStuff
I know this isn't exactly what you're looking for, but ... A lot of developers use a "guard clause", that is, a negative "if" statement that breaks out of the method as soon as possible. At that point, there is no "else" really.
Example:
if (blah == false)
{
return; // perhaps with a message
}
// do rest of code here...
There are some hard-core c/c++/assembly guys out there that will tell you that you're destroying your CPU!!! (in many cases, processors favor the "true" statement and try to "prefetch" the next thing to do... so theoretically any "false" condition will flush the pipe and will go microseconds slower).
In my opinion, we are at the point where "better" (more understandable) code wins out over microseconds of CPU time.
I think that for a single variable the not operator is simple enough and naming issues start being more relevant.
Never name a variable not_X, if in need use a thesaurus and find an opposite. I've seen plenty of awful code like
if (not_dead) {
} else {
}
instead of the obvious
if (alive) {
} else {
}
Then you can sanely use (very readable, no need to invert the code blocks)
if (!alive) {
} else {
}
If we're talking about more variables I think the best rule is to simplify the condition. After a while projects tend to get conditions like:
if (dead || (!dead && sleeping)) {
} else {
}
Which translates to
if (dead || sleeping) {
} else {
}
Always pay attention to what conditions look like and how to simplify them.
Software is knowledge capture. You're encoding someone's knowledge of how to do something.
The software should fit what's "natural" for the problem. When in doubt, ask someone else and see what people actually say and do.
What about the situation where the "common" case is do nothing? What then
if( common ) {
// pass
}
else {
// great big block of exception-handling folderol
}
Or do you do this?
if( ! common ) {
// great big block of except-handling folderol
}
The "always positive" rule isn't really what you want first. You want to look at rules more like the following.
Always natural -- it should read like English (or whatever the common language in your organization is.)
Where possible, common cases first -- so they appear common.
Where possible use positive logic; negative logic can be used where it's commonly said that way or where the common case is a do-nothing.
If one of the two paths is very short (1 to 10 lines or so) and the other is much longer, I follow the Holub rule mentioned here and put the shorter piece of code in the if. That makes it easier to see the if/else flow on one screen when reviewing the code.
If that is not possible, then I structure to make the condition as simple as possible.
For me it depends on the condition, for example:
if (!PreserveData.Checked)
{ resetfields();}
I tend to talk to my self with what I want the logic to be and code it to the little voice in my head.
You can usually make the condition positive without switching around the if / else blocks.
Change
if (!widget.enabled()) {
// more common
} else {
// less common
}
to
if (widget.disabled()) {
// more common
} else {
// less common
}
Intel Pentium branch prediction pre-fetches instructions for the "if" case. If it instead follows the "else" branch: it has the flush the instruction pipeline, causing a stall.
If you care a lot about performance: put the most likely outcome in the 'if' clause.
Personally i write it as
if (expected)
{
//expected path
}
else
{
//fallback other odd case
}
If you have both true and false conditions then I'd opt for a positive conditional - This reduces confusion and in general I believe makes your code easier to read.
On the other hand, if you're using a language such as Perl, and particularly if your false condition is either an error condition or the most common condition, you can use the 'unless' structure, which executes the code block unless the condition is true (i.e. the opposite of if):
unless ($foo) {
$bar;
}
First of all, let's put aside situations when it is better to avoid using "else" in the first place (I hope everyone agrees that such situations do exist and determining such cases probably should be a separate topic).
So, let's assume that there must be an "else" clause.
I think that readability/comprehensibility imposes at least three key requirements or rules, which unfortunately often compete with each other:
The shorter is the first block (the "if" block) the easier is it to grasp the entire "if-else" construct. When the "if" block is long enough, it becomes way too easy to overlook existence of "else" block.
When the "if" and "else" paths are logically asymmetric (e.g. "normal processing" vs. "error processing"), in a standalone "if-else" construct it does not really matter much which path is first and which is second. However, when there are multiple "if-else" constructs in proximity to each other (including nesting), and when all those "if-else" constructs have asymmetry of the same kind - that's when it is very important to arrange those asymmetric paths consistently.
Again, it can be "if ... normal path ... else ... abnormal path" for all, or "if ... abnormal path ... else ... normal path" for all, but it should not be a mix of these two variants.
With all other conditions equal, putting the normal path first is probably more natural for most human beings (I think it's more about psychology than aesthetics :-).
An expression that starts with a negation usually is less readable/comprehensible than an expression that doesn't.
So, we have these three competing requirements/rules, and the real question is: which of them are more important than others. For Allen Holub the rule #1 is probably the most important one. For Steve McConnell - it is the rule #2. But I don't think that you can really choose only one of these rules as a single quideline.
I bet you've already guessed my personal priorities here (from the way I ordered the rules above :-).
My reasons are simple:
The rule #1 is unconditional and impossible to circumvent. If one of the blocks is so long that it runs off the screen - it must become the "else" block. (No, it is not a good idea to create a function/method mechanically just to decrease the number of lines in an "if" or "else" block! I am assuming that each block already has a logically justifiable minimum amount of lines.)
The rule #2 involves a lot of conditions: multiple "if-else" constructs, all having asymmetry of the same kind, etc. So it just does not apply in many cases.
Also, I often observe the following interesting phenomenon: when the rule #2 does apply and when it is used properly, it actually does not conflict with the rule #1! For example, whenever I have a bunch of "if-else" statements with "normal vs. abnormal" asymmetry, all the "abnormal" paths are shorter than "normal" ones (or vice versa). I cannot explain this phenomenon, but I think that it's just a sign of good code organization. In other words, whenever I see a situation when rules #1 and #2 are in conflict, I start looking for "code smells" and more often than not I do find some; and after refactoring - tada! no more painful choosing between rule #1 and rule #2, :-)
Finally, the rule #3 hase the smallest scope and therefore is the least critical.
Also, as mentined here by other colleagues, it is often very easy to "cheat" with this rule (for example, to write "if(disabled),,," instead of "if(!enabled)...").
I hope someone can make some sense of this opus...
As a general rule, if one is significantly larger than the other, I make the larger one the if block.
put the common path first
turn negative cheking into positive ones (!full == empty)
I always keep the most likely first.
In Perl I have an extra control structure to help with that. The inverse of if.
unless (alive) {
go_to_heaven;
} else {
say "MEDIC";
}
You should always put the most likely case first. Besides being more readable, it is faster. This also applies to switch statements.
I'm horrible when it comes to how I set up if statements. Basically, I set it up based on what exactly I'm looking for, which leads everything to be different.
if (userinput = null){
explodeViolently();
} else {
actually do stuff;
}
or perhaps something like
if (1+1=2) {
do stuff;
} else {
explodeViolently();
}
Which section of the if/else statement actually does things for me is a bad habit of mine.
I generally put the positive result (so the method) at the start so:
if(condition)
{
doSomething();
}
else
{
System.out.println("condition not true")
}
But if the condition has to be false for the method to be used, I would do this:
if(!condition)
{
doSomething();
}
else
{
System.out.println("condition true");
}
If you must have multiple exit points, put them first and make them clear:
if TerminatingCondition1 then
Exit
if TerminatingCondition2 then
Exit
Now we can progress with the usual stuff:
if NormalThing then
DoNormalThing
else
DoAbnormalThing