Should one use < or <= in a for loop [closed] - performance

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
If you had to iterate through a loop 7 times, would you use:
for (int i = 0; i < 7; i++)
or:
for (int i = 0; i <= 6; i++)
There are two considerations:
performance
readability
For performance I'm assuming Java or C#. Does it matter if "less than" or "less than or equal to" is used? If you have insight for a different language, please indicate which.
For readability I'm assuming 0-based arrays.
UPD: My mention of 0-based arrays may have confused things. I'm not talking about iterating through array elements. Just a general loop.
There is a good point below about using a constant to which would explain what this magic number is. So if I had "int NUMBER_OF_THINGS = 7" then "i <= NUMBER_OF_THINGS - 1" would look weird, wouldn't it.

The first is more idiomatic. In particular, it indicates (in a 0-based sense) the number of iterations. When using something 1-based (e.g. JDBC, IIRC) I might be tempted to use <=. So:
for (int i=0; i < count; i++) // For 0-based APIs
for (int i=1; i <= count; i++) // For 1-based APIs
I would expect the performance difference to be insignificantly small in real-world code.

Both of those loops iterate 7 times. I'd say the one with a 7 in it is more readable/clearer, unless you have a really good reason for the other.

I remember from my days when we did 8086 Assembly at college it was more performant to do:
for (int i = 6; i > -1; i--)
as there was a JNS operation that means Jump if No Sign. Using this meant that there was no memory lookup after each cycle to get the comparison value and no compare either. These days most compilers optimize register usage so the memory thing is no longer important, but you still get an un-required compare.
By the way putting 7 or 6 in your loop is introducing a "magic number". For better readability you should use a constant with an Intent Revealing Name. Like this:
const int NUMBER_OF_CARS = 7;
for (int i = 0; i < NUMBER_OF_CARS; i++)
EDIT: People aren’t getting the assembly thing so a fuller example is obviously required:
If we do for (i = 0; i <= 10; i++) you need to do this:
mov esi, 0
loopStartLabel:
; Do some stuff
inc esi
; Note cmp command on next line
cmp esi, 10
jle exitLoopLabel
jmp loopStartLabel
exitLoopLabel:
If we do for (int i = 10; i > -1; i--) then you can get away with this:
mov esi, 10
loopStartLabel:
; Do some stuff
dec esi
; Note no cmp command on next line
jns exitLoopLabel
jmp loopStartLabel
exitLoopLabel:
I just checked and Microsoft's C++ compiler does not do this optimization, but it does if you do:
for (int i = 10; i >= 0; i--)
So the moral is if you are using Microsoft C++†, and ascending or descending makes no difference, to get a quick loop you should use:
for (int i = 10; i >= 0; i--)
rather than either of these:
for (int i = 10; i > -1; i--)
for (int i = 0; i <= 10; i++)
But frankly getting the readability of "for (int i = 0; i <= 10; i++)" is normally far more important than missing one processor command.
† Other compilers may do different things.

I always use < array.length because it's easier to read than <= array.length-1.
also having < 7 and given that you know it's starting with a 0 index it should be intuitive that the number is the number of iterations.

Seen from an optimizing viewpoint it doesn't matter.
Seen from a code style viewpoint I prefer < . Reason:
for ( int i = 0; i < array.size(); i++ )
is so much more readable than
for ( int i = 0; i <= array.size() -1; i++ )
also < gives you the number of iterations straight away.
Another vote for < is that you might prevent a lot of accidental off-by-one mistakes.

#Chris, Your statement about .Length being costly in .NET is actually untrue and in the case of simple types the exact opposite.
int len = somearray.Length;
for(i = 0; i < len; i++)
{
somearray[i].something();
}
is actually slower than
for(i = 0; i < somearray.Length; i++)
{
somearray[i].something();
}
The later is a case that is optimized by the runtime. Since the runtime can guarantee i is a valid index into the array no bounds checks are done. In the former, the runtime can't guarantee that i wasn't modified prior to the loop and forces bounds checks on the array for every index lookup.

It makes no effective difference when it comes to performance. Therefore I would use whichever is easier to understand in the context of the problem you are solving.

I prefer:
for (int i = 0; i < 7; i++)
I think that translates more readily to "iterating through a loop 7 times".
I'm not sure about the performance implications - I suspect any differences would get compiled away.

In C++, I prefer using !=, which is usable with all STL containers. Not all STL container iterators are less-than comparable.

In Java 1.5 you can just do
for (int i: myArray) {
...
}
so for the array case you don't need to worry.

I don't think there is a performance difference. The second form is definitely more readable though, you don't have to mentally subtract one to find the last iteration number.
EDIT: I see others disagree. For me personally, I like to see the actual index numbers in the loop structure. Maybe it's because it's more reminiscent of Perl's 0..6 syntax, which I know is equivalent to (0,1,2,3,4,5,6). If I see a 7, I have to check the operator next to it to see that, in fact, index 7 is never reached.

I'd say use the "< 7" version because that's what the majority of people will read - so if people are skim reading your code, they might interpret it wrongly.
I wouldn't worry about whether "<" is quicker than "<=", just go for readability.
If you do want to go for a speed increase, consider the following:
for (int i = 0; i < this->GetCount(); i++)
{
// Do something
}
To increase performance you can slightly rearrange it to:
const int count = this->GetCount();
for (int i = 0; i < count; ++i)
{
// Do something
}
Notice the removal of GetCount() from the loop (because that will be queried in every loop) and the change of "i++" to "++i".

Edsger Dijkstra wrote an article on this back in 1982 where he argues for lower <= i < upper:
There is a smallest natural number. Exclusion of the lower bound —as in b) and d)— forces for a subsequence starting at the smallest natural number the lower bound as mentioned into the realm of the unnatural numbers. That is ugly, so for the lower bound we prefer the ≤ as in a) and c). Consider now the subsequences starting at the smallest natural number: inclusion of the upper bound would then force the latter to be unnatural by the time the sequence has shrunk to the empty one. That is ugly, so for the upper bound we prefer < as in a) and d). We conclude that convention a) is to be preferred.

First, don't use 6 or 7.
Better to use:
int numberOfDays = 7;
for (int day = 0; day < numberOfDays ; day++){
}
In this case it's better than using
for (int day = 0; day <= numberOfDays - 1; day++){
}
Even better (Java / C#):
for(int day = 0; day < dayArray.Length; i++){
}
And even better (C#)
foreach (int day in days){// day : days in Java
}
The reverse loop is indeed faster but since it's harder to read (if not by you by other programmers), it's better to avoid in. Especially in C#, Java...

I agree with the crowd saying that the 7 makes sense in this case, but I would add that in the case where the 6 is important, say you want to make clear you're only acting on objects up to the 6th index, then the <= is better since it makes the 6 easier to see.

Way back in college, I remember something about these two operations being similar in compute time on the CPU. Of course, we're talking down at the assembly level.
However, if you're talking C# or Java, I really don't think one is going to be a speed boost over the other, The few nanoseconds you gain are most likely not worth any confusion you introduce.
Personally, I would author the code that makes sense from a business implementation standpoint, and make sure it's easy to read.

This falls directly under the category of "Making Wrong Code Look Wrong".
In zero-based indexing languages, such as Java or C# people are accustomed to variations on the index < count condition. Thus, leveraging this defacto convention would make off-by-one errors more obvious.
Regarding performance: any good compiler worth its memory footprint should render such as a non-issue.

As a slight aside, when looping through an array or other collection in .Net, I find
foreach (string item in myarray)
{
System.Console.WriteLine(item);
}
to be more readable than the numeric for loop. This of course assumes that the actual counter Int itself isn't used in the loop code. I do not know if there is a performance change.

There are many good reasons for writing i<7. Having the number 7 in a loop that iterates 7 times is good. The performance is effectively identical. Almost everybody writes i<7. If you're writing for readability, use the form that everyone will recognise instantly.

I have always preferred:
for ( int count = 7 ; count > 0 ; -- count )

Making a habit of using < will make it consistent for both you and the reader when you are iterating through an array. It will be simpler for everyone to have a standard convention. And if you're using a language with 0-based arrays, then < is the convention.
This almost certainly matters more than any performance difference between < and <=. Aim for functionality and readability first, then optimize.
Another note is that it would be better to be in the habit of doing ++i rather than i++, since fetch and increment requires a temporary and increment and fetch does not. For integers, your compiler will probably optimize the temporary away, but if your iterating type is more complex, it might not be able to.

Don't use magic numbers.
Why is it 7? ( or 6 for that matter).
use the correct symbol for the number you want to use...
In which case I think it is better to use
for ( int i = 0; i < array.size(); i++ )

The '<' and '<=' operators are exactly the same performance cost.
The '<' operator is a standard and easier to read in a zero-based loop.
Using ++i instead of i++ improves performance in C++, but not in C# - I don't know about Java.

As people have observed, there is no difference in either of the two alternatives you mentioned. Just to confirm this, I did some simple benchmarking in JavaScript.
You can see the results here. What is not clear from this is that if I swap the position of the 1st and 2nd tests, the results for those 2 tests swap, this is clearly a memory issue. However the 3rd test, one where I reverse the order of the iteration is clearly faster.

As everybody says, it is customary to use 0-indexed iterators even for things outside of arrays. If everything begins at 0 and ends at n-1, and lower-bounds are always <= and upper-bounds are always <, there's that much less thinking that you have to do when reviewing the code.

Great question. My answer: use type A ('<')
You clearly see how many iterations you have (7).
The difference between two endpoints is the width of the range
Less characters makes it more readable
You more often have the total number of elements i < strlen(s) rather than the index of the last element so uniformity is important.
Another problem is with this whole construct. i appears 3 times in it, so it can be mistyped. The for-loop construct says how to do instead of what to do. I suggest adopting this:
BOOST_FOREACH(i, IntegerInterval(0,7))
This is more clear, compiles to exaclty the same asm instructions, etc. Ask me for the code of IntegerInterval if you like.

So many answers ... but I believe I have something to add.
My preference is for the literal numbers to clearly show what values "i" will take in the loop. So in the case of iterating though a zero-based array:
for (int i = 0; i <= array.Length - 1; ++i)
And if you're just looping, not iterating through an array, counting from 1 to 7 is pretty intuitive:
for (int i = 1; i <= 7; ++i)
Readability trumps performance until you profile it, as you probably don't know what the compiler or runtime is going to do with your code until then.

You could also use != instead. That way, you'll get an infinite loop if you make an error in initialization, causing the error to be noticed earlier and any problems it causes to be limitted to getting stuck in the loop (rather than having a problem much later and not finding it).

I think either are OK, but when you've chosen, stick to one or the other. If you're used to using <=, then try not to use < and vice versa.
I prefer <=, but in situations where you're working with indexes which start at zero, I'd probably try and use <. It's all personal preference though.

Strictly from a logical point of view, you have to think that < count would be more efficient than <= count for the exact reason that <= will be testing for equality as well.

Related

Can 2 small loops be faster than a big one?

I was watching this video "how did we end up here?" by Martin Thompson of mechanical-sympathy.
(http://m.youtube.com/watch?v=oxjT7veKi9c)
He claims that to make use of the L0 cache, sometimes it's better to have 2 small loops rather than a big one even though we might to to pass through the same list twice.
Is it possible? Anyway to create a trivial example code with measurement to demonstrate this?
Simple example:
double sum1 = 0, sum2 = 0;
for (i = n; --i >= 0;){
sum1 += a[i];
sum2 += b[i];
}
as against:
double sum1 = 0, sum2 = 0;
for (i = n; --i >= 0;){
sum1 += a[i];
}
for (i = n; --i >= 0;){
sum2 += b[i];
}
In the first example, the compiler has to generate code to "switch context" between indexing a[i] and b[i], and keeping track of where the addition goes.
If a and b are complicated, the compiler may be unable to hold references to both of them in registers.
The result can be that this "context switching", because it has to be done on every iteration, takes more instruction cycles than the cost of the extra loop.
(With unrolling, it is even more true.)
This is still without considering cache issues.
"Sometimes", maybe. If the loop's body may be split into parts without much overhead than the total count of executed instructions, either in two small loops or in one big loop, could be almost the same. And data cache helps anyway when traversing the input twice.
Yet I doubt if this trick could be really useful in general.

IF,ELSE statement / Loop and variables assignment: code optimization best practices

I have some simple and very basic questions here but yet I would like to have the wind up once forever so I decided to ask for.
Ok, here comes the code and the question within:
is something like
for (n=0;n<length;++n) countsc[n]=0;
countsc[x]=1; // x is something
better than something like
for (n=0;n<length;++n) {
if (n != x) countsc[n]=0;
else countsc[n]=1;
}
or also
for (n=0;n<length;++n) countsc[n]=(n != x ? 0 : 1);
in terms of performance and optimization (speed, cpu and memory usage)?
How would it be the convenient way one should measure it, for example, with Javascript and/or with PHP? Would the answer be generally valid for all programming languages or it may differ?
In a similar way, is something like
a=0;
if (condition == true) a=1;
generally better than
if (condition == true) a=1;
else a=0;
or also
a = (condition == true ? 1 : 0);
when condition is usually false?
Probably not the answer you're looking for, but in general, I don't think there would be a general way to figure this out from static analysis of code. This is not only going to vary by language, but also possibly by architecture you run it on. I suspect any half-decent compiler should optimize these so there is little/no difference, but that may be less likely for interpreted languages.
If it really is a performance critical section of code (and you will only know that by profiling), then the best answer you will get will be by profiling and comparing the two candidate code sections on your target architecture, using the relevant language.
Depending on the quality of the compiler (or interpreter, if appropriate),
for (n=0;n<length;++n) countsc[n]=0;
countsc[x]=1;
is easier to optimize, since there are no branches to (mis)predict. The minor duplication of setting countsc[x] twice is trivial compared to need to test n each iteration and the penalty for a missed branch prediction inside the loop of
for (n=0;n<length;++n) {
if (n != x) countsc[n]=0;
else countsc[n]=1;
}
In terms of branch prediction, your third example with the ternary operator is identical to the second.
However, unless this is a tight inner loop that is either executed frequently and/or for very large values of length, it's unlikely to matter which approach you use looking at the overall running time of your program.
for (n=0;n<length;++n) countsc[n]=0;
countsc[x]=1;
is the more performant option, assuming comparison and assignment have roughly the same cost.
Similarly,
if (condition) a=1; // boolean == true is not pretty
else a=0;
should be more performant; you only have one assignment and one jump, while you have up to two assignments and one jump with the alternative version. This version
a = (condition == true ? 1 : 0);
should be just as good, I expect it is compiled to the same code as the if-else version.
I would think
for (n = 0; n < x; ++n)
countsc[n] = 0;
countsc[x] = 1;
for (n = x + 1; n < length; ++n)
countsc[n] = 0;
Would be better than using the conditional, and you avoid rewriting your countsc[x]. Of course, everything is O(n) and you won't notice any change in the speed in your program.

Sorting algorithm for list of integers

I have a list of about 200 integers whose values are between 1 and 5.
I want to get into learning about sorting algorithms and knowing where to apply each because at the moment I use bubble-sort for everything which I've been told is a terrible way to do things.
What would be the fastest sorting algorithm for this integer sorting?
EDIT: It turns out that because I know the numbers are 1 to 5 then I can use a bucket sort (?) algorithm which if I'm not mistaken - and I definitely could be - means that for each integer of value 1, I put it in the 1 group, value 2 I put it in the 2 group etc, then concatenate the groups at the end. This seems like a simple and efficient way to do it.
However since this is (currently) a learning excercise for me I am going to remove the 1 - 5 limitation and try to implement bubble-sort and merge-sort then compare the two to see which is faster.
Thanks for your help!
... which I've been told is a terrible way to do things.
First off, don't accept as gospel anything you hear from random bods on the internet (even me).
Bubble sort is fine under certain conditions, such as when the data is already mostly sorted, or the item count is relatively small (such as 200) (a), or you have no sort functionality built into the language and you're on a tight deadline where lack of performance will annoy the customer but lack of functionality will get you fired :-)
This bias against bubble sort is similar to the "only one exit point from a function" and "no goto" rules. You should understand the reasoning behind them so that you know when the rules can be ignored safely.
Anyway, on to the question proper. An efficient way for your specific case is to just count the items then output them, something like:
dim count[1..5] = {0, 0, 0, 0, 0};
for each item in list:
count[item] = count[item] + 1
for val in 1..5:
for quant in 1..count[val]:
output val
That's an O(n) time and O(1) space solution and you won't find a more efficient big-O for a generalised sort routine - it's only possible in this case because of the extra information you have about the data (limited to the values 1 through 5).
If you wanted to examine all the different sort algorithms, the Wikipedia Sorting Algorithm page is a useful starting point, including the major algorithms and their properties.
(a) As an aside, the following code (using worst case data for bubble sort), when run under CygWin on a not-very-powerful IBM T60 (2GHz dual core) laptop, completes in, on average, 0.157 seconds (5 samples: 0.150, 0.125, 0.192, 0.199, 0.115).
I wouldn't use it for sorting a million items (everyone knows bubble sort scales poorly) but 200 should be fine in most cases:
#include <stdio.h>
#define COUNT 200
int main (void) {
int i, swapped, tmp, item[COUNT];
// Set up worst case (reverse order) data.
for (i = 0; i < COUNT; i++)
item[i] = 200 - i;
// Slightly optimised bubble sort.
swapped = 1;
while (swapped) {
swapped = 0;
for (i = 1; i < COUNT; i++) {
if (item[i-1] > item[i]) {
tmp = item[i-1];
item[i-1] = item[i];
item[i] = tmp;
swapped = 1;
}
}
}
// for (i = 0; i < COUNT; i++)
// printf ("%d ", item[i]);
// putchar ('\n');
return 0;
}
You may not need sorting here, since you only have 5 possible values.
You could use 5 containers (or buckets) and as you scan your list of integers you place the values in the right bucket.
At the end, join the buckets together, in order.
Merge sort is an O(n log n) I think its way better than QuickSort
You can find some C# code here.

Naming the counter variables of consecutive for-loops

Let's say you have a piece of code where you have a for-loop, followed by another for-loop and so on... now, which one is preferable
Give every counter variable the same name:
for (int i = 0; i < someBound; i++) {
doSomething();
}
for (int i = 0; i < anotherBound; i++) {
doSomethingElse();
}
Give them different names:
for (int i = 0; i < someBound; i++) {
doSomething();
}
for (int j = 0; j < anotherBound; j++) {
doSomethingElse();
}
I think the second one would be somewhat more readable, on the other hand I'd use j,k and so on to name inner loops... what do you think?
I reuse the variable name in this case. The reason being that i is sort of international programmerese for "loop control variable whose name isn't really important". j is a bit less clear on that score, and once you have to start using k and beyond it gets kind of obscure.
One thing I should add is that when you use nested loops, you do have to go to j, k, and beyond. Of course if you have more than three nested loops, I'd highly suggest a bit of refactoring.
first one is good for me,. cz that would allow you to use j, k in your inner loops., and because you are resetting i = 0 in the second loop so there wont be any issues with old value being used
In a way you wrote your loops the counter is not supposed to be used outside the loop body. So there's nothing wrong about using the same variable names.
As for readability i, j, k are commonly used as variable names for counters. So it is even better to use them rather then pick the next letter over and over again.
I find it interesting that so many people have different opinions on this. Personally I prefer the first method, if for no other reason then to keep j and k open. I could see why people would prefer the second one for readability, but I think any coder worth handing a project over to is going to be able to see what you're doing with the first situation.
The variable should be named something related to the operation or the boundary condition.
For example:
'indexOfPeople',
'activeConnections', or
'fileCount'.
If you are going to use 'i', 'j', and 'k', then reserve 'j' and 'k' for nested loops.
void doSomethingInALoop() {
for (int i = 0; i < someBound; i++) {
doSomething();
}
}
void doSomethingElseInALoop() {
for (int i = 0; i < anotherBound; i++) {
doSomethingElse();
}
}
If the loops are doing the same things (the loop control -- not the loop body, i.e. they are looping over the same array or same range), then I'd use the same variable.
If they are doing different things -- A different array, or whatever, then I'd use use different variables.
So, on the one-hundredth loop, you'd name the variable "zzz"?
The question is really irrelevant since the variable is defined local to the for-loop. Some flavors of C, such as on OpenVMS, require using different names. Otherwise, it amounts to programmer's preference, unless the compiler restricts it.

OpenMP - running things in parallel and some in sequence within them

I have a scenario like:
for (i = 0; i < n; i++)
{
for (j = 0; j < m; j++)
{
for (k = 0; k < x; k++)
{
val = 2*i + j + 4*k
if (val != 0)
{
for(t = 0; t < l; t++)
{
someFunction((i + t) + someFunction(j + t) + k*t)
}
}
}
}
}
Considering this is block A, Now I have two more similar blocks in my code. I want to put them in parallel, so I used OpenMP pragmas. However I am not able to parallelize it, because I am a tad confused that which variables would be shared and private in this case. If the function call in the inner loop was an operation like sum += x, then I could have added a reduction clause.
In general, how would one approach parallelizing a code using OpenMP, when we there is a nested for loop, and then another inner for loop doing the main operation.
I tried declaring a parallel region, and then simply putting pragma fors before the blocks, but definitely I am missing a point there!
Thanks,
Sayan
I'm more of a Fortran programmer than C so my knowledge of OpenMP in C-style is poor, and I'll leave the syntax to you.
Your easiest approach here is probably (I'll qualify this later) to simply parallelise the outermost loop. By default OpenMP will regard variable i as private, all the rest as shared. This is probably not what you want, you probably want to make j and k and t private too. I suspect that you want val private also.
I'm a bit puzzled by the statement at the bottom of your nest of loops (ie someFunction...), which doesn't seem to return any value at all. Does it work by side-effects ?
So, you shouldn't need to declare a parallel region enclosing all this code, and you should probably only parallelise the outermost loop. If you were to parallelise the inner loops too you might find your OpenMP installation either ignoring them, spawning more processes than you have processors, or complaining bitterly.
I say that your easiest approach is probably to parallelise the outermost loop because I've made some assumptions about what your program (fragment) is doing. If the assumptions are wrong you might want to parallelise one of the inner loops. Another point to check is that the number of executions of the loop(s) you parallelise is much greater than the number of threads you use. You don't want to have OpenMP run loops with a trip count of, say, 7, on 4 threads, the load balance would be very poor.
You're correct, the innermost statement would rather be someFunction((i + t) + someFunction2(j + t) + k*t).

Resources