Does it change performance to use a non-int counter in a loop? - performance

I'm just curious and can't find the answer anywhere. Usually, we use an integer for a counter in a loop, e.g. in C/C++:
for (int i=0; i<100; ++i)
But we can also use a short integer or even a char. My question is: Does it change the performance? It's a few bytes less so the memory savings are negligible. It just intrigues me if I do any harm by using a char if I know that the counter won't exceed 100.

Probably using the "natural" integer size for the platform will provide the best performance. In C++ this is usually int. However, the difference is likely to be small and you are unlikely to find that this is the performance bottleneck.

Depends on the architecture. On the PowerPC, there's usually a massive performance penalty involved in using anything other than int (or whatever the native word size is) -- eg, don't use short or char. Float is right out, too.
You should time this on your particular architecture because it varies, but in my test cases there was ~20% slowdown from using short instead of int.

I can't provide a citation, but I've heard that you often do incur a little performance overhead by using a short or char.
The memory savings are nonexistant since it's a temporary stack variable. The memory it lives in will almost certainly already be allocated, and you probably won't save anything by using something shorter because the next variable will likely want to be aligned to a larger boundary anyway.

You can use whatever legal type you want in a for; it doesn't have to be integral or even built in. For example, you can use iterators as well:
for( std::vector<std::string>::iterator s = myStrings.begin(); myStrings.end() != s; ++s )
{
...
}
Whether or not it will have an impact on performance comes down to a question of how the operators you use are implemented. So in the above example that means end(), operator!=() and operator++().

This is not really an answer. I'm just exploring what Crashworks said about the PowerPC. As others have pointed out already, using a type that maps to the native word size should yield the shortest code and the best performance.
$ cat loop.c
extern void bar();
void foo()
{
int i;
for (i = 0; i < 42; ++i)
bar();
}
$ powerpc-eabi-gcc -S -O3 -o - loop.c
.
.
.L5:
bl bar
addic. 31,31,-1
bge+ 0,.L5
It is quite different with short i, instead of int i, and looks like won't perform as well either.
.L5:
bl bar
addi 3,31,1
extsh 31,3
cmpwi 7,31,41
ble+ 7,.L5

No, it really shouldn't impact performance.

It probably would have been quicker to type in a quick program (you did the most complex line already) and profile it, than ask this question here. :-)
FWIW, in languages that use bignums by default (Python, Lisp, etc.), I've never seen a profile where a loop counter was the bottleneck. Checking the type tag is not that expensive -- a couple instructions at most -- but probably bigger than the difference between a (fix)int and a short int.

Probably not as long as you don't do it with a float or a double. Since memory is cheap you would probably be best off just using an int.

An unsigned or size_t should, in theory, give you better results ( wow, easy people, we are trying to optimise for evil, and against those shouting 'premature' nonsense. It's the new trend ).
However, it does have its drawbacks, primarily the classic one: screw-up.
Google devs seems to avoid it to but it is pita to fight against std or boost.

If you compile your program with optimization (e.g., gcc -O), it doesn't matter. The compiler will allocate an integer register to the value and never store it in memory or on the stack. If your loop calls a routine, gcc will allocate one of the variables r14-r31 which any called routine will save and restore. So use int, because that causes the least surprise to whomever reads your code.

Related

ALU, double and int

Sometimes, writing a code, situations such
(double)Number1/(int)Number2 //division of a double type varible by a int one.
appears to me (and I think, to all of you more or less often) and I never knows what really happens if I rewrite (double) over (int).
(double)Number1/(double)Number2
Is the performace the same? And the precision? And the time taken to perform it... Changes? Does the compiler, in general case (if it is possible to say such thing), write the same binary file. i. e., exe file? Does the called ALU operator chang?
I believe that a formal answer would depends on architecture of machine, compiler and language and a lot of stuff more. But... In these cases, how to have a notion about what would happen in "my code" and what choice would be better (if there is an appreciable difference)?
Thank you all for your replies!
The precision can be different.
For example, if Number2 is originally a double, converting it to an int with (int)Number2 before the division can lose a lot of information both through truncating any bits after the binary point and by truncating any integral bits that don't fit in the int.

What is the difference between std::atoi() and std::stoi?

What is the difference between atoi and stoi?
I know,
std::string my_string = "123456789";
In order to convert that string to an integer, you’d have to do the following:
const char* my_c_string = my_string.c_str();
int my_integer = atoi(my_c_string);
C++11 offers a succinct replacement:
std::string my_string = "123456789";
int my_integer = std::stoi(my_string);
1). Are there any other differences between the two?
2). Efficiency and performance wise which one is better?
3). Which is safer to use?
1). Are there any other differences between the two?
I find std::atoi() a horrible function: It returns zero on error. If you consider zero as a valid input, then you cannot tell whether there was an error during the conversion or the input was zero. That's just bad. See for example How do I tell if the c function atoi failed or if it was a string of zeros?
On the other hand, the corresponding C++ function will throw an exception on error. You can properly distinguish errors from zero as input.
2). Efficiency and performance wise which one is better?
If you don't care about correctness or you know for sure that you won't have zero as input or you consider that an error anyway, then, perhaps the C functions might be faster (probably due to the lack of exception handling). It depends on your compiler, your standard library implementation, your hardware, your input, etc. The best way is to measure it. However, I suspect that the difference, if any, is negligible.
If you need a fast (but ugly C-style) implementation, the most upvoted answer to the How to parse a string to an int in C++? question seems reasonable. However, I would not go with that implementation unless absolutely necessary (mainly because of having to mess with char* and \0 termination).
3). Which is safer to use?
See the first point.
In addition to that, if you need to work with char* and to watch out for \0 termination, you are more likely to make mistakes. std::string is much easier and safer to work with because it will take care of all these stuff.

What's more costly on current CPUs: arithmetic operations or conditionals?

20-30 years ago arithmetic operations like division were one of the most costly operations for CPUs. Saving one division in a piece of repeatedly called code was a significant performance gain. But today CPUs have fast arithmetic operations and since they heavily use instruction pipelining, conditionals can disrupt efficient execution. If I want to optimize code for speed, should I prefer arithmetic operations in favor of conditionals?
Example 1
Suppose we want to implement operations modulo n. What will perform better:
int c = a + b;
result = (c >= n) ? (c - n) : c;
or
result = (a + b) % n;
?
Example 2
Let's say we're converting 24-bit signed numbers to 32-bit. What will perform better:
int32_t x = ...;
result = (x & 0x800000) ? (x | 0xff000000) : x;
or
result = (x << 8) >> 8;
?
All the low hanging fruits are already picked and pickled by authors of compilers and guys who build hardware. If you are the kind of person who needs to ask such question, you are unlikely to be able to optimize anything by hand.
While 20 years ago it was possible for a relatively competent programmer to make some optimizations by dropping down to assembly, nowadays it is the domain of experts, specializing in the target architecture; also, optimization requires not only knowing the program, but knowing the data it will process. Everything comes down to heuristics, tests under different conditions etc.
Simple performance questions no longer have simple answers.
If you want to optimise for speed, you should just tell your compiler to optimise for speed. Modern compilers will generally outperform you in this area.
I've sometimes been surprised trying to relate assembly code back to the original source for this very reason.
Optimise your source code for readability and let the compiler do what it's best at.
I expect that in example #1, the first will perform better. The compiler will probably apply some bit-twiddling trick to avoid a branch. But you're taking advantage of knowledge that it's extremely unlikely that the compiler can deduce: namely that the sum is always in the range [0:2*n-2] so a single subtraction will suffice.
For example #2, the second way is both faster on modern CPUs and simpler to follow. A judicious comment would be appropriate in either version. (I wouldn't be surprised to see the compiler convert the first version into the second.)

Is it quicker to access a local variable than an attribute of an object?

Straight-forward language agnostic question. I've always done this:
myVar = myObj.myAttribute
when I need to access myAttribute a lot.
I'm wondering if this is just a superstition I've acquired, or if it's generally faster?
Edit: I would also like to know if this
myVar = myObj.myAttribute/100
for (i=0; i<100; i++) {
print myVar*i;
}
is more efficient than putting (myObj.myAttribute/100) in the loop. Will modern compilers and interpreters detect that that part of the equation doesn't vary?
In this particular case what you did is more efficient, since it's one division vs 100.
I do property assign to variables only if I can optimize the operations done later, like in your case or expect multiple calls to the same property and the object lookup is likely to be expensive. Generally using local variable should be the more cpu wize way, since it can be costly to do complex property lookups, along with the better control of that property value and possible pre-validation before looping. That said it may be inefficient only if the lookup is likely to occur once or twice for the function call, thus adding overhead and making the code harder to follow up.
I suppose it might depend on the language, and/or the compiler ; but, generally speaking, the less your code has to do, the faster it'll be.
But the difference shouldn't be that important... and what matters the most is people are able to understand your code easily.
In Javascript, for instance, it's said that it's faster using a local variable instead of re-calculating object-access several times.
i.e. this :
var a = obj.a.b.c;
a.a = 10;
a.b = 20;
a.c = 30:
is faster than that :
obj.a.b.c.a = 10;
obj.a.b.c.b = 20;
obj.a.b.c.c = 30:
As a rule, depending on the language, maybe.
You are unlikely to notice the difference however, unless you are running (for example) a tight loop.
Usually I would say the savings are not worth the extra cognitive load on the programmer.
However if you have a bit of code which you know has a slowness problem, this kind of optimisation is definitely worth considering.

best-practice on for loop's condition

what is considered best-practice in this case?
for (i=0; i<array.length(); ++i)
or
for (i=array.length()-1; i>=0; --i)
assuming i don't want to iterate from a certain direction, but rather over the bare length of the array. also, i don't plan to alter the array's size in the loop body.
so, will the array.length() become constant during compilation? if not, then the second approach should be the one to go for..
I would do the first method, as that is much more readable, and I can look and see you are iterating over the loop. The second one took me a second :(.
array.length will remain constant so long as you arent modifying the array.
In most cases I would expect array.length() to be implemented in such a way that it is O(1), so it would not really impact on the loop's performance. If you are in doubt, or want to make sure it is a constant, just do so explicitly:
// JavaScript
var l = a.length;
for (var i=0; i<l; i++) {
// do something
}
I consider the reversed notation a "clever hack" that falls into the premature optimization category. It's harder to read, more error-prone and does not really provide a benefit over the alternative I suggest.
But since implementations of compilers/interpreters are vastly different and you do not say what language you refer to, it is hard to make an absolute statement about this. I would say unless this is in an absolutely time-critical section of code or otherwise measurably contributing to code running time, and your benchmark tests show that doing it differently provides a real benefit, I would stick to the code that's easier to understand and maintain.
Version 2 is broken and would iterate from one past end of array to 1. (Now corrected)
Stick with version 1. It's well recognised and doesn't leave the reader doing a double-take.
Version 1 is far more widely used and simpler to understand. Version 2 may occasionally be very slightly faster if the compiler doesn't optimize array.length() into a constant, but...insert your own premature optimization comment here
EDIT: as to whether array.length() will be optimized out, it will depend on the language. If the language uses arrays as "normal" objects or arrays can be dynamically sized, it will be just a method call and the compiler can't assume will return a consistent return value. But for languages in which arrays are a special case or object (or the compiler's just really smart...) the speed difference will probably be eliminated.
for (i=0; i<array.length(); ++i) is better for me, but for (i=array.length()-1; i>=0; --i) is fastest, because common processors fastest checking condition comparing to 0 to condition comparing to two varbiales.
array.lenght() is constant if you dont adding/erasing elements from this array under iterations.
Version 1 is quickest. for (i=0; i
For a for loop in my own code, though it's probably overkill, I got into the habit early on of writing my loops such that the length gets calculated exactly once, but the loop proceeds in the natural way:
int len = array.length();
for (int i=0; i<len; ++i) {
doSomething(array[i]);
}
These days, though, I prefer using "for-each" facilities where they're available and convenient; they make loops easier to read and foolproof. In C++ that would be something like:
std::for_each(array.begin(), array.end(), &doSomething);

Resources