Which is faster? Comparison or assignment? - performance

I'm doing a bit of coding, where I have to write this sort of code:
if( array[i]==false )
array[i]=true;
I wonder if it should be re-written as
array[i]=true;
This raises the question: are comparisions faster than assignments?
What about differences from language to language? (contrast between java & cpp, eg.)
NOTE: I've heard that "premature optimization is the root of all evil." I don't think that applies here :)

This isn't just premature optimization, this is micro-optimization, which is an irrelevant distraction.
Assuming your array is of boolean type then your comparison is unnecessary, which is the only relevant observation.

Well, since you say you're sure that this matters you should just write a test program and measure to find the difference.
Comparison can be faster if this code is executed on multiple variables allocated at scattered addresses in memory. With comparison you will only read data from memory to the processor cache, and if you don't change the variable value when the cache decides to to flush the line it will see that the line was not changed and there's no need to write it back to the memory. This can speed up execution.

Edit: I wrote a script in PHP. I just noticed that there was a glaring error in it meaning the best-case runtime was being calculated incorrectly (scary that nobody else noticed!)
Best case just beats outright assignment but worst case is a lot worse than plain assignment. Assignment is likely fastest in terms of real-world data.
Output:
assignment in 0.0119960308075 seconds
worst case comparison in 0.0188510417938 seconds
best case comparison in 0.0116770267487 seconds
Code:
<?php
$arr = array();
$mtime = explode(" ", microtime());
$starttime = $mtime[1] + $mtime[0];
reset_arr($arr);
for ($i=0;$i<10000;$i++)
$arr[i] = true;
$mtime = explode(" ", microtime());
$firsttime = $mtime[1] + $mtime[0];
$totaltime = ($firsttime - $starttime);
echo "assignment in ".$totaltime." seconds<br />";
reset_arr($arr);
for ($i=0;$i<10000;$i++)
if ($arr[i])
$arr[i] = true;
$mtime = explode(" ", microtime());
$secondtime = $mtime[1] + $mtime[0];
$totaltime = ($secondtime - $firsttime);
echo "worst case comparison in ".$totaltime." seconds<br />";
reset_arr($arr);
for ($i=0;$i<10000;$i++)
if (!$arr[i])
$arr[i] = false;
$mtime = explode(" ", microtime());
$thirdtime = $mtime[1] + $mtime[0];
$totaltime = ($thirdtime - $secondtime);
echo "best case comparison in ".$totaltime." seconds<br />";
function reset_arr($arr) {
for ($i=0;$i<10000;$i++)
$arr[$i] = false;
}

I believe if comparison and assignment statements are both atomic(ie one processor instruction) and the loop executes n times, then in the worst-case comparing then assigning would require n+1(comparing on every iteration plus setting the assignement) executions whereas constantly asssigning the bool would require n executions. Therefore the second one is more efficient.

Depends on the language. However looping through arrays can be costly as well. If the array is in consecutive memory, the fastest is to write 1 bits (255s) across the entire array with memcpy assuming your language/compiler can do this.
Thus performing 0 reads-1 write total, no reading/writing the loop variable/array variable (2 reads/2 writes each loop) several hundred times.

I really wouldn't expect there to be any kind of noticeable performance difference for something as trivial as this so surely it comes down to what gives you clear, more readable code. I my opinion that would be always assigning true.

Might give this a try:
if(!array[i])
array[i]=true;
But really the only way to know for sure is to profile, I'm sure pretty much any compiler would see the comparison to false as unnecessary and optimize it out.

It all depends on the data type. Assigning booleans is faster than first comparing them. But that may not be true for larger value-based datatypes.

As others have noted, this is micro-optimization.
(In politics or journalism, this is known as navel-gazing ;-)
Is the program large enough to have more than a couple layers of function/method/subroutine calls?
If so, it probably had some avoidable calls, and those can waste hundreds as much time as low-level inefficiencies.
On the assumption that you have removed those (which few people do), then by all means run it 10^9 times under a stopwatch, and see which is faster.

Why would you even write the first version? What's the benefit of checking to see if something is false before setting it true. If you always are going to set it true, then always set it true.
When you have a performance bottleneck that you've traced back to setting a single boolean value unnecessarily, come back and talk to us.

I remember in one book about assembly language the author claimed that if condition should be avoided, if possible.
It is much slower if the condition is false and execution has to jump to another line, considerably slowing down performance. Also since programs are executed in machine code, I think 'if' is slower in every (compiled) language, unless its condition is true almost all the time.

If you just want to flip the values, then do:
array[i] = !array[i];
Performance using this is actually worse though, as instead of only having to do a single check for a true false value then setting, it checks twice.
If you declare a 1000000 element array of true,false, true,false pattern comparision is slower. (var b = !b) essentially does a check twice instead of once

Related

Are boolean operations slower than mathematical operations in loops?

I really tried to find something about this kind of operations but I don't find specific information about my question... It's simple: Are boolean operations slower than typical math operations in loops?
For example, this can be seen when working with some kind of sorting. The method will make an iteration and compare X with Y... But is this slower than a summatory or substraction loop?
Example:
Boolean comparisons
for(int i=1; i<Vector.Length; i++) if(Vector[i-1] < Vector[i])
Versus summation:
Double sum = 0;
for(int i=0; i<Vector.Length; i++) sum += Vector[i];
(Talking about big length loops)
Which is faster for the processor to complete?
Do booleans require more operations in order to return "true" or "false" ?
Short version
There is no correct answer because your question is not specific enough (the two examples of code you give don't achieve the same purpose).
If your question is:
Is bool isGreater = (a > b); slower or faster than int sum = a + b;?
Then the answer would be: It's about the same unless you're very very very very very concerned about how many cycles you spend, in which case it depends on your processor and you need to read its documentation.
If your question is:
Is the first example I gave going to iterate slower or faster than the second example?
Then the answer is: It's going to depend primarily on the values the array contains, but also on the compiler, the processor, and plenty of other factors.
Longer version
On most processors a boolean operation has no reason to significantly be slower or faster than an addition: both are basic instructions, even though comparison may take two of them (subtracting, then comparing to zero). The number of cycles it takes to decode the instruction depends on the processor and might be different, but a few cycles won't make a lot of difference unless you're in a critical loop.
In the example you give though, the if condition could potentially be harmful, because of instruction pipelining. Modern processors try very hard to guess what the next bunch of instructions are going to be so they can pre-fetch them and treat them in parallel. If there is branching, the processor doesn't know if it will have to execute the then or the else part, so it guesses based on the previous times.
If the result of your condition is the same most of the time, the processor will likely guess it right and this will go well. But if the result of the condition keeps changing, then the processor won't guess correctly. When such a branch misprediction happens, it means it can just throw away the content of the pipeline and do it all over again because it just realized it was moot. That. does. hurt.
You can try it yourself: measure the time it takes to run your loop over a million elements when they are of same, increasing, decreasing, alternating, or random value.
Which leads me to the conclusion: processors have become some seriously complex beasts and there is no golden answers, just rules of thumb, so you need to measure and profile. You can read what other people did measure though to get an idea of what you should or should not do.
Have fun experimenting. :)

Unexpected slowdown of function that modifies array in-place

This bug is due to Matlab being too smart for its own good.
I have something like
for k=1:N
stats = subfun(E,k,stats);
end
where statsis a 1xNarray, N=5000 say, and subfun calculates stats(k)from E, and fills it into stats
function stats = subfun(E,k,stats)
s = mean(E);
stats(k) = s;
end
Of course, there is some overhead in passing a large array back and forth, only to fill in one of its elements. In my case, however, the overhead is negligable, and I prefer this code instead of
for k=1:N
s = subfun(E,k);
stats(k) = s;
end
My preference is because I actually have a lot more assignments than just stats.
Also some of the assignments are actually a good deal more complicated.
As mentioned, the overhead is negligable. But, if I do something trivial, like this inconsequential if-statement
for k=1:N
i = k;
if i>=1
stats = subfun(E,i,stats);
end
end
the assignments that take place inside subfun then suddenly takes "forever" (it increases much faster than linearly with N). And it's the assignment, not the calculation that takes forever. In fact, it is even worse than the following nonsensical subfun
function stats = subfun(E,k,stats)
s = calculation_on_E(E);
clear stats
stats(k) = s;
end
which requires re-allocation of stats every time.
Does anybody have the faintest idea why this happens?
This might be due to some obscure detail of Matlab's JIT. The JIT of recent versions of Matlab knows not to create a new array, but to do modifications in-place in some limited cases. One of the requirements is that the function is defined as
function x = modify_big_matrix(x, i, j)
x(i, j) = 123;
and not as
function x_out = modify_big_matrix(x_in, i, j)
x_out = x_in;
x_out(i, j) = 123;
Your examples seem to follow this rule, so, as Praetorian mentioned, your if statement might prevent the JIT from recognizing that it is an in-place operation.
If you really need to speed up your algorithm, it is possible to modify arrays in-place using your own mex-functions. I have successfully used this trick to gain a factor of 4 speedup on some medium sized arrays (order 100x100x100 IIRC). This is however not recommended, could segfault Matlab if you are not careful and might stop working in future versions.
As discussed by others, the problem almost certainly lies with JIT and its relatively fragile ability to modify in place.
As mentioned, I really prefer the first form of the function call and assignments, although other workable solutions have been suggested. Without relying on JIT, the only way this can be efficient (as far as I can see) is some form of passing by reference.
Therefore I made a class Stats that inherits from handle, and which contains the data array for k=1:N. It is then passed by reference.
For future reference, this seems to work very well, with good performance, and I'm currently using it as my working solution.

Do speeds of if statements in a repetitive loop affect overall performance?

If I have code that will take a while to execute, printing out results every iteration will slow down the program a lot. To still receive occasional output to check on the progress of the code, I might have:
if (i % 10000 == 0) {
# print progress here
}
Does the if statement checking every time slow it down at all? Should I just not put output and just wait, will that make it noticeably faster at all?
Also, is it faster to do: (i % 10000 == 0) or (i == 10000)?
Is checking equality or modulus faster?
In general case, it won't matter at all.
A slightly longer answer: It won't matter unless the loop is run millions of times and the other statement in it is actually less demanding than an if statement (for example, a simple multiplication etc.). In that case, you might see a slight performance drop.
Regarding (i % 10000 == 0) vs. (i == 10000), the latter is obviously faster, because it only compares, whereas the former possibility does a (fairly costly) modulus and a comparison.
That said, both an if statement and a modulus count won't make any difference if your loop doesn't take up 90 % of the program's running time. Which usually is the case only at school :). You probably spent a lot more time by asking this question than you would have saved by not printing anything. For development and debugging, this is not a bad way to go.
The golden rule for this kind of decisions:
Write the most readable and explicit code you can imagine to do the
thing you want it to do. If you have a performance problem, look at
wrong data structures and algorithmic choices first. If you have done
all those and need a really quick program, profile it to see which
part takes most time. After all those, you're allowed to do this kind
of low-level guesses.

Is it quicker to access a local variable than an attribute of an object?

Straight-forward language agnostic question. I've always done this:
myVar = myObj.myAttribute
when I need to access myAttribute a lot.
I'm wondering if this is just a superstition I've acquired, or if it's generally faster?
Edit: I would also like to know if this
myVar = myObj.myAttribute/100
for (i=0; i<100; i++) {
print myVar*i;
}
is more efficient than putting (myObj.myAttribute/100) in the loop. Will modern compilers and interpreters detect that that part of the equation doesn't vary?
In this particular case what you did is more efficient, since it's one division vs 100.
I do property assign to variables only if I can optimize the operations done later, like in your case or expect multiple calls to the same property and the object lookup is likely to be expensive. Generally using local variable should be the more cpu wize way, since it can be costly to do complex property lookups, along with the better control of that property value and possible pre-validation before looping. That said it may be inefficient only if the lookup is likely to occur once or twice for the function call, thus adding overhead and making the code harder to follow up.
I suppose it might depend on the language, and/or the compiler ; but, generally speaking, the less your code has to do, the faster it'll be.
But the difference shouldn't be that important... and what matters the most is people are able to understand your code easily.
In Javascript, for instance, it's said that it's faster using a local variable instead of re-calculating object-access several times.
i.e. this :
var a = obj.a.b.c;
a.a = 10;
a.b = 20;
a.c = 30:
is faster than that :
obj.a.b.c.a = 10;
obj.a.b.c.b = 20;
obj.a.b.c.c = 30:
As a rule, depending on the language, maybe.
You are unlikely to notice the difference however, unless you are running (for example) a tight loop.
Usually I would say the savings are not worth the extra cognitive load on the programmer.
However if you have a bit of code which you know has a slowness problem, this kind of optimisation is definitely worth considering.

Why should recursion be preferred over iteration?

Iteration is more performant than recursion, right? Then why do some people opine that recursion is better (more elegant, in their words) than iteration? I really don't see why some languages like Haskell do not allow iteration and encourage recursion? Isn't that absurd to encourage something that has bad performance (and that too when more performant option i.e. recursion is available) ? Please shed some light on this. Thanks.
Iteration is more performant than
recursion, right?
Not necessarily.
This conception comes from many C-like languages, where calling a function, recursive or not, had a large overhead and created a new stackframe for every call.
For many languages this is not the case, and recursion is equally or more performant than an iterative version. These days, even some C compilers rewrite some recursive constructs to an iterative version, or reuse the stack frame for a tail recursive call.
Try implementing depth-first search recursively and iteratively and tell me which one gave you an easier time of it. Or merge sort. For a lot of problems it comes down to explicitly maintaining your own stack vs. leaving your data on the function stack.
I can't speak to Haskell as I've never used it, but this is to address the more general part of the question posed in your title.
Haskell do not allow iteration because iteration involves mutable state (the index).
As others have stated, there's nothing intrinsically less performant about recursion. There are some languages where it will be slower, but it's not a universal rule.
That being said, to me recursion is a tool, to be used when it makes sense. There are some algorithms that are better represented as recursion (just as some are better via iteration).
Case in point:
fib 0 = 0
fib 1 = 1
fib n = fib(n-1) + fib(n-2)
I can't imagine an iterative solution that could possibly make the intent clearer than that.
Here is some information on the pros & cons of recursion & iteration in c:
http://www.stanford.edu/~blp/writings/clc/recursion-vs-iteration.html
Mostly for me Recursion is sometimes easier to understand than iteration.
Iteration is just a special form of recursion.
Recursion is one of those things that seem elegant or efficient in theory but in practice is generally less efficient (unless the compiler or dynamic recompiler) is changing what the code does. In general anything that causes unnecessary subroutine calls is going to be slower, especially when more than 1 argument is being pushed/popped. Anything you can do to remove processor cycles i.e. instructions the processor has to chew on is fair game. Compilers can do a pretty good job of this these days in general but it's always good to know how to write efficient code by hand.
Several things:
Iteration is not necessarily faster
Root of all evil: encouraging something just because it might be moderately faster is premature; there are other considerations.
Recursion often much more succinctly and clearly communicates your intent
By eschewing mutable state generally, functional programming languages are easier to reason about and debug, and recursion is an example of this.
Recursion takes more memory than iteration.
I don't think there's anything intrinsically less performant about recursion - at least in the abstract. Recursion is a special form of iteration. If a language is designed to support recursion well, it's possible it could perform just as well as iteration.
In general, recursion makes one be explicit about the state you're bringing forward in the next iteration (it's the parameters). This can make it easier for language processors to parallelize execution. At least that's a direction that language designers are trying to exploit.
As a low level ITERATION deals with the CX registry to count loops, and of course data registries.
RECURSION not only deals with that it also adds references to the stack pointer to keep references of the previous calls and then how to go back.-
My University teacher told me that whatever you do with recursion can be done with Iterations and viceversa, however sometimes is simpler to do it by recursion than Iteration (more elegant) but at a performance level is better to use Iterations.-
In Java, recursive solutions generally outperform non-recursive ones. In C it tends to be the other way around. I think this holds in general for adaptively compiled languages vs. ahead-of-time compiled languages.
Edit:
By "generally" I mean something like a 60/40 split. It is very dependent on how efficiently the language handles method calls. I think JIT compilation favors recursion because it can choose how to handle inlining and use runtime data in optimization. It's very dependent on the algorithm and compiler in question though. Java in particular continues to get smarter about handling recursion.
Quantitative study results with Java (PDF link). Note that these are mostly sorting algorithms, and are using an older Java Virtual Machine (1.5.x if I read right). They sometimes get a 2:1 or 4:1 performance improvement by using the recursive implementation, and rarely is recursion significantly slower. In my personal experience, the difference isn't often that pronounced, but a 50% improvement is common when I use recursion sensibly.
I find it hard to reason that one is better than the other all the time.
Im working on a mobile app that needs to do background work on user file system. One of the background threads needs to sweep the whole file system from time to time, to maintain updated data to the user. So in fear of Stack Overflow , i had written an iterative algorithm. Today i wrote a recursive one, for the same job. To my surprise, the iterative algorithm is faster: recursive -> 37s, iterative -> 34s (working over the exact same file structure).
Recursive:
private long recursive(File rootFile, long counter) {
long duration = 0;
sendScanUpdateSignal(rootFile.getAbsolutePath());
if(rootFile.isDirectory()) {
File[] files = getChildren(rootFile, MUSIC_FILE_FILTER);
for(int i = 0; i < files.length; i++) {
duration += recursive(files[i], counter);
}
if(duration != 0) {
dhm.put(rootFile.getAbsolutePath(), duration);
updateDurationInUI(rootFile.getAbsolutePath(), duration);
}
}
else if(!rootFile.isDirectory() && checkExtension(rootFile.getAbsolutePath())) {
duration = getDuration(rootFile);
dhm.put(rootFile.getAbsolutePath(), getDuration(rootFile));
updateDurationInUI(rootFile.getAbsolutePath(), duration);
}
return counter + duration;
}
Iterative: - iterative depth-first search , with recursive backtracking
private void traversal(File file) {
int pointer = 0;
File[] files;
boolean hadMusic = false;
long parentTimeCounter = 0;
while(file != null) {
sendScanUpdateSignal(file.getAbsolutePath());
try {
Thread.sleep(Constants.THREADS_SLEEP_CONSTANTS.TRAVERSAL);
} catch (InterruptedException e) {
e.printStackTrace();
}
files = getChildren(file, MUSIC_FILE_FILTER);
if(!file.isDirectory() && checkExtension(file.getAbsolutePath())) {
hadMusic = true;
long duration = getDuration(file);
parentTimeCounter = parentTimeCounter + duration;
dhm.put(file.getAbsolutePath(), duration);
updateDurationInUI(file.getAbsolutePath(), duration);
}
if(files != null && pointer < files.length) {
file = getChildren(file,MUSIC_FILE_FILTER)[pointer];
}
else if(files != null && pointer+1 < files.length) {
file = files[pointer+1];
pointer++;
}
else {
pointer=0;
file = getNextSybling(file, hadMusic, parentTimeCounter);
hadMusic = false;
parentTimeCounter = 0;
}
}
}
private File getNextSybling(File file, boolean hadMusic, long timeCounter) {
File result= null;
//se o file é /mnt, para
if(file.getAbsolutePath().compareTo(userSDBasePointer.getParentFile().getAbsolutePath()) == 0) {
return result;
}
File parent = file.getParentFile();
long parentDuration = 0;
if(hadMusic) {
if(dhm.containsKey(parent.getAbsolutePath())) {
long savedValue = dhm.get(parent.getAbsolutePath());
parentDuration = savedValue + timeCounter;
}
else {
parentDuration = timeCounter;
}
dhm.put(parent.getAbsolutePath(), parentDuration);
updateDurationInUI(parent.getAbsolutePath(), parentDuration);
}
//procura irmao seguinte
File[] syblings = getChildren(parent,MUSIC_FILE_FILTER);
for(int i = 0; i < syblings.length; i++) {
if(syblings[i].getAbsolutePath().compareTo(file.getAbsolutePath())==0) {
if(i+1 < syblings.length) {
result = syblings[i+1];
}
break;
}
}
//backtracking - adiciona pai, se tiver filhos musica
if(result == null) {
result = getNextSybling(parent, hadMusic, parentDuration);
}
return result;
}
Sure the iterative isn't elegant, but alhtough its currently implemented on an ineficient way, it is still faster than the recursive one. And i have better control over it, as i dont want it running at full speed, and will let the garbage collector do its work more frequently.
Anyway, i wont take for granted that one method is better than the other, and will review other algorithms that are currently recursive. But at least from the 2 algorithms above, the iterative one will be the one in the final product.
I think it would help to get some understanding of what performance is really about. This link shows how a perfectly reasonably-coded app actually has a lot of room for optimization - namely a factor of 43! None of this had anything to do with iteration vs. recursion.
When an app has been tuned that far, it gets to the point where the cycles saved by iteration as against recursion might actually make a difference.
Recursion is the typical implementation of iteration. It's just a lower level of abstraction (at least in Python):
class iterator(object):
def __init__(self, max):
self.count = 0
self.max = max
def __iter__(self):
return self
# I believe this changes to __next__ in Python 3000
def next(self):
if self.count == self.max:
raise StopIteration
else:
self.count += 1
return self.count - 1
# At this level, iteration is the name of the game, but
# in the implementation, recursion is clearly what's happening.
for i in iterator(50):
print(i)
I would compare recursion with an explosive: you can reach big result in no time. But if you use it without cautions the result could be disastrous.
I was impressed very much by proving of complexity for the recursion that calculates Fibonacci numbers here. Recursion in that case has complexity O((3/2)^n) while iteration just O(n). Calculation of n=46 with recursion written on c# takes half minute! Wow...
IMHO recursion should be used only if nature of entities suited for recursion well (trees, syntax parsing, ...) and never because of aesthetic. Performance and resources consumption of any "divine" recursive code need to be scrutinized.
Iteration is more performant than recursion, right?
Yes.
However, when you have a problem which maps perfectly to a Recursive Data Structure, the better solution is always recursive.
If you pretend to solve the problem with iterations you'll end up reinventing the stack and creating a messier and ugly code, compared to the elegant recursive version of the code.
That said, Iteration will always be faster than Recursion. (in a Von Neumann Architecture), so if you use recursion always, even where a loop will suffice, you'll pay a performance penalty.
Is recursion ever faster than looping?
"Iteration is more performant than recursion" is really language- and/or compiler-specific. The case that comes to mind is when the compiler does loop-unrolling. If you've implemented a recursive solution in this case, it's going to be quite a bit slower.
This is where it pays to be a scientist (testing hypotheses) and to know your tools...
on ntfs UNC max path as is 32K
C:\A\B\X\C.... more than 16K folders can be created...
But you can not even count the number of folders with any recursive method, sooner or later all will give stack overflow.
Only a Good lightweight iterative code should be used to scan folders professionally.
Believe or not, most top antivirus cannot scan maximum depth of UNC folders.

Resources