if(X){if(Y){ versus if(X&&Y). Question about efficiency - performance

I'm sure this is a very simple question, but I'm not really sure what search parameters to use, so I'm going to ask here.
Let's say I have code I want to execute when both X and Y are true, and I want it to be as efficient as possible. There are two ways I know to go about this;
if(X)
if(Y)
//do stuff
Or there is:
if(X && Y)
//do stuff
What I'm curious to know is how this code is actually read and executed at runtime? Is it more efficient to not check Y at all if X isn't true? Or is it more efficient to execute checks for X and Y at the same time? Obviously the second is more readable for humans, but if the only goal is efficiency, which is better?
Thanks!

The 1st case is more effective than the 2nd one.
The ASM code for the 1st case will be below type.
ax = A
if ax=0 goto address
ax = B
if ax = 0 goto adress
Do something
address:
..........................
but the ASM code for the 2nd case will be below type.
ax = A
bx = B
ax = A & B
if ax=0 goto address
Do something
address:
...........................

In many (most?) languages, the two will be exactly the same.
If you care about performance, I assume you're using a good, optimizing compiler. Any optimizing compiler should realize that when you write
if(X && Y)
then if X is false, it doesn't need to waste time evaluating Y.
This so-called short circuiting behavior is part of the definition of C and C-like languages.
Also, if you care about efficiency and performance, issues like this one are probably the last thing you should be worrying about. It's very important that you pick a good language, and a good compiler for it, and choose good algorithms. Fussing with low-level coding details, like if (X) if(Y) versus if(X && Y), usually won't make much of a difference.
Also, if you really care about efficiency and performance, the only way to really answer questions like these is to perform careful measurements, using your code and your compiler and your computer, today.
And it can actually be quite difficult to perform good measurements. Often, the two options will seem to have identical performance, which is another clue that "microoptimizations" usually don't matter at all.

Related

How to understand the recursive search in Prolog?

Here is a section of Prolog code defining numeral in a recursive way:
numeral(0).
numeral(succ(X)) :- numeral(X).
When given query numeral(X). Prolog will return:
X = 0 ;
X = succ(0) ;
X = succ(succ(0)) ;
X = succ(succ(succ(0))) ;
X = succ(succ(succ(succ(0)))) ;
X = succ(succ(succ(succ(succ(0))))) ;
X = succ(succ(succ(succ(succ(succ(0)))))) ;
X = succ(succ(succ(succ(succ(succ(succ(0))))))) ;
X = succ(succ(succ(succ(succ(succ(succ(succ(0))))))))
yes
Based on what I have learned, when doing the query, prolog will firstly make X into a variable like (_G42), then it will search the facts and rules to find the match.
In this case, it will find 0 (fact) as a right match. Then it will also try to match the rule. That is considering _G42 is not 0, and _G42 is the succ of another number. Thus, another variable is generated(like _G44), _G44 will match 0 and will also go further like _G42. Since _G44 matches 0, then it will go backward to _G42, getting _G42 = succ(_G44) = succ(0).
I am not sure if I am right about the understanding. I made a diagram to show my comprehension on this problem.
If the analysis is correct, I still feel difficult to design the recursive function like this. Since I am new to Prolog, I want to know if this kind of definition always used in application (say building an expert system, verifying protocols) or it is just for beginners to better understanding the basic searching procedure? If it is often used, what is the key point to design this kind of recursive definition?
My personal opinion: Especially as a beginner, you have zero chance to"understand the recursive search in Prolog". Countless beginners are trying to understand Prolog in this way, and they very consistently fail.
The sad part is that this hits hardest workers the hardest: You always think you can somehow understand it, but in the end, you cannot, because there are too many ways to invoke even the simplest predicates, with uninstantiated and (partly) instantiated arguments, and even with aliased variables.
Your graph nicely illustrates that such a procedural reading gets extremely unwieldy very quickly for even the simplest conceivable recursive definitions.
A much more tractable approach for understanding the predicate is to read it declaratively:
0 is a numeral
If X is a numeral (whatever X is!), then succ(X) of X is also a numeral.
Note that :- even means ←, i.e., an implication from right to left.
My recommendation is to focus on a clear declarative description of what ought to hold. To overcome the initial barriers with Prolog, you must let go the idea that you can trace the steps that the CPU performs in the extreme detail in which you are currently trying to follow it. Prolog is too high-level to be amenable to tracing in this low-level way. It is like trying to interpret between French and English by tracing only the neuronal activities of the speakers.
Write a clear definition and then leave the search to Prolog. There are many other and working ways to understand and break down declarative definitions without getting swamped in low-level details. See for example program-slicing and failure-slicing. They work as long as you stay in the so-called pure monotonic subset of Prolog. Focus on this area, and you will be able to make very fast progress.

Is it worth it to rewrite an if statement to avoid branching?

Recently I realized I have been doing too much branching without caring the negative impact on performance it had, therefore I have made up my mind to attempt to learn all about not branching. And here is a more extreme case, in attempt to make the code to have as little branch as possible.
Hence for the code
if(expression)
A = C; //A and C have to be the same type here obviously
expression can be A == B, or Q<=B, it could be anything that resolve to true or false, or i would like to think of it in term of the result being 1 or 0 here
I have come up with this non branching version
A += (expression)*(C-A); //Edited with thanks
So my question would be, is this a good solution that maximize efficiency?
If yes why and if not why?
Depends on the compiler, instruction set, optimizer, etc. When you use a boolean expression as an int value, e.g., (A == B) * C, the compiler has to do the compare, and the set some register to 0 or 1 based on the result. Some instruction sets might not have any way to do that other than branching. Generally speaking, it's better to write simple, straightforward code and let the optimizer figure it out, or find a different algorithm that branches less.
Jeez, no, don't do that!
Anyone who "penalize[s] [you] a lot for branching" would hopefully send you packing for using something that awful.
How is it awful, let me count the ways:
There's no guarantee you can multiply a quantity (e.g., C) by a boolean value (e.g., (A==B) yields true or false). Some languages will, some won't.
Anyone casually reading it is going observe a calculation, not an assignment statement.
You're replacing a comparison, and a conditional branch with two comparisons, two multiplications, a subtraction, and an addition. Seriously non-optimal.
It only works for integral numeric quantities. Try this with a wide variety of floating point numbers, or with an object, and if you're really lucky it will be rejected by the compiler/interpreter/whatever.
You should only ever consider doing this if you had analyzed the runtime properties of the program and determined that there is a frequent branch misprediction here, and that this is causing an actual performance problem. It makes the code much less clear, and its not obvious that it would be any faster in general (this is something you would also have to measure, under the circumstances you are interested in).
After doing research, I came to the conclusion that when there are bottleneck, it would be good to include timed profiler, as these kind of codes are usually not portable and are mainly used for optimization.
An exact example I had after reading the following question below
Why is it faster to process a sorted array than an unsorted array?
I tested my code on C++ using that, that my implementation was actually slower due to the extra arithmetics.
HOWEVER!
For this case below
if(expression) //branched version
A += C;
//OR
A += (expression)*(C); //non-branching version
The timing was as of such.
Branched Sorted list was approximately 2seconds.
Branched unsorted list was aproximately 10 seconds.
My implementation (whether sorted or unsorted) are both 3seconds.
This goes to show that in an unsorted area of bottleneck, when we have a trivial branching that can be simply replaced by a single multiplication.
It is probably more worthwhile to consider the implementation that I have suggested.
** Once again it is mainly for the areas that is deemed as the bottleneck **

What's more costly on current CPUs: arithmetic operations or conditionals?

20-30 years ago arithmetic operations like division were one of the most costly operations for CPUs. Saving one division in a piece of repeatedly called code was a significant performance gain. But today CPUs have fast arithmetic operations and since they heavily use instruction pipelining, conditionals can disrupt efficient execution. If I want to optimize code for speed, should I prefer arithmetic operations in favor of conditionals?
Example 1
Suppose we want to implement operations modulo n. What will perform better:
int c = a + b;
result = (c >= n) ? (c - n) : c;
or
result = (a + b) % n;
?
Example 2
Let's say we're converting 24-bit signed numbers to 32-bit. What will perform better:
int32_t x = ...;
result = (x & 0x800000) ? (x | 0xff000000) : x;
or
result = (x << 8) >> 8;
?
All the low hanging fruits are already picked and pickled by authors of compilers and guys who build hardware. If you are the kind of person who needs to ask such question, you are unlikely to be able to optimize anything by hand.
While 20 years ago it was possible for a relatively competent programmer to make some optimizations by dropping down to assembly, nowadays it is the domain of experts, specializing in the target architecture; also, optimization requires not only knowing the program, but knowing the data it will process. Everything comes down to heuristics, tests under different conditions etc.
Simple performance questions no longer have simple answers.
If you want to optimise for speed, you should just tell your compiler to optimise for speed. Modern compilers will generally outperform you in this area.
I've sometimes been surprised trying to relate assembly code back to the original source for this very reason.
Optimise your source code for readability and let the compiler do what it's best at.
I expect that in example #1, the first will perform better. The compiler will probably apply some bit-twiddling trick to avoid a branch. But you're taking advantage of knowledge that it's extremely unlikely that the compiler can deduce: namely that the sum is always in the range [0:2*n-2] so a single subtraction will suffice.
For example #2, the second way is both faster on modern CPUs and simpler to follow. A judicious comment would be appropriate in either version. (I wouldn't be surprised to see the compiler convert the first version into the second.)

Iterative solving for unknowns in a fluids problem

I am a Mechanical engineer with a computer scientist question. This is an example of what the equations I'm working with are like:
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
The situation is this:
I need r to find x, but I need x to find z. I also need x to find f which is a part of finding z. So I guess a value for x, and then I use that value to find r and f. Then I go back and use the value I found for r and f to find x. I keep doing this until the guess and the calculated are the same.
My question is:
How do I get the computer to do this? I've been using mathcad, but an example in another language like C++ is fine.
The very first thing you should do faced with iterative algorithms is write down on paper the sequence that will result from your idea:
Eg.:
x_0 = ..., f_0 = ..., r_0 = ...
x_1 = ..., f_1 = ..., r_1 = ...
...
x_n = ..., f_n = ..., r_n = ...
Now, you have an idea of what you should implement (even if you don't know how). If you don't manage to find a closed form expression for one of the x_i, r_i or whatever_i, you will need to solve one dimensional equations numerically. This will imply more work.
Now, for the implementation part, if you never wrote a program, you should seriously ask someone live who can help you (or hire an intern and have him write the code). We cannot help you beginning from scratch with, eg. C programming, but we are willing to help you with specific problems which should arise when you write the program.
Please note that your algorithm is not guaranteed to converge, even if you strongly think there is a unique solution. Solving non linear equations is a difficult subject.
It appears that mathcad has many abstractions for iterative algorithms without the need to actually implement them directly using a "lower level" language. Perhaps this question is better suited for the mathcad forums at:
http://communities.ptc.com/index.jspa
If you are using Mathcad, it has the functionality built in. It is called solve block.
Start with the keyword "given"
Given
define the guess values for all unknowns
x:=2
f:=3
r:=2
...
define your constraints
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
calculate the solution
find(x, y, z, r, ...)=
Check Mathcad help or Quicksheets for examples of the exact syntax.
The simple answer to your question is this pseudo-code:
X = startingX;
lastF = Infinity;
F = 0;
tolerance = 1e-10;
while ((lastF - F)^2 > tolerance)
{
lastF = F;
X = ?;
R = ?;
F = FunctionOf(X,R);
}
This may not do what you expect at all. It may give a valid but nonsense answer or it may loop endlessly between alternate wrong answers.
This is standard substitution to convergence. There are more advanced techniques like DIIS but I'm not sure you want to go there. I found this article while figuring out if I want to go there.
In general, it really pays to think about how you can transform your problem into an easier problem.
In my experience it is better to pose your problem as a univariate bounded root-finding problem and use Brent's Method if you can
Next worst option is multivariate minimization with something like BFGS.
Iterative solutions are horrible, but are more easily solved once you think of them as X2 = f(X1) where X is the input vector and you're trying to reduce the difference between X1 and X2.
As the commenters have noted, the mathematical aspects of your question are beyond the scope of the help you can expect here, and are even beyond the help you could be offered based on the detail you posted.
However, I think that even if you understood the mathematics thoroughly there are computer science aspects to your question that should be addressed.
When you write your code, try to make organize it into functions that depend only upon the parameters you are passing in to a subroutine. So write a subroutine that takes in values for y, z, and r and returns you x. Make another that takes in f,L,D,G and returns z. Now you have testable routines that you can check to make sure they are computing correctly. Check the input values to your routines in the routines - for instance in computing x you will get a divide by 0 error if you pass in a 0 for r. Think about how you want to handle this.
If you are going to solve this problem interatively you will need a method that will decide, based on the results of one iteration, what the values for the next iteration will be. This also should be encapsulated within a subroutine. Now if you are using a language that allows only one value to be returned from a subroutine (which is most common computation languages C, C++, Java, C#) you need to package up all your variables into some kind of data structure to return them. You could use an array of reals or doubles, but it would be nicer to choose to make an object and then you can reference the variables by their name and not their position (less chance of error).
Another aspect of iteration is knowing when to stop. Certainly you'll do so when you get a solution that converges. Make this decision into another subroutine. Now when you need to change the convergence criteria there is only one place in the code to go to. But you need to consider other reasons for stopping - what do you do if your solution starts diverging instead of converging? How many iterations will you allow the run to go before giving up?
Another aspect of iteration of a computer is round-off error. Mathematically 10^40/10^38 is 100. Mathematically 10^20 + 1 > 10^20. These statements are not true in most computations. Your calculations may need to take this into account or you will end up with numbers that are garbage. This is an example of a cross-cutting concern that does not lend itself to encapsulation in a subroutine.
I would suggest that you go look at the Python language, and the pythonxy.com extensions. There are people in the associated forums that would be a good resource for helping you learn how to do iterative solving of a system of equations.

How to explain to a developer that adding extra if - else if conditions is not a good way to "improve" readability?

Recently I've bumped into the following C++ code:
if (a)
{
f();
}
else if (b)
{
f();
}
else if (c)
{
f();
}
Where a, b and c are all different conditions, and they are not very short.
I tried to change the code to:
if (a || b || c)
{
f();
}
But the author opposed saying that my change will decrease readability of the code. I had two arguments:
1) You should not increase readability by replacing one branching statement with three (though I really doubt that it's possible to make code more readable by using else if instead of ||).
2) It's not the fastest code, and no compiler will optimize this.
But my arguments did not convince him.
What would you tell a programmer writing such a code?
Do you think complex condition is an excuse for using else if instead of OR?
This code is redundant. It is prone to error.
If you were to replace f(); someday with something else, there is the danger you miss one out.
There may though be a motivation behind that these three condition bodies could one day become different and you sort of prepare for this situation. If there is a strong possibility it will happen, it may be okay to do something of the sort. But I'd advice to follow the YAGNI principle (You Ain't Gonna Need It). Can't say how much bloated code has been written not because of the real need but just in anticipation of it becoming needed tomorrow. Practice shows this does not bring any value during the entire lifetime of an application but heavily increases maintenance overhead.
As to how to approach explaining it to your colleague, it has been discussed numerous times. Look here:
How do you tell someone they’re writing bad code?
How to justify to your colleagues that they produce crappy code?
How do you handle poor quality code from team members?
“Mentor” a senior programmer or colleague without insulting
Replace the three complex conditions with one function, making it obvious why f() should be executed.
bool ShouldExecute; { return a||b||c};
...
if ShouldExecute {f();};
Since the conditions are long, have him do this:
if ( (aaaaaaaaaaaaaaaaaaaaaaaaaaaa)
|| (bbbbbbbbbbbbbbbbbbbbbbbbbbbb)
|| (cccccccccccccccccccccccccccc) )
{
f();
}
A good compiler might turn all of these into the same code anyway, but the above is a common construct for this type of thing. 3 calls to the same function is ugly.
In general I think you are right in that if (a || b || c) { f(); } is easier to read. He could also make good use of whitespace to help separate the three blocks.
That said, I would be interested to see what a, b, c, and f look like. If f is just a single function call and each block is large, I can sort of see his point, although I cringe at violating the DRY principle by calling f three different times.
Performance is not an issue here.
Many people wrap themselves in the flag of "readability" when it's really just a matter of individual taste.
Sometimes I do it one way, sometimes the other. What I'm thinking about is -
"Which way will make it easier to edit the kinds of changes that might have to be made in the future?"
Don't sweat the small stuff.
I think that both of your arguments (as well as Developer Art's point about maintainability) are valid, but apparently your discussion partner is not open for a discussion.
I get the feeling that you are having this discussion with someone who is ranked as more senior. If that's the case, you have a war to fight and this is just one small battle, which is not important for you to win. Instead of spending time arguing about this thing, try to make your results (which will be far better than your discussion partner's if he's writing that kind of kode) speak for themselves. Just make sure that you get credit for your work, not the whole team or someone else.
This is probably not the kind of answer you expected to the question, but I got a feeling that there's something more to it than just this small argument...
I very much doubt there will be any performance gains of this, except at least in a very specific scenario. In this scenario you change a, b, and c, and thus which of the three that triggers the code changes, but the code executes anyhow, then reducing the code to one if-statement might improve, since the CPU might have the code in the branch cache when it gets to it next time. If you triple the code, so that it occupies 3 times the space in the branch cache, there is a higher chance one or more of the paths will be pushed out, and thus you won't have the most performant execution.
This is very low-level, so again, I doubt this will make much of an impact.
As for readability, which one is easier to read:
if something, do this
if something else, do this
if yet another something else, do this
"this" is the same in all three cases
or this:
if something, or something else, or yet another something else, then do this
Place some more code in there, other than just a simple function call, and it starts getting hard to identify that this is actually three identical pieces of code.
Maintainability goes down with the 3 if-statement organization because of this.
You also have duplicated code, almost always a sign of bad structure and design.
Basically, I would tell the programmer that if he has problems reading the 1 if-statement way of writing it, maybe C++ is not what he should be doing.
Now, let's assume that the "a", "b", and "c" parts are really big, so that the OR's in there gets lost in lots of noise with parenthesis or what not.
I would still reorganize the code so that it only called the function (or executed the code in there) in one place, so perhaps this is a compromise?
bool doExecute = false;
if (a) doExecute = true;
if (b) doExecute = true;
if (c) doExecute = true;
if (doExecute)
{
f();
}
or, even better, this way to take advantage of boolean logic short circuiting to avoid evaluating things unnecessary:
bool doExecute = a;
doExecute = doExecute || b;
doExecute = doExecute || c;
if (doExecute)
{
f();
}
Performance shouldn't really even come into question
Maybe later he wont call f() in all 3 conditons
Repeating code doesn't make things clearer, your (a||b||c) is much clearer although maybe one could refactor it even more (since its C++) e.g.
x.f()

Resources