doing something in an unusual but efficient way - algorithm

I watched a video today and the guy in the video just write this to understand whether a number is even or not:
number/2*2 == number ? true : false ;
i tried it when i got home and compared with
number % 2 == 0 ? true : false ;
The second one was faster then i changed the first one as:
number>>1<<1 == number ? true : false;
this time shifting the number once to the right and once to left worked faster :D
The performance difference is not huge just 0-1 second for identifying all the numbers
between 1 and 1000000000 but I liked it very much and wanted to hear such tricks from you.
so what else ? =)
and another idea from Russell Borogove =)
(number&1) == 0;
Results:
Time Elapsed With And Operation:00:00:07.0504033
Time Elapsed With Shift Operation:00:00:06.4653698
Time Elapsed With Mod Operation:00:00:06.8323908
Surprisingly shifting two times is working faster than a single and operation on my computer.

MIT actually keeps a list of such things, HAKMEM, which can be found at http://www.inwap.com/pdp10/hbaker/hakmem/hakmem.html. Most of the programming-related ones are written in assembly language, but I understand that some of them have been translated to C at http://graphics.stanford.edu/~seander/bithacks.html.
Now for a lecture: These dirty tricks might be faster, but take far too long to comprehend.
Most computing isn't so performance-critical that tricks like this are necessary. In the odd-even case, number % 2 == 0 is much clearer and more readable than number/2*2 == number or number>>1<<1 == number. That said, in normal applications you should always use the simpler and more standard option because it will make your code easier to understand and maintain.
However, there are use cases for tricks like this. Especially in large-scale mathematical or scientific computing and computer graphics, tricks like these can save your life. An excellent example of this is John Carmack's "magic inverse square root" in Quake 3.

The book Hacker's Delight is 300 pages of nothing but stuff like this. It's not cheap but it's a bit-twiddler's bible.

Related

Random number generation from 1 to 7

I was going through Google Interview Questions. to implement the random number generation from 1 to 7.
I did write a simple code, I would like to understand if in the interview this question asked to me and if I write the below code is it Acceptable or not?
import time
def generate_rand():
ret = str(time.time()) # time in second like, 12345.1234
ret = int(ret[-1])
if ret == 0 or ret == 1:
return 1
elif ret > 7:
ret = ret - 7
return ret
return ret
while 1:
print(generate_rand())
time.sleep(1) # Just to see the output in the STDOUT
(Since the question seems to ask for analysis of issues in the code and not a solution, I am not providing one. )
The answer is unacceptable because:
You need to wait for a second for each random number. Many applications need a few hundred at a time. (If the sleep is just for convenience, note that even a microsecond granularity will not yield true random numbers as the last microsecond will be monotonically increasing until 10us are reached. You may get more than a few calls done in a span of 10us and there will be a set of monotonically increasing pseudo-random numbers).
Random numbers have uniform distribution. Each element should have the same probability in theory. In this case, you skew 1 more (twice the probability for 0, 1) and 7 more (thrice the probability for 7, 8, 9) compared to the others in the range 2-6.
Typically answers to this sort of a question will try to get a large range of numbers and distribute the ranges evenly from 1-7. For example, the above method would have worked fine if u had wanted randomness from 1-5 as 10 is evenly divisible by 5. Note that this will only solve (2) above.
For (1), there are other sources of randomness, such as /dev/random on a Linux OS.
You haven't really specified the constraints of the problem you're trying to solve, but if it's from a collection of interview questions it seems likely that it might be something like this.
In any case, the answer shown would not be acceptable for the following reasons:
The distribution of the results is not uniform, even if the samples you read from time.time() are uniform.
The results from time.time() will probably not be uniform. The result depends on the time at which you make the call, and if your calls are not uniformly distributed in time then the results will probably not be uniformly distributed either. In the worst case, if you're trying to randomise an array on a very fast processor then you might complete the entire operation before the time changes, so the whole array would be filled with the same value. Or at least large chunks of it would be.
The changes to the random value are highly predictable and can be inferred from the speed at which your program runs. In the very-fast-computer case you'll get a bunch of x followed by a bunch of x+1, but even if the computer is much slower or the clock is more precise, you're likely to get aliasing patterns which behave in a similarly predictable way.
Since you take the time value in decimal, it's likely that the least significant digit doesn't visit all possible values uniformly. It's most likely a conversion from binary to some arbitrary number of decimal digits, and the distribution of the least significant digit can be quite uneven when that happens.
The code should be much simpler. It's a complicated solution with many special cases, which reflects a piecemeal approach to the problem rather than an understanding of the relevant principles. An ideal solution would make the behaviour self-evident without having to consider each case individually.
The last one would probably end the interview, I'm afraid. Perhaps not if you could tell a good story about how you got there.
You need to understand the pigeonhole principle to begin to develop a solution. It looks like you're reducing the time to its least significant decimal digit for possible values 0 to 9. Legal results are 1 to 7. If you have seven pigeonholes and ten pigeons then you can start by putting your first seven pigeons into one hole each, but then you have three pigeons left. There's nowhere that you can put the remaining three pigeons (provided you only use whole pigeons) such that every hole has the same number of pigeons.
The problem is that if you pick a pigeon at random and ask what hole it's in, the answer is more likely to be a hole with two pigeons than a hole with one. This is what's called "non-uniform", and it causes all sorts of problems, depending on what you need your random numbers for.
You would either need to figure out how to ensure that all holes are filled equally, or you would have to come up with an explanation for why it doesn't matter.
Typically the "doesn't matter" answer is that each hole has either a million or a million and one pigeons in it, and for the scale of problem you're working with the bias would be undetectable.
Using the same general architecture you've created, I would do something like this:
import time
def generate_rand():
ret = str(time.time()) # time in second like, 12345.1234
ret = ret % 8 # will return pseudorandom numbers 0-7
if ret == 0:
return 1 # or you could also return the result of another call to generate_rand()
return ret
while 1:
print(generate_rand())
time.sleep(1)

Is it worth it to rewrite an if statement to avoid branching?

Recently I realized I have been doing too much branching without caring the negative impact on performance it had, therefore I have made up my mind to attempt to learn all about not branching. And here is a more extreme case, in attempt to make the code to have as little branch as possible.
Hence for the code
if(expression)
A = C; //A and C have to be the same type here obviously
expression can be A == B, or Q<=B, it could be anything that resolve to true or false, or i would like to think of it in term of the result being 1 or 0 here
I have come up with this non branching version
A += (expression)*(C-A); //Edited with thanks
So my question would be, is this a good solution that maximize efficiency?
If yes why and if not why?
Depends on the compiler, instruction set, optimizer, etc. When you use a boolean expression as an int value, e.g., (A == B) * C, the compiler has to do the compare, and the set some register to 0 or 1 based on the result. Some instruction sets might not have any way to do that other than branching. Generally speaking, it's better to write simple, straightforward code and let the optimizer figure it out, or find a different algorithm that branches less.
Jeez, no, don't do that!
Anyone who "penalize[s] [you] a lot for branching" would hopefully send you packing for using something that awful.
How is it awful, let me count the ways:
There's no guarantee you can multiply a quantity (e.g., C) by a boolean value (e.g., (A==B) yields true or false). Some languages will, some won't.
Anyone casually reading it is going observe a calculation, not an assignment statement.
You're replacing a comparison, and a conditional branch with two comparisons, two multiplications, a subtraction, and an addition. Seriously non-optimal.
It only works for integral numeric quantities. Try this with a wide variety of floating point numbers, or with an object, and if you're really lucky it will be rejected by the compiler/interpreter/whatever.
You should only ever consider doing this if you had analyzed the runtime properties of the program and determined that there is a frequent branch misprediction here, and that this is causing an actual performance problem. It makes the code much less clear, and its not obvious that it would be any faster in general (this is something you would also have to measure, under the circumstances you are interested in).
After doing research, I came to the conclusion that when there are bottleneck, it would be good to include timed profiler, as these kind of codes are usually not portable and are mainly used for optimization.
An exact example I had after reading the following question below
Why is it faster to process a sorted array than an unsorted array?
I tested my code on C++ using that, that my implementation was actually slower due to the extra arithmetics.
HOWEVER!
For this case below
if(expression) //branched version
A += C;
//OR
A += (expression)*(C); //non-branching version
The timing was as of such.
Branched Sorted list was approximately 2seconds.
Branched unsorted list was aproximately 10 seconds.
My implementation (whether sorted or unsorted) are both 3seconds.
This goes to show that in an unsorted area of bottleneck, when we have a trivial branching that can be simply replaced by a single multiplication.
It is probably more worthwhile to consider the implementation that I have suggested.
** Once again it is mainly for the areas that is deemed as the bottleneck **

Do speeds of if statements in a repetitive loop affect overall performance?

If I have code that will take a while to execute, printing out results every iteration will slow down the program a lot. To still receive occasional output to check on the progress of the code, I might have:
if (i % 10000 == 0) {
# print progress here
}
Does the if statement checking every time slow it down at all? Should I just not put output and just wait, will that make it noticeably faster at all?
Also, is it faster to do: (i % 10000 == 0) or (i == 10000)?
Is checking equality or modulus faster?
In general case, it won't matter at all.
A slightly longer answer: It won't matter unless the loop is run millions of times and the other statement in it is actually less demanding than an if statement (for example, a simple multiplication etc.). In that case, you might see a slight performance drop.
Regarding (i % 10000 == 0) vs. (i == 10000), the latter is obviously faster, because it only compares, whereas the former possibility does a (fairly costly) modulus and a comparison.
That said, both an if statement and a modulus count won't make any difference if your loop doesn't take up 90 % of the program's running time. Which usually is the case only at school :). You probably spent a lot more time by asking this question than you would have saved by not printing anything. For development and debugging, this is not a bad way to go.
The golden rule for this kind of decisions:
Write the most readable and explicit code you can imagine to do the
thing you want it to do. If you have a performance problem, look at
wrong data structures and algorithmic choices first. If you have done
all those and need a really quick program, profile it to see which
part takes most time. After all those, you're allowed to do this kind
of low-level guesses.

Algorithm Efficiency - Is partially unrolling a loop effective if it requires more comparisons?

How to judge if putting two extra assignments in an iteration is expensive or setting a if condition to test another thing? here I elaborate. question is to generate and PRINT the first n terms of the Fibonacci sequence where n>=1. my implement in C was:
#include<stdio.h>
void main()
{
int x=0,y=1,output=0,l,n;
printf("Enter the number of terms you need of Fibonacci Sequence ? ");
scanf("%d",&n);
printf("\n");
for (l=1;l<=n;l++)
{
output=output+x;
x=y;
y=output;
printf("%d ",output);
}
}
but the author of the book "how to solve it by computer" says it is inefficient since it uses two extra assignments for a single fibonacci number generated. he suggested:
a=0
b=1
loop:
print a,b
a=a+b
b=a+b
I agree this is more efficient since it keeps a and b relevant all the time and one assignment generates one number. BUT it is printing or supplying two fibonacci numbers at a time. suppose question is to generate an odd number of terms, what would we do? author suggested put a test condition to check if n is an odd number. wouldn't we be losing the gains of reducing number of assignments by adding an if test in every iteration?
I consider it very bad advice from the author to even bring this up in a book targeted at beginning programmers. (Edit: In all fairness, the book was originally published in 1982, a time when programming was generally much more low-level than it is now.)
99.9% of code does not need to be optimized. Especially in code like this that mixes extremely cheap operations (arithmetic on integers) with very expensive operations (I/O), it's a complete waste of time to optimize the cheap part.
Micro-optimizations like this should only be considered in time-critical code when it is necessary to squeeze every bit of performance out of your hardware.
When you do need it, the only way to know which of several options performs best is to measure. Even then, the results may change with different processors, platforms, memory configurations...
Without commenting on your actual code: As you are learning to program, keep in mind that minor efficiency improvements that make code harder to read are not worth it. At least, they aren't until profiling of a production application reveals that it would be worth it.
Write code that can be read by humans; it will make your life much easier and keep maintenance programmers from cursing the name of you and your offspring.
My first advice echoes the others: Strive first for clean, clear code, then optimize where you know there is a performance issue. (It's hard to imagine a time-critical fibonacci sequencer...)
However, speaking as someone who does work on systems where microseconds matter, there is a simple solution to the question you ask: Do the "if odd" test only once, not inside the loop.
The general pattern for loop unrolling is
create X repetitions of the loop logic.
divide N by X.
execute the loop N/X times.
handle the N%X remaining items.
For your specific case:
a=0;
b=1;
nLoops = n/2;
while (nloops-- > 0) {
print a,b;
a=a+b;
b=a+b;
}
if (isOdd(n)) {
print a;
}
(Note also that N/2 and isOdd are trivially implemented and extremely fast on a binary computer.)

Is it a bad practice writing long one-liner code?

I found myself keep writing pretty long one-liner code(influenced by shell pipe), like this:
def parseranges(ranges, n):
"""
Translate ":2,4:6,9:" to "0 1 3 4 5 8 9...n-1"
== === == === ===== =========
"""
def torange(x, n):
if len(x)==1:
(x0, ) = x
s = 1 if x0=='' else int(x0)
e = n if x0=='' else s
elif len(x)==2:
(x0, x1) = x
s = 1 if x0=='' else int(x0)
e = n if x1=='' else int(x1)
else:
raise ValueError
return range(s-1, e)
return sorted(reduce(lambda x, y:x.union(set(y)), map(lambda x:torange(x, n), map(lambda x:x.split(':'), ranges.split(','))), set()))
I felt ok when I written it.
I thought long one-liner code is a functional-programming style.
But, several hours later, I felt bad about it.
I'm afraid I would be criticized by people who may maintain it.
Sadly, I've get used to writing these kind of one-liner.
I really want to know others' opinion.
Please give me some advice. Thanks
I would say that it is bad practice if you're sacrificing readability.
It is common wisdom that source code is written once but read many times by different people. Therefore it is wise to optimize source code for the common case: being read, trying to understand.
My advice: Act according to this principle. Ask yourself: Can anybody understand any piece of my code more easily? When the answer is not a 100% "No, I can't even think of a better way to express the problem/solution." then follow your gut feeling and reformat or recode that part.
Unless performance is a major consideration, readability of the code should be given high major priority. Its really important for its maintainability.
A relevant quote from the book Structure and Interpretation of Computer Programs.
"Programs should be written for people to read, and only incidentally for machines to execute."
(Update 2022-03-25: My answer refers to a previous revision of the question.)
The first and third examples are acceptable to me. They are close enough to the application domain so that I can easily see the intention of the code.
The second example is much too clever. I don't even have an idea about its purpose. Can you rewrite it in maybe five lines, giving the variables longer names?

Resources