Why are numbers rounded in standard computer algorithms? [closed] - performance

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I understand that this makes the algorithms faster and use less storage space, and that these would have been critical features for software to run on the hardware of previous decades, but is this still an important feature? If the calculations were done with exact rational arithmetic then there would be no rounding errors at all, which would simplify many algorithms as you would no longer have to worry about catastrophic cancellation or anything like that.

Floating point is much faster than arbitrary-precision and symbolic packages, and 12-16 significant figures is usually plenty for demanding science/engineering applications where non-integral computations are relevant.

The programming language ABC used rational numbers (x / y where x and y were integers) wherever possible.
Sometimes calculations would become very slow because the numerator and denominator had become very big.
So it turns out that it's a bad idea if you don't put some kind of limit on the numerator and denominator.

In the vast majority of computations, the size of numbers required to to compute answers exactly would quickly grow beyond the point where computation would be worth the effort, and in many calculations it would grow beyond the point where exact calculation would even be possible. Consider that even running something like like a simple third-order IIR filter for a dozen iterations would require a fraction with thousands of bits in the denominator; running the algorithm for a few thousand iterations (hardly an unusual operation) could require more bits in the denominator than there exist atoms in the universe.

Many numerical algorithms still require fixed-precision numbers in order to perform well enough. Such calculations can be implemented in hardware because the numbers fit entirely in registers, whereas arbitrary precision calculations must be implemented in software, and there is a massive performance difference between the two. Ask anybody who crunches numbers for a living whether they'd be ok with things running X amount slower, and they probably will say "no that's completely unworkable."
Also, I think you'll find that having arbitrary precision is impractical and even impossible. For example, the number of decimal places can grow fast enough that you'll want to drop some. And then you're back to square one: rounded number problems!
Finally, sometimes the numbers beyond a certain precision do not matter anyway. For example, generally the nnumber of significant digits should reflect the level of experimental uncertainty.
So, which algorithms do you have in mind?

Traditionally integer arithmetic is easier and cheaper to implement in hardware (uses less space on the die so you can fit more units on there). Especially when you go into the DSP segment this can make a lot of difference.

Related

Unexpectedly low error for a numerical integrator with certain equations of motion

I have an RKF7(8) integrator whose output I've verified with several simple test functions. However, when I use it on the equation of motion I'm interested in, the local truncation errors are suddenly very small. For a timestep of around 1e-1, my errors are all around 1e-18 or 1e-19. For the simple test functions (so far, sine and an exponential), the errors are always reasonable, ie 1e-7 or so for the same timestep.
The only difference between the simple test functions and the problem function is that it's a huge chunk of code, like maybe 1000 different terms, some with relatively large exponents (like 9 or 10). Could that possibly affect the precision? Should I change my code to use long doubles?
Very interesting question. The problem you are facing might be related to issues (or limitations) of the floating point arithmetic. Since your function contains coefficients in a wide numerical interval, it is likely that you have some loss of precision in your calculations. In general, these problems can come in the form of:
Overflow
Underflow
Multiplication and division
Adding numbers of very different magnitudes
Subtracting numbers of similar magnitudes
Overflow and underflow occur when the numbers you are dealing with are too large or too small with respect to the machine precision, and it would be my bet that this is not what happens in your system. Nevertheless, one must take into account that multiplication and division operations can lead to overflow and underflow. On the other hand, adding numbers of very different magnitudes (or subtracting numbers of similar magnitudes) can lead to severe loss of precision due to the roundoff errors. From my experience in optimization problems that involve large and small numbers, I would say this could be a reasonable explanation of the behavior of your integrator.
I have two suggestions for you. The fist one is of course increasing the precision of your numbers to the maximum available. This might help or not depending on how ill conditioned your problem is. The second one is to use a better algorithm to perform the sums in your numerical method. In contrast to the naive addition of all number sequentially, you could use a more elaborated strategy by dividing your sums into sub-sums, effectively reducing roundoff errors. Notable examples of these algorithms are the pairwise summation and the Kahan summation.
I hope this answer offers you some clues. Good luck!

Real running time calculation of matrix multiplication [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I would like to calculate approximately the running time of a matrix multiplication problem. Below are my assumptions:
No parallel programming
A 2 Ghz CPU
A square matrix of size n
An O(n^3) algorithm
For example suppose that n = 1000. So, how much time (approximately) should I expect taking the square of this matrix will take on the above assumptions.
Thanks.
This really terribly depends on the algorithm and the CPU. Even without parallelization, there's a lot of freedom in how the same steps would be represented on a CPU, and differences (in clock cycles needed for various operations) between different CPU's of the same family, too. Don't forget, either, that modern CPUs add some parallelization of instructions on their own. Optimization done by the compiler will make a difference in reordering memory order and branches and will likely convert instructions to vectorized ones even if you didn't specify that. Depending on further factors it may make a difference, too, whether your matrices are in a fixed location in memory or if you are accessing them by a pointer, and whether they are allocated with fixed size or each row / column dynamically. Don't forget about memory caching, page invalidations, and operation system scheduling, as I did in previous versions of my answer.
If this is for your own rough estimate or for a "typical" case, you won't do much wrong by just writing the program, running it in your specific conditions (as discussed above) in many repetitions for n = 1000, and calculating the average.
If you want a lot of hard work for a worse result, you can actually do what you probably meant to do in your original question yourself:
see what instructions your specific compiler produces for your specific algorithm under your specific conditions and with specific optimization settings (like here)
pick your specific processor and find its latency table for every instruction that's there,
add them up per iteration and multiply by 1000^3,
divide by the clock frequency.
Seriously, it's not worth the effort, a benchmark is faster, clearer, and more precise anyway (as this does not account for what happens in the branch predictor and hyperthreading and memory caching and other architectural details). If you want an exercise I'll leave that to you.

Who is responsible for the precision? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Let's say some researchers have figured out a way to analyze data and they have developed an algorithm for that. At the time, the algorithm is described in a book, using lots of mathematical formulas.
Now the algorithm needs to be implemented in software. The developer can read the formulas and starts translating e.g. Sum(f(x)) [1..n] (seems TeX is not allowed here) to a for loop.
Depending on how the developer converts the formula into code, there might be overflows or truncation in floating point operations. Not knowing much about real-world input values, unit tests might not detect those issues. However, in some cases, this can be avoided just by re-ordering the items or simplifying terms.
I wonder who is responsible for the precision of the output. Is it the mathematician or is it the developer? The mathematician might not know enough about computer number formats while the developer might not know enough about mathematics to restructure the formula.
A simple example:
Given the Binomial coefficient n over k which translates to n! / (k! (n-k)!).
A simple implementation would probably use the factorial function and then input the numbers directly (pseudo code):
result = fac(n) / (fac(k) * fac(n-k))
This can lead to overflows for larger n. Knowing that, one could divide n! by k! first and do (pseudo code):
result = 1
for (i = k+1 to n) result *= i
result = result / fac(n-k)
which is a) faster because it needs less calculations and b) does not suffer from overflows.
This science is called numerical analysis
http://en.wikipedia.org/wiki/Numerical_analysis
In my understanding the analysis is on the mathematician side, but it is the responsibility of the programmer to know the problem exists and to look for the correct well known solutions (like not using a simple Euler integrator but Runge-Kutta).
Short answer: developer.
Algorithm (of just a formula) manipulates arbitrary precision real numbers as pure math objects.
Code (based on the formula) works with real hardware and must overcome limitations (which depends on your hardware) by using more complex code.
Example: Formula f(x,y) = x * y may lead to very complex source code (if x,y are 64-bit floating point real numbers and your hardware is 8-bit microcontroller without FPU support and without integer MUL instruction).

Is big-O notation relevant for concurrent world? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
In most popular languages like C/C++/C#/Erlang/Java we have threads/processes; there is GPGPU computation market growing. If algorithm requires N data independent steps we get not the same performance as if algorithm would require all steps follow one another. So I wonder if big-O notation makes sense in concurrent world? And if it does not what is relevant to analyze algorithm performance?
You can have N or more processors in distributed environment (GPGPU / cluster / FPGA of future where you can get as many cores as you need - concurrent world, not limited to the number of parallel cores)
Big-O notation is still relevant.
You have a constant number of processors (by assumption), thus only a constant number of things can happen concurrently, thus you can only speed up an algorithm by a constant factor, and big-O ignores constant factors.
So whether you look at the total number of steps, or only consider the number of steps taken by the processor processing the most steps, we end up with exactly the same time complexity, and this still ends up giving us a decent idea of the rate of growth of the time taken in relation to the input size.
... future where you can get as much cores as you need - concurrent world, not limited to the number of parallel cores.
If we even get to the stage where you can run an algorithm with exponential running time on very large input in seconds, then yes, big-O notation, and the study of algorithms, will probably become much less relevant.
But considering, for example, that for an O(n!) algorithm, with n = 1000 (which is pretty small to be honest), it will require in the region of 4x10^2567 steps, which is about 4x10^2480 times more than the mere 10^87 estimated number of atoms in the entire observable universe. In short, big-O notation is unlikely to ever become irrelevant.
Even on the assumption of an effectively unlimited number of processors, we can still use big-O notation to indicate the steps taken by the processor processing the most steps (which should indicate the running time).
We could also use it to indicate the number of processors used, if we'd like.
The bottom line is that big-O notation is to show the rate of growth of a function - a function which could represent just about anything. Just because we typically use it to indicate the number of arithmetic computations, steps, comparisons or similar doesn't mean those are the only things we can use it for.
Big-O is a mathematical concept. It's always relevant, it would have been relevant before computers even existed, it's relevant now, it will always be relevant, it's even relevant to hypothetical aliens millions of light years away (if they know about it). Big-O is not just something we use to talk about how running time scales, it has a mathematical definition and it's about functions.
But there are many models of computation (unfortunately many people forget that, and even forget that what they're using is a model at all) and which ones make sense to use is not always the same.
For example, if you're talking about the properties of a parallel algorithm, assuming you have a constant number of processing elements essentially ignores the parallel nature of the algorithm. So in order to be able to express more, a commonly used model (but by no means the only one) is PRAM.
That you don't actually have an unlimited number of processing elements in reality is of no import. It's a model. The whole point is to abstract reality away. You don't have unlimited memory either, which is one of the assumptions of a Turing machine.
For models even further removed from reality, see hypercomputation.
Multithreading and using gpu's just uses parallelization to speed up the algorithms. But there are algorithms that cannot be speeded up this way.
Even if algorithms can speeded up by parallelization, a O(N log N) algorithm will be much faster than a O(N²) algorithm.

Are bit-wise operations common and useful in real-life programming? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I often bump into interview questions involving a sorted/unsorted array and they ask you to find some sort of property of this array. For example finding the number that appears odd number of times in an array, or find the missing number in an unsorted array of size one million. Often the question post additional constraints such as O(n) runtime complexity, or O(1) space complexity.
Both of these problems can be solved pretty efficiently using bit-wise manipulations. Of course these are not all, there's a whole ton of questions like these.
To me bit-wise programming seems to be more like hack or intuition based, because it works in binary not decimals. Being a college student with not much real life programming experience at all, I'm curious if questions of this type are actually popular at all in real work, or are they just brain twisters interviewers use to select the smartest candidate.
If they are indeed useful, in what kind of scenarios are they actually applicable?
Are bit-wise operations common and useful in real-life programming?
The commonality or applicability depends on the problem in hand.
Some real-life projects do benefit from bit-wise operations.
Some examples:
You're setting individual pixels on the screen by directly manipulating the video memory, in which every pixel's color is represented by 1 or 4 bits. So, in every byte you can have packed 8 or 2 pixels and you need to separate them. Basically, your hardware dictates the use of bit-wise operations.
You're dealing with some kind of file format (e.g. GIF) or network protocol that uses individual bits or groups of bits to represent pieces of information. Your data dictates the use of bit-wise operations.
You need to compute some kind of checksum (possibly, parity or CRC) or hash value and some of the most applicable algorithms do this by manipulating with bits.
You're implementing (or using) an arbitrary-precision arithmetic library.
You're implementing FFT and you naturally need to reverse bits in an integer or simulate propagation of carry in the opposite direction when adding. The nature of the algorithm requires some bit-wise operations.
You're short of space and need to use as little memory as possible and you squeeze multiple bit values and groups of bits into entire bytes, words, double words and quad words. You choose to use bit-wise operations to save space.
Branches/jumps on your CPU are costly and you want to improve performance by implementing your code as a series of instructions without any branches and bit-wise instructions can help. The simplest example here would be choosing the minimum (or maximum) integer value out of two. The most natural way of implementing it is with some kind of if statement, which ultimately involves comparison and branching. You choose to use bit-wise operations to improve speed.
Your CPU supports floating point arithmetic but calculating something like square root is a slow operation and you instead simulate it using a few fast and simple integer and floating operations. Same here, you benefit from manipulating with the bit representation of the floating point format.
You're emulating a CPU or an entire computer and you need to manipulate individual bits (or groups of bits) when decoding instructions, when accessing parts of CPU or hardware registers, when simply emulating bit-wise instructions like OR, AND, XOR, NOT, etc. Your problem flat out requires bit-wise instructions.
You're explaining bit-wise algorithms or tricks or something that needs bit-wise operations to someone else on the web (e.g. here) or in a book. :)
I've personally done all of the above and more in the past 20 years. YMMV, though.
From my experience, it is very useful when you are aiming for speed and efficiency for large datasets.
I use bit vectors a lot in order to represent very large sets, which makes the storage very efficient and operations such as comparisons and combinations very fast. I have also found that bit matrices are very useful for the same reasons, for example finding intersections of a large number of large binary matrices. Using binary masks to specify subsets is also very useful, for example Matlab and Python's Numpy/Scipy use binary masks (essentially binary matrices) to select subsets of elements from matrices.
Using Bitwise Operations is strictly Dependent on your main concerns.
I was once asked to solve a problem to find the all combinations of numbers which don't
have a repeating digit within them , which are of form N*i, for a given i.
I suddenly made use of bitwise operations and generated all the numbers exactly with better
time , But to my surprise I was asked to rewrite and code with the no use of the Bitwise
Operators , as people find no readability with that , the code which many people has to use
in further . So, If performance is your concern go for Bitwise .
If readability is your concern reduce their use.
If you want both at time , you need to follow a good style of writing code with bitwise
operators in a way it was readable or understandable .
Although you can often "avoid it" in user-level code if you really don't care for it, it can be useful for cases where memory consumption is a big issue. Bit operations are often times needed, or even required when dealing with hardware devices or embedded programming in general.
It's common to have I/O registers with many different configuration options addressable through various flag-style bit combinations. Or for small embedded devices where memory is extremely constrained relative to modern PC RAM sizes you may be used to in your normal work.
It's also very handy for some optimizations in hot code, where you want to use a branch-free implementation of something that could be expressed with conditional code, but need quicker run-time performance. For example, finding the nearest power of 2 to a given integer can be implemented quite efficiently on some processors using bit hacks over more common solutions.
There is a great book called "Hacker's Delight" Henry S. Warren Jr. that is filled with very useful functions for a wide variety of problems that occur in "real world" code. There are also a number of online documents with similar things.
A famous document from the MIT AI lab in the 1970s known as HAKMEM is another example.

Resources