What multiplication algorithm is used in practice? - algorithm

Naive matrix multiplication is O(n^3). One can get better asymptotic performance with more sophisticated algorithms, but most of them are not useful because of huge overhead.
My question is, what algorithm is favoured in current libraries such as BLAS? What about in accelerators like CUDA?
And what is their asymptotic complexity?

Related

How efficient is efficient when it comes to Polynomial Time Algorithms?

I hope this is the right place for this question.
Polynomial time algorithms! How do polynomial time algorithms (PTAs) actually relate to the processing power, memory size (RAM) and storage of computers?
We consider PTAs to be efficient. We know that even for a PTA, the time complexity increases with the input size n. Take for example, there already exists a PTA that determines if a number is prime. But what happens if I want to check a number this big https://justpaste.it/3fnj2? Is the PTA for prime check still considered efficient? Is there a computer that can compute if such a big number like that is prime?
Whether yes or no (maybe no, idk), how does the concept of polynomial time algorithms actually apply in the real world? Is their some computing bound or something for so-called polynomial time algorithms?
I've tried Google searches on this but all I find are mathematical Big O related explanations. I don't find articles that actual relate the concept of PTAs to computing power. I would appreciate some explanation or links to some resources.
There are a few things to explain.
Regarding Polynomial Time as efficient is just an arbitrary agreement. The mathematicians have defined a set Efficient_Algorithms = {P algorithm, where P Polynomial}. That is only a mathematical definition. Mathematicians don't see your actual hardware and they don't care for it. They work with abstract concepts. Yes, scientists consider O(n^100) as efficient.
But you cannot compare one to one statements from theoretical computer science with computer programs running on hardware. Scientists work with formulas and theorems while computer programs are executed on electric circuits.
The Big-Oh notation does not help you for comparing implementations of an algorithms. The Big-Oh notation compares algorithms but not the implementations of them. This can be illustrated as follows. Consider you have a prime checking algorithm with a high polynomial complexity. You implement it and you see it does not perform well for practical use cases. So you use a profiler. It tells you where the bottle neck is. You find out that 98% of the computations time are matrix multiplications. So you develop a processor that does exactly such calculations extremely fast. Or you buy the most modern graphics card for this purpose. Or you wait 150 years for a new hardware generation. Or you achieve to make most of these multiplications parallel. Imagine you achieved somehow to reduce the time for matrix multiplications by 95%. With this wonderful hardware you run your algorithm. And suddenly it performs well. So your algorithm is actually efficient. It was only your hardware that was not powerful enough. This is not an thought experiment. Such dramatic improvements of computation power are reality quite often.
The very most of algorithms that have a polynomial complexity have such because the problems they are solving are actually of polynomial complexity. Consider for example the matrix multiplication. If you do it on paper it is O(n^3). It is the nature of this problem that it has a polynomial complexity. In practice and daily life (I think) most problems for which you have a polynomial algorithm are actually polynomial problems. If you have a polynomial problem, then a polynomial algorithm is efficient.
Why do we talk about polynomial algorithms and why do we consider them as efficient? As already said, this is quite arbitrary. But as a motivation the following words may be helpful. When talking about "polynomial algorithms", we can say there are two types of them.
The algorithms that have a complexity that is even lower than polynomial (e.t. linear or logarithmic). I think we can agree to say these are efficient.
The algorithms that are actually polynomial and not lower than polynomial. As illustrated above, in practice these algorithms are oftentimes polynomial because they solve problems that are actually of polynomial nature and therefore require polynomial complexity. If you see it this way, then of course we can say, these algorithms are efficient.
In practice if you have a linear problem you will normally recognise it as a linear problem. Normally you would not apply an algorithm that has a worse complexity to a linear problem. This is just practical experience. If you for example search an element in a list you would not expect more comparisons than the number of elements in the list. If in such cases you apply an algorithm that has a complexity O(n^2), then of course this polynomial algorithm is not efficient. But as said, such mistakes are oftentimes so obvious, that they don't happen.
So that is my final answer to your question: In practice software developers have a good feeling for linear complexity. Good developers also have a feeling of logarithmic complexity in real life. In consequence that means, you don't have to worry about complexity theory too much. If you have polynomial algorithm, then you normally have a quite good feeling to tell if the problem itself is actually linear or not. If this is not the case, then your algorithm is efficient. If you have an exponential algorithm, it may not be obvious what is going on. But in practice you see the computation time, do some experiments or get complains from users. Exponential complexity is normally not deniable.

Big O(n logn) is not preferable over the O(n^2)

Any Algorithms example when do we prefer Big O(n^2) time complexity over the O(n logn)?
I have seen this question somewhere but did not find answer.
For a large problem, O(n log n) will always beat O(n^2). For a small problem, the constant factors hidden by the big-O notation may cause you to prefer the O(n^2) algorithm. For instance, the O(n log n) quicksort is faster than the O(n^2) insert sort, but some quicksort implementations switch to insert sort when the partitions get small (less than ten elements).
There are several reasons to choose an algorithm with a higher time complexity:
Speed: The asymptotic complexity only applies to values of n greater than some n_0. Also, it assumes a certain machine underneath which only partially matches real machines with multiple levels of cache and constrained memory.
Space: Some algorithms require more space than others, and thus become impossible to implement. Also, this may simply influence the speed on a real machine. For example, locality of references has influence on cache hits or misses, which is the reason why Quicksort performs better than Mergesort.
Implementation complexity: In some cases the loss in performance is simply negligible, but the development time isn't.
Many naive O(n^2) algorithms are faster on small inputs than their more complicated O(n log(n)) brethren.
For example, the GNU MP Bignum library has a very highly optimized multiplication implementation. But for numbers made up of a couple dozen words it just uses schoolbook multiplication (the best threshold depends heavily on the machine). In fact GMP transitions through a whole sequence of fastest-around-size-X algorithms.
One possibility - the O(n logn) algorithm is recursive, but you can program the O(n^2) iteratively, and your programming language that you must use does not support recursion.
"Preferred" is relative here BTW. If the data-set was large enough, you COULD emulate recursion by using your own stack variable that you manipulate in a version of the "recursive" algorithm that was implemented iteratively (we had to do that exercise in Guy Steele's Comparative Programming class at CMU back-in-the-day).

when would an O(n^2) algorithm be preferred over an O(n) algorithm?

I can think of two situations to use an O(n^2) algorithm instead of an O(n) algorithm:
Because the big O notation only describes the asymptotic complexity, the exact complexity of an O(n^2) algorithm may actually be less than an O(n) algorithm when n is small.
If an O(n) algorithm requires more memory space than an O(n^2) algorithm and the memory is limited, then the O(n^2) algorithm will be preferred.
Are there any other situations in favor of an O(n^2) algorithm?
In cryptography, sometimes inefficient or 'unoptimized' algorithms are desired because they take similar resources (time, processing power, heat dissipated, memory used) no matter what they are processing. As such, it makes it harder to do things like timing attacks or side-channel attacks.

Is big-O notation a tool to do best, worst, & average case analysis of an algorithm?

Is big-O notation a tool to do best, worst, & average case analysis of an algorithm?
Or is big-O only for worst case analysis, since it is an upper bounding function?
It is Big O, because orders of magnitude are expressed like O(n), O(logN), etc.
The best, worst, and average cases of an algorithm can all be expressed with Big O notation.
For an example of this applied to sorting algorithms, see
http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms
Note that an algorithm can be classified according to multiple, independent criteria such as memory use or CPU use. Often, there is a tradeoff between two or more criteria (e.g. an algorithm that uses little CPU may use quite a bit of memory).
Big "O" is a measure of asymptotic complexity, which is to say, roughly how an algorithm scales as N gets really large.
If best & worse converge to the same asymptotic complexity, you can use a single value - or you can figure them out seperately (for example, some sorting algorithms have completely different characteristics on sorted or almost-sorted data than on un-sorted data).
The notation itself doesn't convey this though, how you use it does.
... Or is big-O only for worst case analysis ...
If you give just one asymptotic complexity for an algorithm, it doesn't tell the reader whether (or how) the best and worst case differ from the average.
If you give best-case and worst-case complexity, it tells the reader how they differ.
By default, if a single value is listed, it is probably the average complexity which may (or may not) converge with the worst-case.

Implementation of Chazelle's triangulation algorithm

There is an algorithm for triangulating a polygon in linear time due to Chazelle (1991), but, AFAIK, there aren't any standard implementations of his algorithm in general mathematical software libraries.
Does anyone know of such an implementation?
See this answer to the question Powerful algorithms too complex to implement:
According to Skiena (author of The Algorithm Design Manual), "[the] algorithm is quite hopeless to implement."
I've looked for an implementation before, but couldn't find one. I think it's safe to assume no-one has implemented it due to its complexity, and I think it also has quite a large constant factor so wouldn't do well against O(n lg n) algorithms that have smaller constant factors.
This is claimed to be an
Implementation of Chazelle's Algorithm for Triangulating a Simple Polygon in Linear Time (mainly C++ and C).

Resources