Does multiplication take unit time? - algorithm

I have the following problem
Under what circumstances can multiplication be regarded as a unit time
operation?
But I thought multiplication is always considered to be taking unit time. Was I wrong?

It depends on what N is. If N is the number of bits in an arbitrarily large number, then as the number of bits increases, it takes longer to compute the product. However, in most programming languages and applications, the size of numbers are capped to some reasonable number of bits (usually 32 or 64). In hardware, these numbers are multiplied in one step that does not depend on the size of the number.
When the number of bits is a fixed number, like 32, then it doesn't make sense to talk about asymptotic complexity, and you can treat multiplication like an O(1) operation in terms of whatever algorithm you're looking at. When can become arbitrarily large, like with Java's BigInteger class, then multiplication depends on the size of those numbers, as does the memory required to store them.

Only in those cases where you're performing operations on two numbers,of numeric type (emphasis here ,not going into the binary detail), you simply need to assume that the operation being performed is of constant time only.
It's not defined as unit time,but,more strictly, a constant time interval which doesn't change even if we increase the size of number,but, in reality the calculation does utilise subtle more time to perform calculation on large numbers). These are generally considered trivial,unless the numbers being multiplied are too large,like BigIntegers in Java,etc.
But,as soon as we move towards performing multiplication of binary strings, our complexity increases and naive method yields a complexity of O(n^2).
So,to aimplify, we perform a divide and conquer-based multiplication, also known as Karatsuba's algorithm for multiplication, which has a complexity of O(n^1.59) which reduces the total number of multiplications and additions to some lesser number of multiplications and some of the additions.
I hope I haven't misjudged the question. If so,please alert me so that
I can remove this answer. If I understood the question properly, then the other answer posted here seems
incomplete.

The expression unit time is a little ambiguous (and AFAIK not much used).
True unit time is achieved when the multiply is performed in a single clock cycle. This rarely occurs on modern processors.
If the execution time of the multiply does not depend on the particular values of the operands, we can say that it is performed in constant time.
When the operand length is bounded, so that the time never exceeds a given duration, we also say that an operation is performed in constant time.
This constant duration can be used as the timing unit of the running time, so that you count in "multiplies" instead of seconds (ops, flops).
Lastly, you can evaluate the performance of an algorithm in terms of the number of multiplies it performs, independently of the time they take.

Related

Why is exponentiation not atomic?

In calculating the efficiency of algorithms, I have read that the exponentiation operation is not considered to be an atomic operation (like multiplication).
Is it because exponentiation is the same as the multiplication operation repeated several times over?
In principle, you can pick any set of "core" operations on numbers that you consider to take a single time unit to evaluate. However, there are a couple of reasons, though, why we typically don't count exponentiation as one of them.
Perhaps the biggest has to do with how large of an output you produce. Suppose you have two numbers x and y that are each d digits long. Then their sum x + y has (at most) d + 1 digits - barely bigger than what we started with. Their product xy has at most 2d digits - larger than what we started with, but not by a huge amount. On the other hand, the value xy has roughly yd digits, which can be significantly bigger than what we started with. (A good example of this: think about computing 100100, which has about 200 digits!) This means that simply writing down the result of the exponentiation would require a decent amount of time to complete.
This isn't to say that you couldn't consider exponentiation to be a constant-time operation. Rather, I've just never seen it done.
(Fun fact: some theory papers don't consider multiplication to be a constant-time operation, since the complexity of a hardware circuit to multiply two b-bit numbers grows quadratically with the size of b. And some theory papers don't consider addition to be constant-time either, especially when working with variable-length numbers! It's all about context. If you're dealing with "smallish" numbers that fit into machine words, then we can easily count addition and multiplication as taking constant time. If you have huge numbers - say, large primes for RSA encryption - then the size of the numbers starts to impact the algorithm's runtime and implementation.)
This is a matter of definition. For example in hardware-design and biginteger-processing multiplication is not considered an atomic operation (see e.g. this analysis of the karatsuba-algorithm).
On the level that is relevant for general purpose software-design on the other hand, multiplication can be considered as a fairly fast operation on fixed-digit numbers implemented in hardware. Exponentiation on the other hand is rarely implemented in hardware and an upper bound for the complexity can only be given in terms of the exponent, rather than the number of digits.

Does the complexity of mergesort/radix sort change when the keys occupy more than a single word of memory

This is a homework problem.So I am looking for hints rather than the solution. Consider a set of n numbers. Each number is 'k' digits long. Suppose 'k' is much much larger and does not fit into a single word of memory. In such a scenario what is the complexity of mergesort and radix sort?
My analysis is - Asymptotic complexity doesn't depend on the underlying architectural details like the number of words a number occupies etc. May be the constant factor changes and algorithms run slower, but the overall complexity remains the same. For instance,in languages like Python that handle arbitrarily long integers, the algorithms remain the same. But a few of my friends argue that as the number of words occupied by a number 'w' grows towards infinity, the complexity does change.
Am I on the right path?
The runtime of an algorithm can indeed depend on the number of machine words making up the input. As an example, take integer multiplication. The computer can compute the product of two one-word numbers in time O(1), but it can't compute the product of two arbitrarily-sized numbers in time O(1) because the machine has to load each word into memory as part of its computation.
As a hint for radix sort versus mergesort - the mergesort algorithm makes O(n log n) comparisons between elements, but those comparisons might not take time O(1) each. How much time does it take to compare two numbers that require k machine words each? Similarly, radix sort's runtime depends on the number of digits in the number. How many rounds of radix sort do you need if you have k machine words in each number?
Hope this helps!
You're sort of correct. This is a large part of why most complexity analysis will (somewhere, at least implicitly) state that it's working with the count of some basic operations, not actual time. You generally take for granted that most (if not all) of those basic operations (e.g., comparison, swapping, math like addition or subtraction, etc.) are constant time, which lets you translate almost directly from operation count (the actual complexity) to time consumed.
To be entirely accurate, however, asymptotic complexity is (should) normally specified in terms of a count of fundamental operations, though, not actual time consumed.

Why is division more expensive than multiplication?

I am not really trying to optimize anything, but I remember hearing this from programmers all the time, that I took it as a truth. After all they are supposed to know this stuff.
But I wonder why is division actually slower than multiplication? Isn't division just a glorified subtraction, and multiplication is a glorified addition? So mathematically I don't see why going one way or the other has computationally very different costs.
Can anyone please clarify the reason/cause of this so I know, instead of what I heard from other programmer's that I asked before which is: "because".
CPU's ALU (Arithmetic-Logic Unit) executes algorithms, though they are implemented in hardware. Classic multiplications algorithms includes Wallace tree and Dadda tree. More information is available here. More sophisticated techniques are available in newer processors. Generally, processors strive to parallelize bit-pairs operations in order the minimize the clock cycles required. Multiplication algorithms can be parallelized quite effectively (though more transistors are required).
Division algorithms can't be parallelized as efficiently. The most efficient division algorithms are quite complex (The Pentium FDIV bug demonstrates the level of complexity). Generally, they requires more clock cycles per bit. If you're after more technical details, here is a nice explanation from Intel. Intel actually patented their division algorithm.
But I wonder why is division actually slower than multiplication? Isn't division just a glorified subtraction, and multiplication is a glorified addition?
The big difference is that in a long multiplication you just need to add up a bunch of numbers after shifting and masking. In a long division you have to test for overflow after each subtraction.
Lets consider a long multiplication of two n bit binary numbers.
shift (no time)
mask (constant time)
add (neively looks like time proportional to n²)
But if we look closer it turns out we can optimise the addition by using two tricks (there are further optimisations but these are the most important).
We can add the numbers in groups rather than sequentially.
Until the final step we can add three numbers to produce two rather than adding two to produce one. While adding two numbers to produce one takes time proportional to n, adding three numbers to produce two can be done in constant time because we can eliminate the carry chain.
So now our algorithm looks like
shift (no time)
mask (constant time)
add numbers in groups of three to produce two until there are only two left (time proportional to log(n))
perform the final addition (time proportional to n)
In other words we can build a multiplier for two n bit numbers in time roughly proportional to n (and space roughly proportional to n²). As long as the CPU designer is willing to dedicate the logic multiplication can be almost as fast as addition.
In long division we need to know whether each subtraction overflowed before we can decide what inputs to use for the next one. So we can't apply the same parallising tricks as we can with long multiplication.
There are methods of division that are faster than basic long division but still they are slower than multiplication.

why O(1) != O(log(n)) ? for n=[integer, long, ...]

for example, say n = Integer.MAX_VALUE or 2^123 then O(log(n)) = 32 and 123 so a small integer. isn't it O(1) ?
what is the difference ? I think, the reason is O(1) is constant but O(log(n)) not. Any other ideas ?
If n is bounded above, then complexity classes involving n make no sense. There is no such thing as "in the limit as 2^123 approaches infinity", except in the old joke that "a pentagon approximates a circle, for sufficiently large values of 5".
Generally, when analysing the complexity of code, we pretend that the input size isn't bounded above by the resource limits of the machine, even though it is. This does lead to some slightly odd things going on around log n, since if n has to fit into a fixed-size int type, then log n has quite a small bound, so the bound is more likely to be useful/relevant.
So sometimes, we're analysing a slightly idealised version of the algorithm, because the actual code written cannot accept arbitrarily large input.
For example, your average quicksort formally uses Theta(log n) stack in the worst case, obviously so with the fairly common implementation that call-recurses on the "small" side of the partition and loop-recurses on the "big" side. But on a 32 bit machine you can arrange to in fact use a fixed-size array of about 240 bytes to store the "todo list", which might be less than some other function you've written based on an algorithm that formally has O(1) stack use. The morals are that implementation != algorithm, complexity doesn't tell you anything about small numbers, and any specific number is "small".
If you want to account for bounds, you could say that, for example, your code to sort an array is O(1) running time, because the array has to be below the size that fits in your PC's address space, and hence the time to sort it is bounded. However, you will fail your CS assignment if you do, and you won't be providing anyone with any useful information :-)
Obviously if you know that the input will always have a fixed number of elements, the algorithm will always run in constant time. Big-O notation is used to denote worse-case running time, which describes the limit when the number of elements grows infinitely large.
The difference is that n isn't fixed. The idea behind Big-O notation is to get an idea of how the size of the input effects the running time (or memory usage). So if an algorithm always takes the same amount of time, whether n = 1 or n = Integer.MAX_VALUE, we say it is O(1). If the algorithm takes a unit of time longer each time the input size doubles, then we say it is O(logn).
Edit: to answer your specific question on the difference between O(1) and O(logn), I'll give you an example. Let's say we want an algorithm that will find the min element in an unsorted array. One approach is to go through each element and keep track of the current min. Another approach is to sort the array and then return the first element.
The first algorithm is O(n), and the second algorithm is O(nlogn). So let's say we start with an array of 16 elements. The first algorithm will run in time 16, the second algorithm will run in time 16*4. If we increase it to 17, then it becomes 17 and 17*4. We might naively say that the second algorithm takes 4 times as long as the first algorithm (if we treat the logn component as constant).
But let's look at what happens when our array contains 2^32 elements. Now the first algorithm takes 2^32 time to complete, where our second algorithm takes 32*2^32 time to complete. It takes 32 times as long. Yes, it's a small difference, but it is still a difference. If the first algorithm takes 1 minute, the second algorithm will take over half an hour!
I think you will get a better idea if it is called O(n^0).
It is a scaling function depending on the input variable N. It is a function, not number, you should never assume any number for the variable N.
It is just like that you say that a function f(x) is 3 because f(100) = 3, it is wrong. It is a function, not any particular number. A constant function f(x) = 1 is still a function, it will never equal to another function g(x) = N, i.e. g(x)=f(x)
Its the growth rate that you want to look at. O(1) implies no growth at all. While O(logn) does have growth. Even though the growth is small it is still growth.
You’re not thinking big enough. Any algorithm that runs on a computer will either run forever or terminate after some small number of steps — since the computer is only a finite state machine, you cannot write algorithms that run for an arbitrary amount of time and then terminate. By that argument, Big-O notation is only theoretical and has no purpose in a real-life computer program. Even O(2^n) hits an upper limit at O(2^INT_MAX), which is equivalent to O(1).
Realistically, though, Big-O can help you out if you know the constant factors. Even if an algorithm has an upper bound of O(log n), and n can have 32 bits, that could mean the difference between a request taking 1 second and 32 seconds.
Big-O shows how running time (or memory, etc) changes as the size of problem changes.
When size of the problem gets 10 times bigger, an O(n) solution takes 10 times as long, an O(log(n)) solution takes a bit longer, and an O(1) solution takes the same time: O(1) means 'changes as fast as constant 1', but constants don't change.
Familiarize yourself with the big-O notation in a bit more detail.
There is a reason why you leave "O(n)" in, and consider to drop "O(log n)". They both are "constants": the former is less than 32, and the latter is less than 232. But you nevertheless have a natural feeling that you can't call O(n) O(1).
However, if log(n) < 32, it means that O(n*logn) algorithm works thirty two times slower than its O(n) version. Big enough to write "log*n"s?

How can one test time complexity "experimentally"?

Could it be done by keeping a counter to see how many iterations an algorithm goes through, or does the time duration need to be recorded?
The currently accepted won't give you any theoretical estimation, unless you are somehow able to fit the experimentally measured times with a function that approximates them. This answer gives you a manual technique to do that and fills that gap.
You start by guessing the theoretical complexity function of the algorithm. You also experimentally measure the actual complexity (number of operations, time, or whatever you find practical), for increasingly larger problems.
For example, say you guess an algorithm is quadratic. Measure (Say) the time, and compute the ratio of time to your guessed function (n^2):
for n = 5 to 10000 //n: problem size
long start = System.time()
executeAlgorithm(n)
long end = System.time()
long totalTime = end - start
double ratio = (double) time / (n * n)
end
. As n moves towards infinity, this ratio...
Converges to zero? Then your guess is too low. Repeat with something bigger (e.g. n^3)
Diverges to infinity? Then your guess is too high. Repeat with something smaller (e.g. nlogn)
Converges to a positive constant? Bingo! Your guess is on the money (at least approximates the theoretical complexity for as large n values as you tried)
Basically that uses the definition of big O notation, that f(x) = O(g(x)) <=> f(x) < c * g(x) - f(x) is the actual cost of your algorithm, g(x) is the guess you put, and c is a constant. So basically you try to experimentally find the limit of f(x)/g(x); if your guess hits the real complexity, this ratio will estimate the constant c.
Algorithm complexity is defined as (something like:)
the number of operations the algorithm does as a function
of its input size.
So you need to try your algorithm with various input sizes (i.e. for sort - try sorting 10 elements, 100 elements etc.), and count each operation (e.g. assignment, increment, mathematical operation etc.) the algorithm does.
This will give you a good "theoretical" estimation.
If you want real-life numbers on the other hand - use profiling.
As others have mentioned, the theoretical time complexity is a function of number of cpu operations done by your algorithm. In general processor time should be a good approximation for that modulo a constant. But the real run time may vary because of a number of reasons such as:
processor pipeline flushes
Cache misses
Garbage collection
Other processes on the machine
Unless your code is systematically causing some of these things to happen, with enough number of statistical samples, you should have a fairly good idea of the time complexity of your algorithm, based on observed runtime.
The best way would be to actually count the number of "operations" performed by your algorithm. The definition of "operation" can vary: for an algorithm such as quicksort, it could be the number of comparisons of two numbers.
You could measure the time taken by your program to get a rough estimate, but various factors could cause this value to differ from the actual mathematical complexity.
yes.
you can track both, actual performance and number of iterations.
Might I suggest using ANTS profiler. It will provide you this kind of detail while you run your app with "experimental" data.

Resources