Why this algorithm is O(n^2) - big-o

function multiply(x, y)
Input: Two n-bit integers x and y, where y ≥ 0 Output: Their product
if y=0: return 0
z = multiply(x, ⌊y/2⌋)
if y is even:
return 2z
else:
return x + 2z
As stated in my question, why is this function O(n^2)? This is an explanation from the book that above example belongs to :
It must terminate after n recursive calls, because at each call y is halved—that is, its number of bits is decreased by one. And each recursive call requires these operations: a division by 2 (right shift); a test for odd/even (looking up the last bit); a multiplication by 2 (left shift); and possibly one addition, a total of O(n) bit operations. The total time taken is thus O(n^2), just as before.
Because of left shift and right shift of division by 2 and multiplication, I thought it would be bigger than O(n^2)..maybe n^3.

Each of the operations right shift, test, left shift takes a fixed amount of time per bit, so it is done O(n) times per recursion. Remember that O(3n) is still O(n).
Since the entire function is applied recursively to each bit using the left shift, the previous O(n) steps are carried out O(n) times, making for a total complexity of O(n^2).

Is this a comp-sci textbook or a VLSI textbook? Because the answer depends on the complexity of the operations y==0, y/2, 2z, x+2z, and "y is even".
As an applications developer, I consider those to be constant time operations so they are all O(1). The Multiply function is then either O(log(Y)) or O(N) where N is the number of bits in Y. Same thing. Therefore, I conclude that this entire function is O(N).
Now, a computer engineer might argue that y/2 requires shifting N bits, and thus it is an O(N) operation. There's probably some CPU out there that works that way. Permit me to be absurd for a moment and argue that I could create an implementation for y==0 that takes O(N^47), thus this function is O(N^48). :-)
In reality, any modern-day N-bit processor will do bit shifts of an N-bit number in parallel, so they really are O(1). Maybe back on an 8088 this wasn't the case, but for any modern design that would be true. So in practicality I argue this is O(N) not O(N^2)

Related

Can one compute the nth Fibonacci number in time O(n) or O(1)? Why?

I asked myself if one can compute the nth Fibonacci number in time O(n) or O(1) and why?
Can someone explain please?
Yes. It is called Binet's Formula, or sometimes, incorrectly, De Moivre's Formula (the real De Moivre's formula is another, but De Moivre did discover Binet's formula before Binet), and involves the golden ratio Phi. The mathematical reasoning behind this (see link) is a bit involved, but doable:
While it is an approximate formula, Fibonacci numbers are integers -- so, once you achieve a high enough precision (depends on n), you can just approximate the number from Binet's formula to the closest integer.
Precision however depends on constants, so you basically have two versions, one with float numbers and one with double precision numbers, with the second also running in constant time, but slightly slower. For large n you will need an arbitrary precision number library, and those have processing times that do depend on the numbers involved; as observed by #MattTimmermans, you'll then probably end up with a O(log^2 n) algorithm. This should happen for large enough values of n that you'd be stuck with a large-number library no matter what (but I'd need to test this to be sure).
Otherwise, the Binet formula is mainly made up of two exponentiations and one division (the three sums and divisions by 2 are probably negligible), while the recursive formula mainly employs function calls and the iterative formula uses a loop. While the first formula is O(1), and the other two are O(n), the actual times are more like a, b n + c and d n + e, with values for a, b, c, d and e that depend on the hardware, compiler, implementation etc. . With a modern CPU it is very likely that a is not too larger than b or d, which means that the O(1) formula should be faster for almost every n. But most implementations of the iterative algorithm start with
if (n < 2) {
return n;
}
which is very likely to be faster for n = 0 and n = 1. I feel confident that Binet's formula is faster for any n beyond the single digits.
Instead of thinking about the recursive method, think of building the sequence from the bottom up, starting at 1+1.
You can also use a matrix m like this:
1 1
1 0
and calculate power n of it. then output m^n[0,0].

What constitutes exponential time complexity?

I am comparing two algorithms that determine whether a number is prime. I am looking at the upper bound for time complexity, but I can't understand the time complexity difference between the two, even though in practice one algorithm is faster than the other.
This pseudocode runs in exponential time, O(2^n):
Prime(n):
for i in range(2, n-1)
if n % i == 0
return False
return True
This pseudocode runs in half the time as the previous example, but I'm struggling to understand if the time complexity is still O(2^n) or not:
Prime(n):
for i in range(2, (n/2+1))
if n % i == 0
return False
return True
As a simple intuition of what big-O (big-O) and big-Θ (big-Theta) are about, they are about how changes the number of operations you need to do when you significantly increase the size of the problem (for example by a factor of 2).
The linear time complexity means that you increase the size by a factor of 2, the number of steps you need to perform also increases by about 2 times. This is what called Θ(n) and often interchangeably but not accurate O(n) (the difference between O and Θ is that O provides only an upper bound but Θ guarantees both upper and lower bounds).
The logarithmic time complexity (Θ(log(N))) means that when increase the size by a factor of 2, the number of steps you need to perform increases by some fixed amount of operations. For example, using binary search you can find given element in twice as long list using just one ore loop iterations.
Similarly the exponential time complexity (Θ(a^N) for some constant a > 1) means that if you increase that size of the problem just by 1, you need a times more operations. (Note that there is a subtle difference between Θ(2^N) and 2^Θ(N) and actually the second one is more generic, both lie inside the exponential time but neither of two covers it all, see wiki for some more details)
Note that those definition significantly depend on how you define "the size of the task"
As #DavidEisenstat correctly pointed out there are two possible context in which your algorithm can be seen:
Some fixed width numbers (for example 32-bit numbers). In such a context an obvious measure of the complexity of the prime-testing algorithm is the value being tested itself. In such case your algorithm is linear.
In practice there are many contexts where prime testing algorithm should work for really big numbers. For example many crypto-algorithms used today (such as Diffie–Hellman key exchange or RSA) rely on very big prime numbers like 512-bits, 1024-bits and so on. Also in those context the security is measured in the number of those bits rather than particular prime value. So in such contexts a natural way to measure the size of the task is the number of bits. And now the question arises: how many operations do we need to perform to check a value of known size in bits using your algorithm? Obviously if the value N has m bits it is about N ≈ 2^m. So your algorithm from linear Θ(N) converts into exponential 2^Θ(m). In other words to solve the problem for a value just 1 bit longer, you need to do about 2 times more work.
Exponential versus linear is a question of how the input is represented and the machine model. If the input is represented in unary (e.g., 7 is sent as 1111111) and the machine can do constant time division on numbers, then yes, the algorithm is linear time. A binary representation of n, however, uses about lg n bits, and the quantity n has an exponential relationship to lg n (n = 2^(lg n)).
Given that the number of loop iterations is within a constant factor for both solutions, they are in the same big O class, Theta(n). This is exponential if the input has lg n bits, and linear if it has n.
i hope this will explain you why they are in fact linear.
suppose you call function and see how many time they r executed
Prime(n): # 1 time
for i in range(2, n-1) #n-1-1 times
if n % i == 0 # 1 time
return False # 1 time
return True # 1 time
# overall -> n
Prime(n): # Time
for i in range(2, (n/2+1)) # n//(2+1) -1-1 time
if n % i == 0 # 1 time
return False # 1 time
return True # 1 time
# overall -> n/2 times -> n times
this show that prime is linear function
O(n^2) might be because of code block where this function is called.

Do these two bit-counting algorithms have the same time complexity?

Below is an algorithm I picked up somewhere (forgot where exactly, possibly from this answer) to calculate the amount of bits set in an integer, i.e. its Hamming weight.
function hamming_weight($i)
{
$i = $i - (($i >> 1) & 0x55555555);
$i = ($i & 0x33333333) + (($i >> 2) & 0x33333333);
return ((($i + ($i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
(I happened to have it handy in PHP, but this could really be any language.)
If I'm not terribly mistaken, this runs in O(1) - there's no branches after all.
Now here's a bit-counting function I wrote myself, which apart from readability I deem inferior:
function hamming_weight_2($i)
{
$weight = 0;
for ($k = 1, $s = 0; $k < 0xFFFFFFFF; $k *= 2, $s++)
{
$weight += (($i & $k) >> $s);
}
return $weight;
}
However, in what way is it inferior? At first I thought "well there's a loop, so this should run in linear time", but then I realized the loop doesn't depend on the size of the input at all. No matter the size of $i, the number of iterations stays the same.
What I'm thus wondering is this:
Can these two alogrithms really be said to both run in O(1)?
If so, is there a measure that distinguishes the two? It seems the first one ought to be better in some way.
In this case, looking at the question in terms of big O complexity doesn't make sense because there are a fixed number of bits in your variable. Instead you should count the individual operations:
Algorithm 1:
Bitwise Ands: 4
Bitshifts: 4
Additions/Subtracts: 3
Multiplications: 1
Algorithm 2:
Bitwise Ands: 32
Bitshifts: 32
Additions/Subtracts: 64
Multiplications: 32
Even allowing for replacing those multiplications with additional bitshifts, significantly more work is being done in the second algorithm.
Can these two algorithms really be said to both run in O(1)?
Absolutely, yes. Any algorithm that runs in under a fixed amount of time independently of the size of its input can be said to be O(1).
If so, is there a measure that distinguishes the two? It seems the first one ought to be better in some way?
What distinguishes algorithms with identical asymptotic complexity is a constant factor. This applies to algorithms of any asymptotic complexity, not only O(1) algorithms.
You can figure out the constant by adding up the elementary operations required to perform the computations according to these algorithms. Count operations performed outside a loop, and add the count of operations inside the loop multiplied by the number of times that loop will be executed in the worst case (i.e. 32).
While the two algorithms have identical asymptotic complexity, the first algorithm is said to have a much smaller constant factor, and is, therefore, faster than the second one.
f (n) = O (g (n)) means that f (n) is less than or equal to c * g (n) for all n ≥ N for some N > 0 and for some c > 0. The c component can be important. If one algorithm runs in n nanoseconds and another runs in n hours, they both have the same time in Big-O notation but one is just slightly (a few thousand billion times) faster. Not something to be concerned about, obviously. Scarcely any difference.
PS. It is rarely important to count the number of bits in a single word. For counting the bits in an array of words, both algorithms are sub-optimal.
Well, it depends. Note that neither of them are actually algorithms, they're implementations. That's different, since in implementation you always have a constant number of bits. Yes, always - bigints are also limited by a constant, because array size is limited by a constant. Clearly it's useless to think that way though.
So let's look at it differently. First, consider the conceptual algorithms instead of the implementations. Integers are now n bits long, and the code you showed is generalized to their n-bit forms. The first one would have O(log n) steps vs O(n) for the second. But how long do those steps take? It depends on your abstract machine. It is a stackoverflow tradition to pretend that the only abstract machine in "existence" (in the platonic sense) is the RAM machine, or maybe the Turing machine. But there are more. PRAM for example, in which you are not necessarily limited to a constant number of parallel processing elements.
n-bit addition takes O(log n) time on a PRAM machine with sufficient processors (so, at least n), bitwise operations obviously only take O(1) on a PRAM machine with at least n processors, so that gives you O(log(n)2) for the first algorithm but O(n log n) for the second.
But you can go even further, and assume all operations on n bits take constant time. I'm sure someone will comment that you can't do that, but you can actually assume whatever you want (look up hypercomputation in particular). The usual assumption that operations on O(log n) bits take constant time is pretty weird too if you think about it. Anyway if "n-bit operations are O(1)" is what you're working with, that's O(log n) for the first algorithm and O(n) for the second.

Time complexity measured as a function of the number of bits in the input?

I came across this problem that I am not sure how to solve:
Suppose A(.) is a subroutine that takes as input a number in binary, and takes linear time (that is, O(n), where n is the length (in bits) of the number).
Consider the following piece of code, which starts with an n-bit number x.
while x>1:
call A(x)
x=x-1
Assume that the subtraction takes O(n) time on an n-bit number.
(a) How many times does the inner loop iterate (as a function of n)? Leave your answer in big-O form.
(b) What is the overall running time (as a function of n), in big-O form?
My guess is that (a) is O(n^2) and (b) is O(n^3). Is this correct? The way I'm thinking about it is that the loop has to compute two steps each time it cycles through and it will cycle through x time each time subtracting 1 from n bits until x reaches 0. And for part b since A(.) takes time O(n) we multiply that with the time it takes to execute the loop and we then have the over all running time. Is my analysis correct?
Something that might help here is to write x = 2n, since if x has n bits its value is O(2n). Therefore, the loop will run O(2n) times.
Each iteration of the loop does O(n) work, giving an upper bound on the work of O(n · 2n). This bound ends up being tight. Notice that for the first x/2 iterations of the loop, the value of x will still need n bits. Therefore, as a lower bound on the work done, we get x/2 = 2n-1 iterations doing n work each, giving a total of Ω(n · 2n) work. Thus the work done is Θ(n · 2n).
Hope this helps!

Is the time complexity of the empty algorithm O(0)?

So given the following program:
Is the time complexity of this program O(0)? In other words, is 0 O(0)?
I thought answering this in a separate question would shed some light on this question.
EDIT: Lots of good answers here! We all agree that 0 is O(1). The question is, is 0 O(0) as well?
From Wikipedia:
A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.
From this description, since the empty algorithm requires 0 time to execute, it has an upper bound performance of O(0). This means, it's also O(1), which happens to be a larger upper bound.
Edit:
More formally from CLR (1ed, pg 26):
For a given function g(n), we denote O(g(n)) the set of functions
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }
The asymptotic time performance of the empty algorithm, executing in 0 time regardless of the input, is therefore a member of O(0).
Edit 2:
We all agree that 0 is O(1). The question is, is 0 O(0) as well?
Based on the definitions, I say yes.
Furthermore, I think there's a bit more significance to the question than many answers indicate. By itself the empty algorithm is probably meaningless. However, whenever a non-trivial algorithm is specified, the empty algorithm could be thought of as lying between consecutive steps of the algorithm being specified as well as before and after the algorithm steps. It's nice to know that "nothingness" does not impact the algorithm's asymptotic time performance.
Edit 3:
Adam Crume makes the following claim:
For any function f(x), f(x) is in O(f(x)).
Proof: let S be a subset of R and T be a subset of R* (the non-negative real numbers) and let f(x):S ->T and c ≥ 1. Then 0 ≤ f(x) ≤ f(x) which leads to 0 ≤ f(x) ≤ cf(x) for all x∈S. Therefore f(x) ∈ O(f(x)).
Specifically, if f(x) = 0 then f(x) ∈ O(0).
It takes the same amount of time to run regardless of the input, therefore it is O(1) by definition.
Several answers say that the complexity is O(1) because the time is a constant and the time is bounded by the product of some coefficient and 1. Well, it is true that the time is a constant and it is bounded that way, but that doesn't mean that the best answer is O(1).
Consider an algorithm that runs in linear time. It is ordinarily designated as O(n) but let's play devil's advocate. The time is bounded by the product of some coefficient and n^2. If we consider O(n^2) to be a set, the set of all algorithms whose complexity is small enough, then linear algorithms are in that set. But it doesn't mean that the best answer is O(n^2).
The empty algorithm is in O(n^2) and in O(n) and in O(1) and in O(0). I vote for O(0).
I have a very simple argument for the empty algorithm being O(0): For any function f(x), f(x) is in O(f(x)). Simply let f(x)=0, and we have that 0 (the runtime of the empty algorithm) is in O(0).
On a side note, I hate it when people write f(x) = O(g(x)), when it should be f(x) ∈ O(g(x)).
Big O is asymptotic notation. To use big O, you need a function - in other words, the expression must be parametrized by n, even if n is not used. It makes no sense to say that the number 5 is O(n), it's the constant function f(n) = 5 that is O(n).
So, to analyze time complexity in terms of big O you need a function of n. Your algorithm always makes arguably 0 steps, but without a varying parameter talking about asymptotic behaviour makes no sense. Assume that your algorithm is parametrized by n. Only now you may use asymptotic notation. It makes no sense to say that it is O(n2), or even O(1), if you don't specify what is n (or the variable hidden in O(1))!
As soon as you settle on the number of steps, it's a matter of the definition of big O: the function f(n) = 0 is O(0).
Since this is a low-level question it depends on the model of computation.
Under "idealistic" assumptions, it is possible you don't do anything.
But in Python, you cannot say def f(x):, but only def f(x): pass. If you assume that every instruction, even pass (NOP), takes time, then the complexity is f(n) = c for some constant c, and unless c != 0 you can only say that f is O(1), not O(0).
It's worth noting big O by itself does not have anything to do with algorithms. For example, you may say sin x = x + O(x3) when discussing Taylor expansion. Also, O(1) does not mean constant, it means bounded by constant.
All of the answers so far address the question as if there is a right and a wrong answer. But there isn't. The question is a matter of definition. Usually in complexity theory the time cost is an integer --- although that too is just a definition. You're free to say that the empty algorithm that quits immediately takes 0 time steps or 1 time step. It's an abstract question because time complexity is an abstract definition. In the real world, you don't even have time steps, you have continuous physical time; it may be true that one CPU has clock cycles, but a parallel computer could easily have asynchronoous clocks and in any case a clock cycle is extremely small.
That said, I would say that it's more reasonable to say that the halt operation takes 1 time step rather than that it takes 0 time steps. It does seem more realistic. For many situations it's arguably very conservative, because the overhead of initialization is typically far greater than executing one arithmetic or logical operation. Giving the empty algorithm 0 time steps would only be reasonable to model, for example, a function call that is deleted by an optimizing compiler that knows that the function won't do anything.
It should be O(1). The coefficient is always 1.
Consider:
If something grows like 5n, you don't say O(5n), you say O(n) [in other words, O(1n)]
If something grows like 7n^2, you don't say O(7n^2), you say O(n^2) [in other words, O(1n^2)]
Likewise you should say O(1), not O(some other constant)
There is no such thing as O(0). Even an oracle machine or a hypercomputer require the time for one operation, i.e. solve(the_goldbach_conjecture), ergo:
All machines, theoretical or real, finite or infinite produce algorithms with a minimum time complexity of O(1).
But then again, this code right here is O(0):
// Hello world!
:)
I would say it's O(1) by definition, but O(0) if you want to get technical about it: since O(k1g(n)) is equivalent to O(k2g(n)) for any constants k1 and k2, it follows that O(1 * 1) is equivalent to O(0 * 1), and therefore O(0) is equivalent to O(1).
However, the empty algorithm is not like, for example, the identity function, whose definition is something like "return your input". The empty algorithm is more like an empty statement, or whatever happens between two statements. Its definition is "do absolutely nothing with your input", presumably without even the implied overhead of simply having input.
Consequently, the complexity of the empty algorithm is unique in that O(0) has a complexity of zero times whatever function strikes your fancy, or simply zero. It follows that since the whole business is so wacky, and since O(0) doesn't already mean something useful, and since it's slightly ridiculous to even discuss such things, a reasonable special case for O(0) is something like this:
The complexity of the empty algorithm is O(0) in time and space. An algorithm with time complexity O(0) is equivalent to the empty algorithm.
So there you go.
Given the formal definition of Big O:
Let f(x) and g(x) be two functions defined over the set of real numbers. Then, we write:
f(x) = O(g(x)) as x approaches infinity iff there exists a real M and a real x0 so that:
|f(x)| <= M * |g(x)| for every x > x0
As I see it, if we substitute g(x) = 0 (in order to have a program with complexity O(0)), we must have:
|f(x)| <= 0, for every x > x0 (the constraint of existence of a real M and x0 is practically lifted here)
which can only be true when f(x) = 0.
So I would say that not only the empty program is O(0), but it is the only one for which that holds. Intuitively, this should've been true since O(1) encompasses all algorithms that require a constant number of steps regardless of the size of its task, including 0. It's essentially useless to talk about O(0); it's already in O(1). I suspect it's purely out of simplicity of definition that we use O(1), where it could as well be O(c) or something similar.
0 = O(f) for all function f, since 0 <= |f|, so it is also O(0).
Not only is this a perfectly sensible question, but it is important in certain situations involving amortized analysis, especially when "cost" means something other than "time" (for example, "atomic instructions").
Let's say there is a datastructure featuring multiple operation types, for which an amortized analysis is being conducted. It could well happen that one type of operation can always be funded fully using "coins" deposited during previous operations.
There is a simple example of this: the "multipop queue" described in Cormen, Leiserson, Rivest, Stein [CLRS09, 17.2, p. 457], and also on Wikipedia. Each time an item is pushed, a coin is put on the item, for a total amortized cost of 2. When (multi) pops occur, they can be fully paid for by taking one coin from each item popped, so the amortized cost of MULTIPOP(k) is O(0). To wit:
Note that the amortized cost of MULTIPOP is a constant (0)
...
Moreover, we can also charge MULTIPOP operations nothing. To pop the
first plate, we take the dollar of credit off the plate and use it to
pay the actual cost of a POP operation. To pop a second plate, we
again have a dollar of credit on the plate to pay for the POP
operation, and so on. Thus, we have always charged enough up front to
pay for MULTIPOP operations. In other words, since each plate on the
stack has 1 dollar of credit on it, and the stack always has a
nonnegative number of plates, we have ensured that the amount of
credit is always nonnegative.
Thus O(0) is an important "complexity class" for certain amortized operations.
O(1) means the algorithm's time complexity is always constant.
Let's say we have this algorithm (in C):
void doSomething(int[] n)
{
int x = n[0]; // This line is accessing an array position, so it is time consuming.
int y = n[1]; // Same here.
return x + y;
}
I am ignoring the fact that the array could have less than 2 positions, just to keep it simple.
If we count the 2 most expensive lines, we have a total time of 2.
2 = O(1), because:
2 <= c * 1, if c = 2, for every n > 1
If we have this code:
public void doNothing(){}
And we count it as having 0 expansive lines, there is no difference in saying it has O(0) O(1), or O(1000), because for every one of these functions, we can prove the same theorem.
Normally, if the algorithm takes a constant number of steps to complete, we say it has O(1) time complexity.
I guess this is just a convention, because you could use any constant number to represent the function inside the O().
No. It's O(c) by convention whenever you don't have dependence on input size, where c is any positive constant (typically 1 is used - O(1) = O(12.37)).

Resources