Big-O What are the constants k and n0 in the formal definition of the order of an algorithm? - algorithm

In my textbook I see the following:
Definition of the order of an algorithm
Algorithm A is order f(n) -- denoted O(f(n)) -- if constants k and n0 exist such that A requires no more than k * f(n) time units to solve a problem of size n >= n0.
I understand: Time requirements for different complexity classes grow at different rates. For instance, with increasing values of n, the time required for O(n) grows much more slowly than O(n2), which grows more slowly than O(n3), and so forth.
I do not understand: How k and n0 fit into this definition.
What is n0? Specifically, why does n have subscript 0, what does this subscript mean?
With question 1 answered, what does a 'a problem of size n >= n0' mean? A larger data set? More loop repetitions? A growing problem size?
What is k then? Why is k being multiplied by f(n)? What does k have to do with increasing the problem size - n?
I've already looked at:
Big Oh Notation - formal definition
Constants in the formal definition of Big O
What is an easy way for finding C and N when proving the Big-Oh of an Algorithm?
Confused on how to find c and k for big O notation if f(x) = x^2+2x+1

1) n > n0 - means that we agree that for small n A might need more than k*f(n) operations. Eg. bubble sort might be faster than quick sort or merge sort for very small inputs. Choice of 0 as a subscript is completely due to author preferences.
2) Larger input size.
3) k is a constant. Suppose one algorithm performs 1000*n operation for input of size n, so it is O(n). Another algorithm needs 5*n^2 operations for input of size n. That means for input of size 100, first algorithm needs 100,000 ops and the second one 50,000 ops. So, for input size about 100 you better choose the second one though it is quadratic, and the first one is linear. On the following picture you can see that n0 = 200, because only with n greater than 200 quadratic function becomes more expensive than linear (here i assume that k equals 1).

n is the problem size, however that is best measured. Thus n0 is a specific constant n, specifically the threshold after which the relationship holds. The specific value is irrelevant for big-oh, being only interested in its existence.
k is also an arbitrary constant, whose bare existence (in conjunction with n0) is important for big-oh.
Naturally, people are also interested in smaller problems, and in fact the perfect algorithm for a big problem might be decidedly inefficient for a small one, due to the constants involved.

It means the first value for n for which the rest holds true (i.e. we're only interested in high enough values for n)
Problem size, usually the size of the input.
It means you don't care about the different (for example) between 3*n^2 and 400*n^2, so any value that is high enough to satisfy the equation is OK.
All of these conditions aim to simplify the O notation, making the difference between simple and complex operations mute (e.g. you don't care if an operation is one or 20 cycles as long as the number is finite).

Related

Is O(n^3) really more efficient than O(2^n)?

My algorithm class's homework claims that O(n3) is more efficient than O(2n).
When I put these functions into a graphing calculator, f(x)=2x appears to be consistently more efficient for very large n (starting from around n = 982).
Considering that for a function f(n) = O(g(n)), it must be smaller for all n greater than some n0, wouldn't this mean that from n = 982 we could say that O(2n) is more efficient?
2^n grows extremely faster than n^3. Maybe you have input wrong values into your calculator or something like that. Also note that more efficient means less time which means a lower value on the y-axis.
Let me show you some correct plots for those functions (using Wolfram Alpha):
First 2^n is smaller (but just for a tiny range), after that you can see how 2^n grows beyond it.
After this intersection the situation never changes again and 2^n remains extremely greater than n^3. That also holds for the range you analysed, so > 982, like seen in the next plot (the plot for n^3 is near the x-axis):
Also note that in the Big-O-Notation we always compare functions based on their growth. This is why something like O(n^3) does not contain functions f : f(x) <= n^3 but rather f : f(x) <= C * n^3 where C is an arbitrary constant, it could be big, it could be small. This accounts for the growth-factor in the comparison. Also note that it is allowed that the condition does not hold for a finite amount of x but there must exist some bound x' from where on the condition holds, so for every x > x'.
Compare this explanation to the complete mathematical definition from Wikipedia where C is k, x is n and x' is n_0:
Which defines, if true, that f(n) is in the set O(g(n)).
You confuse O(2n) and 2n. O(2n) is actually C*2n, where C is an arbitrary chosen positive constant. Likewise, O(n3) is D*n3, where D is another arbitrary chosen positive constant. The claim "O(n3) is more efficient than O(2n)" means that, given any fixed C and D, it is always possible to find such n0 that for any n >= n0, D*n3
< C*2n.

Counting the steps of an algorithm for an upper bound

I know that If I have a for loop, and a nested for loop, which both iterate 1 to n times, I can multiply the run times of both loops to get O(n^2). This is a clean and simple calculation. However, if you had iterations like so,
n = 2, k = 5
n = 3, k = 9
n = 4, k = 14
where k is the number of times the inner for loop iterates. At one point, it is larger than n^2, then it is exactly n^2, then it becomes less than n^2. Assuming you cannot determine k based on n, and maybe even having these points of n very far apart how do you calculate Big-O?
I tried graphing points. And at one point, I could say it was O(n^3) since some points exceed n^2, and further down, it would be O(n^2). Which one should I choose?
You state in your question that k is:
"... At one point, it is larger than n^2"
This is the uncertainty (or non-specificity) in your question that makes it hard to answer rigorously. Anyway, for the remainder of this answer, we shall assume that what you mean by the quote above is that:
For all values of n, the value of k(n) is bounded from above by
C·n^2, for some constant C>0.
From here on, let's refer to this statement as (+).
Now, since you're mentioning Big-O notation, we'll proceed to somewhat loosely define what this actually means:
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus
there exists some constant c such that f(n) is always ≤ c · g(n),
for sufficiently large n (i.e. , n ≥ n0 for some constant n0).
I.e., Big-O notation is a way to describe an upper bound here for the asymptotic (limiting) behaviour of our algorithm. You write in your question, however, that:
"And at one point, I could say it was O(n^3) since some points exceed n^2, and further down, it would be O(n^2)"
Now this is a very specific analysis of how the inner loop of your algorithm behaves for specific values of n, and really not something that is related to asymptotic analysis (or Big-O notation). We're not interested in specifics about how the algorithms behaves for specific values of n, but whether we can find some general upper bound for the algorithm given n is "sufficiently large" (n ≥ n0 for some constant n0).
Now, with these comments above, we can proceed to analysing the asymptotic behaviour of your algorithm.
We can approach this using Sigma notation, making use of statement (+) above, k(n) < C·n:
The last step (++) follows from the definition of Big-O-notation, that we loosely stated above.
Hence, given that we interpret your information regarding k as (+), your algorithm runs in O(n^3) (which is an upper bound, not necessarily a tight one).

Ambiguity about the Big-oh notation

I am currently trying to learn time complexity of algorithms, big-o notation and so on. However, some point confuses me a lot. I know that most of the time, the input size of an array or whatever we are dealing with determines the running time of the algorithm. Let's say I have an unsorted array with size N and I want to find the maximum element of this array without using any special algorithm. I just want to iterate over the array and find the maximum element. Since the size of my array is N, this process runs at O(N) or linear time. Let M is an integer that is the square root of N. So N can be written as the square of M that is M*M or M^2. So, I think there is nothing wrong if I want to replace N with M^2. I know that M^2 is also the size of my array so my big-o notation could be written as O(M^2). So, my new running time looks like running in quadratic time. Why does this happen?
You are correct, if it happens to be that you have some variable M such that M^2 ~= N is always true, you could say the algorithm runs in O(M^2).
But, note that now - the algorithm runs in quadratic related to M, and not quadratic time related to the input, it is still linear related to the size of the input.
The important thing here is defining linear/quadratic, etc.
More precisely, you have to detail linear/quadratic, etc. with respect to something (N or M for your example). The most natural choice is to study the complexity wrt. the size of the input (N for your example).
Another trap for big integers is that the size of n is log(n). So for instance if you loop over all smaller integers, your algorithm is not polynomial.

Trying to understand Big-oh notation

Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.

Is the time complexity of the empty algorithm O(0)?

So given the following program:
Is the time complexity of this program O(0)? In other words, is 0 O(0)?
I thought answering this in a separate question would shed some light on this question.
EDIT: Lots of good answers here! We all agree that 0 is O(1). The question is, is 0 O(0) as well?
From Wikipedia:
A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.
From this description, since the empty algorithm requires 0 time to execute, it has an upper bound performance of O(0). This means, it's also O(1), which happens to be a larger upper bound.
Edit:
More formally from CLR (1ed, pg 26):
For a given function g(n), we denote O(g(n)) the set of functions
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }
The asymptotic time performance of the empty algorithm, executing in 0 time regardless of the input, is therefore a member of O(0).
Edit 2:
We all agree that 0 is O(1). The question is, is 0 O(0) as well?
Based on the definitions, I say yes.
Furthermore, I think there's a bit more significance to the question than many answers indicate. By itself the empty algorithm is probably meaningless. However, whenever a non-trivial algorithm is specified, the empty algorithm could be thought of as lying between consecutive steps of the algorithm being specified as well as before and after the algorithm steps. It's nice to know that "nothingness" does not impact the algorithm's asymptotic time performance.
Edit 3:
Adam Crume makes the following claim:
For any function f(x), f(x) is in O(f(x)).
Proof: let S be a subset of R and T be a subset of R* (the non-negative real numbers) and let f(x):S ->T and c ≥ 1. Then 0 ≤ f(x) ≤ f(x) which leads to 0 ≤ f(x) ≤ cf(x) for all x∈S. Therefore f(x) ∈ O(f(x)).
Specifically, if f(x) = 0 then f(x) ∈ O(0).
It takes the same amount of time to run regardless of the input, therefore it is O(1) by definition.
Several answers say that the complexity is O(1) because the time is a constant and the time is bounded by the product of some coefficient and 1. Well, it is true that the time is a constant and it is bounded that way, but that doesn't mean that the best answer is O(1).
Consider an algorithm that runs in linear time. It is ordinarily designated as O(n) but let's play devil's advocate. The time is bounded by the product of some coefficient and n^2. If we consider O(n^2) to be a set, the set of all algorithms whose complexity is small enough, then linear algorithms are in that set. But it doesn't mean that the best answer is O(n^2).
The empty algorithm is in O(n^2) and in O(n) and in O(1) and in O(0). I vote for O(0).
I have a very simple argument for the empty algorithm being O(0): For any function f(x), f(x) is in O(f(x)). Simply let f(x)=0, and we have that 0 (the runtime of the empty algorithm) is in O(0).
On a side note, I hate it when people write f(x) = O(g(x)), when it should be f(x) ∈ O(g(x)).
Big O is asymptotic notation. To use big O, you need a function - in other words, the expression must be parametrized by n, even if n is not used. It makes no sense to say that the number 5 is O(n), it's the constant function f(n) = 5 that is O(n).
So, to analyze time complexity in terms of big O you need a function of n. Your algorithm always makes arguably 0 steps, but without a varying parameter talking about asymptotic behaviour makes no sense. Assume that your algorithm is parametrized by n. Only now you may use asymptotic notation. It makes no sense to say that it is O(n2), or even O(1), if you don't specify what is n (or the variable hidden in O(1))!
As soon as you settle on the number of steps, it's a matter of the definition of big O: the function f(n) = 0 is O(0).
Since this is a low-level question it depends on the model of computation.
Under "idealistic" assumptions, it is possible you don't do anything.
But in Python, you cannot say def f(x):, but only def f(x): pass. If you assume that every instruction, even pass (NOP), takes time, then the complexity is f(n) = c for some constant c, and unless c != 0 you can only say that f is O(1), not O(0).
It's worth noting big O by itself does not have anything to do with algorithms. For example, you may say sin x = x + O(x3) when discussing Taylor expansion. Also, O(1) does not mean constant, it means bounded by constant.
All of the answers so far address the question as if there is a right and a wrong answer. But there isn't. The question is a matter of definition. Usually in complexity theory the time cost is an integer --- although that too is just a definition. You're free to say that the empty algorithm that quits immediately takes 0 time steps or 1 time step. It's an abstract question because time complexity is an abstract definition. In the real world, you don't even have time steps, you have continuous physical time; it may be true that one CPU has clock cycles, but a parallel computer could easily have asynchronoous clocks and in any case a clock cycle is extremely small.
That said, I would say that it's more reasonable to say that the halt operation takes 1 time step rather than that it takes 0 time steps. It does seem more realistic. For many situations it's arguably very conservative, because the overhead of initialization is typically far greater than executing one arithmetic or logical operation. Giving the empty algorithm 0 time steps would only be reasonable to model, for example, a function call that is deleted by an optimizing compiler that knows that the function won't do anything.
It should be O(1). The coefficient is always 1.
Consider:
If something grows like 5n, you don't say O(5n), you say O(n) [in other words, O(1n)]
If something grows like 7n^2, you don't say O(7n^2), you say O(n^2) [in other words, O(1n^2)]
Likewise you should say O(1), not O(some other constant)
There is no such thing as O(0). Even an oracle machine or a hypercomputer require the time for one operation, i.e. solve(the_goldbach_conjecture), ergo:
All machines, theoretical or real, finite or infinite produce algorithms with a minimum time complexity of O(1).
But then again, this code right here is O(0):
// Hello world!
:)
I would say it's O(1) by definition, but O(0) if you want to get technical about it: since O(k1g(n)) is equivalent to O(k2g(n)) for any constants k1 and k2, it follows that O(1 * 1) is equivalent to O(0 * 1), and therefore O(0) is equivalent to O(1).
However, the empty algorithm is not like, for example, the identity function, whose definition is something like "return your input". The empty algorithm is more like an empty statement, or whatever happens between two statements. Its definition is "do absolutely nothing with your input", presumably without even the implied overhead of simply having input.
Consequently, the complexity of the empty algorithm is unique in that O(0) has a complexity of zero times whatever function strikes your fancy, or simply zero. It follows that since the whole business is so wacky, and since O(0) doesn't already mean something useful, and since it's slightly ridiculous to even discuss such things, a reasonable special case for O(0) is something like this:
The complexity of the empty algorithm is O(0) in time and space. An algorithm with time complexity O(0) is equivalent to the empty algorithm.
So there you go.
Given the formal definition of Big O:
Let f(x) and g(x) be two functions defined over the set of real numbers. Then, we write:
f(x) = O(g(x)) as x approaches infinity iff there exists a real M and a real x0 so that:
|f(x)| <= M * |g(x)| for every x > x0
As I see it, if we substitute g(x) = 0 (in order to have a program with complexity O(0)), we must have:
|f(x)| <= 0, for every x > x0 (the constraint of existence of a real M and x0 is practically lifted here)
which can only be true when f(x) = 0.
So I would say that not only the empty program is O(0), but it is the only one for which that holds. Intuitively, this should've been true since O(1) encompasses all algorithms that require a constant number of steps regardless of the size of its task, including 0. It's essentially useless to talk about O(0); it's already in O(1). I suspect it's purely out of simplicity of definition that we use O(1), where it could as well be O(c) or something similar.
0 = O(f) for all function f, since 0 <= |f|, so it is also O(0).
Not only is this a perfectly sensible question, but it is important in certain situations involving amortized analysis, especially when "cost" means something other than "time" (for example, "atomic instructions").
Let's say there is a datastructure featuring multiple operation types, for which an amortized analysis is being conducted. It could well happen that one type of operation can always be funded fully using "coins" deposited during previous operations.
There is a simple example of this: the "multipop queue" described in Cormen, Leiserson, Rivest, Stein [CLRS09, 17.2, p. 457], and also on Wikipedia. Each time an item is pushed, a coin is put on the item, for a total amortized cost of 2. When (multi) pops occur, they can be fully paid for by taking one coin from each item popped, so the amortized cost of MULTIPOP(k) is O(0). To wit:
Note that the amortized cost of MULTIPOP is a constant (0)
...
Moreover, we can also charge MULTIPOP operations nothing. To pop the
first plate, we take the dollar of credit off the plate and use it to
pay the actual cost of a POP operation. To pop a second plate, we
again have a dollar of credit on the plate to pay for the POP
operation, and so on. Thus, we have always charged enough up front to
pay for MULTIPOP operations. In other words, since each plate on the
stack has 1 dollar of credit on it, and the stack always has a
nonnegative number of plates, we have ensured that the amount of
credit is always nonnegative.
Thus O(0) is an important "complexity class" for certain amortized operations.
O(1) means the algorithm's time complexity is always constant.
Let's say we have this algorithm (in C):
void doSomething(int[] n)
{
int x = n[0]; // This line is accessing an array position, so it is time consuming.
int y = n[1]; // Same here.
return x + y;
}
I am ignoring the fact that the array could have less than 2 positions, just to keep it simple.
If we count the 2 most expensive lines, we have a total time of 2.
2 = O(1), because:
2 <= c * 1, if c = 2, for every n > 1
If we have this code:
public void doNothing(){}
And we count it as having 0 expansive lines, there is no difference in saying it has O(0) O(1), or O(1000), because for every one of these functions, we can prove the same theorem.
Normally, if the algorithm takes a constant number of steps to complete, we say it has O(1) time complexity.
I guess this is just a convention, because you could use any constant number to represent the function inside the O().
No. It's O(c) by convention whenever you don't have dependence on input size, where c is any positive constant (typically 1 is used - O(1) = O(12.37)).

Resources