Related
I‘m trying to wrap my head around the meaning of the Landau-Notation in the context of analysing an algorithm‘s complexity.
What exactly does the O formally mean in Big-O-Notation?
So the way I understand it is that O(g(x)) gives a set of functions which grow as rapidly or slower as g(x), meaning, for example in the case of O(n^2):
where t(x) could be, for instance, x + 3 or x^2 + 5. Is my understanding correct?
Furthermore, are the following notations correct?
I saw the following written down by a tutor. What does this mean? How can you use less or equal, if the O-Notation returns a set?
Could I also write something like this?
So the way I understand it is that O(g(x)) gives a set of functions which grow as rapidly or slower as g(x).
This explanation of Big-Oh notation is correct.
f(n) = n^2 + 5n - 2, f(n) is an element of O(n^2)
Yes, we can say that. O(n^2) in plain English, represents "set of all functions that grow as rapidly as or slower than n^2". So, f(n) satisfies that requirement.
O(n) is a subset of O(n^2), O(n^2) is a subset of O(2^n)
This notation is correct and it comes from the definition. Any function that is in O(n), is also in O(n^2) since growth rate of it is slower than n^2. 2^n is an exponential time complexity, whereas n^2 is polynomial. You can take limit of n^2 / 2^n as n goes to infinity and prove that O(n^2) is a subset of O(2^n) since 2^n grows bigger.
O(n) <= O(n^2) <= O(2^n)
This notation is tricky. As explained here, we don't have "less than or equal to" for sets. I think tutor meant that time complexity for the functions belonging to the set O(n) is less than (or equal to) the time complexity for the functions belonging to the set O(n^2). Anyways, this notation doesn't really seem familiar, and it's best to avoid such ambiguities in textbooks.
O(g(x)) gives a set of functions which grow as rapidly or slower as g(x)
That's technically right, but a bit imprecise. A better description contains the addenda
O(g(x)) gives the set of functions which are asymptotically bounded above by g(x), up to constant factors.
This may seem like a nitpick, but one inference from the imprecise definition is wrong.
The 'fixed version' of your first equation, if you make the variables match up and have one limit sign, seems to be:
This is incorrect: the ratio only has to be less than or equal to some fixed constant c > 0.
Here is the correct version:
where c is some fixed positive real number, that does not depend on n.
For example, f(x) = 3 (n^2) is in O(n^2): one constant c that works for this f is c = 4. Note that the requirement isn't 'for all c > 0', but rather 'for at least one constant c > 0'
The rest of your remarks are accurate. The <= signs in that expression are an unusual usage, but it's true if <= means set inclusion. I wouldn't worry about that expression's meaning.
There's other, more subtle reasons to talk about 'boundedness' rather than growth rates. For instance, consider the cosine function. |cos(x)| is in O(1), but its derivative fluctuates from negative one to positive one even as x increases to infinity.
If you take 'growth rate' to mean something like the derivative, example like these become tricky to talk about, but saying |cos(x)| is bounded by 2 is clear.
For an even better example, consider the logistic curve. The logistic function is O(1), however, its derivative and growth rate (on positive numbers) is positive. It is strictly increasing/always growing, while 1 has a growth rate of 0. This seems to conflict with the first definition without lots of additional clarifying remarks of what 'grow' means.
An always growing function in O(1) (image from the Wikipedia link):
Suppose that I have 2 nested for loops, and 1 array of size N as shown in my code below:
int result = 0;
for( int i = 0; i < N ; i++)
{
for( int j = i; j < N ; j++)
{
result = array[i] + array[j]; // just some funny operation
}
}
Here are 2 cases:
(1) if the constraint is that N >= 1,000,000 strictly, then we can definitely say that the time complexity is O(N^2). This is true for sure as we all know.
(2) Now, if the constraint is that N < 25 strictly, then people could probably say that because we know that definitely, N is always too small, the time complexity is estimated to be O(1) since it takes very little time to run and complete these 2 for loops WITH MODERN COMPUTERS ? Does that sound right ?
Please tell me if the value of N plays a role in deciding the outcome of the time complexity O(N) ? If yes, then how big the value N needs to be in order to play that role (1,000 ? 5,000 ? 20,000 ? 500,000 ?) In other words, what is the general rule of thumb here ?
INTERESTING THEORETICAL QUESTION: If 15 years from now, the computer is so fast that even if N = 25,000,000, these 2 for loops can be completed in 1 second. At that time, can we say that the time complexity would be O(1) even for N = 25,000,000 ? I suppose the answer would be YES at that time. Do you agree ?
tl:dr No. The value of N has no effect on time complexity. O(1) versus O(N) is a statement about "all N" or how the amount of computation increases when N increases.
Great question! It reminds me of when I was first trying to understand time complexity. I think many people have to go through a similar journey before it ever starts to make sense so I hope this discussion can help others.
First of all, your "funny operation" is actually funnier than you think since your entire nested for-loops can be replaced with:
result = array[N - 1] + array[N - 1]; // just some hilarious operation hahaha ha ha
Since result is overwritten each time, only the last iteration effects the outcome. We'll come back to this.
As far as what you're really asking here, the purpose of Big-O is to provide a meaningful way to compare algorithms in a way that is indenependent of input size and independent of the computer's processing speed. In other words, O(1) versus O(N) has nothing to with the size of N and nothing to do with how "modern" your computer is. That all effects execution time of the algorithm on a particular machine with a particular input, but does not effect time complexity, i.e. O(1) versus O(N).
It is actually a statement about the algorithm itself, so a math discussion is unavoidable, as dxiv has so graciously alluded to in his comment. Disclaimer: I'm going to omit certain nuances in the math since the critical stuff is already a lot to explain and I'll defer to the mountains of complete explanations elsewhere on the web and textbooks.
Your code is a great example to understand what Big-O does tell us. The way you wrote it, its complexity is O(N^2). That means that no matter what machine or what era you run your code in, if you were to count the number of operations the computer has to do, for each N, and graph it as a function, say f(N), there exists some quadratic function, say g(N)=9999N^2+99999N+999 that is greater than f(N) for all N.
But wait, if we just need to find big enough coefficients in order for g(N) to be an upper bound, can't we just claim that the algorithm is O(N) and find some g(N)=aN+b with gigantic enough coefficients that its an upper bound of f(N)??? THE ANSWER TO THIS IS THE MOST IMPORTANT MATH OBSERVATION YOU NEED TO UNDERSTAND TO REALLY UNDERSTAND BIG-O NOTATION. Spoiler alert. The answer is no.
For visuals, try this graph on Desmos where you can adjust the coefficients:[https://www.desmos.com/calculator/3ppk6shwem][1]
No matter what coefficients you choose, a function of the form aN^2+bN+c will ALWAYS eventually outgrow a function of the form aN+b (both having positive a). You can push a line as high as you want like g(N)=99999N+99999, but even the function f(N)=0.01N^2+0.01N+0.01 crosses that line and grows past it after N=9999900. There is no linear function that is an upper bound to a quadratic. Similarly, there is no constant function that is an upper bound to a linear function or quadratic function. Yet, we can find a quadratic upper bound to this f(N) such as h(N)=0.01N^2+0.01N+0.02, so f(N) is in O(N^2). This observation is what allows us to just say O(1) and O(N^2) without having to distinguish between O(1), O(3), O(999), O(4N+3), O(23N+2), O(34N^2+4+e^N), etc. By using phrases like "there exists a function such that" we can brush all the constant coefficients under the rug.
So having a quadratic upper bound, aka being in O(N^2), means that the function f(N) is no bigger than quadratic and in this case happens to be exactly quadratic. It sounds like this just comes down to comparing the degree of polynomials, why not just say that the algorithm is a degree-2 algorithm? Why do we need this super abstract "there exists an upper bound function such that bla bla bla..."? This is the generalization necessary for Big-O to account for non-polynomial functions, some common ones being logN, NlogN, and e^N.
For example if the number of operations required by your algorithm is given by f(N)=floor(50+50*sin(N)), we would say that it's O(1) because there is a constant function, e.g. g(N)=101 that is an upper bound to f(N). In this example, you have some bizarre algorithm with oscillating execution times, but you can convey to someone else how much it doesn't slow down for large inputs by simply saying that it's O(1). Neat. Plus we have a way to meaningfully say that this algorithm with trigonometric execution time is more efficient than one with linear complexity O(N). Neat. Notice how it doesn't matter how fast the computer is because we're not measuring in seconds, we're measuring in operations. So you can evaluate the algorithm by hand on paper and it's still O(1) even if it takes you all day.
As for the example in your question, we know it's O(N^2) because there are aN^2+bN+c operations involved for some a, b, c. It can't be O(1) because no matter what aN+b you pick, I can find a large enough input size N such that your algorithm requires more than aN+b operations. On any computer, in any time zone, with any chance of rain outside. Nothing physical effects O(1) versus O(N) versus (N^2). What changes it to O(1) is changing the algorithm itself to the one-liner that I provided above where you just add two numbers and spit out the result no matter what N is. Let's say for N=10 it takes 4 operations to do both array lookups, the addition, and the variable assignment. If you run it again on the same machine with N=10000000 it's still doing the same 4 operations. The amount of operations required by the algorithm doesn't grow with N. That's why the algorithm is O(1).
It's why problems like finding a O(NlogN) algorithm to sort an array are math problems and not nano-technology problems. Big-O doesn't even assume you have a computer with electronics.
Hopefully this rant gives you a hint as to what you don't understand so you can do more effective studying for a complete understanding. There's no way to cover everything needed in one post here. It was some good soul-searching for me, so thanks.
ok so basically i need to understand how i can compare functions so that i can find big O big theta and big omega for algorithms of a program
my mathematics background is not very strong but i have the foundations down
and my question is
is there a mathematical way to find where two functions will intersect and eventually one dominate the other from some point n
for example if i have a function
2n^2 and 64nlog(n) [with log to base 2]
how can i find at which values of n, 2n^2 will upper bound( hope i used the correct term here) 64nlog(n) and also how to apply this to any other function
is it just simply guess work?
You just need to find the intercepts of the two functions. So set them equal to each other and solve for n. Or just plot them and get a rough idea of the intercepts so you know at what size of data you should switch between each algorithm. Also, Wolfram Alpha has useful function solving and plotting tools too, just in case you are a bit rusty on math.
Big O, Big Theta, and Big Omega are about general sorts of growth patterns for algorithms. Each algorithm can be implemented an infinite number of ways. Each implementation may have a specific execution time, which relates to the Big O of the algorithm. You can compare the execution-time-per-input-size of two implementations of an algorithm, but not for an algorithm by itself. Big O, Big Theta, and Big Omega say next-to-nothing about the execution-time of an algorithm. As such, you cannot even guess a specific intersection point of where one algorithm becomes faster, because such a concept makes no sense. We can discuss intersection points in theory, but not in detail, because it has no value for algorithms, only implementations.
It's also important to note that Big O (and similar) have no constant factors, like your functions do.
http://en.wikipedia.org/wiki/Big_O_notation
You can divide out an n on both sides, and a common factor 2, and then solve this:
32 log(n) <= c n for n >= n_0
Let's say c = 32, because then it's log(n) <= n (which looks nice), and n_0 can be 1.
But this sort of thing has already been done. Unless you are in CS class, you can just say that O(n log n) is a subset of O(n2) without proving it, it's well-known anyway.
For big-O, big-Theta, and big-Omega, you don't have to find the intersection point. You just have to prove that one function eventually dominates the other (up to constant factors). One useful tool for this for example is L'Hopital's rule: Take the ratio of the two running time functions and compute the limit as n goes to infinity, and if the limit is infinity/infinity then you can apply L'Hopital's rule to get that the limit of the ratio is equal to the limit of the ratio of derivatives, and sometimes get a solution. For example this shows that log(n) = O(n^c) for any c > 0.
I am just starting to learn the big O concept. What I learned is that if a function f is less than or equal to another constant multiple of function g, then f is O(g).
Now I came across an example in which a string of size "n" takes "2n" (double the size of input) steps of algorithm. So they say the time taken is O(2n) but then they follow this statement by saying As O(2n)=O(n), time complexity is O(n).
I dont understand this. As 2n will always be greater than n, how can we ignore the multple of 2 then? Anything less than or equal to 2n will not necessarily be less than n!
Doesn't it mean that we are somehow equating n and 2n? Sounds confusing. Please clarify in simplest possible way as I am just a beginner in this concept.
Best Regards :)
Big-O and related notations are intended to capture the aspects of algorithm performance that are most inherent to the algorithm, independent of how it is being run and measured.
Constant multipliers depend on the unit of measurement, seconds vs. microseconds vs. instructions vs. loop iterations. Even measured in the same units they will be different if measured on different systems. The same algorithm may take 20n instructions in one instruction set, 30n instructions on another. It may take 0.5n microseconds on one, 10n microseconds on another.
Many of the basic algorithm complexities you will see in the literature were calculated decades ago, but remain meaningful across significant changes in processor architecture and even more significant changes in performance.
Similar considerations apply to start-up and similar overheads.
A f(n) is O(n) if there exist constants N and c such that, for all n>=N, f(n) <= cn. For f(n) = 2n the constants are N=0 and c = 2. The first constant, N, is about ignoring overhead, the second, c, is about ignoring constant multipliers.
... As 2n will always be greater than n, how can we ignore the multple of 2 then? ...
Simply put, with growing n the multiplier loses its importance. The asymptotic behavior of a function describes what happens when n gets large.
Maybe it helps to consider not just O(n) and O(2n), because they are in the same class, but to contrast it with some other common classes. Example: Any O(n^2) algorithm will take longer than any O(n), in the long run (in the short run, their running times might even be reversed). Say you have two algorithms, one with linear time complexity of 100n and another with 8n^2. The quadratic algorithm will be faster for all n =< 12, but slower for all n > 12.
This property – that for any fixed nonnegative c and d you'll find an n, so that cn < dn^2 – constitues a part of the hierarchy of time complexities.
As you alluded to in your first paragraph, the time required to execute the algorithm is proportional to a constant multiple of the input size. You can think of O(n), to be O(C*n), where C is any constant multiplier.
So given the following program:
Is the time complexity of this program O(0)? In other words, is 0 O(0)?
I thought answering this in a separate question would shed some light on this question.
EDIT: Lots of good answers here! We all agree that 0 is O(1). The question is, is 0 O(0) as well?
From Wikipedia:
A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.
From this description, since the empty algorithm requires 0 time to execute, it has an upper bound performance of O(0). This means, it's also O(1), which happens to be a larger upper bound.
Edit:
More formally from CLR (1ed, pg 26):
For a given function g(n), we denote O(g(n)) the set of functions
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }
The asymptotic time performance of the empty algorithm, executing in 0 time regardless of the input, is therefore a member of O(0).
Edit 2:
We all agree that 0 is O(1). The question is, is 0 O(0) as well?
Based on the definitions, I say yes.
Furthermore, I think there's a bit more significance to the question than many answers indicate. By itself the empty algorithm is probably meaningless. However, whenever a non-trivial algorithm is specified, the empty algorithm could be thought of as lying between consecutive steps of the algorithm being specified as well as before and after the algorithm steps. It's nice to know that "nothingness" does not impact the algorithm's asymptotic time performance.
Edit 3:
Adam Crume makes the following claim:
For any function f(x), f(x) is in O(f(x)).
Proof: let S be a subset of R and T be a subset of R* (the non-negative real numbers) and let f(x):S ->T and c ≥ 1. Then 0 ≤ f(x) ≤ f(x) which leads to 0 ≤ f(x) ≤ cf(x) for all x∈S. Therefore f(x) ∈ O(f(x)).
Specifically, if f(x) = 0 then f(x) ∈ O(0).
It takes the same amount of time to run regardless of the input, therefore it is O(1) by definition.
Several answers say that the complexity is O(1) because the time is a constant and the time is bounded by the product of some coefficient and 1. Well, it is true that the time is a constant and it is bounded that way, but that doesn't mean that the best answer is O(1).
Consider an algorithm that runs in linear time. It is ordinarily designated as O(n) but let's play devil's advocate. The time is bounded by the product of some coefficient and n^2. If we consider O(n^2) to be a set, the set of all algorithms whose complexity is small enough, then linear algorithms are in that set. But it doesn't mean that the best answer is O(n^2).
The empty algorithm is in O(n^2) and in O(n) and in O(1) and in O(0). I vote for O(0).
I have a very simple argument for the empty algorithm being O(0): For any function f(x), f(x) is in O(f(x)). Simply let f(x)=0, and we have that 0 (the runtime of the empty algorithm) is in O(0).
On a side note, I hate it when people write f(x) = O(g(x)), when it should be f(x) ∈ O(g(x)).
Big O is asymptotic notation. To use big O, you need a function - in other words, the expression must be parametrized by n, even if n is not used. It makes no sense to say that the number 5 is O(n), it's the constant function f(n) = 5 that is O(n).
So, to analyze time complexity in terms of big O you need a function of n. Your algorithm always makes arguably 0 steps, but without a varying parameter talking about asymptotic behaviour makes no sense. Assume that your algorithm is parametrized by n. Only now you may use asymptotic notation. It makes no sense to say that it is O(n2), or even O(1), if you don't specify what is n (or the variable hidden in O(1))!
As soon as you settle on the number of steps, it's a matter of the definition of big O: the function f(n) = 0 is O(0).
Since this is a low-level question it depends on the model of computation.
Under "idealistic" assumptions, it is possible you don't do anything.
But in Python, you cannot say def f(x):, but only def f(x): pass. If you assume that every instruction, even pass (NOP), takes time, then the complexity is f(n) = c for some constant c, and unless c != 0 you can only say that f is O(1), not O(0).
It's worth noting big O by itself does not have anything to do with algorithms. For example, you may say sin x = x + O(x3) when discussing Taylor expansion. Also, O(1) does not mean constant, it means bounded by constant.
All of the answers so far address the question as if there is a right and a wrong answer. But there isn't. The question is a matter of definition. Usually in complexity theory the time cost is an integer --- although that too is just a definition. You're free to say that the empty algorithm that quits immediately takes 0 time steps or 1 time step. It's an abstract question because time complexity is an abstract definition. In the real world, you don't even have time steps, you have continuous physical time; it may be true that one CPU has clock cycles, but a parallel computer could easily have asynchronoous clocks and in any case a clock cycle is extremely small.
That said, I would say that it's more reasonable to say that the halt operation takes 1 time step rather than that it takes 0 time steps. It does seem more realistic. For many situations it's arguably very conservative, because the overhead of initialization is typically far greater than executing one arithmetic or logical operation. Giving the empty algorithm 0 time steps would only be reasonable to model, for example, a function call that is deleted by an optimizing compiler that knows that the function won't do anything.
It should be O(1). The coefficient is always 1.
Consider:
If something grows like 5n, you don't say O(5n), you say O(n) [in other words, O(1n)]
If something grows like 7n^2, you don't say O(7n^2), you say O(n^2) [in other words, O(1n^2)]
Likewise you should say O(1), not O(some other constant)
There is no such thing as O(0). Even an oracle machine or a hypercomputer require the time for one operation, i.e. solve(the_goldbach_conjecture), ergo:
All machines, theoretical or real, finite or infinite produce algorithms with a minimum time complexity of O(1).
But then again, this code right here is O(0):
// Hello world!
:)
I would say it's O(1) by definition, but O(0) if you want to get technical about it: since O(k1g(n)) is equivalent to O(k2g(n)) for any constants k1 and k2, it follows that O(1 * 1) is equivalent to O(0 * 1), and therefore O(0) is equivalent to O(1).
However, the empty algorithm is not like, for example, the identity function, whose definition is something like "return your input". The empty algorithm is more like an empty statement, or whatever happens between two statements. Its definition is "do absolutely nothing with your input", presumably without even the implied overhead of simply having input.
Consequently, the complexity of the empty algorithm is unique in that O(0) has a complexity of zero times whatever function strikes your fancy, or simply zero. It follows that since the whole business is so wacky, and since O(0) doesn't already mean something useful, and since it's slightly ridiculous to even discuss such things, a reasonable special case for O(0) is something like this:
The complexity of the empty algorithm is O(0) in time and space. An algorithm with time complexity O(0) is equivalent to the empty algorithm.
So there you go.
Given the formal definition of Big O:
Let f(x) and g(x) be two functions defined over the set of real numbers. Then, we write:
f(x) = O(g(x)) as x approaches infinity iff there exists a real M and a real x0 so that:
|f(x)| <= M * |g(x)| for every x > x0
As I see it, if we substitute g(x) = 0 (in order to have a program with complexity O(0)), we must have:
|f(x)| <= 0, for every x > x0 (the constraint of existence of a real M and x0 is practically lifted here)
which can only be true when f(x) = 0.
So I would say that not only the empty program is O(0), but it is the only one for which that holds. Intuitively, this should've been true since O(1) encompasses all algorithms that require a constant number of steps regardless of the size of its task, including 0. It's essentially useless to talk about O(0); it's already in O(1). I suspect it's purely out of simplicity of definition that we use O(1), where it could as well be O(c) or something similar.
0 = O(f) for all function f, since 0 <= |f|, so it is also O(0).
Not only is this a perfectly sensible question, but it is important in certain situations involving amortized analysis, especially when "cost" means something other than "time" (for example, "atomic instructions").
Let's say there is a datastructure featuring multiple operation types, for which an amortized analysis is being conducted. It could well happen that one type of operation can always be funded fully using "coins" deposited during previous operations.
There is a simple example of this: the "multipop queue" described in Cormen, Leiserson, Rivest, Stein [CLRS09, 17.2, p. 457], and also on Wikipedia. Each time an item is pushed, a coin is put on the item, for a total amortized cost of 2. When (multi) pops occur, they can be fully paid for by taking one coin from each item popped, so the amortized cost of MULTIPOP(k) is O(0). To wit:
Note that the amortized cost of MULTIPOP is a constant (0)
...
Moreover, we can also charge MULTIPOP operations nothing. To pop the
first plate, we take the dollar of credit off the plate and use it to
pay the actual cost of a POP operation. To pop a second plate, we
again have a dollar of credit on the plate to pay for the POP
operation, and so on. Thus, we have always charged enough up front to
pay for MULTIPOP operations. In other words, since each plate on the
stack has 1 dollar of credit on it, and the stack always has a
nonnegative number of plates, we have ensured that the amount of
credit is always nonnegative.
Thus O(0) is an important "complexity class" for certain amortized operations.
O(1) means the algorithm's time complexity is always constant.
Let's say we have this algorithm (in C):
void doSomething(int[] n)
{
int x = n[0]; // This line is accessing an array position, so it is time consuming.
int y = n[1]; // Same here.
return x + y;
}
I am ignoring the fact that the array could have less than 2 positions, just to keep it simple.
If we count the 2 most expensive lines, we have a total time of 2.
2 = O(1), because:
2 <= c * 1, if c = 2, for every n > 1
If we have this code:
public void doNothing(){}
And we count it as having 0 expansive lines, there is no difference in saying it has O(0) O(1), or O(1000), because for every one of these functions, we can prove the same theorem.
Normally, if the algorithm takes a constant number of steps to complete, we say it has O(1) time complexity.
I guess this is just a convention, because you could use any constant number to represent the function inside the O().
No. It's O(c) by convention whenever you don't have dependence on input size, where c is any positive constant (typically 1 is used - O(1) = O(12.37)).