This question already has answers here:
What is the difference between Θ(n) and O(n)?
(9 answers)
Closed 5 years ago.
I am a little confused in a specific case with the Big O notation and the Asymptotic behavior of algorithms. While I was reading the blog http://discrete.gr/complexity/ that describes these notations very well I came across this statement whether it is true or false:
A O( n ) algorithm is Θ( 1 )
The answer says that this may or may not be true depending on the algorithm. In the general case it's false. If an algorithm is Θ( 1 ), then it certainly is O( n ). But if it's O( n ) then it may not be Θ( 1 ). For example, a Θ( n ) algorithm is O( n ) but not Θ( 1 ).
I am trying a little hard to comprehend this answer. I understand that Big O implies that a program can asymptotically be no worse. So I interpret that above statement where O( n ) is worse than Θ( 1 ) and is true.
Can someone explain with an example?
If you know that a task requires exactly one week (= Θ(1)), you can surely say that you can do it in at most a year (= O(n)).
If you know that a task requires at most a year (= O(n)), you cannot be sure that you'll do it in a week (= Θ(1)).
If you know that a task requires exactly one year (= Θ(n)), you could also say that it requires at most a year (= O(n)), and you're sure it cannot be done in a week (≠ Θ(1)).
Consider the optimized bubble sort, which has an asymptotically tight upper bound of O(n2), and an asymptotically tight lower bound of Ω(n) (when the array is already sorted). The arrangement of items determines how many operations the algorithm will have to perform.
Contrast that to summing a list of integers. In this case, you always have to look at each item in the list exactly once. The upper bound is O(n), and the lower bound is Ω(n). There is no arrangement of items that will change the complexity of the algorithm. In this case, when the tight upper and lower bounds are the same, we say that the algorithmic complexity is Θ(n).
If an algorithm is Θ(n), then the complexity will never exceed O(n), and it will never be less than O(n). So it can't possibly be O(1) or Θ(1).
But if an algorithm is described as O(n), it could be Ω(1). For example, finding the first non-zero value in a list of integers. If the first item in the list is non-zero, then you don't have to look at any other numbers. But if the list is all zeroes, you end up looking at them all. So the upper bound is O(n) and the lower bound is Ω(1).
The example basically tries to cover two ideas:
If an algorithm is Θ(f(n)), it means it is both Ω(f(n)) and O(f(n)). It is asymptotically neither better nor worse than f(n), it has the same asymptotic behavior.
Functions that areO(1) can be seen as a subset of functions that are O(n). This can be generalized but no need to get too formal here I guess to avoid mathematically incorrect disasters from my side. It means if it can never do worse than a constant, then it can never do worse than a linear function.
So the idea is to break up the Θ to O and Ω notations. And then identify which is subset of which.
Also it's nice to notice that ANY algorithm (that has a non null complexity at least) is Ω(1). An algorithm can never do better than a constant.
Example Mammals are humans.
The answer: no, not in general. All humans are mammals though. Humans are a subset of mammals.
I tried to think of another example but it was either too technical or not clear enough. So I'll leave here this not so well drawn but rather clear graph here. I found by googling o omega theta and looking for images. There are a few other good images too.
(source: daveperrett.com)
Basically, for each function f in this graph: Anything above it is Ω(f(n)) because it can never do better than f, it can never be below it as n increases; Anything below it is O(f(n)) because it can never be worse than f, it can never be above it as n increases.
The graph doesn't show well the insignificance of constants asymptotically. There are other graphs that show it better. I just put it here because it had lots of functions at a time.
Related
I have a doubt in this particular question where the answer says that big-Oh(n^2) algorithm will not run faster than big-Oh(n^3) algorithm whereas if the notation in both cases was theta instead then it would have been true but why is that so?
I would love it if anyone could explain it to me in detail because I couldn't find any source from where my doubt could get clarified.
Part 1 paraphrased (note the phrasing in the question is ambiguous where "number" is quantified -- it has to be picked after you choose the two algorithms, but I assume that's what's intended).
Given functions f and g with f=Theta(n^2) and g=Theta(n^3), then there exists a number N such that f(n) < g(n) for all n>N.
Part 2 paraphrased:
Given functions f and g with f=O(n^2) and g=O(n^3), then for all n, f(n) < g(n).
1 is true, and you can prove it by application of the definition of big-Theta.
2 is false (as a general statement), and you can disprove it by finding a single example of f and g for which it is false. For example, f(n) = 2, g(n) = 1. Big O is a kind of upper bound, so these constant functions work. The counter-examples given in the question are f(n)=n, g(n)=log(n), but the same principle applies.
the answer says that big-Oh(n^2) algorithm will not run faster than big-Oh(n^3) algorithm
It is more subtle: an O(𝑛²) algorithm could run slower than a O(𝑛³) algorithm. It is not will, but could.
The answer gives one reason, but actually there are two:
Big O notation only gives an upper bound. Quoted from Wikipedia:
A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.
So anything that is O(𝑛) is also O(𝑛²) and O(𝑛³), but not necessarily vice versa. The answer says that an algorithm that is O(𝑛³) could maybe have a tighter bound that is O(log𝑛). True, this may sound silly, because why should one then say it is O(𝑛³) when it is also O(log𝑛)? Then it seems more reasonable to just talk about O(log𝑛). And this is what we commonly do. But there is also a second reason:
The second option does not have the contraint "n > number" in its claim. This is essential, because irrespective of time complexities, the running time of an algorithm for a given value of 𝑛 cannot be determined from its time complexity. An algorithm that is O(𝑛log𝑛) may take 10 seconds to do its job, while an algorithm that is O(𝑛²) may take 8 seconds to get the same result, even though its time complexity is worse. When comparing time complexities you only get information about asymptotic behaviour, i.e. when 𝑛 is larger than a large enough number.
Because this extra constraint is part of the first claim, it is indeed true.
I have some confusion regarding the Asymptotic Analysis of Algorithms.
I have been trying to understand this upper bound case, seen a couple of youtube videos. In one of them, there was an example of this equation
where we have to find the upper bound of the equation 2n+3. So, by looking at this, one can say that it is going o be O(n).
My first question :
In algorithmic complexity, we have learned to drop the constants and find the dominant term, so is this Asymptotic Analysis to prove that theory? or does it have other significance? otherwise, what is the point of this analysis when it is always going to be the biggest n in the equation, example- if it were n+n^2+3, then the upper bound would always be n^2 for some c and n0.
My second question :
as per rule the upper bound formula in Asymptotic Analysis must satisfy this condition f(n) = O(g(n)) IFF f(n) < c.g(n) where n>n0,c>0, n0>=1
i) n is the no of inputs, right? or does n represent the number of steps we perform? and does f(n) represents the algorithm?
ii) In the following video to prove upper bound of the equation 2n+3 could be n^2 the presenter considered c =1, and that is why to satisfy the equation n had to be >= 3 whereas one could have chosen c= 5 and n=1 as well, right? So then why were, in most cases in the video, the presenter was changing the value of n and not c to satisfy the conditions? is there a rule, or is it random? Can I change either c or n(n0) to satisfy the condition?
My Third Question:
In the same video, the presenter mentioned n0 (n not) is the number of steps. Is that correct? I thought n0 is the limit after which the graph becomes the upper bound (after n0, it satisfies the condition for all values of n); hence n0 also represents the input.
Would you please help me understand because people come up with different ideas in different explanations, and I want to understand them correctly?
Edit
The accepted answer clarified all of the questions except the first one. I have gone through many articles on the web, and here I am documenting my conclusion if anyone else has the same question. This will help them.
My first question was
In algorithmic complexity, we have learned to drop the constants and
find the dominant term, so is this Asymptotic Analysis to prove that
theory?
No, Asymptotic Analysis describes the algorithmic complexity, which is all about understanding or visualizing the Asymptotic behavior or the tail behavior of a function or a group of functions by plotting mathematical expression.
In computer science, we use it to evaluate (note: evaluate is not measuring) the performance of an algorithm in terms of input size.
for example, these two functions belong to the same group
mySet = set()
def addToMySet(n):
for i in range(n):
mySet.add(i*i)
mySet2 = set()
def addToMySet2(n):
for i in range(n):
for j in range(500):
mySet2.add(i*j)
Even though the execution time of the addToMySet2(n) is always > the execution time of addToMySet(n), the tail behavior of both of these functions would be the same with respect to the largest n, if one plot them in a graph the tendency of that graph for both of the functions would be linear thus they belong to the same group. Using Asymptotic Analysis, we get to see the behavior and group them.
A mistake that I made assuming upper bound represents the worst case. In reality, The upper bound of any algorithm is associated with all of the best, average, and worst cases. so the correct way of putting that would be
upper/lower bound in the best/average/worst case of an
algorithm
.
We can't relate the upper bound of an algorithm with the worst-case time complexity and the lower bound with the best-case complexity. However, an upper bound can be higher than the worst-case because upper bounds are usually asymptotic formulae that have been proven to hold.
I have seen this kind of question like find the worst-case time complexity of such and such algorithm, and the answer is either O(n) or O(n^2) or O(log-n), etc.
For example, if we consider the function addToMySet2(n), one would say the algorithmic time complexity of that function is O(n), which is technically wrong because there are three factors bound, bound type, (inclusive upper bound and strict upper bound) and case are involved determining the algorithmic time complexity.
When one denote O(n) it is derived from this Asymptotic Analysis f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0) so we are considering upper bound of best/average/worst case. In the above statement the case is missing.
I think We can consider, when not indicated, the big O notation generally describes an asymptotic upper bound on the worst-case time complexity. Otherwise, one can also use it to express asymptotic upper bounds on the average or best case time complexities
The whole point of asymptotic analysis is to compare algorithms performance scaling. For example, if I write two version of the same algorithm, one with O(n^2) time complexity and the other with O(n*log(n)) time complexity, I know for sure that the O(n*log(n)) one will be faster when n is "big". How big? it depends. You actually can't know unless you benchmark it. What you know is at some point, the O(n*log(n)) will always be better.
Now with your questions:
the "lower" n in n+n^2+3 is "dropped" because it is negligible when n scales up compared to the "dominant" one. That means that n+n^2+3 and n^2 behave the same asymptotically. It is important to note that even though 2 algorithms have the same time complexity, it does not mean they are as fast. For example, one could be always 100 times faster than the other and yet have the exact same complexity.
(i) n can be anything. It may be the size of the input (eg. an algorithm that sorts a list) but it may also be the input itself (eg. an algorithm that give the n-th prime number) or a number of iteration, etc
(ii) he could have taken any c, he chose c=1 as an example as he could have chosen c=1.618. Actually the correct formulation would be:
f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0)
the n0 from the formula is a pure mathematical construct. For c>0, it is the n value from which the function f is bounded by g. Since n can represent anything (size of a list, input value, etc), it is the same for n0
I had seen in one of the videos (https://www.youtube.com/watch?v=A03oI0znAoc&t=470s) that, If suppose f(n)= 2n +3, then BigO is O(n).
Now my question is if I am a developer, and I was given O(n) as upperbound of f(n), then how I will understand, what exact value is the upper bound. Because in 2n +3, we remove 2 (as it is a constant) and 3 (because it is also a constant). So, if my function is f(n) where n = 1, I can't say g(n) is upperbound where n = 1.
1 cannot be upperbound for 1. I find hard understanding this.
I know it is a partial (and probably wrong answer)
From Wikipedia,
Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation.
In your example,
f(n) = 2n+3 has the same growth rate as f(n) = n
If you plot the functions, you will see that both functions have the same linear growth; and as n -> infinity, the difference between the 2 gets minimal.
In Big O notation, f(n) = 2n+3 when n=1 means nothing; you need to look at the trend, not discreet values.
As a developer, you will consider big-O as a first indication for deciding which algorithm to use. If you have an algorithm which is say, O(n^2), you will try to understand whether there is another one which is, say, O(n). If the problem is inherently O(n^2), then the big-O notation will not provide further help and you will need to use other criterion for your decision. However, if the problem is not inherently O(n^2), but O(n), you should discard any algorithm that happen to be O(n^2) and find an O(n) one.
So, the big-O notation will help you to better classify the problem and then try to solve it with an algorithm whose complexity has the same big-O. If you are lucky enough as to find 2 or more algorithms with this complexity, then you will need to ponder them using a different criterion.
I am really very confused in asymptotic notations. As far as I know, Big-O notation is for worst cast, omega is for best case and theta is for average case. However, I have always seen Big O being used everywhere, even for best case. For e.g. in the following link, see the table where time complexities of different sorting algorithms are mentioned-
https://en.wikipedia.org/wiki/Best,_worst_and_average_case
Everywhere in the table, big O notation is used irrespective of whether it is best case, worst case or average case. Then what is the use of other two notations and where do we use it?
As far as I know, Big-O notation is for worst cast, omega is for best case and theta is for average case.
They aren't. Omicron is for (asymptotic) upper bound, omega is for lower bound and theta is for tight bound, which is both an upper and a lower bound. If the lower and upper bound of an algorithm are different, then the complexity cannot be expressed with theta notation.
The concept of upper,lower,tight bound are orthogonal to the concept of best,average,worst case. You can analyze the upper bound of each case, and you can analyze different bounds of the worst case (and also any other combination of the above).
Asymptotic bounds are always in relation to the set of variables in the expression. For example, O(n) is in relation to n. The best, average and worst cases emerge from everything else but n. For example, if n is the number of elements, then the different cases might emerge from the order of the elements, or the number of unique elements, or the distribution of values.
However, I have always seen Big O being used everywhere, even for best case.
That's because the upper bound is almost always the one that is the most important and interesting when describing an algorithm. We rarely care about the lower bound. Just like we rarely care about the best case.
The lower bound is sometimes useful in describing a problem that has been proven to have a particular complexity. For example, it is proven that worst case complexity of all general comparison sorting algorithms is Ω(n log n). If the sorting algorithm is also O(n log n), then by definition, it is also Θ(n log n).
Big O is for upper bound, not for worst case! There is no notation specifically for worst case/best case. The examples you are talking about all have Big O because they are all upper bounded by the given value. I suggest you take another look at the book from which you learned the basics because this is immensely important to understand :)
EDIT: Answering your doubt- because usually, we are bothered with our at-most performance i.e. when we say, our algorithm performs in O(logn) in the best case-scenario, we know that its performance will not be worse than logarithmic time in the given scenario. It is the upper bound that we seek to reduce usually and hence we usually mention big O to compare algorithms. (not to say that we never mention the other two)
O(...) basically means "not (much) slower than ...".
It can be used for all three cases ("the worst case is not slower than", "the best case is not slower than", and so on).
Omega is the oppsite: You can say, something can't be much faster than ... . Again, it can be used with all three cases. Compared to O(...), it's not that important, because telling someone "I'm certain my program is not faster than yours" is nothing to be proud of.
Theta is a combination: It's "(more or less) as fast as" ..., not just slower/faster.
The Big-O notation is somethin like this >= in terms of asymptotic equality.
For example if you see this :
x = O(x^2) it does say x <= x^2 (in asymptotic terms).
And it does mean "x is at most as complex as x^2", which is something that you are usually interesting it.
Even when you compare Best/Average case, you can say "At best possible input, I will have AT MOST this complexity".
There are two things mixed up: Big O, Omega, Theta, are purely mathematical constructions. For example, O (f (N)) is the set of functions which are less than c * f (n), for some c > 0, and for all n >= some minimum value N0. With that definition, n = O (f (n^4)), because n ≤ n^4. 100 = O (f (n)), because 100 <= n for n ≥ 100, or 100 <= 100 * n for n ≥ 1.
For an algorithm, you want to give worst case speed, average case speed, rarely the best case speed, sometimes amortised average speed (that's when running an algorithm once does work that can be used when it's run again. Like calculating n! for n = 1, 2, 3, ... where each calculation can take advantage of the previous one). And whatever speed you measure, you can give a result in one of the notations.
For example, you might have an algorithm where you can prove that the worst case is O (n^2), but you cannot prove whether there are faster special cases or not, and you also cannot prove that the algorithm isn't actually faster, like O (n^1.9). So O (n^2) is the only thing that you can prove.
So I've been trying to understand Big O notation as well as I can, but there are still some things I'm confused about. So I keep reading that if something is O(n), it usually is referring to the worst-case of an algorithm, but that it doesn't necessarily have to refer to the worst case scenario, which is why we can say the best-case of insertion sort for example is O(n). However, I can't really make sense of what that means. I know that if the worst-case is O(n^2), it means that the function that represents the algorithm in its worst case grows no faster than n^2 (there is an upper bound). But if you have O(n) as the best case, how should I read that as? In the best case, the algorithm grows no faster than n? What I picture is a graph with n as the upper bound, like
If the best case scenario of an algorithm is O(n), then n is the upper bound of how fast the operations of the algorithm grow in the best case, so they cannot grow faster than n...but wouldn't that mean that they can grow as fast as O(log n) or O(1), since they are below the upper bound? That wouldn't make sense though, because O(log n) or O(1) is a better scenario than O(n), so O(n) WOULDN'T be the best case? I'm so lost lol
Big-O, Big-Θ, Big-Ω are independent from worst-case, average-case, and best-case.
The notation f(n) = O(g(n)) means f(n) grows no more quickly than some constant multiple of g(n).
The notation f(n) = Ω(g(n)) means f(n) grows no more slowly than some constant multiple of g(n).
The notation f(n) = Θ(g(n)) means both of the above are true.
Note that f(n) here may represent the best-case, worst-case, or "average"-case running time of a program with input size n.
Furthermore, "average" can have many meanings: it can mean the average input or the average input size ("expected" time), or it can mean in the long run (amortized time), or both, or something else.
Often, people are interested in the worst-case running time of a program, amortized over the running time of the entire program (so if something costs n initially but only costs 1 time for the next n elements, it averages out to a cost of 2 per element). The most useful thing to measure here is the least upper bound on the worst-case time; so, typically, when you see someone asking for the Big-O of a program, this is what they're looking for.
Similarly, to prove a problem is inherently difficult, people might try to show that the worst-case (or perhaps average-case) running time is at least a certain amount (for example, exponential).
You'd use Big-Ω notation for these, because you're looking for lower bounds on these.
However, there is no special relationship between worst-case and Big-O, or best-case and Big-Ω.
Both can be used for either, it's just that one of them is more typical than the other.
So, upper-bounding the best case isn't terribly useful. Yes, if the algorithm always takes O(n) time, then you can say it's O(n) in the best case, as well as on average, as well as the worst case. That's a perfectly fine statement, except the best case is usually very trivial and hence not interesting in itself.
Furthermore, note that f(n) = n = O(n2) -- this is technically correct, because f grows more slowly than n2, but it is not useful because it is not the least upper bound -- there's a very obvious upper bound that's more useful than this one, namely O(n). So yes, you're perfectly welcome to say the best/worst/average-case running time of a program is O(n!). That's mathematically perfectly correct. It's just useless, because when people ask for Big-O they're interested in the least upper bound, not just a random upper bound.
It's also worth noting that it may simply be insufficient to describe the running-time of a program as f(n). The running time often depends on the input itself, not just its size. For example, it may be that even queries are trivially easy to answer, whereas odd queries take a long time to answer.
In that case, you can't just give f as a function of n -- it would depend on other variables as well. In the end, remember that this is just a set of mathematical tools; it's your job to figure out how to apply it to your program and to figure out what's an interesting thing to measure. Using tools in a useful manner needs some creativity, and math is no exception.
Informally speaking, best case has O(n) complexity means that when the input meets
certain conditions (i.e. is best for the algorithm performed), then the count of
operations performed in that best case, is linear with respect to n (e.g. is 1n or 1.5n or 5n).
So if the best case is O(n), usually this means that in the best case it is exactly linear
with respect to n (i.e. asymptotically no smaller and no bigger than that) - see (1). Of course,
if in the best case that same algorithm can be proven to perform at most c * log N operations
(where c is some constant), then this algorithm's best case complexity would be informally
denoted as O(log N) and not as O(N) and people would say it is O(log N) in its best case.
Formally speaking, "the algorithm's best case complexity is O(f(n))"
is an informal and wrong way of saying that "the algorithm's complexity
is Ω(f(n))" (in the sense of the Knuth definition - see (2)).
See also:
(1) Wikipedia "Family of Bachmann-Landau notations"
(2) Knuth's paper "Big Omicron and Big Omega and Big Theta"
(3)
Big Omega notation - what is f = Ω(g)?
(4)
What is the difference between Θ(n) and O(n)?
(5)
What is a plain English explanation of "Big O" notation?
I find it easier to think of O() as about ratios than about bounds. It is defined as bounds, and so that is a valid way to think of it, but it seems a bit more useful to think about "if I double the number/size of inputs to my algorithm, does my processing time double (O(n)), quadruple (O(n^2)), etc...". Thinking about it that way makes it a little bit less abstract - at least to me...