Hi i have started learning algorithm analysis. Here i have a doubt in asymptotic analysis.
Let's say i have a function f(n) = 5n^3 + 2n^2 + 23.
Now i need to find the Big-Oh, Big-Omega and Theta Notations for the above function,
Big-Oh:
f(n) <= (5 + 2 + 23) n^3 // raising all the n's to the power of 3 will give us a value which will be always greater than f(n)
f(n) <= 30n^3
f(n) belongs to Big-Oh(n^3)
Big-Omega:
n^3 <= f(n)
f(n) belongs to Big-Omega(n^3)
Theta:
n^3 <= f(n) <= 30 n^3
f(n) belongs to Theta ( n^3)
So here,
f(n) belongs to Big-Oh(n^3)
f(n) belongs to Big-Omega(n^3)
f(n) belongs to Theta(n^3)
Like this for any polynomial,the order of growth for Oh,Omega and Theta Notations are same(in our case it is order of n^3).
When order of growth will be same for all the notations, then what is the use of showing them with different notations and where exactly
it can be used? Kindly give me a practical example if possible.
Big theta (Θ) is when our upper bound (O) and lower bound (Ω) are the same, in other words it's a tight bound then. That's one reason to show both O and Ω (or well all three).
Why is this useful? Because Θ is a tight bound - it's much stronger than O. With O you can say that the above is O(n^1000) and you are still technically correct. A lot of times O != Ω and you don't have that tight bound.
Why are we usually talking about O usually, then? Well because in most cases we are interested in the upper bound (of the worst case scenario) of our algorithm. Sometimes we simply don't know the Θ and we go with O instead. Also it's important to notice that many people simply misuse those symbols, don't fully understand them and/or are simply not precise enough and use O in places where Θ could be used.
For instance quicksort does not have a tight bound (unless we are talking specifically about either best/average or worst case analysis) as it's Ω(nlogn) in the best and average cases but O(n^2) in the worst case. On the other hand mergesort is both Ω(nlogn) and O(nlogn) therefore it's also Θ(nlogn).
All in all it's all theoretical as quicksort in practics is in most cases faster as you usually don't hit the worst case and the operations done by quicksort are easier.
In the example you gave, the execution time is known and seems to be fixed. So it is in O(n^3), Ω(n^3) and thus in Θ(n^3) for all cases.
However, for an algorithm, the execution time may, and not so rarely, depend on the input the algorithm is running on.
Eg.: to search a key in a linked list takes going through all the list members in the worst case and that's linear time.
In the best case, the key you're looking for at that time is in the very beginning of the list and that's constant time.
So the algorithm is in O(n) and in Ω(1)=O(1). There's no valid f(n) to specify Θ(f(n)) for this algorithm.
Above Function Running Time can be computed in the following manner
Is it omega and big o of n ?
5n^3+2n^2+23 >=cn
5n^2+2n+23/n >= c as n grows and it tends to infinite
such constant c can exist which is smaller than or equals
to left hand side of the inequality so the function running
time is omega of n.
5n^2+2n+23/n <= c when n tends to infinite this inequality
doesn't hold true because such constant can not exist
which is equal to or greater than Left hand side of the
inequality so the function running time is not big o of n.
is it omega and Big o of n ^3 ?
5n^3+2n^2+23 >=cn^3
5 + 2/n + 23/n^3 >=c this inequality holds true so it's
omega of n^3.
5 + 2/n + 23/n^3 <=c this inequality also holds true so it's
big o of n^3.
Since it's omega and big o of n^3 hence it's theta of n^3 as
well.
Similarly its omega of n^ 2 but not big o of n^2.
Related
if f(n) = 3n + 8,
for this we say or prove that f(n) = Ω(n)
Why we not use Ω(1) or Ω(logn) or .... for describing growth rate of our function?
In the context of studying the complexity of algorithms, the Ω asymptotic bound can serve at least two purposes:
check if there is any chance of finding an algorithm with an acceptable complexity;
check if we have found an optimal algorithm, i.e. such that its O bound matches the known Ω bound.
For theses purposes, a tight bound is preferable (mandatory).
Also note that f(n)=Ω(n) implies f(n)=Ω(log(n)), f(n)=Ω(1) and all lower growth rates, and we needn't repeat that.
You can actually do that. Check the Big Omega notation here and let's take Ω(log n) as an example. We have:
f(n) = 3n + 8 = Ω(log n)
because:
(according to the 1914 Hardy-Littlewood definition)
or:
(according to the Knuth definition).
For the definition of liminf and limsup symbols (with pictures) please check here.
Perhaps what was really meant is Θ (Big Theta), that is, both O() and Ω() simultaneously.
For example i have f(N) = 5N +3 from a program. I want to know what is the big (oh) of this function. we say higher order term O(N).
Is this correct method to find big(oh) of any program by dropping lower orders terms and constants?
If we got O(N) by simply looking on that complexity function 5N+3. then, what is the purpose of this formula F(N) <= C* G(N)?
i got to know that, this formula is just for comparing two functions. my question is,
In this formula, F(N) <= C* G(N), i have F(N) = 5N+3, but what is this upper bound G(N)? Where it comes from? where from will we take it?
i have studied many books, and many posts, but i am still facing confusions.
Q: Is this correct method to find big(oh) of any program by dropping
lower orders terms and constants?
Yes, most people who have at least some experience with examining time complexities use this method.
Q: If we got O(N) by simply looking on that complexity function 5N+3.
then, what is the purpose of this formula F(N) <= C* G(N)?
To formally prove that you correctly estimated big-oh for certain algorithm. Imagine that you have F(N) = 5N^2 + 10 and (incorrectly) conclude that the big-oh complexity for this example is O(N). By using this formula you can quickly see that this is not true because there does not exist constant C such that for large values of N holds 5N^2 + 10 <= C * N. This would imply C >= 5N + 10/N, but no matter how large constant C you choose, there is always N larger than this constant, so this inequality does not hold.
Q: In this formula, F(N) <= C* G(N), i have F(N) = 5N+3, but what is this
upper bound G(N)? Where it comes from? where from will we take it?
It comes from examining F(N), specifically by finding its highest order term. You should have some math knowledge to estimate which function grows faster than the other, for start check this useful link. There are several classes of complexities - constant, logarithmic, polynomial, exponential.. However, in most cases it is easy to find the highest order term for any function. If you are not sure, you can always plot a graph of a function or formally prove that one function grows faster than the other. For example, if F(N) = log(N^3) + sqrt(N) maybe it is not clear at first glance what's the highest order term, but if you calculate or plot log(N^3) for N = 1, 10 and 1000 and sqrt(N) for same values, it is immediately clear that sqrt(N) grows faster, so big-oh for this function is O(sqrt(N)).
I am really confused what big O,big theta and big omega represent: best case, worst case and average case or upper bound and lower bound.
If the answer is upper bound and lower bound, then whose upper bound and lower bound? For example let's consider an algorithm. Then does it have three different expressions or rate of growths for best case, lower case and average case and for every case can we find it's big O, theta and omega.
Lastly, we know merge sort via divide and conquer algorithm has a rate of growth or time complexity of n*log(n), then is it the rate of growth of best case or worst case and how do we relate big O, theta and omega to this. please can you explain via a hypothetical expression.
The notations are all about asymptotic growing. If they explain the worst or the average case depends only on what you say it should express.
E.g. quicksort is a randomized algorithm for sorting. Lets say we use it deterministic and always choose the first element in the list as pivot. Then there exists an input of length n (for all n) such that the worst case is O(n²). But on random lists the average case is O(n log n).
So here I used the big O for average and worst case.
Basically this notation is for simplification. If you have an algorithm that does exactly 5n³-4n²-3logn steps you can simply write O(n³) and get rid of all the crap after n³ and also forget about the constants.
You can use big O to get rid of all monomials except for the one with the biggest exponent and all constant factors (constant means they don't grow, but 10100 is also constant)
At the end you get with O(f(n)) a set of functions, that all have the upper bound f(n) (this means g(n) is in O(f(n)), if you can find a constant number c such that g(n) ≤ c⋅f(n))
To make it a little easier:
I have explained that big O means an upper bound but not strict. so n³ is in O(n³), but also n².
So you can think about big O as a "lower equal".
The same way you can do with the others.
Little o is a strict lower: n² is in o(n³), but n³ is not.
Big Omega is a "greater equal": n³ is in Ω(n³) and also n⁴.
The little omega is a strict "greater": n³ is not in ω(n³) but n⁴ is.
And the big Theta is something like "equal" so n³ is in Θ(n³), but neither n² nor n⁴ is.
I hope this helps a little.
So the idea is that O means "on average" , one means the best case , one the worst case. For example lets think of most sorting algorithms. Most of them sort in n time if the items are already in order. You just have to check they are in order. All of them have a worst case order where they have to do most work to order everything.
Assume f, g to be asymptotically non-negative functions.
In brief,
1) Big_O Notation
f(n) = O(g(n)) if asymptotically, f(n) ≤ c · g(n), for some constant c.
2) Theta Notation
f(n) = Θ(g(n)) if asymptotically, c1 · g(n) ≤ f(n) ≤ c2 · g(n),
for some constants c1, c2; that is, up to constant factors, f(n)
and g(n) are asymptotically similar.
3) Omega Notation
f(n) = Ω(g(n)) if asymptotically, f(n) ≥ c · g(n), for some constant
c.
(Informally, asymptotically and up to constant factors, f is at least g).
4) Small o Notation
f(n) = o(g(n)) if limn→∞ f(n)/g(n)
= 0.
That is, for every c > 0, asymptotically, f(n) < c · g(n), (i.e., f is an order lower than g).
For the last part of your Question, Merge-Sort is Θ(nlog(n)) that is both worst and best case asymptotically converge to c1*nlog(n) + c2 for some constants c1, c2.
I know in Big O Notation we only consider the highest order, leading polynomial term because we are basically placing this theoretic worst case bound on compute-time complexity but sometimes I get confused on when we can legally drop/consider terms as constants. For example if our equation ran in
O((n^3)/3) --> we pull out the "1/3" fraction, treat it as a constant, drop it, then say our algo runs in O(n^3) time.
What about in the case of O((n^3)/((log(n))^2))? In this case could we pull out the 1/((log(n)^2)) term, treat it as a constant, drop it, and then ultimately conclude our algorithm is O(n^3). It does not look like we can, but what differentiates this case from the above case? both can be treated as constants because their values are relatively small in comparison to the leading polynomial term in the numerator but in the second case, the denominator term really brings down the worst case bound (convergence) as n values get larger and larger.
At this point, it starts to be a good idea to go back and look at the formal definition for big O notation. Essentially, when we say that f(x) is O(g(x)) we mean that there exists a constant factor a and a starting input n0 such that for all x >= n0 then f(x) <= a*g(x).
For a concrete example, to prove that 3*x^2 + 17 is O(n^2) we can use a = 20 and n0 = 1.
From this definition it becomes easy to see why the constant factors get dropped off - its just a matter of adjusting the a to compensate. As for your 1/log(n) question, if we have f(x) is O(g(x)) and g(x) is O(h(x)) then we also have f(x) is O(h(x)). So yes, 10*n^3/log(n) + x is O(n^3) but that is not a tighter upper bound and it is a weaker statement than saying that 10*n^3/log(n) + x is O(n^3/log(n)). For a tight bounds you would want to use big-Theta notation instead.
Any value which is fixed and does not depend on a variable (like n) can be treated as constant. You can separate the constants, remove the lower order terms and classify the result as the complexity. Also big O notation states if
f(x) <= c*g(x)
Then f(x) ~ O(g(x)). For example-
n^3 * 5 -> here 5 is a constant. Complexity is O(n^3)
4*(n^3)/((log(n))^2 + 7 -> here 7 and 4 are constants. Complexity is O(n^3/(logn)^2)
I have used the Master Theorem to solve recurrence relations. I have gotten it down to Θ(3n2-9n). Does this equal Θ(n2)? I have another recurrence for which the solution is Θ(2n3 - 1002). In BigTheta notation do you always use only the largest term? So my second one would be Θ(n3)? It just seems like 100n2 would be more important in the second case. So will it matter if I discard it?
Any suggestions?
Yes. Your assumptions are correct. The first one is Θ(n2) and the second one is Θ(n3). When you are using Θ notation you only require the largest term.
In case of your second recurrence consider the n = 1000, then n3 = 1000000000. Where as 100n2 is just 100000000. As the value of n increases, n3 becomes more and more predominant than 100n2.
For theoretical purpose you don't need to consider the constant, how ever large it might be. But practical applications might prefer an algorithm with a small constant even if the complexity is high. For example it might be better to use an algorithm having complexity 0.01n 3 over an algorithm having 10000n2 complexity if the value of n is not very large.
if we have function f(n) = 3n^2-9n , lower order terms and costants can be ignored, we consider higher order terms ,because they play major role in growth of function.
By considering only higher order term we can easily find the upper bound, here is the example.
f(n)= 3n^2-9n
For all sufficient large value of n>=1,
3n^2<=3n^2
and -9n <= n^2
thus, f(n)=3n^2 -9n <= 3n^2 + n^2
<= 4n^2
*The upper bound of f(n) is 4n^2 , that means for all sufficient large
value of n>=1, the value of f(n) wouldn't be greater than 4n^2.*
therefore, f(n)= Θ(n^2) where c=4 and n0=1
we can directly find the upper bound by saying to ignore lower order terms and constants in the equation f(n)= 3n^2-9n , result will be the same Θ(n^2)