Understanding when to use theta for time complexity - big-o

I (believe) I understand the definitions of Big-O, Big-Ω and Big-Θ; in that Big-O is the asymptotic upper bound, Big-Ω is the asymptotic lower bound and Big-Θ is the asymptotic tight bound. However, I keep getting confused with the usage of Θ in certain situations, such as in an insertion sort:
From what I understand this says that the insertion sort will:
Take at least linear time (it won't run any faster than linear time); according to Big-Ω.
Take at most n^2 time (it won't take any longer than n^2); according to Big-O.
The confusion arises from my understanding of when to use Big-Θ. To do so, I was lead to believe that you can only use Big-Θ when the values of Big-O and Big-Ω are the same. If that's the case, why is insertion sort considered to be Θ(n^2) when the Ω and O values are different?

Basically, you can only use Big-Θ when there is no asymptotic gap between the upper bound and the lower bound on the running time of the algorithm:
In your example, insertion-sort takes at most O(n^2) time (in the worst-case) and it takes Ω(n) time (in the best-case). So, O(n^2) is the time upper bound of the algorithm, and Ω(n) is the lower-bound on the algorithm. Since these two are not the same you cannot use Big-Θ to describe the running time of the insertion-sort algorithm.
However, consider Selection-Sort algorithm. Its worst-case running time is O(n^2), and its best-case running time is Ω(n^2). Therefore, since the upper bound and the lower bound are the same (asymptotically), you can say that the running time of the selection-sort algorithm is Θ(n^2).

Related

isn't big Oh asymptotic notation {O(f(n)}the slowest runtime an algorithm can have?(it gives asymptotic upper bound which means slowest runtime)

i was reading a book called "Introduction to algorithms" and they were analyzing an algorithm called Strassen's algorithm for matrix multiplication and it said this-
""one might at first think that any matrix multiplication algorithm must take omega(n3)time, since the natural definition of matrix multiplication requires that many mul-tiplications. You would be incorrect, however: we have a way to multiply matrices in O(n3) time.""
isn't O(n3) time slower than omega(n3) time.
as omega gives asymtotic lower bound means fastest runtime.
than why the book say that we can do it in O(n3) like it is faster tha omega(n3) time.
First of all it is not true that, as people commonly seem to believe, Big-O is worst case, Big-Omega is best case, and Big-Theta is average case.
Big-O is an upper bound. We are often interested in an upper bound on the worst case so Big-O gets associated with worst case behavior, but we can also be interested in an upper bound on average case behavior, etc.
When we are using asymptotic notation applied to running times, "higher" functions are worse so upper bounds are good. If the algorithm has an upper bound, O(n^3), time this is better than it having a lower bound, Ω(n^3), because a lower bound means that it could be worse, could be slower, that it is no better than the lower bound.

Difficulty in comprehending asymptotic notation

As far as I know and research,
Big - Oh notation describes the worst case of an algorithm time complexity.
Big - Omega notation describes the best case of an algorithm time complexity.
Big - Theta notation describes the average case of an algorithm time complexity.
Source
However, in recent days, I've seen a discussion in which some guys tell that
O for worst case is a "forgivable" misconception. Θ for best is plain
wrong. Ω for best is also forgivable. But they remain misconceptions.
The big-O notation can represent any complexity. In fact it can
describe the asymptotic behavior of any function.
I remember as well I learnt the former one in my university course. I'm still a student and if I know them wrongly, please could you explain me?
Bachmann-Landau notation has absolutely nothing whatsoever to do with computational complexity of algorithms. This should already be very obvious by the fact that the idea of a computing machine that computes an algorithm didn't really exist in 1894, when Bachmann and Landau invented this notation.
Bachmann-Landau notation describes the growth rates of functions by grouping them together in sets of functions that grow at roughly the same rate. Bachmann-Landau notation does not say anything about what those functions mean. They are just functions. In fact, they don't have to mean anything at all.
All that they mean, is this:
f ∈ ο(g): f grows slower than g
f ∈ Ο(g): f does not grow significantly faster than g
f ∈ Θ(g): f grows as fast as g
f ∈ Ω(g): f does not grow significantly slower than g
f ∈ ω(g): f grows faster than g
It does not say anything about what f or g are, or what they mean.
Note that there are actually two conflicting, incompatible definitions of Ω; The one given here is the one that is more useful for computational complexity theory. Also note that these are only very broad intuitions, when in doubt, you should look at the definitions.
If you want, you can use Bachmann-Landau notation to describe the growth rate of a population of rabbits as a function of time, or the growth rate of a human's belly as a function of beers.
Or, you can use it to describe the best-case step complexity, worst-case step complexity, average-case step complexity, expected-case step complexity, amortized step complexity, best-case time complexity, worst-case time complexity, average-case time complexity, expected-case time complexity, amortized time complexity, best-case space complexity, worst-case space complexity, average-case space complexity, expected-case space complexity, or amortized space complexity of an algorithm.
These assertions are at best inaccurate and at worst wrong. Here is the truth.
The running time of an algorithm is not a function of N (as it also depends on the particular data set), so you cannot discuss its asymptotic complexity directly. All you can say is that it lies between the best and worst cases.
The worst case, best case and average case running times are functions of N, though the average case depends on the probabilistic distribution of the data, so is not uniquely defined.
Then the asymptotic notation is such that
O(f(N)) denotes an upper bound, which can be tight or not;
Ω(f(N)) denotes a lower bound, which can be tight or not;
Θ(f(N)) denotes a bilateral bound, i.e. the conjunction of O(f(N)) and Ω(f(N)); it is perforce tight.
So,
All of the worst-case, best-case and average-case complexities have a Θ bound, as they are functions; in practice this bound can be too difficult to establish and we content ourselves with looser O or Ω bounds instead.
It is absolutely not true that the Θ bound is reserved for the average case.
Examples:
The worst-case of quicksort is Θ(N²) and its best and average cases are Θ(N Log N). So by language abuse, the running time of quicksort is O(N²) and Ω(N Log N).
The worst-case, best-case and average cases of insertion sort are all three Θ(N²).
Any algorithm that needs to look at all input is best-case, average-case and worst-case Ω(N) (and without more information we can't tell any upper bound).
Matrix multiplication is known to be doable in time O(N³). But we still don't know precisely the Θ bound of this operation, which is N² times a slowly growing function. Thus Ω(N²) is an obvious but not tight lower bound. (Worst, best and average cases have the same complexity.)
There is some confusion stemming from the fact that the best/average/worst cases take well-defined durations, while tthe general case lies in a range (it is in fact a random variable). And in addition, algorithm analysis is often tedious, if not intractable, and we use simplifications that can lead to loose bounds.

Is "best case performance Θ(1) -> running time ≠ Θ(log n)" valid?

This is an argument for justifying that the running time of an algorithm can't be considered Θ(f(n)) but should be O(f(n)) instead.
E.g. this question about binary search: Is binary search theta log (n) or big O log(n)
MartinStettner's response is even more confusing.
Consider *-case performances:
Best-case performance: Θ(1)
Average-case performance: Θ(log n)
Worst-case performance: Θ(log n)
He then quotes Cormen, Leiserson, Rivest: "Introduction to Algorithms":
What we mean when we say "the running time is O(n^2)" is that the worst-case running time (which is a function of n) is O(n^2) ...
Doesn't this suggest the terms running time and worst-case running time are synonymous?
Also, if running time refers to a function with natural input f(n), then there has to be Θ class which contains it, e.g. Θ(f(n)), right? This indicates that you are obligated to use O notation only when the running time is not known very precisely (i.e. only an upper bound is known).
When you write O(f(n)) that means that the running time of your algorithm is bounded above by a function c*f(n) where c is a constant. That also means that that your algorithm can complete in much less number of steps than c*f(n). We often use the Big-O notation because we want to include the possibility that the algorithm completes faster than we indicate. On the other hand, Theta(f(n)) means that the algorithm always completes in c*f(n) steps. Binary search is O(log(n)) because usually it will complete in log(n) steps, but could complete in one step if you get lucky (best-case performance).
I get always confused, if I read about running times.
For me a running time is the time an implementation of an algorithm needs to be executed on a computer. This can be differ in many ways and so is a complicated thing.
So I think complexity of an algorithm is a better word.
Now, the complexity is (in most cases) a worst-case-complexity. If you know an upper bound for the worst case, you also know that it can only get better in other cases.
So, if you know, that there exist some (maybe trivial) cases, where your algorithm does only a few (constant number) steps and stops, you don't have to care about an lower bound and so you (normaly) use an upper bound in big-O or little-o notation.
If you do your calculations carfully, you can also use the Θ notation.
But notice: all complexities only hold on the cases they are attached to. This means: If you make assumtions like "Input is a best case", this effects your calculation and also the resulting complexity. In the case of binary search you posted the complexity for three different assumtions.
You can generalize it by saying: "The complexity of Binary Search is in O(log n)", since Θ(log n) means "Ω(log n) and O(log n)" and O(1) is a subset of O(log n).
To summerize:
- If you know the very precise function for the complexity, you can give the complexity in Θ-notation
- If you want to get an overall upper bound, you have to use O-notation, until the lower bound over all input cases does not differ from the upper bound.
- In most cases you have to use the O-notation, since the algorithms are too complex to get a close upper and lower bound.

How to determine whether to put big O,theta or omega notation in the time complexity of algorithm

For example, if time complexity of merge sort is O(n log n) then why it is big O not theta or omega. I know the definition of these, but what I do not understand is how to determine the notation based on the definition.
For most algorithms, you are basically concerned on the upper bound on its running time. For example, you have some algorithm to sort an array of numbers. Now you would most likely be concerned that how fast will the algorithm run in the worst possible case.
Hence the complexity of merge sort is mostly written as O(nlogn) even when it will be better to express it as theta(nlogn) because theta notation is a more tighter bound. And merge sort runs in theta(nlogn) time because it will always consume this much time no matter what the input is.
You will not find omega notation again mostly because we are concerned with the upper bounds on running time and not the lower bound.

Different upper bounds and lower bounds of same algorithm

So I just started learning about Asymptotic bounds for an algorithm
Question:
What can we say about theta of a function if for the algorithm we find different lower and upper bounds?? (say omega(n) and O(n^2)). Or rather what can we say about tightness of such an algorithm?
The book which I read says Theta is for same upper and lower bounds of the function.
What about in this case?
I don't think you can say anything, in that case.
The definition of Θ(f(n)) is:
A function is Θ(f(n)) if and only if it is Ω(f(n)) and O(f(n)).
For some pathological function that exhibits those behaviors, such as oscillating between n and n^2, it wouldn't be defined.
Example:
f(x) = n if n is odd
n^2 if n is even
Your bounds Ω(n) and O(n^2) would be tight on this, but Θ(f(n)) is not defined for any function.
See also: What is the difference between Θ(n) and O(n)?
Just for a bit of practicality, one algorithm that is not in Θ(f(n)) for any f(n) would be insertion sort. It runs in Ω(n) since for a list that is already sorted, you only need one operation for the insert in every step, but it runs in O(n^2) in the general case. Constructing functions that oscillate or are non-continuous otherwise usually is done more for didactic purposes, but in my experience such functions rarely, if ever, appear with actual algorithms.
Regarding tightness, I only ever heard that in this context with reference to the upper and lower bounds proposed for algorithms. Again regarding the example of insertion sort, the given bounds are tight in the sense that there are instances of the problem that actually can be done in time linear in their size (the lower bound) and other instances of the problem that will not execute in time less than quadratic in their size. Bounds that are valid, but not tight for insertion sort would be Ω(1) since you can't sort lists of arbitrary size in constant time, and O(n^3) because you can always sort a list of n elements in strictly O(n^2) time, which is an order of magnitude less, so you can certainly do it in O(n^3). What bounds are for is to give us a crude idea of what we can expect as performance of our algorithms so we get an idea of how efficient our solutions are; tight bounds are the most desirable, since they both give us that crude idea and that idea is optimal in the sense that there are extreme cases (which sometimes are also the general case) where we actually need all the complexity the bound allows.
The average case complexity is not a bound; it "only" describes how efficient an algorithm is "in most cases"; take for example quick sort which has a best-case complexity of Ω(n), a worst case complexity of O(n^2) and an average case complexity of O(n log n). This tells us that for almost all cases, quick sort is as fast as sorting gets in general (i.e. the average case complexity), while there are instances of the problem that it solves faster than that (best case complexity -> lower bound) and also instances of the problem that take quick sort longer to solve than that (worst case complexity -> upper bound).

Resources