Big theta notation of insertion sort algorithm - algorithm

I'm studying asymptotic notations from the book and I can't understand what the author means. I know that if f(n) = Θ(n^2) then f(n) = O(n^2). However, I understand from the author's words that for insertion sort function algorithm f(n) = Θ(n) and f(n)=O(n^2).
Why? Does the big omega or big theta change with different inputs?
He says that:
"The Θ(n^2) bound on the worst-case running time of insertion sort, however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input. "
However it is different for big-oh notation. What does he mean? What is the difference between them?
I'm so confused. I'm copy pasting it below:
Since O-notation describes an upper bound, when we use it to bound the worst-case running
time of an algorithm, we have a bound on the running time of the algorithm on every input.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running
time on every input. The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input.
For example, when the input is already sorted, insertion sort runs in
Θ(n) time.

Does the big omega or big theta change with different inputs?
Yes. To give a simpler example, consider linear search in an array from left to right. In the worst and average case, this algorithm takes f(n) = a × n/2 + b expected steps for some constants a and b. But when the left element is guaranteed to always hold the key you're looking for, it always takes a + b steps.
Since Θ denotes a strict bound, and Θ(n) != Θ(n²), it follows that the Θ for the two kinds of input is different.
EDIT: as for Θ and big-O being different on the same input, yes, that's perfectly possible. Consider the following (admittedly trivial) example.
When we set n to 5, then n = 5 and n < 6 are both true. But when we set n to 1, then n = 5 is false while n < 6 is still true.
Similarly, big-O is just an upper bound, just like < on numbers, while Θ is a strict bound like =.
(Actually, Θ is more like a < n < b for constants a and b, i.e. it defines something analogous to a range of numbers, but the principle is the same.)

Refer to CLRS edition 3
Page -44(Asymptotic notation,functions,and running times)
It says -
Even when we use asymptotic notation to apply to the running time of an algorithm, we need to understand which running time we mean. Sometimes we are interested in the worst-case running time. Often, however, we wish to characterize the running time no matter what the input. In other words, we often wish to make a blanket statement that covers all inputs, not just the worst case.
Takings from the above para:
Worst case provides the atmost limit for the running time of an algorithm.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running time on every input.
But Theta(n^2) bound on the worst-case running time of insertion sort, however, does not imply Theta(n^2) bound on the running time of insertion sort on every input.
Because best case running time of insertion sort yields Theta(n).(When the list is already sorted)
We usually write the worst case time complexity of an algorithm but when best case and average case come into accountability the time complexity varies according to these cases.

In simple words, the running time of a programs is described as a
function of its input size i.e. f(n).
The = is asymmetric, thus an+b=O(n) means f(n) belongs to set O(g(n)). So we can also say an+b=O(n^2) and its true because f(n) for some value of a,b and n belongs to set O(n^2).
Thus Big-Oh(O) only gives an upper bound or you can say the notation gives a blanket statement, which means all the inputs of a given input size are covered not just the worst case once. For example in case of insertion sort an array of size n in reverse order.
So n=O(n^2) is true but will be an abuse when defining worst case running time for an algorithm. As worst case running time gives an upper bound on the running time for any input. And as we all know that in case of insertion sort the running time will depend upon how the much sorted the input is in the given array of a fixed size. So if the array is sort the running will be linear.
So we need a tight asymptotic bound notation to describe our worst case,
which is provided by Θ notation thus the worst case of insertion sort will be Θ(n^2) and best case will be Θ(n).

we have a bound on the running time of the algorithm on every input
It means that if there is a set of inputs with running time n^2 while other have less, then the algorithm is O(n^2).
The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of
insertion sort on every input
He is saying that the converse is not true. If an algorithm is O(n^2), it doesnt not mean every single input will run with quadratic time.

My academic theory on the insertion sort algorithm is far away in the past, but from what I understand in your copy-paste :
big-O notation always describe the worst case, but big-Theta describe some sort of average for typical data.
take a look at this : What is the difference between Θ(n) and O(n)?

Related

Is Big Omega of any linear algorithm n or can it also be 1?

If we have a linear algorithm (for example, find if a number exists in a given array of numbers), does this mean that Omega(n) = n? The number of steps would be n. And the tightest bound I can make is c*n where c = 1.
But as far as I know, Omega also describes the best case scenario which in this case would be 1 because the searched element can be on the first position of the array and that accounts for only one step. So, by this logic, Omega(n) = 1.
Which variant is the correct one and why? Thanks.
There is a large confusion about what is described using the asymptotic notation.
The running time of an algorithm is in general a function of the number of elements, but also of the particular values of the inputs. Hence T(x) where x is an input of n elements is not a function of n alone.
Now one can study the worst-case and best-case: to determine these, you choose a configuration of the input corresponding to the slowest or fastest execution time and these are functions of n only. And additional option is the expected (or average) running-time, which corresponds to a given statistical distribution of the input. This is also a function of n alone.
Now, Tworst(n), Tbest(n), Texpected(n) can have upper bounds, denoted by O(f(n)), and lower bounds, denoted by Ω(f(n)). When these bounds coincide, the notation Θ(f(n)) is used.
In the case of a linear search, the best case is Θ(1) and the worst and expected cases are Θ(n). Hence the running time for arbitrary input is Ω(1)
and O(n).
Addendum:
The Graal of algorithmics is the discovery of efficient algorithms, i.e. such that the effective running time is of the same order as the best behavior that can be achieved independently of any algorithm.
For instance, it is obvious that the worst-case of any search algorithm is Ω(n) because whatever the search order, you may have to perform n comparisons (for instance if the key is not there). As the linear search is worst-case O(n), it is worst-case efficient. It is also best-case efficient, but this is not so interesting.
If you have an linear time algorithm that means that the time complexity has a linear upper bound, namely O(n). This does not mean that it also has a linear lower bound. In your example, finding out if a element exits, the lower bound is Ω(1). Here is Ω(n) just wrong.
Doing a linear search on an array, to find the minimal element takes exactly n steps in all cases. So here is the lower bound Ω(n). But Ω(1) would also be right, since a constant number of steps is also a lower bound for n steps, but it is no tight lower bound.

What is the relation/difference between worst case time complexity of an algorithm and its upper bound?

What is the relation/difference between worst case time complexity of an algorithm and its upper bound?
The term "upper bound" is not very clear, as it may refer to two possible things:
The upper bound of the algorithm - the bound where the algorithm can never run "slower" than it. This is basically the worst case performance of it, so if this is what you mean - the answer is pretty simple.
big-O notation, which provides an upper bound on the complexity of the algorithm under a specific analysis. The big-O notation is a set of functions, and can be applied to any analysis of an algorithm, including worst case, average case, and even best case.
Let's take Quick Sort as an example.
Quick Sort is said to have O(n^2) worst case performance, and O(nlogn) average case performance. How can one algorithm has two complexities? Simple, the function representing the analysis of average case, and the one representing the worst case, are completely different funcitons - and we can apply big O notation on each of them, there is no restriction about it.
Moreover, we can even apply it to best-case. Consider a small optimization to quick-sort, where it first checks if the array is already sorted, and if it is - it stops immidiately. This is effectively O(n) operation, and there is some input that will provide this behavior - so we can now say that the algorithm's best case complexity is O(n)
The difference between worst case and big O(UPPER BOUND) is that
the worst case is a case that actually happens to your code,
the upper bound is an overestimate, an assumption that we put in order to
calculate the big O, it doesn't have to happen
example on insertion sort:
Worst Case:
The numbers are all arranged reversely so you need to arrange and move every single
number
Pseudo-code
for j=2 to n
do key = a[i]
i=j-1
while i>0 & a[i]>key
do a[i+1] = a[i]
i=i-1
end while
a[i+1]=key
end for
Upper Bound:
We assume that the order of the inner loop is i =n-1 every single time, but in fact,
it is changeable every time, it can't be n-1 every time, but we assumed
/overestimated it to calculate the big O

Is "best case performance Θ(1) -> running time ≠ Θ(log n)" valid?

This is an argument for justifying that the running time of an algorithm can't be considered Θ(f(n)) but should be O(f(n)) instead.
E.g. this question about binary search: Is binary search theta log (n) or big O log(n)
MartinStettner's response is even more confusing.
Consider *-case performances:
Best-case performance: Θ(1)
Average-case performance: Θ(log n)
Worst-case performance: Θ(log n)
He then quotes Cormen, Leiserson, Rivest: "Introduction to Algorithms":
What we mean when we say "the running time is O(n^2)" is that the worst-case running time (which is a function of n) is O(n^2) ...
Doesn't this suggest the terms running time and worst-case running time are synonymous?
Also, if running time refers to a function with natural input f(n), then there has to be Θ class which contains it, e.g. Θ(f(n)), right? This indicates that you are obligated to use O notation only when the running time is not known very precisely (i.e. only an upper bound is known).
When you write O(f(n)) that means that the running time of your algorithm is bounded above by a function c*f(n) where c is a constant. That also means that that your algorithm can complete in much less number of steps than c*f(n). We often use the Big-O notation because we want to include the possibility that the algorithm completes faster than we indicate. On the other hand, Theta(f(n)) means that the algorithm always completes in c*f(n) steps. Binary search is O(log(n)) because usually it will complete in log(n) steps, but could complete in one step if you get lucky (best-case performance).
I get always confused, if I read about running times.
For me a running time is the time an implementation of an algorithm needs to be executed on a computer. This can be differ in many ways and so is a complicated thing.
So I think complexity of an algorithm is a better word.
Now, the complexity is (in most cases) a worst-case-complexity. If you know an upper bound for the worst case, you also know that it can only get better in other cases.
So, if you know, that there exist some (maybe trivial) cases, where your algorithm does only a few (constant number) steps and stops, you don't have to care about an lower bound and so you (normaly) use an upper bound in big-O or little-o notation.
If you do your calculations carfully, you can also use the Θ notation.
But notice: all complexities only hold on the cases they are attached to. This means: If you make assumtions like "Input is a best case", this effects your calculation and also the resulting complexity. In the case of binary search you posted the complexity for three different assumtions.
You can generalize it by saying: "The complexity of Binary Search is in O(log n)", since Θ(log n) means "Ω(log n) and O(log n)" and O(1) is a subset of O(log n).
To summerize:
- If you know the very precise function for the complexity, you can give the complexity in Θ-notation
- If you want to get an overall upper bound, you have to use O-notation, until the lower bound over all input cases does not differ from the upper bound.
- In most cases you have to use the O-notation, since the algorithms are too complex to get a close upper and lower bound.

Contradiction in Cormen regarding Insertion sort

In Cormen theorem 3.1 says that
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2).
The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Now if we look at the Exercise 3.1-6 it asks
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Am I the only one who sees a contradiction here.
I mean if we abide by the question that has to be proved, we conclude that for asymptotically tighter bounds (f(n) = Big-theta(g(n))) we need to have f(n) = big-omega(g(n)) for the algorithm's best case and Big-oh(g(n)) in its worst case
But in case of Insertion sort best case time complexity is big-omega(n) and worst case time complexity is Big-oh(n^2)
I think you are a bit confused here.Let me clarify a few points for you.
Running time can mean two things: the actual running time of the program, or the bounded function like theta or big-oh(so it helps to call this time complexity, in order to avoid the confusion).Hereafter we will use running time for program's actual running time, and time complexity to denote the Big-Oh/theta notation.
To pick up Big-Oh read my answer here.
Once you are clear with Big-Oh, the other functions fall in place easily.When we say T(n) is Omega(g(n)), we mean to the right of some point k the curve c.g(n) bounds the running time curve from below.OR in other words:
T(n)>=c.g(n) for all n>=k, and for some constant c independent of input size.
And theta notation is like saying "I am just one function, but using different constants you can make me bound the running time curve from above and from below"
So when we say T(n) is theta(g(n)), we mean
c1.g(n)==k
Now we know what the functions mean, let's see where CLRS got in the confusion.
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2). The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Here by running time CLRS means the actual running time T(n).It's poorly worded, and it's not your fault that you misunderstood.In fact I would go ahead and say they it's wrong.There is nothing like falls in between, a function is either in the set O(g(n)) or it isn't. So it's an error.
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Here CLRS means the running time function T(n) and they want you to figure out the time complexity.
There is no contradiction here. The question only states to prove that Big-Theta(g(n)) is asymptotically tightly bound by Big-O(g(n)) and Big-Omega(g(n)). If you prove the question, you only prove that a function runs in Big-Theta(g(n)) if and only if it runs between Big-O(g(n)) and Big-Omega(g(n)).
The insertion sort runs from Big-Omega(n) to Big-Oh(n^2), so the running time of insertion sort CANNOT be tightly bound to Big-Theta(n^2).
As a matter of fact, CLRS never uses Big-Theta(n^2) to tightly bound insertion sort.
There's no contradiction, since CLRS mentioned nothing about insertion sort of being theta(N^2).

Different upper bounds and lower bounds of same algorithm

So I just started learning about Asymptotic bounds for an algorithm
Question:
What can we say about theta of a function if for the algorithm we find different lower and upper bounds?? (say omega(n) and O(n^2)). Or rather what can we say about tightness of such an algorithm?
The book which I read says Theta is for same upper and lower bounds of the function.
What about in this case?
I don't think you can say anything, in that case.
The definition of Θ(f(n)) is:
A function is Θ(f(n)) if and only if it is Ω(f(n)) and O(f(n)).
For some pathological function that exhibits those behaviors, such as oscillating between n and n^2, it wouldn't be defined.
Example:
f(x) = n if n is odd
n^2 if n is even
Your bounds Ω(n) and O(n^2) would be tight on this, but Θ(f(n)) is not defined for any function.
See also: What is the difference between Θ(n) and O(n)?
Just for a bit of practicality, one algorithm that is not in Θ(f(n)) for any f(n) would be insertion sort. It runs in Ω(n) since for a list that is already sorted, you only need one operation for the insert in every step, but it runs in O(n^2) in the general case. Constructing functions that oscillate or are non-continuous otherwise usually is done more for didactic purposes, but in my experience such functions rarely, if ever, appear with actual algorithms.
Regarding tightness, I only ever heard that in this context with reference to the upper and lower bounds proposed for algorithms. Again regarding the example of insertion sort, the given bounds are tight in the sense that there are instances of the problem that actually can be done in time linear in their size (the lower bound) and other instances of the problem that will not execute in time less than quadratic in their size. Bounds that are valid, but not tight for insertion sort would be Ω(1) since you can't sort lists of arbitrary size in constant time, and O(n^3) because you can always sort a list of n elements in strictly O(n^2) time, which is an order of magnitude less, so you can certainly do it in O(n^3). What bounds are for is to give us a crude idea of what we can expect as performance of our algorithms so we get an idea of how efficient our solutions are; tight bounds are the most desirable, since they both give us that crude idea and that idea is optimal in the sense that there are extreme cases (which sometimes are also the general case) where we actually need all the complexity the bound allows.
The average case complexity is not a bound; it "only" describes how efficient an algorithm is "in most cases"; take for example quick sort which has a best-case complexity of Ω(n), a worst case complexity of O(n^2) and an average case complexity of O(n log n). This tells us that for almost all cases, quick sort is as fast as sorting gets in general (i.e. the average case complexity), while there are instances of the problem that it solves faster than that (best case complexity -> lower bound) and also instances of the problem that take quick sort longer to solve than that (worst case complexity -> upper bound).

Resources