Contradiction in Cormen regarding Insertion sort - algorithm

In Cormen theorem 3.1 says that
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2).
The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Now if we look at the Exercise 3.1-6 it asks
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Am I the only one who sees a contradiction here.
I mean if we abide by the question that has to be proved, we conclude that for asymptotically tighter bounds (f(n) = Big-theta(g(n))) we need to have f(n) = big-omega(g(n)) for the algorithm's best case and Big-oh(g(n)) in its worst case
But in case of Insertion sort best case time complexity is big-omega(n) and worst case time complexity is Big-oh(n^2)

I think you are a bit confused here.Let me clarify a few points for you.
Running time can mean two things: the actual running time of the program, or the bounded function like theta or big-oh(so it helps to call this time complexity, in order to avoid the confusion).Hereafter we will use running time for program's actual running time, and time complexity to denote the Big-Oh/theta notation.
To pick up Big-Oh read my answer here.
Once you are clear with Big-Oh, the other functions fall in place easily.When we say T(n) is Omega(g(n)), we mean to the right of some point k the curve c.g(n) bounds the running time curve from below.OR in other words:
T(n)>=c.g(n) for all n>=k, and for some constant c independent of input size.
And theta notation is like saying "I am just one function, but using different constants you can make me bound the running time curve from above and from below"
So when we say T(n) is theta(g(n)), we mean
c1.g(n)==k
Now we know what the functions mean, let's see where CLRS got in the confusion.
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2). The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Here by running time CLRS means the actual running time T(n).It's poorly worded, and it's not your fault that you misunderstood.In fact I would go ahead and say they it's wrong.There is nothing like falls in between, a function is either in the set O(g(n)) or it isn't. So it's an error.
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Here CLRS means the running time function T(n) and they want you to figure out the time complexity.

There is no contradiction here. The question only states to prove that Big-Theta(g(n)) is asymptotically tightly bound by Big-O(g(n)) and Big-Omega(g(n)). If you prove the question, you only prove that a function runs in Big-Theta(g(n)) if and only if it runs between Big-O(g(n)) and Big-Omega(g(n)).
The insertion sort runs from Big-Omega(n) to Big-Oh(n^2), so the running time of insertion sort CANNOT be tightly bound to Big-Theta(n^2).
As a matter of fact, CLRS never uses Big-Theta(n^2) to tightly bound insertion sort.

There's no contradiction, since CLRS mentioned nothing about insertion sort of being theta(N^2).

Related

How can we prove that the running time bound of an algorithm is tight?

Suppose we can prove that an algorithm, invoked with an input of size n, runs in time O(f(n)).
I want to prove that this running time bound is tight. Two questions:
Wouldn't it suffice to give a special input and show that the running time is at least f(n)?
I've read that one possibility for proving the tightness is to "reduce sorting to it". I've no idea what is meant by that
Wouldn't it suffice to give a special input and show that the running
time is at least f(n)?
Yes, assuming you are talking about the worst case complexity.
If you are talking about worst case complexity - and you have proved it is running in O(f(n)), if you find an input that "worse" than C*f(n)) for some constant C - you effectively proved that the algorithm (under worst case performance) is in Ω(f(n)), and since O(f(n)) [intersection] Ω(f(n)) = Theta(f(n)), it means your algorithm is running in Theta(f(n)) under worst case analysis.
Note that it should actually be a "family" of examples, since if I claim "yes, but this is correct only for small n values, and not for n>N (for some N), you can show me that this family of examples also covers the case where n>N, and still be valid.
Symmetrically, if you proved an algorithm has best case performance of Ω(f(n)), and you found some input that runs "better" (faster) than C*f(n)) for some constant C, you effectively proved the algorithm is Theta(f(n)) under best case analysis.
This trick does NOT work for average case analysis - where you should calculate the expectancy of the run time, and a singular example is not helpful.
I've read that one possibility for proving the tightness is to "reduce
sorting to it". I've no idea what is meant by that
This is done to prove a much stronger claim, that there is no algorithm (at all) that can solve some problem at the desired time.
The common usage of it is to assume there is some black box algorithm A that runs in some o(g(n)) time, and then use A to build a sorting algorithm that runs in o(nlogn) time. However, since sorting is Ω(nlogn) problem, we have a contradiction, and we can conclude out assumptions about A are wrong, and it cannot run in the desired time.
This helps us to prove a stronger claim: Not only OUR algorithm has this lower bound, but ALL algorithms that solve the specific problem has this lower bound.
ad 1.: Yes, but you must be able to find such input of size n for any n. An example of size 3 that is processed in 9 steps doesn't really prove anything.
ad 2.: If you can modify a sequence of elements so that your algorithm effectively sorts it (you get a sorted sequence after some processing of the output), you've reduced sorting to it. And because sorting (by comparison) cannot be faster than O(n log(n)), this can be used to prove that your algorithm cannot be faster than O(n log(n)).
Of course, the input and output processing functions cannot be slower than or equal to O(n log(n)) for this argument to be valid, otherwise you could sort the array and prove that an O(1) algorithm that just returns the input array is in fact at least O(n log(n)) :).

Asymptotic Analysis questions

I found a couple questions on geeksforgeeks.org that i can't seem to understand(#1 and #3). I was hoping someone could clarify the answers for me:
clarify whether true/valid or false
1.Time Complexity of QuickSort is Θ(n^2)
I answered true but it is false, why? If quicksort has a time complexity of O(n^2) and we know that Θ(g(n))={f(n) where c1*g(n) <= f(n) <=c2*g(n) and n >= n0} then doesn't that prove that it is true since c2*g(n) being the upper bound can equal f(n)?
2.Time Complexity of QuickSort is O(n^2) - true
3.Time complexity of all computer algorithms can be written as Ω(1)
This is true but i have no understanding of why this is true. A search algorithm can have a lower bound of Ω(1) assuming we find what we were looking for on the first element but how does this hold true for ALL computer algorithms such as insertion sort algorithm where the worst case is O(n^2)?
link:
http://www.geeksforgeeks.org/analysis-of-algorithms-set-3asymptotic-notations/
Time Complexity of QuickSort is Θ(n^2)----This means for every value of n, time taken by the algorithm to produce the output is equal to a function which is f(n)=n^2.but we know this is not true for quick sort because we know for some input, running time of quick sort may be equal to a function which is g(n)=nlogn. so we need to specify if it is worst,best or average case.It is correct to say "Worst case time complexity of quicksort is Θ(n^2)".
"Time Complexity of QuickSort is O(n^2)"---this means for each input value of n,running time of the algorithm is at most a function which is f(n)=n^2.This implies there exist some input, for which the algorithm has a running time which may be less than f(n)=n^2.we know best case time complexity of quicksort is g(n)=nlogn and g(n)< f(n).As this statement covers all the cases so the statement is true.
Similarly it is correct to say "Time complexity of quicksort is Ω(nlogn)".because this means running time of the algorithm is at least nlogn, and n^2>nlogn.
"Time complexity of all computer algorithms can be written as Ω(1)"---here 1 represent constant time function.the above statement implies: to execute any computer algorithms we need a minimum constant time.which is correct for all computer algorithms.
The worst case scenario for QuickSort is O(n^2). But you expect it to run in O(n log n) time. Hence the running time of the algorithm varies per case and you cannot use the theta symbol to give the general running time of the algorithm.
And of course the lowerbound on the running time of any algorithm is constant time (Ω(1)). It doesn't have to reach this lower bound though but the algorithm should be run, and should have at least one operation.

Still sort of confused about Big O notation

So I've been trying to understand Big O notation as well as I can, but there are still some things I'm confused about. So I keep reading that if something is O(n), it usually is referring to the worst-case of an algorithm, but that it doesn't necessarily have to refer to the worst case scenario, which is why we can say the best-case of insertion sort for example is O(n). However, I can't really make sense of what that means. I know that if the worst-case is O(n^2), it means that the function that represents the algorithm in its worst case grows no faster than n^2 (there is an upper bound). But if you have O(n) as the best case, how should I read that as? In the best case, the algorithm grows no faster than n? What I picture is a graph with n as the upper bound, like
If the best case scenario of an algorithm is O(n), then n is the upper bound of how fast the operations of the algorithm grow in the best case, so they cannot grow faster than n...but wouldn't that mean that they can grow as fast as O(log n) or O(1), since they are below the upper bound? That wouldn't make sense though, because O(log n) or O(1) is a better scenario than O(n), so O(n) WOULDN'T be the best case? I'm so lost lol
Big-O, Big-Θ, Big-Ω are independent from worst-case, average-case, and best-case.
The notation f(n) = O(g(n)) means f(n) grows no more quickly than some constant multiple of g(n).
The notation f(n) = Ω(g(n)) means f(n) grows no more slowly than some constant multiple of g(n).
The notation f(n) = Θ(g(n)) means both of the above are true.
Note that f(n) here may represent the best-case, worst-case, or "average"-case running time of a program with input size n.
Furthermore, "average" can have many meanings: it can mean the average input or the average input size ("expected" time), or it can mean in the long run (amortized time), or both, or something else.
Often, people are interested in the worst-case running time of a program, amortized over the running time of the entire program (so if something costs n initially but only costs 1 time for the next n elements, it averages out to a cost of 2 per element). The most useful thing to measure here is the least upper bound on the worst-case time; so, typically, when you see someone asking for the Big-O of a program, this is what they're looking for.
Similarly, to prove a problem is inherently difficult, people might try to show that the worst-case (or perhaps average-case) running time is at least a certain amount (for example, exponential).
You'd use Big-Ω notation for these, because you're looking for lower bounds on these.
However, there is no special relationship between worst-case and Big-O, or best-case and Big-Ω.
Both can be used for either, it's just that one of them is more typical than the other.
So, upper-bounding the best case isn't terribly useful. Yes, if the algorithm always takes O(n) time, then you can say it's O(n) in the best case, as well as on average, as well as the worst case. That's a perfectly fine statement, except the best case is usually very trivial and hence not interesting in itself.
Furthermore, note that f(n) = n = O(n2) -- this is technically correct, because f grows more slowly than n2, but it is not useful because it is not the least upper bound -- there's a very obvious upper bound that's more useful than this one, namely O(n). So yes, you're perfectly welcome to say the best/worst/average-case running time of a program is O(n!). That's mathematically perfectly correct. It's just useless, because when people ask for Big-O they're interested in the least upper bound, not just a random upper bound.
It's also worth noting that it may simply be insufficient to describe the running-time of a program as f(n). The running time often depends on the input itself, not just its size. For example, it may be that even queries are trivially easy to answer, whereas odd queries take a long time to answer.
In that case, you can't just give f as a function of n -- it would depend on other variables as well. In the end, remember that this is just a set of mathematical tools; it's your job to figure out how to apply it to your program and to figure out what's an interesting thing to measure. Using tools in a useful manner needs some creativity, and math is no exception.
Informally speaking, best case has O(n) complexity means that when the input meets
certain conditions (i.e. is best for the algorithm performed), then the count of
operations performed in that best case, is linear with respect to n (e.g. is 1n or 1.5n or 5n).
So if the best case is O(n), usually this means that in the best case it is exactly linear
with respect to n (i.e. asymptotically no smaller and no bigger than that) - see (1). Of course,
if in the best case that same algorithm can be proven to perform at most c * log N operations
(where c is some constant), then this algorithm's best case complexity would be informally
denoted as O(log N) and not as O(N) and people would say it is O(log N) in its best case.
Formally speaking, "the algorithm's best case complexity is O(f(n))"
is an informal and wrong way of saying that "the algorithm's complexity
is Ω(f(n))" (in the sense of the Knuth definition - see (2)).
See also:
(1) Wikipedia "Family of Bachmann-Landau notations"
(2) Knuth's paper "Big Omicron and Big Omega and Big Theta"
(3)
Big Omega notation - what is f = Ω(g)?
(4)
What is the difference between Θ(n) and O(n)?
(5)
What is a plain English explanation of "Big O" notation?
I find it easier to think of O() as about ratios than about bounds. It is defined as bounds, and so that is a valid way to think of it, but it seems a bit more useful to think about "if I double the number/size of inputs to my algorithm, does my processing time double (O(n)), quadruple (O(n^2)), etc...". Thinking about it that way makes it a little bit less abstract - at least to me...

Big theta notation of insertion sort algorithm

I'm studying asymptotic notations from the book and I can't understand what the author means. I know that if f(n) = Θ(n^2) then f(n) = O(n^2). However, I understand from the author's words that for insertion sort function algorithm f(n) = Θ(n) and f(n)=O(n^2).
Why? Does the big omega or big theta change with different inputs?
He says that:
"The Θ(n^2) bound on the worst-case running time of insertion sort, however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input. "
However it is different for big-oh notation. What does he mean? What is the difference between them?
I'm so confused. I'm copy pasting it below:
Since O-notation describes an upper bound, when we use it to bound the worst-case running
time of an algorithm, we have a bound on the running time of the algorithm on every input.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running
time on every input. The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input.
For example, when the input is already sorted, insertion sort runs in
Θ(n) time.
Does the big omega or big theta change with different inputs?
Yes. To give a simpler example, consider linear search in an array from left to right. In the worst and average case, this algorithm takes f(n) = a × n/2 + b expected steps for some constants a and b. But when the left element is guaranteed to always hold the key you're looking for, it always takes a + b steps.
Since Θ denotes a strict bound, and Θ(n) != Θ(n²), it follows that the Θ for the two kinds of input is different.
EDIT: as for Θ and big-O being different on the same input, yes, that's perfectly possible. Consider the following (admittedly trivial) example.
When we set n to 5, then n = 5 and n < 6 are both true. But when we set n to 1, then n = 5 is false while n < 6 is still true.
Similarly, big-O is just an upper bound, just like < on numbers, while Θ is a strict bound like =.
(Actually, Θ is more like a < n < b for constants a and b, i.e. it defines something analogous to a range of numbers, but the principle is the same.)
Refer to CLRS edition 3
Page -44(Asymptotic notation,functions,and running times)
It says -
Even when we use asymptotic notation to apply to the running time of an algorithm, we need to understand which running time we mean. Sometimes we are interested in the worst-case running time. Often, however, we wish to characterize the running time no matter what the input. In other words, we often wish to make a blanket statement that covers all inputs, not just the worst case.
Takings from the above para:
Worst case provides the atmost limit for the running time of an algorithm.
Thus, the O(n^2) bound on worst-case running time of insertion sort also applies to its running time on every input.
But Theta(n^2) bound on the worst-case running time of insertion sort, however, does not imply Theta(n^2) bound on the running time of insertion sort on every input.
Because best case running time of insertion sort yields Theta(n).(When the list is already sorted)
We usually write the worst case time complexity of an algorithm but when best case and average case come into accountability the time complexity varies according to these cases.
In simple words, the running time of a programs is described as a
function of its input size i.e. f(n).
The = is asymmetric, thus an+b=O(n) means f(n) belongs to set O(g(n)). So we can also say an+b=O(n^2) and its true because f(n) for some value of a,b and n belongs to set O(n^2).
Thus Big-Oh(O) only gives an upper bound or you can say the notation gives a blanket statement, which means all the inputs of a given input size are covered not just the worst case once. For example in case of insertion sort an array of size n in reverse order.
So n=O(n^2) is true but will be an abuse when defining worst case running time for an algorithm. As worst case running time gives an upper bound on the running time for any input. And as we all know that in case of insertion sort the running time will depend upon how the much sorted the input is in the given array of a fixed size. So if the array is sort the running will be linear.
So we need a tight asymptotic bound notation to describe our worst case,
which is provided by Θ notation thus the worst case of insertion sort will be Θ(n^2) and best case will be Θ(n).
we have a bound on the running time of the algorithm on every input
It means that if there is a set of inputs with running time n^2 while other have less, then the algorithm is O(n^2).
The Θ(n^2) bound on the worst-case running time of insertion sort,
however, does not imply a Θ(n^2) bound on the running time of
insertion sort on every input
He is saying that the converse is not true. If an algorithm is O(n^2), it doesnt not mean every single input will run with quadratic time.
My academic theory on the insertion sort algorithm is far away in the past, but from what I understand in your copy-paste :
big-O notation always describe the worst case, but big-Theta describe some sort of average for typical data.
take a look at this : What is the difference between Θ(n) and O(n)?

Analysis of shell sort

I am reading a book on algorithms it is mentioned on analysis of shell sort algorithm as below:
The worst-case running time of Shellsort, using Shell's increments, is Theta(n square).
The proof requires showing not only an upper bound on the worst-case
running time but also showing that there exists some input that
actually takes lower bound as Omeaga(n square) time to run. We prove
the lower bound first, by constructing a bad case.
My question on above is:
why author is mentioning bad case to check for lower bound? I taught to get lower bound we should take best case,
Kindly request to clarify above.
Thanks!
The reason he is considering both upper bound and lower bound is because he wants to express worst-case time using Theta(Θ) notation.
The theta notation requires you to establish both a upper bound and a
lower bound.
Theta is a bounding constraint (both upper and lower). To show an algorithm has running time Theta(f(n)) you have to establish two things:
1) in the worst case, the algorithm runs in time O(f(n)) for all cases [worst case complexity]
2) The lgorithm must take time Omega(f(n)) [there are real examples that hit the worst case scenario]
You establish the second part by finding particularly bad cases for the algorithm.
To show that something is Theta(f(n)), one has to show both an upper and a lower bound, which is what the text is doing.
The claim that "the worst-case time is Theta(n square)" requires one to demonstrate both an upper and a lower bound for said worst-case time.
Similarly, a statement about the average-case time being Theta(f(n)) would require two bounds on the average-case time.
And so on.
As #Patrick87 succinctly puts it:
the bound is orthogonal to the case.

Resources