Analysis of shell sort - algorithm

I am reading a book on algorithms it is mentioned on analysis of shell sort algorithm as below:
The worst-case running time of Shellsort, using Shell's increments, is Theta(n square).
The proof requires showing not only an upper bound on the worst-case
running time but also showing that there exists some input that
actually takes lower bound as Omeaga(n square) time to run. We prove
the lower bound first, by constructing a bad case.
My question on above is:
why author is mentioning bad case to check for lower bound? I taught to get lower bound we should take best case,
Kindly request to clarify above.
Thanks!

The reason he is considering both upper bound and lower bound is because he wants to express worst-case time using Theta(Θ) notation.
The theta notation requires you to establish both a upper bound and a
lower bound.

Theta is a bounding constraint (both upper and lower). To show an algorithm has running time Theta(f(n)) you have to establish two things:
1) in the worst case, the algorithm runs in time O(f(n)) for all cases [worst case complexity]
2) The lgorithm must take time Omega(f(n)) [there are real examples that hit the worst case scenario]
You establish the second part by finding particularly bad cases for the algorithm.

To show that something is Theta(f(n)), one has to show both an upper and a lower bound, which is what the text is doing.
The claim that "the worst-case time is Theta(n square)" requires one to demonstrate both an upper and a lower bound for said worst-case time.
Similarly, a statement about the average-case time being Theta(f(n)) would require two bounds on the average-case time.
And so on.
As #Patrick87 succinctly puts it:
the bound is orthogonal to the case.

Related

Is the Big-Omega time complexity of all search algorithms O(1)?

I understand that Big Omega defines the lower bound of s function (or best-case runtime).
Considering that almost every search algorithm could "luck out" and find the target element on the first iteration, would it be fair to say that its Big-Omega time complexity is O(1)?
I also understand that defining O(1) as the big Omega may not be useful -other lower bounds may be tighter, or closer to the evaluated function-, but the question is, is it correct?
I've found multiple sources claiming the linear search is Big-Omega O(n), even if some cases could complete in a single step, which is different from the best-case scenario as I understand it.
The lower bound (𝛺) is not the fastest answer a given algorithm can give.
The lower bound of a given problem is equal to the worst case scenario of the best algorithm that solves the problem. When doing complexity analysis, you should never forget that "luck" is always in the hands of the input (the instance the algorithm is trying to solve).
When trying to find a lower bound, you will imagine the "perfect algorithm" and you will try to "trap" it in a very hard case. Usually the algorithm is not defined and is only described based on its (hypotetical) performances. You would use arguments such as "If the ideal algorithm is that fast, it will not have this particular knowledge and will therefore fail on this particular instance, ie. the ideal algorithm doesn't exist". Replace ideal with the lower bound you are trying to prove.
For example, if we search the lower bound for the min-search problem in an unsorted array is 𝛺(n). The proof for this is quite trivial, and like most of the time, is made by contradiction. Basically, an algorithm A in o(n) will not see at least one item from the input array, if that item it did not saw was the minimum, A will fail. The contradiction proves that the problem is in 𝛺(n).
Maybe you can have a look at that answer I gave on a similar question.
The notations O, o, Θ, Ω, and ω are used in characterizing mathematical functions; for example, f(n) = n3 log n is in O(n4) and in Ω(n3).
So, the question is what mathematical functions we apply them to.
The mathematical functions that we tend to be interested in are things like "the worst-case time complexity of such-and-such algorithm, as a function of the size of its input", or "the average-case space complexity of such-and-such procedure, as a function of the largest element in its input". (Note: when we just say "the complexity of such-and-such algorithm", that's usually shorthand for its worst-case time complexity, as a function of some characteristic of its input that's hopefully obvious in context. Either way, it's still a mathematical function.)
We can use any of these notations in characterizing those functions. In particular, it's fine to use Ω in characterizing the worst case or average case.
We can also use any of these functions in characterizing things like "the best-case […]" — that's unusual, but there are times when it may be relevant. But, notably, we're not limited to Ω for that; just as we can use Ω in characterizing the worst case, we can also use O in characterizing the best case. It's all about what characterizations we're interested in.
You are confusing two different topics: Lower/upper bound, and worst-case/best-case time complexity.
The short answer to your question is: Yes, all search algorithms have a lower bound of Ω(1). Linear search (in the worst case, and on average) also has a lower bound of Ω(n), which is a stronger and more useful claim. The analogy is that 1 < π but also 3 < π, the latter being more useful. So in this sense, you are right.
However, your confusion seems to be between the notations for complexity classes (big-O, big-Ω, big-θ etc), and the concepts of best-case, worst-case, average case. The point is that the best case and the worst case time complexities of an algorithm are completely different functions, and you can use any of the notations above to describe any of them. (NB: Some claim that big-Ω automatically and exclusively describes best case time complexity and that big-O describes worst case, but this is a common misconception. They just describe complexity classes and you can use them with any mathematical functions.)
It is correct to say that the average time complexity linear search is Ω(n), because we are just talking about the function that describes its average time complexity. Its worst-case complexity is a different function, which happens not to be Ω(n), because as you say it can be constant-time.

Big Oh complexity of polynomial times log N

I understand O(NlgN) is linearithmic. But what is O(N^m(lgN))? Would is just be considered polynomial running time since the polynomial part grows faster?
If you know that the asymptotic behaviour of a function or algorithm can be described by O(n^m log N), you should probably stick with just that. However, you could naturally say that one upper bound on the time complexity of that same function/algorithm is one of polynomial time, i.e.:
This is acceptable as the Big-O upper bound on the asymptotic behaviour mustn't necessarily be a tight one.
Now, let's say you have some algorithm and have found an upper asymptotic bound on it as O(N^(m+1)), but know that you used quite coarse tools when deriving this bound; i.e., possibly there exists tighter asymptotic bounds. However, before proceeding on a crusade in calculus and analysis, you could ask yourself: is this bound good enough for my purposes? (E.g. making sure the algorithm doesn't run in exponential time). If so, just use the so-so but acceptable bound you've derived.
In case you've already derived a tighter better bound, however, it is probably most favourable to stick with that bound when presenting the asymptotic behaviour of your function or algorithm.

Understanding Big-Ω (Big-Omega) notation

I was doing some reading on logarithms and the rate of growth of the running time of algorithms.
I have, however, a problem understanding the Big-Ω (Big-Omega) notation.
I know that we use it for 'asymptotic lower bounds', and that we can express the idea that an algorithm takes at least a certain amount of time.
Consider this example:
var a = [1,2,3,4,5,6,7,8,9,10];
Somebody chooses a number. I write a program that tries to guess this number using a linear search (1, 2, 3, 4... until it guesses the number).
I can say that the running time of the algorithm would be a function of the size of its input, so these are true (n is the number of elements in an array):
Θ(n) (Big-Theta notation - asymptotically tight bound )
O(n) (Big-O notation - upper bounds)
When it comes to Big-Ω, in my understanding, the algorithm's running time would be Ω(1) since it is the least amount of guesses needed to find the chosen number (if, for example, a player chose 1 (the first item in the array)).
I reckon this bacause this is the defintion I found on KhanAcademy:
Sometimes, we want to say that an algorithm takes at least a certain
amount of time, without providing an upper bound. We use big-Ω
notation; that's the Greek letter "omega."
Am I right to say that this algorithm's running time is Ω(1)? Is it also true that it is Ω(n), if yes - why ?
Big O,Theta or Omega notation all refer to how a solution scales asymptotically as the size of the problem tends to infinity, however, they should really be prefaced with what you are measuring.
Usually when one talks about big O(n) one usually means that the worst case complexity is O(n), however, one does sometimes see it used for typical running times, particularly for heuristics or algorithms which have an element of randomness or are not strictly guaranteed to converge at all.
So here, we are presumably talking about the worst case complexity, which is Theta(n), since it is Theta(n), its also O(n) and Omega(n).
One way to prove a lower bound when its unknown is to say that X is the easiest case for this algorithm, here the best case is O(1), so therefore we can say that the algorithm takes at lease Omega(1) and at most O(n), and Theta is unknown, and that is correct usage, but the aim is to get the highest possible bound for Omega which is still true, and the lowest possible bound for O(n) which is still true. Here Omega(n) is obvious, so its better to say Omega(n) than Omega(1), just as its better to say O(n) rather than O(n^2).

Relation between worst case and average case running time of an algorithm

Let's say A(n) is the average running time of an algorithm and W(n) is the worst. Is it correct to say that
A(n) = O(W(n))
is always true?
The Big O notation is kind of tricky, since it only defines an upper bound to the execution time of a given algorithm.
What this means is, if f(x) = O(g(x)) then for every other function h(x) such that g(x) < h(x) you'll have f(x) = O(h(x)) . The problem is, are those over extimated execution times usefull? and the clear answer is not at all. What you usually whant is the "smallest"
upper bound you can get, but this is not strictly required in the definition, so you can play around with it.
You can get some stricter bound using the other notations, such as the Big Theta, as you can read here.
So, the answer to your question is yes, A(n) = O(W(n)), but that doesn't give any usefull information on the algorithm.
If you're mentioning A(n) and W(n) are functions - then, yes, you can do such statement in common case - it is because big-o formal definition.
Note, that in terms on big-o there's no sense to act such way - since it makes understanding of the real complexity worse. (In general, three cases - worst, average, best - are present exactly to show complexity more clear)
Yes, it is not a mistake to say so.
People use asymptotic notation to convey the growth of running time on specific cases in terms of input sizes.To compare the average case complexity with the worst case complexity isn't providing much insight into understanding the function's growth on either of the cases.
Whilst it is not wrong, it fails to provide more information than what we already know.
I'm unsure of exactly what you're trying to ask, but bear in mind the below.
The typical algorithm used to show the difference between average and worst case running time complexities is Quick Sort with poorly chosen pivots.
On average with a random sample of unsorted data, the runtime complexity is n log(n). However, with an already sorted set of data where pivots are taken from either the front/end of the list, the runtime complexity is n^2.

Contradiction in Cormen regarding Insertion sort

In Cormen theorem 3.1 says that
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2).
The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Now if we look at the Exercise 3.1-6 it asks
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Am I the only one who sees a contradiction here.
I mean if we abide by the question that has to be proved, we conclude that for asymptotically tighter bounds (f(n) = Big-theta(g(n))) we need to have f(n) = big-omega(g(n)) for the algorithm's best case and Big-oh(g(n)) in its worst case
But in case of Insertion sort best case time complexity is big-omega(n) and worst case time complexity is Big-oh(n^2)
I think you are a bit confused here.Let me clarify a few points for you.
Running time can mean two things: the actual running time of the program, or the bounded function like theta or big-oh(so it helps to call this time complexity, in order to avoid the confusion).Hereafter we will use running time for program's actual running time, and time complexity to denote the Big-Oh/theta notation.
To pick up Big-Oh read my answer here.
Once you are clear with Big-Oh, the other functions fall in place easily.When we say T(n) is Omega(g(n)), we mean to the right of some point k the curve c.g(n) bounds the running time curve from below.OR in other words:
T(n)>=c.g(n) for all n>=k, and for some constant c independent of input size.
And theta notation is like saying "I am just one function, but using different constants you can make me bound the running time curve from above and from below"
So when we say T(n) is theta(g(n)), we mean
c1.g(n)==k
Now we know what the functions mean, let's see where CLRS got in the confusion.
For example, the best case running time of insertion sort is big-omega(n), whereas worst case running time of Insertion sort is Big-oh(n^2). The running time of insertion sort therefore falls between big-omega(n) and Bigoh(n^2)
Here by running time CLRS means the actual running time T(n).It's poorly worded, and it's not your fault that you misunderstood.In fact I would go ahead and say they it's wrong.There is nothing like falls in between, a function is either in the set O(g(n)) or it isn't. So it's an error.
Prove that the running time of an algorithm is Big-theta(g(n)) iff its worst case running time is Big-oh(g(n)) and its best case running time is big-omega(g(n))
Here CLRS means the running time function T(n) and they want you to figure out the time complexity.
There is no contradiction here. The question only states to prove that Big-Theta(g(n)) is asymptotically tightly bound by Big-O(g(n)) and Big-Omega(g(n)). If you prove the question, you only prove that a function runs in Big-Theta(g(n)) if and only if it runs between Big-O(g(n)) and Big-Omega(g(n)).
The insertion sort runs from Big-Omega(n) to Big-Oh(n^2), so the running time of insertion sort CANNOT be tightly bound to Big-Theta(n^2).
As a matter of fact, CLRS never uses Big-Theta(n^2) to tightly bound insertion sort.
There's no contradiction, since CLRS mentioned nothing about insertion sort of being theta(N^2).

Resources