Difficulty in comprehending asymptotic notation - algorithm

As far as I know and research,
Big - Oh notation describes the worst case of an algorithm time complexity.
Big - Omega notation describes the best case of an algorithm time complexity.
Big - Theta notation describes the average case of an algorithm time complexity.
Source
However, in recent days, I've seen a discussion in which some guys tell that
O for worst case is a "forgivable" misconception. Θ for best is plain
wrong. Ω for best is also forgivable. But they remain misconceptions.
The big-O notation can represent any complexity. In fact it can
describe the asymptotic behavior of any function.
I remember as well I learnt the former one in my university course. I'm still a student and if I know them wrongly, please could you explain me?

Bachmann-Landau notation has absolutely nothing whatsoever to do with computational complexity of algorithms. This should already be very obvious by the fact that the idea of a computing machine that computes an algorithm didn't really exist in 1894, when Bachmann and Landau invented this notation.
Bachmann-Landau notation describes the growth rates of functions by grouping them together in sets of functions that grow at roughly the same rate. Bachmann-Landau notation does not say anything about what those functions mean. They are just functions. In fact, they don't have to mean anything at all.
All that they mean, is this:
f ∈ ο(g): f grows slower than g
f ∈ Ο(g): f does not grow significantly faster than g
f ∈ Θ(g): f grows as fast as g
f ∈ Ω(g): f does not grow significantly slower than g
f ∈ ω(g): f grows faster than g
It does not say anything about what f or g are, or what they mean.
Note that there are actually two conflicting, incompatible definitions of Ω; The one given here is the one that is more useful for computational complexity theory. Also note that these are only very broad intuitions, when in doubt, you should look at the definitions.
If you want, you can use Bachmann-Landau notation to describe the growth rate of a population of rabbits as a function of time, or the growth rate of a human's belly as a function of beers.
Or, you can use it to describe the best-case step complexity, worst-case step complexity, average-case step complexity, expected-case step complexity, amortized step complexity, best-case time complexity, worst-case time complexity, average-case time complexity, expected-case time complexity, amortized time complexity, best-case space complexity, worst-case space complexity, average-case space complexity, expected-case space complexity, or amortized space complexity of an algorithm.

These assertions are at best inaccurate and at worst wrong. Here is the truth.
The running time of an algorithm is not a function of N (as it also depends on the particular data set), so you cannot discuss its asymptotic complexity directly. All you can say is that it lies between the best and worst cases.
The worst case, best case and average case running times are functions of N, though the average case depends on the probabilistic distribution of the data, so is not uniquely defined.
Then the asymptotic notation is such that
O(f(N)) denotes an upper bound, which can be tight or not;
Ω(f(N)) denotes a lower bound, which can be tight or not;
Θ(f(N)) denotes a bilateral bound, i.e. the conjunction of O(f(N)) and Ω(f(N)); it is perforce tight.
So,
All of the worst-case, best-case and average-case complexities have a Θ bound, as they are functions; in practice this bound can be too difficult to establish and we content ourselves with looser O or Ω bounds instead.
It is absolutely not true that the Θ bound is reserved for the average case.
Examples:
The worst-case of quicksort is Θ(N²) and its best and average cases are Θ(N Log N). So by language abuse, the running time of quicksort is O(N²) and Ω(N Log N).
The worst-case, best-case and average cases of insertion sort are all three Θ(N²).
Any algorithm that needs to look at all input is best-case, average-case and worst-case Ω(N) (and without more information we can't tell any upper bound).
Matrix multiplication is known to be doable in time O(N³). But we still don't know precisely the Θ bound of this operation, which is N² times a slowly growing function. Thus Ω(N²) is an obvious but not tight lower bound. (Worst, best and average cases have the same complexity.)
There is some confusion stemming from the fact that the best/average/worst cases take well-defined durations, while tthe general case lies in a range (it is in fact a random variable). And in addition, algorithm analysis is often tedious, if not intractable, and we use simplifications that can lead to loose bounds.

Related

How to know if algorithm is big O or big theta

Can someone shortly explain why an algorithm would be O(f(n)) and not Θ(f(n). I get that to be Θf(n)) it must be O(f(n)) and Ω(f(n)) but how do you know if a particular algorithm is Θf(n)) or O(f(n)). I think it's hard for me to not see big O as a worst case run time. I know it's just a bound but how is the bound determined. Like I can see a search in a binary search tree as running in constant time if the element is in the root, but I think this has nothing to do with big O.
I think it is very important to distinguish bounds from cases.
Big O, Big Ω, and Big Θ all concern themselves with bounds. They are ways of defining behavior of an algorithm, and they give information on the growth of the number of operations as n (number of inputs) grows increasingly large.
In class the Big O of a worst case is often discussed and I think that can be confusing sometimes because it conflates the idea of asymptotic behavior with a singular worst case. Big O is concerned with behavior as n approaches infinity, not a singular case. Big O(f(x)) is an upper bound. It is a guarantee that regardless of input, the algorithm will having a running time no worse than some positive constant multiplied by f(x).
As you mentioned Θ(f(x)) only exists if Big O is O(f(x)) and Big Ω is Ω(f(x)). In the question title you asked how to determine if an algorithm is Big O or Big Θ. The answer is that the algorithm could be both Θ(f(x)) and O(f(x)). In cases where Big Θ exists, the lower bound is some positive constant A multiplied by f(x) and the upper bound is some positive constant C multiplied by f(x). This means that when Θ(f(x)) exists, the algorithm can perform no worse than C*f(x) and no better than A*f(x), regardless of any input. When Θ(f(x)) exists, you are given a guarantee of how the algorithm will behave no matter what kind of input you feed it.
When Θ(f(x)) exists, so does O(f(x)). In these cases, it is valid to state that the running time of the algorithm is O(f(x)) or that the running time of the algorithm is Θ(f(x)). They are both true statements. Giving the Big Θ notation is just more informative, since it provides information on both the upper and lower bound. Big O only provides information on the upper bound.
When Big O and Big Ω have different functions for their bounds (i.e when Big O is O(g(x)) and Big Ω is Ω(h(x)) where g(x) does not equal h(x)), then Big Θ does not exist. In these cases, if you want to provide a guarantee of the upper bound on the algorithm, you must use Big O notation.
Above all, it is imperative that you differentiate between bounds and cases. Bounds give guarantees on the behavior of the algorithm as n becomes very large. Cases are more on an individual basis.
Let's make an example here. Imagine an algorithm BestSort that sorts a list of numbers by first checking if it is sorted and if it is not, sort it by using MergeSort. This algorithm BestSort has a best case complexity of Ω(n) since it may discover a sorted list and it has a worst case complexity of O(n log(n)) which it inherits from Mergesort. Thus this algorithm has no Theta complexity. Compare this to pure Mergesort which is always Θ(n log(n)) even if the list is already sorted.
I hope this helps you a bit.
EDIT
Since there has been some confusion I will provide some kind of pseudo code:
BestSort(A) {
If (A is sorted) // this check has a complexity of O(n)
return A;
else
return mergesort(A); // mergesort has a complexity of Θ(n log(n))
}

Is "best case performance Θ(1) -> running time ≠ Θ(log n)" valid?

This is an argument for justifying that the running time of an algorithm can't be considered Θ(f(n)) but should be O(f(n)) instead.
E.g. this question about binary search: Is binary search theta log (n) or big O log(n)
MartinStettner's response is even more confusing.
Consider *-case performances:
Best-case performance: Θ(1)
Average-case performance: Θ(log n)
Worst-case performance: Θ(log n)
He then quotes Cormen, Leiserson, Rivest: "Introduction to Algorithms":
What we mean when we say "the running time is O(n^2)" is that the worst-case running time (which is a function of n) is O(n^2) ...
Doesn't this suggest the terms running time and worst-case running time are synonymous?
Also, if running time refers to a function with natural input f(n), then there has to be Θ class which contains it, e.g. Θ(f(n)), right? This indicates that you are obligated to use O notation only when the running time is not known very precisely (i.e. only an upper bound is known).
When you write O(f(n)) that means that the running time of your algorithm is bounded above by a function c*f(n) where c is a constant. That also means that that your algorithm can complete in much less number of steps than c*f(n). We often use the Big-O notation because we want to include the possibility that the algorithm completes faster than we indicate. On the other hand, Theta(f(n)) means that the algorithm always completes in c*f(n) steps. Binary search is O(log(n)) because usually it will complete in log(n) steps, but could complete in one step if you get lucky (best-case performance).
I get always confused, if I read about running times.
For me a running time is the time an implementation of an algorithm needs to be executed on a computer. This can be differ in many ways and so is a complicated thing.
So I think complexity of an algorithm is a better word.
Now, the complexity is (in most cases) a worst-case-complexity. If you know an upper bound for the worst case, you also know that it can only get better in other cases.
So, if you know, that there exist some (maybe trivial) cases, where your algorithm does only a few (constant number) steps and stops, you don't have to care about an lower bound and so you (normaly) use an upper bound in big-O or little-o notation.
If you do your calculations carfully, you can also use the Θ notation.
But notice: all complexities only hold on the cases they are attached to. This means: If you make assumtions like "Input is a best case", this effects your calculation and also the resulting complexity. In the case of binary search you posted the complexity for three different assumtions.
You can generalize it by saying: "The complexity of Binary Search is in O(log n)", since Θ(log n) means "Ω(log n) and O(log n)" and O(1) is a subset of O(log n).
To summerize:
- If you know the very precise function for the complexity, you can give the complexity in Θ-notation
- If you want to get an overall upper bound, you have to use O-notation, until the lower bound over all input cases does not differ from the upper bound.
- In most cases you have to use the O-notation, since the algorithms are too complex to get a close upper and lower bound.

Still sort of confused about Big O notation

So I've been trying to understand Big O notation as well as I can, but there are still some things I'm confused about. So I keep reading that if something is O(n), it usually is referring to the worst-case of an algorithm, but that it doesn't necessarily have to refer to the worst case scenario, which is why we can say the best-case of insertion sort for example is O(n). However, I can't really make sense of what that means. I know that if the worst-case is O(n^2), it means that the function that represents the algorithm in its worst case grows no faster than n^2 (there is an upper bound). But if you have O(n) as the best case, how should I read that as? In the best case, the algorithm grows no faster than n? What I picture is a graph with n as the upper bound, like
If the best case scenario of an algorithm is O(n), then n is the upper bound of how fast the operations of the algorithm grow in the best case, so they cannot grow faster than n...but wouldn't that mean that they can grow as fast as O(log n) or O(1), since they are below the upper bound? That wouldn't make sense though, because O(log n) or O(1) is a better scenario than O(n), so O(n) WOULDN'T be the best case? I'm so lost lol
Big-O, Big-Θ, Big-Ω are independent from worst-case, average-case, and best-case.
The notation f(n) = O(g(n)) means f(n) grows no more quickly than some constant multiple of g(n).
The notation f(n) = Ω(g(n)) means f(n) grows no more slowly than some constant multiple of g(n).
The notation f(n) = Θ(g(n)) means both of the above are true.
Note that f(n) here may represent the best-case, worst-case, or "average"-case running time of a program with input size n.
Furthermore, "average" can have many meanings: it can mean the average input or the average input size ("expected" time), or it can mean in the long run (amortized time), or both, or something else.
Often, people are interested in the worst-case running time of a program, amortized over the running time of the entire program (so if something costs n initially but only costs 1 time for the next n elements, it averages out to a cost of 2 per element). The most useful thing to measure here is the least upper bound on the worst-case time; so, typically, when you see someone asking for the Big-O of a program, this is what they're looking for.
Similarly, to prove a problem is inherently difficult, people might try to show that the worst-case (or perhaps average-case) running time is at least a certain amount (for example, exponential).
You'd use Big-Ω notation for these, because you're looking for lower bounds on these.
However, there is no special relationship between worst-case and Big-O, or best-case and Big-Ω.
Both can be used for either, it's just that one of them is more typical than the other.
So, upper-bounding the best case isn't terribly useful. Yes, if the algorithm always takes O(n) time, then you can say it's O(n) in the best case, as well as on average, as well as the worst case. That's a perfectly fine statement, except the best case is usually very trivial and hence not interesting in itself.
Furthermore, note that f(n) = n = O(n2) -- this is technically correct, because f grows more slowly than n2, but it is not useful because it is not the least upper bound -- there's a very obvious upper bound that's more useful than this one, namely O(n). So yes, you're perfectly welcome to say the best/worst/average-case running time of a program is O(n!). That's mathematically perfectly correct. It's just useless, because when people ask for Big-O they're interested in the least upper bound, not just a random upper bound.
It's also worth noting that it may simply be insufficient to describe the running-time of a program as f(n). The running time often depends on the input itself, not just its size. For example, it may be that even queries are trivially easy to answer, whereas odd queries take a long time to answer.
In that case, you can't just give f as a function of n -- it would depend on other variables as well. In the end, remember that this is just a set of mathematical tools; it's your job to figure out how to apply it to your program and to figure out what's an interesting thing to measure. Using tools in a useful manner needs some creativity, and math is no exception.
Informally speaking, best case has O(n) complexity means that when the input meets
certain conditions (i.e. is best for the algorithm performed), then the count of
operations performed in that best case, is linear with respect to n (e.g. is 1n or 1.5n or 5n).
So if the best case is O(n), usually this means that in the best case it is exactly linear
with respect to n (i.e. asymptotically no smaller and no bigger than that) - see (1). Of course,
if in the best case that same algorithm can be proven to perform at most c * log N operations
(where c is some constant), then this algorithm's best case complexity would be informally
denoted as O(log N) and not as O(N) and people would say it is O(log N) in its best case.
Formally speaking, "the algorithm's best case complexity is O(f(n))"
is an informal and wrong way of saying that "the algorithm's complexity
is Ω(f(n))" (in the sense of the Knuth definition - see (2)).
See also:
(1) Wikipedia "Family of Bachmann-Landau notations"
(2) Knuth's paper "Big Omicron and Big Omega and Big Theta"
(3)
Big Omega notation - what is f = Ω(g)?
(4)
What is the difference between Θ(n) and O(n)?
(5)
What is a plain English explanation of "Big O" notation?
I find it easier to think of O() as about ratios than about bounds. It is defined as bounds, and so that is a valid way to think of it, but it seems a bit more useful to think about "if I double the number/size of inputs to my algorithm, does my processing time double (O(n)), quadruple (O(n^2)), etc...". Thinking about it that way makes it a little bit less abstract - at least to me...

Big O for worst-case running time and Ω is for the best-case, but why is Ω used in worst case sometimes?

I'm confused, I thought that you use Big O for worst-case running time and Ω is for the best-case? Can someone please explain?
And isn't (lg n) the best-case? and (nlg n) is the worst case? Or am I misunderstanding something?
Show that the worst-case running time of Max-Heapify on a heap of size
n is Ω(lg n). ( Hint: For a heap with n nodes, give node values that
cause Max-Heapify to be called recursively at every node on a path
from the root down to a leaf.)
Edit: no this is not homework. im practicing and this has an answer key buy im confused.
http://www-scf.usc.edu/~csci303/cs303hw4solutions.pdf Problem 4(6.2 - 6)
Edit 2: So I misunderstood the question not about Big O and Ω?
It is important to distinguish between the case and the bound.
Best, average, and worst are common cases of interest when analyzing algorithms.
Upper (O, o) and lower (Omega, omega), along with Theta, are common bounds on functions.
When we say "Algorithm X's worst-case time complexity is O(n)", we're saying that the function which represents Algorithm X's performance, when we restrict inputs to worst-case inputs, is asymptotically bounded from above by some linear function. You could speak of a lower bound on the worst-case input; or an upper or lower bound on the average, or best, case behavior.
Case != Bound. That said, "upper on the worst" and "lower on the best" are pretty sensible sorts of metrics... they provide absolute bounds on the performance of an algorithm. It doesn't mean we can't talk about other metrics.
Edit to respond to your updated question:
The question asks you to show that Omega(lg n) is a lower bound on the worst case behavior. In other words, when this algorithm does as much work as it can do for a class of inputs, the amount of work it does grows at least as fast as (lg n), asymptotically. So your steps are the following: (1) identify the worst case for the algorithm; (2) find a lower bound for the runtime of the algorithm on inputs belonging to the worst case.
Here's an illustration of the way this would look for linear search:
In the worst case of linear search, the target item is not in the list, and all items in the list must be examined to determine this. Therefore, a lower bound on the worst-case complexity of this algorithm is O(n).
Important to note: for lots of algorithms, the complexity for most cases will be bounded from above and below by a common set of functions. It's very common for the Theta bound to apply. So it might very well be the case that you won't get a different answer for Omega than you do for O, in any event.
Actually, you use Big O for a function which grows faster than your worst-case complexity, and Ω for a function which grows more slowly than your worst-case complexity.
So here you are asked to prove that your worst case complexity is worse than lg(n).
O is the upper limit (i.e, worst case)
Ω is the lower limit (i.e., best case)
The example is saying that in the worst input for max-heapify ( i guess the worst input is reverse-ordered input) the running time complexity must be (at least) lg n . Hence the Ω (lg n) since it is the lower limit on the execution complexity.

Different upper bounds and lower bounds of same algorithm

So I just started learning about Asymptotic bounds for an algorithm
Question:
What can we say about theta of a function if for the algorithm we find different lower and upper bounds?? (say omega(n) and O(n^2)). Or rather what can we say about tightness of such an algorithm?
The book which I read says Theta is for same upper and lower bounds of the function.
What about in this case?
I don't think you can say anything, in that case.
The definition of Θ(f(n)) is:
A function is Θ(f(n)) if and only if it is Ω(f(n)) and O(f(n)).
For some pathological function that exhibits those behaviors, such as oscillating between n and n^2, it wouldn't be defined.
Example:
f(x) = n if n is odd
n^2 if n is even
Your bounds Ω(n) and O(n^2) would be tight on this, but Θ(f(n)) is not defined for any function.
See also: What is the difference between Θ(n) and O(n)?
Just for a bit of practicality, one algorithm that is not in Θ(f(n)) for any f(n) would be insertion sort. It runs in Ω(n) since for a list that is already sorted, you only need one operation for the insert in every step, but it runs in O(n^2) in the general case. Constructing functions that oscillate or are non-continuous otherwise usually is done more for didactic purposes, but in my experience such functions rarely, if ever, appear with actual algorithms.
Regarding tightness, I only ever heard that in this context with reference to the upper and lower bounds proposed for algorithms. Again regarding the example of insertion sort, the given bounds are tight in the sense that there are instances of the problem that actually can be done in time linear in their size (the lower bound) and other instances of the problem that will not execute in time less than quadratic in their size. Bounds that are valid, but not tight for insertion sort would be Ω(1) since you can't sort lists of arbitrary size in constant time, and O(n^3) because you can always sort a list of n elements in strictly O(n^2) time, which is an order of magnitude less, so you can certainly do it in O(n^3). What bounds are for is to give us a crude idea of what we can expect as performance of our algorithms so we get an idea of how efficient our solutions are; tight bounds are the most desirable, since they both give us that crude idea and that idea is optimal in the sense that there are extreme cases (which sometimes are also the general case) where we actually need all the complexity the bound allows.
The average case complexity is not a bound; it "only" describes how efficient an algorithm is "in most cases"; take for example quick sort which has a best-case complexity of Ω(n), a worst case complexity of O(n^2) and an average case complexity of O(n log n). This tells us that for almost all cases, quick sort is as fast as sorting gets in general (i.e. the average case complexity), while there are instances of the problem that it solves faster than that (best case complexity -> lower bound) and also instances of the problem that take quick sort longer to solve than that (worst case complexity -> upper bound).

Resources