Complexity of Binary Search - algorithm

I am watching the Berkley Uni online lecture and stuck on the below.
Problem: Assume you have a collection of CD that is already sorted. You want to find the list of CD with whose title starts with "Best Of."
Solution: We will use binary search to find the first case of "Best Of" and then we print until the tile is no longer "Best Of"
Additional question: Find the complexity of this Algorithm.
Upper Bound: Binary Search Upper Bound is O(log n), so once we have found it then we print let say k title. so it is O(logn + k)
Lower Bound: Binary Search lower Bound is Omega(1) assuming we are lucky and the record title is the middle title. In this case it is Omega(k)
This is the way I analyzed it.
But in the lecture, the lecturer used best case and worst case.
I have two questions about it:
Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
His analysis was
Worst Case : Theta(logn + k)
Best Case : Theta (k)
If I use the concept of Worst Case as referring to the data and having nothing to do with algorithm then yep, his analysis is right.
This is because assuming the worst case (CD title in the end or not found) then the Big O and Omega is both log n there it is theta(log n +k).
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?

Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
No, the Ο and Ω notations do only describe the bounds of a function that describes the asymptotic behavior of the actual behavior of the algorithm. Here’s a good
Ω describes the lower bound: f(n) ∈ Ω(g(n)) means the asymptotic behavior of f(n) is not less than g(n)·k for some positive k, so f(n) is always at least as much as g(n)·k.
Ο describes the upper bound: f(n) ∈ Ο(g(n)) means the asymptotic behavior of f(n) is not more than g(n)·k for some positive k, so f(n) is always at most as much as g(n)·k.
These two can be applied on both the best case and the worst case for binary search:
best case: first element you look at is the one you are looking for
Ω(1): you need at least one lookup
Ο(1): you need at most one lookup
worst case: element is not present
Ω(log n): you need at least log n steps until you can say that the element you are looking for is not present
Ο(log n): you need at most log n steps until you can say that the element you are looking for is not present
You see, the Ω and Ο values are identical. In that case you can say the tight bound for the best case is Θ(1) and for the worst case is Θ(log n).
But often we do only want to know the upper bound or tight bound as the lower bound has not much practical information.
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?
Yes, your analysis seems correct.

For your first question, there is a difference between the best-case runtime of an algorithm, the worst-case runtime of an algorithm, big-O, and big-Ω notations. The best- and worst-case runtimes of an algorithm are actual mathematical functions with precise values that tell you the maximum and minimum amount of work an algorithm will do. To describe the growth rates of these functions, we can use big-O and big-Ω notation. However, it's possible to describe the best-case behavior of a function with big-Ω notation or the worst-case with big-O notation. For example, we might know that the worst-case runtime of a function might be O(n2), but not actually know which function that's O(n2) the worst-case behavior is. This might occur if we wanted to upper-bound the worst-case behavior so that we know that it isn't catastrophically bad. It's possible that in this case the worst-case behavior is actually Θ(n), in which case the O(n2) is a bad upper bound, but saying that the worst-case behavior is O(n2) in this case indicates that it isn't terrible. Similarly, we might say that the best-case behavior of an algorithm is Ω(n), for example, if we know that it must do at least linear work but can't tell if it must do much more than this.
As to your second question, the analysis of the worst-case and best-case behaviors of the algorithm are absolutely dependent on the structure of the input data, and it's usually quite difficult to analyze the best- and worst-case behavior of an algorithm without seeing how it performs on different data sets. It's perfectly reasonable to do a worst-case analysis by exhibiting some specific family of inputs (note that it has to be a family of inputs, not just one input, since we need to get an asymptotic bound) and showing that the algorithm must run poorly on that input. You can do a best-case analysis in the same way.
Hope this helps!

Related

Still not understanding Big-O vs Worst Case Time Complexity

The worst case for time taken by linear search is when the item is at the end of the list/array, or doesn't exist. In this case, the algorithm will need to perform n comparisons, to see if each element is the required value, assuming n is the length of the array/list.
From what I've understood of big-O notation, it makes sense to say that the time complexity of this algorithm is O(n), as it COULD happen that the worst case occurs, and big-O is used when we want to make a conservative estimate of the "worst case".
From a lot posts and answers on Stack Overflow, it seems this thinking is flawed, with claims made such as Big-O notation has nothing to do with the worst case analysis.
Please help me to understand the distinction in a way that doesn't just add to my confusion, as the answers here: Why big-Oh is not always a worst case analysis of an algorithm? do.
I'm not seeing how big-O has NOTHING to do with worst case analysis. From my current hilltop, it looks like big-O expresses how the worst case grows as the input size grows, which seems very much "to do" with worst-case analysis.
Statements such as this, from https://medium.com/omarelgabrys-blog/the-big-scary-o-notation-ce9352d827ce :
As an example, worst case analysis gives the maximum number of operations assuming that the input is in the worst possible state, while the big o notation express the max number of operations done in the worst case.
don't help much, as I cannot see what distinction is being referred to.
Any added clarity much appreciated.
The big-O notation is indeed independent of the worst-case analysis. It applies to any function you want.
In the case of a linear seach,
the worst-case complexity is O(n) (in fact even Θ(n)),
the average-case complexity is O(n) (in fact even Θ(n)),
the best-case complexity is O(1) (in fact even Θ(1)).
So big-O and worst-case are different concepts, though a big-O bound for the running time of an algorithm must hold for the worst-case.
This is the case:
If an algorithm to find a solution for a problem is in O(f(n)), means that the worst-case scenario for finding the solution for the problem by the algorithm is in O(f(n)). In other words, if the worst-case scenario can be found in g(n) steps by the algorithm, then g(n) is in O(f(n)).
For example, for the search algorithm, as you have mentioned, we know that the worst-case scenario can be found in O(n). Now, although the algorithm is in O(n), we can say the algorithm is in O(n^2) as well. As you see, here is the distinction between Big-Oh complexity and the worst-case scenario.
In sum, the worst-case scenario complexity of an algorithm is a subset of the Big-Oh complexity of the algorithm.

comparison sort algorithms requires Ω(nlgn) comparisons in the worst case

This was taken from the popular book called Intro to Algorithms. The author states that any comparison sort algorithm requires Ω(nlgn) comparisons in the worst case. Taking the bubble sort algorithm as an example, in the worst case we have an upper bound O(n^2). Omega represents the lower or least bound therefore wouldn't the lower bound of a worst case be Ω(n^2) as well? How would a bubble sort have a lower bound, such as the suggested Ω(nlgn), rather than n^2 in a worst case performance? In the worst case performance bubble sort can't take AT LEAST nlgn.
The author said ANY algorithm: no algorithm can do better than Ω(N Log(N)) in the worst case.
The reason is easy to understand: any comparison-based sorting algorithm is a binary decision tree (a long, dynamic sequence of if-then-else). Since the algorithm must be able to process any permutation of the data, it must be able to permute differently all N! cases and the tree must have at least that many leaves. So the height of the decision tree, i.e. the worst-case complexity, is at least Lg(N!)=Ω(N.Log(N)).
When the decision tree is well balanced (Heapsort), the height is also O(N.Log(N)).
When the decision tree is strongly imbalanced (Bubblesort), the height can become O(N²).
Addendum:
As Ω denotes a lower bound, any lower lower bound is also valid. So as the worst case of Bubble sort is Θ(N²), it is also Ω(N²), Ω(N.Log(N)), Ω(N), Ω(Log N), Ω(1)...
To oversimplify slightly*, when we talk about lower bounds for algorithmic problems, we're interested in how the best algorithm does in the worst case. The best comparison-based sorting algorithms (e.g., mergesort) use roughly n log n comparisons in the worst case, so the lower bound for sorting is quoted as Omega(n log n). Algorithms that are not the best, e.g., bubble sort, may do materially worse than the best algorithm in the worst case. In the best case, they may do better than the best algorithm. Neither of these facts is inconsistent with the lower bound for sorting.
*There may not be one best algorithm.
You need to focus on what "at least" means, denoted by Ω.
"Bubble sort requires Ω(nlg(n)) comparisons in the worst case" is not a False statement, because it DOES require AT LEAST knlg(n) comparisons for some constant k.
Yes, we know that bubble sort requires Ω(n^2) in the worst case. However, this does not make the above statement False. Thus what the author claims is correct.
Here's an exmaple, hopefully to clarify the situation:
"I can do at least 50 push-ups regardless of how tired I am"
So knowing that, is the following statement False?
"I can do at least 20 push-ups regardless of how tired I am"
The author states that any comparison sort algorithm requires Ω(nlgn) comparisons in the worst case.
There are several ways to think about the wording of this claim, with increasing degrees of formalism/pedantry.
Most colloquially: Read the phrase "any comparison sort algorithm" as "the best comparison-based sorting algorithm you can think of." Bubblesort doesn't even enter the picture, because mergesort is better (in the worst case). It's analogous to something like "Any airplane will stall when its airspeed drops below 1mph." Some airplanes will stall sooner, but even the best airplane you can think of won't be able to beat the claim.
Slightly more formally: Insert the words "at least." Any comparison-based sorting algorithm requires at least Ω(nlgn) comparisons in the worst case. Some require even more than that.
Slightly more formally: Observe that the Ω-notation is already defined to mean "at least," so, really, you don't have to say it. Saying "at least Ω(nlgn)" is redundant, like saying "ATM machine" or "PIN number." Likewise, the big O-notation is defined to mean "at most." Bubblesort is O(n²). "But when you give it sorted input, it runs in linear time!" Yes; when we say "O(n²)" we mean it takes at most quadratic time. It's allowed to take less in some cases. Similarly, a naïvely implemented quicksort is Ω(nlgn). "But when you give it adversarial input, it runs in quadratic time!" Yes; when we say "Ω(nlgn)" we mean it takes at least time proportional to nlgn. It's allowed to take more in some cases.
Formally (AFAIK): In fact, the O and Ω notations refer to sets of functions.
The set O(1) is defined as the set of all functions f such that ∃c∀n: f(n) < c. (Sorry, StackOverflow doesn't support math markup.)
The set O(n²) is the set of all functions f such that ∃c∀n: f(n) < cn². Notice that O(1) ⊂ O(n) ⊂ O(n²).
Ω is the same idea with the sign flipped: The set Ω(n lg n) is the set of all functions f such that ∃c∀n: f(n) > cn lg n. Notice that Ω(1) ⊃ Ω(n) ⊃ Ω(n²).
So, your book is basically saying that if we consider the function gA(n) = number of comparisons made by algorithm A on an input of size n, where A represents a comparison-based sorting algorithm, then
∀A: gA ∈ Ω(n lg n)
which is to say
∀A: ∃c∀n: gA(n) > cn lg n
(That is, it is impossible to find any comparison-based sorting algorithm A such that ∀c∃n: gA(n) < cn lg n.)
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.8/d3.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/knockout/2.3.0/knockout-min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.23/angular.min.js"></script>

Still sort of confused about Big O notation

So I've been trying to understand Big O notation as well as I can, but there are still some things I'm confused about. So I keep reading that if something is O(n), it usually is referring to the worst-case of an algorithm, but that it doesn't necessarily have to refer to the worst case scenario, which is why we can say the best-case of insertion sort for example is O(n). However, I can't really make sense of what that means. I know that if the worst-case is O(n^2), it means that the function that represents the algorithm in its worst case grows no faster than n^2 (there is an upper bound). But if you have O(n) as the best case, how should I read that as? In the best case, the algorithm grows no faster than n? What I picture is a graph with n as the upper bound, like
If the best case scenario of an algorithm is O(n), then n is the upper bound of how fast the operations of the algorithm grow in the best case, so they cannot grow faster than n...but wouldn't that mean that they can grow as fast as O(log n) or O(1), since they are below the upper bound? That wouldn't make sense though, because O(log n) or O(1) is a better scenario than O(n), so O(n) WOULDN'T be the best case? I'm so lost lol
Big-O, Big-Θ, Big-Ω are independent from worst-case, average-case, and best-case.
The notation f(n) = O(g(n)) means f(n) grows no more quickly than some constant multiple of g(n).
The notation f(n) = Ω(g(n)) means f(n) grows no more slowly than some constant multiple of g(n).
The notation f(n) = Θ(g(n)) means both of the above are true.
Note that f(n) here may represent the best-case, worst-case, or "average"-case running time of a program with input size n.
Furthermore, "average" can have many meanings: it can mean the average input or the average input size ("expected" time), or it can mean in the long run (amortized time), or both, or something else.
Often, people are interested in the worst-case running time of a program, amortized over the running time of the entire program (so if something costs n initially but only costs 1 time for the next n elements, it averages out to a cost of 2 per element). The most useful thing to measure here is the least upper bound on the worst-case time; so, typically, when you see someone asking for the Big-O of a program, this is what they're looking for.
Similarly, to prove a problem is inherently difficult, people might try to show that the worst-case (or perhaps average-case) running time is at least a certain amount (for example, exponential).
You'd use Big-Ω notation for these, because you're looking for lower bounds on these.
However, there is no special relationship between worst-case and Big-O, or best-case and Big-Ω.
Both can be used for either, it's just that one of them is more typical than the other.
So, upper-bounding the best case isn't terribly useful. Yes, if the algorithm always takes O(n) time, then you can say it's O(n) in the best case, as well as on average, as well as the worst case. That's a perfectly fine statement, except the best case is usually very trivial and hence not interesting in itself.
Furthermore, note that f(n) = n = O(n2) -- this is technically correct, because f grows more slowly than n2, but it is not useful because it is not the least upper bound -- there's a very obvious upper bound that's more useful than this one, namely O(n). So yes, you're perfectly welcome to say the best/worst/average-case running time of a program is O(n!). That's mathematically perfectly correct. It's just useless, because when people ask for Big-O they're interested in the least upper bound, not just a random upper bound.
It's also worth noting that it may simply be insufficient to describe the running-time of a program as f(n). The running time often depends on the input itself, not just its size. For example, it may be that even queries are trivially easy to answer, whereas odd queries take a long time to answer.
In that case, you can't just give f as a function of n -- it would depend on other variables as well. In the end, remember that this is just a set of mathematical tools; it's your job to figure out how to apply it to your program and to figure out what's an interesting thing to measure. Using tools in a useful manner needs some creativity, and math is no exception.
Informally speaking, best case has O(n) complexity means that when the input meets
certain conditions (i.e. is best for the algorithm performed), then the count of
operations performed in that best case, is linear with respect to n (e.g. is 1n or 1.5n or 5n).
So if the best case is O(n), usually this means that in the best case it is exactly linear
with respect to n (i.e. asymptotically no smaller and no bigger than that) - see (1). Of course,
if in the best case that same algorithm can be proven to perform at most c * log N operations
(where c is some constant), then this algorithm's best case complexity would be informally
denoted as O(log N) and not as O(N) and people would say it is O(log N) in its best case.
Formally speaking, "the algorithm's best case complexity is O(f(n))"
is an informal and wrong way of saying that "the algorithm's complexity
is Ω(f(n))" (in the sense of the Knuth definition - see (2)).
See also:
(1) Wikipedia "Family of Bachmann-Landau notations"
(2) Knuth's paper "Big Omicron and Big Omega and Big Theta"
(3)
Big Omega notation - what is f = Ω(g)?
(4)
What is the difference between Θ(n) and O(n)?
(5)
What is a plain English explanation of "Big O" notation?
I find it easier to think of O() as about ratios than about bounds. It is defined as bounds, and so that is a valid way to think of it, but it seems a bit more useful to think about "if I double the number/size of inputs to my algorithm, does my processing time double (O(n)), quadruple (O(n^2)), etc...". Thinking about it that way makes it a little bit less abstract - at least to me...

Big O for worst-case running time and Ω is for the best-case, but why is Ω used in worst case sometimes?

I'm confused, I thought that you use Big O for worst-case running time and Ω is for the best-case? Can someone please explain?
And isn't (lg n) the best-case? and (nlg n) is the worst case? Or am I misunderstanding something?
Show that the worst-case running time of Max-Heapify on a heap of size
n is Ω(lg n). ( Hint: For a heap with n nodes, give node values that
cause Max-Heapify to be called recursively at every node on a path
from the root down to a leaf.)
Edit: no this is not homework. im practicing and this has an answer key buy im confused.
http://www-scf.usc.edu/~csci303/cs303hw4solutions.pdf Problem 4(6.2 - 6)
Edit 2: So I misunderstood the question not about Big O and Ω?
It is important to distinguish between the case and the bound.
Best, average, and worst are common cases of interest when analyzing algorithms.
Upper (O, o) and lower (Omega, omega), along with Theta, are common bounds on functions.
When we say "Algorithm X's worst-case time complexity is O(n)", we're saying that the function which represents Algorithm X's performance, when we restrict inputs to worst-case inputs, is asymptotically bounded from above by some linear function. You could speak of a lower bound on the worst-case input; or an upper or lower bound on the average, or best, case behavior.
Case != Bound. That said, "upper on the worst" and "lower on the best" are pretty sensible sorts of metrics... they provide absolute bounds on the performance of an algorithm. It doesn't mean we can't talk about other metrics.
Edit to respond to your updated question:
The question asks you to show that Omega(lg n) is a lower bound on the worst case behavior. In other words, when this algorithm does as much work as it can do for a class of inputs, the amount of work it does grows at least as fast as (lg n), asymptotically. So your steps are the following: (1) identify the worst case for the algorithm; (2) find a lower bound for the runtime of the algorithm on inputs belonging to the worst case.
Here's an illustration of the way this would look for linear search:
In the worst case of linear search, the target item is not in the list, and all items in the list must be examined to determine this. Therefore, a lower bound on the worst-case complexity of this algorithm is O(n).
Important to note: for lots of algorithms, the complexity for most cases will be bounded from above and below by a common set of functions. It's very common for the Theta bound to apply. So it might very well be the case that you won't get a different answer for Omega than you do for O, in any event.
Actually, you use Big O for a function which grows faster than your worst-case complexity, and Ω for a function which grows more slowly than your worst-case complexity.
So here you are asked to prove that your worst case complexity is worse than lg(n).
O is the upper limit (i.e, worst case)
Ω is the lower limit (i.e., best case)
The example is saying that in the worst input for max-heapify ( i guess the worst input is reverse-ordered input) the running time complexity must be (at least) lg n . Hence the Ω (lg n) since it is the lower limit on the execution complexity.

Different upper bounds and lower bounds of same algorithm

So I just started learning about Asymptotic bounds for an algorithm
Question:
What can we say about theta of a function if for the algorithm we find different lower and upper bounds?? (say omega(n) and O(n^2)). Or rather what can we say about tightness of such an algorithm?
The book which I read says Theta is for same upper and lower bounds of the function.
What about in this case?
I don't think you can say anything, in that case.
The definition of Θ(f(n)) is:
A function is Θ(f(n)) if and only if it is Ω(f(n)) and O(f(n)).
For some pathological function that exhibits those behaviors, such as oscillating between n and n^2, it wouldn't be defined.
Example:
f(x) = n if n is odd
n^2 if n is even
Your bounds Ω(n) and O(n^2) would be tight on this, but Θ(f(n)) is not defined for any function.
See also: What is the difference between Θ(n) and O(n)?
Just for a bit of practicality, one algorithm that is not in Θ(f(n)) for any f(n) would be insertion sort. It runs in Ω(n) since for a list that is already sorted, you only need one operation for the insert in every step, but it runs in O(n^2) in the general case. Constructing functions that oscillate or are non-continuous otherwise usually is done more for didactic purposes, but in my experience such functions rarely, if ever, appear with actual algorithms.
Regarding tightness, I only ever heard that in this context with reference to the upper and lower bounds proposed for algorithms. Again regarding the example of insertion sort, the given bounds are tight in the sense that there are instances of the problem that actually can be done in time linear in their size (the lower bound) and other instances of the problem that will not execute in time less than quadratic in their size. Bounds that are valid, but not tight for insertion sort would be Ω(1) since you can't sort lists of arbitrary size in constant time, and O(n^3) because you can always sort a list of n elements in strictly O(n^2) time, which is an order of magnitude less, so you can certainly do it in O(n^3). What bounds are for is to give us a crude idea of what we can expect as performance of our algorithms so we get an idea of how efficient our solutions are; tight bounds are the most desirable, since they both give us that crude idea and that idea is optimal in the sense that there are extreme cases (which sometimes are also the general case) where we actually need all the complexity the bound allows.
The average case complexity is not a bound; it "only" describes how efficient an algorithm is "in most cases"; take for example quick sort which has a best-case complexity of Ω(n), a worst case complexity of O(n^2) and an average case complexity of O(n log n). This tells us that for almost all cases, quick sort is as fast as sorting gets in general (i.e. the average case complexity), while there are instances of the problem that it solves faster than that (best case complexity -> lower bound) and also instances of the problem that take quick sort longer to solve than that (worst case complexity -> upper bound).

Resources