Still not understanding Big-O vs Worst Case Time Complexity - algorithm

The worst case for time taken by linear search is when the item is at the end of the list/array, or doesn't exist. In this case, the algorithm will need to perform n comparisons, to see if each element is the required value, assuming n is the length of the array/list.
From what I've understood of big-O notation, it makes sense to say that the time complexity of this algorithm is O(n), as it COULD happen that the worst case occurs, and big-O is used when we want to make a conservative estimate of the "worst case".
From a lot posts and answers on Stack Overflow, it seems this thinking is flawed, with claims made such as Big-O notation has nothing to do with the worst case analysis.
Please help me to understand the distinction in a way that doesn't just add to my confusion, as the answers here: Why big-Oh is not always a worst case analysis of an algorithm? do.
I'm not seeing how big-O has NOTHING to do with worst case analysis. From my current hilltop, it looks like big-O expresses how the worst case grows as the input size grows, which seems very much "to do" with worst-case analysis.
Statements such as this, from https://medium.com/omarelgabrys-blog/the-big-scary-o-notation-ce9352d827ce :
As an example, worst case analysis gives the maximum number of operations assuming that the input is in the worst possible state, while the big o notation express the max number of operations done in the worst case.
don't help much, as I cannot see what distinction is being referred to.
Any added clarity much appreciated.

The big-O notation is indeed independent of the worst-case analysis. It applies to any function you want.
In the case of a linear seach,
the worst-case complexity is O(n) (in fact even Θ(n)),
the average-case complexity is O(n) (in fact even Θ(n)),
the best-case complexity is O(1) (in fact even Θ(1)).
So big-O and worst-case are different concepts, though a big-O bound for the running time of an algorithm must hold for the worst-case.

This is the case:
If an algorithm to find a solution for a problem is in O(f(n)), means that the worst-case scenario for finding the solution for the problem by the algorithm is in O(f(n)). In other words, if the worst-case scenario can be found in g(n) steps by the algorithm, then g(n) is in O(f(n)).
For example, for the search algorithm, as you have mentioned, we know that the worst-case scenario can be found in O(n). Now, although the algorithm is in O(n), we can say the algorithm is in O(n^2) as well. As you see, here is the distinction between Big-Oh complexity and the worst-case scenario.
In sum, the worst-case scenario complexity of an algorithm is a subset of the Big-Oh complexity of the algorithm.

Related

Is the Big-Omega time complexity of all search algorithms O(1)?

I understand that Big Omega defines the lower bound of s function (or best-case runtime).
Considering that almost every search algorithm could "luck out" and find the target element on the first iteration, would it be fair to say that its Big-Omega time complexity is O(1)?
I also understand that defining O(1) as the big Omega may not be useful -other lower bounds may be tighter, or closer to the evaluated function-, but the question is, is it correct?
I've found multiple sources claiming the linear search is Big-Omega O(n), even if some cases could complete in a single step, which is different from the best-case scenario as I understand it.
The lower bound (𝛺) is not the fastest answer a given algorithm can give.
The lower bound of a given problem is equal to the worst case scenario of the best algorithm that solves the problem. When doing complexity analysis, you should never forget that "luck" is always in the hands of the input (the instance the algorithm is trying to solve).
When trying to find a lower bound, you will imagine the "perfect algorithm" and you will try to "trap" it in a very hard case. Usually the algorithm is not defined and is only described based on its (hypotetical) performances. You would use arguments such as "If the ideal algorithm is that fast, it will not have this particular knowledge and will therefore fail on this particular instance, ie. the ideal algorithm doesn't exist". Replace ideal with the lower bound you are trying to prove.
For example, if we search the lower bound for the min-search problem in an unsorted array is 𝛺(n). The proof for this is quite trivial, and like most of the time, is made by contradiction. Basically, an algorithm A in o(n) will not see at least one item from the input array, if that item it did not saw was the minimum, A will fail. The contradiction proves that the problem is in 𝛺(n).
Maybe you can have a look at that answer I gave on a similar question.
The notations O, o, Θ, Ω, and ω are used in characterizing mathematical functions; for example, f(n) = n3 log n is in O(n4) and in Ω(n3).
So, the question is what mathematical functions we apply them to.
The mathematical functions that we tend to be interested in are things like "the worst-case time complexity of such-and-such algorithm, as a function of the size of its input", or "the average-case space complexity of such-and-such procedure, as a function of the largest element in its input". (Note: when we just say "the complexity of such-and-such algorithm", that's usually shorthand for its worst-case time complexity, as a function of some characteristic of its input that's hopefully obvious in context. Either way, it's still a mathematical function.)
We can use any of these notations in characterizing those functions. In particular, it's fine to use Ω in characterizing the worst case or average case.
We can also use any of these functions in characterizing things like "the best-case […]" — that's unusual, but there are times when it may be relevant. But, notably, we're not limited to Ω for that; just as we can use Ω in characterizing the worst case, we can also use O in characterizing the best case. It's all about what characterizations we're interested in.
You are confusing two different topics: Lower/upper bound, and worst-case/best-case time complexity.
The short answer to your question is: Yes, all search algorithms have a lower bound of Ω(1). Linear search (in the worst case, and on average) also has a lower bound of Ω(n), which is a stronger and more useful claim. The analogy is that 1 < π but also 3 < π, the latter being more useful. So in this sense, you are right.
However, your confusion seems to be between the notations for complexity classes (big-O, big-Ω, big-θ etc), and the concepts of best-case, worst-case, average case. The point is that the best case and the worst case time complexities of an algorithm are completely different functions, and you can use any of the notations above to describe any of them. (NB: Some claim that big-Ω automatically and exclusively describes best case time complexity and that big-O describes worst case, but this is a common misconception. They just describe complexity classes and you can use them with any mathematical functions.)
It is correct to say that the average time complexity linear search is Ω(n), because we are just talking about the function that describes its average time complexity. Its worst-case complexity is a different function, which happens not to be Ω(n), because as you say it can be constant-time.

Difficulty in comprehending asymptotic notation

As far as I know and research,
Big - Oh notation describes the worst case of an algorithm time complexity.
Big - Omega notation describes the best case of an algorithm time complexity.
Big - Theta notation describes the average case of an algorithm time complexity.
Source
However, in recent days, I've seen a discussion in which some guys tell that
O for worst case is a "forgivable" misconception. Θ for best is plain
wrong. Ω for best is also forgivable. But they remain misconceptions.
The big-O notation can represent any complexity. In fact it can
describe the asymptotic behavior of any function.
I remember as well I learnt the former one in my university course. I'm still a student and if I know them wrongly, please could you explain me?
Bachmann-Landau notation has absolutely nothing whatsoever to do with computational complexity of algorithms. This should already be very obvious by the fact that the idea of a computing machine that computes an algorithm didn't really exist in 1894, when Bachmann and Landau invented this notation.
Bachmann-Landau notation describes the growth rates of functions by grouping them together in sets of functions that grow at roughly the same rate. Bachmann-Landau notation does not say anything about what those functions mean. They are just functions. In fact, they don't have to mean anything at all.
All that they mean, is this:
f ∈ ο(g): f grows slower than g
f ∈ Ο(g): f does not grow significantly faster than g
f ∈ Θ(g): f grows as fast as g
f ∈ Ω(g): f does not grow significantly slower than g
f ∈ ω(g): f grows faster than g
It does not say anything about what f or g are, or what they mean.
Note that there are actually two conflicting, incompatible definitions of Ω; The one given here is the one that is more useful for computational complexity theory. Also note that these are only very broad intuitions, when in doubt, you should look at the definitions.
If you want, you can use Bachmann-Landau notation to describe the growth rate of a population of rabbits as a function of time, or the growth rate of a human's belly as a function of beers.
Or, you can use it to describe the best-case step complexity, worst-case step complexity, average-case step complexity, expected-case step complexity, amortized step complexity, best-case time complexity, worst-case time complexity, average-case time complexity, expected-case time complexity, amortized time complexity, best-case space complexity, worst-case space complexity, average-case space complexity, expected-case space complexity, or amortized space complexity of an algorithm.
These assertions are at best inaccurate and at worst wrong. Here is the truth.
The running time of an algorithm is not a function of N (as it also depends on the particular data set), so you cannot discuss its asymptotic complexity directly. All you can say is that it lies between the best and worst cases.
The worst case, best case and average case running times are functions of N, though the average case depends on the probabilistic distribution of the data, so is not uniquely defined.
Then the asymptotic notation is such that
O(f(N)) denotes an upper bound, which can be tight or not;
Ω(f(N)) denotes a lower bound, which can be tight or not;
Θ(f(N)) denotes a bilateral bound, i.e. the conjunction of O(f(N)) and Ω(f(N)); it is perforce tight.
So,
All of the worst-case, best-case and average-case complexities have a Θ bound, as they are functions; in practice this bound can be too difficult to establish and we content ourselves with looser O or Ω bounds instead.
It is absolutely not true that the Θ bound is reserved for the average case.
Examples:
The worst-case of quicksort is Θ(N²) and its best and average cases are Θ(N Log N). So by language abuse, the running time of quicksort is O(N²) and Ω(N Log N).
The worst-case, best-case and average cases of insertion sort are all three Θ(N²).
Any algorithm that needs to look at all input is best-case, average-case and worst-case Ω(N) (and without more information we can't tell any upper bound).
Matrix multiplication is known to be doable in time O(N³). But we still don't know precisely the Θ bound of this operation, which is N² times a slowly growing function. Thus Ω(N²) is an obvious but not tight lower bound. (Worst, best and average cases have the same complexity.)
There is some confusion stemming from the fact that the best/average/worst cases take well-defined durations, while tthe general case lies in a range (it is in fact a random variable). And in addition, algorithm analysis is often tedious, if not intractable, and we use simplifications that can lead to loose bounds.

Is an algorithm with a worst-case time complexity of O(n) always faster than an algorithm with a worst-case time complexity of O(n^2)?

This question has appeared in my algorithms class. Here's my thought:
I think the answer is no, an algorithm with worst-case time complexity of O(n) is not always faster than an algorithm with worst-case time complexity of O(n^2).
For example, suppose we have total-time functions S(n) = 99999999n and T(n) = n^2. Then clearly S(n) = O(n) and T(n) = O(n^2), but T(n) is faster than S(n) for all n < 99999999.
Is this reasoning valid? I'm slightly skeptical that, while this is a counterexample, it might be a counterexample to the wrong idea.
Thanks so much!
Big-O notation says nothing about the speed of an algorithm for any given input; it describes how the time increases with the number of elements. If your algorithm executes in constant time, but that time is 100 billion years, then it's certainly slower than many linear, quadratic and even exponential algorithms for large ranges of inputs.
But that's probably not really what the question is asking. The question is asking whether an algorithm A1 with worst-case complexity O(N) is always faster than an algorithm A2 with worst-case complexity O(N^2); and by faster it probably refers to the complexity itself. In which case you only need a counter-example, e.g.:
A1 has normal complexity O(log n) but worst-case complexity O(n^2).
A2 has normal complexity O(n) and worst-case complexity O(n).
In this example, A1 is normally faster (i.e. scales better) than A2 even though it has a greater worst-case complexity.
Since the question says Always it means it is enough to find only one counter example to prove that the answer is No.
Example for O(n^2) and O(n logn) but the same is true for O(n^2) and O(n)
One simple example can be a bubble sort where you keep comparing pairs until the array is sorted. Bubble sort is O(n^2).
If you use bubble sort on a sorted array, it will be faster than using other algorithms of time complexity O(nlogn).
You're talking about worst-case complexity here, and for some algorithms the worst case never happen in a practical application.
Saying that an algorithm runs faster than another means it run faster for all input data for all sizes of input. So the answer to your question is obviously no because the worst-case time complexity is not an accurate measure of the running time, it measures the order of growth of the number of operations in a worst case.
In practice, the running time depends of the implementation, and is not only about this number of operations. For example, one has to care about memory allocated, cache-efficiency, space/temporal locality. And obviously, one of the most important thing is the input data.
If you want examples of when the an algorithm runs faster than another while having a higher worst-case complexity, look at all the sorting algorithms and their running time depending of the input.
You are correct in every sense, that you provide a counter example to the statement. If it is for exam, then period, it should grant you full mark.
Yet for a better understanding about big-O notation and complexity stuff, I will share my own reasoning below. I also suggest you to always think the following graph when you are confused, especially the O(n) and O(n^2) line:
Big-O notation
My own reasoning when I first learnt computational complexity is that,
Big-O notation is saying for sufficient large size input, "sufficient" depends on the exact formula (Using the graph, n = 20 when compared O(n) & O(n^2) line), a higher order one will always be slower than lower order one
That means, for small input, there is no guarantee a higher order complexity algorithm will run slower than lower order one.
But Big-O notation tells you an information: When the input size keeping increasing, keep increasing....until a "sufficient" size, after that point, a higher order complexity algorithm will be always slower. And such a "sufficient" size is guaranteed to exist*.
Worst-time complexity
While Big-O notation provides a upper bound of the running time of an algorithm, depends on the structure of the input and the implementation of the algorithm, it may generally have a best complexity, average complexity and worst complexity.
The famous example is sorting algorithm: QuickSort vs MergeSort!
QuickSort, with a worst case of O(n^2)
MergeSort, with a worst case of O(n lg n)
However, Quick Sort is basically always faster than Merge Sort!
So, if your question is about Worst Case Complexity, quick sort & merge sort maybe the best counter example I can think of (Because both of them are common and famous)
Therefore, combine two parts, no matter from the point of view of input size, input structure, algorithm implementation, the answer to your question is NO.

Time Complexity of Algorithms (Big Oh notation)

Hey just a quick question,
I've just started looking into algorithm analysis and I'm attempting to learn Big-Oh notation.
The algorithm I'm looking at contains a quicksort (of complexity O(nlog(n))) to sort a dataset, and then the algorithm that operates upon the set itself has a worst case run-time of n/10 and complexity O(n).
I believe that the overall complexity of the algorithm would just be O(n), because it's of the highest order, so it makes the complexity of the quicksort redundant. However, could someone confirm this or tell me if I'm doing something wrong?
Wrong.
Quicksort has worst case complexity O(n^2). But even if you have an O(nlogn) sort algorithm, this is still more than O(n).

Complexity of Binary Search

I am watching the Berkley Uni online lecture and stuck on the below.
Problem: Assume you have a collection of CD that is already sorted. You want to find the list of CD with whose title starts with "Best Of."
Solution: We will use binary search to find the first case of "Best Of" and then we print until the tile is no longer "Best Of"
Additional question: Find the complexity of this Algorithm.
Upper Bound: Binary Search Upper Bound is O(log n), so once we have found it then we print let say k title. so it is O(logn + k)
Lower Bound: Binary Search lower Bound is Omega(1) assuming we are lucky and the record title is the middle title. In this case it is Omega(k)
This is the way I analyzed it.
But in the lecture, the lecturer used best case and worst case.
I have two questions about it:
Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
His analysis was
Worst Case : Theta(logn + k)
Best Case : Theta (k)
If I use the concept of Worst Case as referring to the data and having nothing to do with algorithm then yep, his analysis is right.
This is because assuming the worst case (CD title in the end or not found) then the Big O and Omega is both log n there it is theta(log n +k).
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?
Why need to use best case and worst case, aren't big-O and Omega considered as the best and worst cases the algorithm can perform?
No, the Ο and Ω notations do only describe the bounds of a function that describes the asymptotic behavior of the actual behavior of the algorithm. Here’s a good
Ω describes the lower bound: f(n) ∈ Ω(g(n)) means the asymptotic behavior of f(n) is not less than g(n)·k for some positive k, so f(n) is always at least as much as g(n)·k.
Ο describes the upper bound: f(n) ∈ Ο(g(n)) means the asymptotic behavior of f(n) is not more than g(n)·k for some positive k, so f(n) is always at most as much as g(n)·k.
These two can be applied on both the best case and the worst case for binary search:
best case: first element you look at is the one you are looking for
Ω(1): you need at least one lookup
Ο(1): you need at most one lookup
worst case: element is not present
Ω(log n): you need at least log n steps until you can say that the element you are looking for is not present
Ο(log n): you need at most log n steps until you can say that the element you are looking for is not present
You see, the Ω and Ο values are identical. In that case you can say the tight bound for the best case is Θ(1) and for the worst case is Θ(log n).
But often we do only want to know the upper bound or tight bound as the lower bound has not much practical information.
Assuming you do not do "best case" and "worst case", then how do you analyze the algorithm? Is my analysis right?
Yes, your analysis seems correct.
For your first question, there is a difference between the best-case runtime of an algorithm, the worst-case runtime of an algorithm, big-O, and big-Ω notations. The best- and worst-case runtimes of an algorithm are actual mathematical functions with precise values that tell you the maximum and minimum amount of work an algorithm will do. To describe the growth rates of these functions, we can use big-O and big-Ω notation. However, it's possible to describe the best-case behavior of a function with big-Ω notation or the worst-case with big-O notation. For example, we might know that the worst-case runtime of a function might be O(n2), but not actually know which function that's O(n2) the worst-case behavior is. This might occur if we wanted to upper-bound the worst-case behavior so that we know that it isn't catastrophically bad. It's possible that in this case the worst-case behavior is actually Θ(n), in which case the O(n2) is a bad upper bound, but saying that the worst-case behavior is O(n2) in this case indicates that it isn't terrible. Similarly, we might say that the best-case behavior of an algorithm is Ω(n), for example, if we know that it must do at least linear work but can't tell if it must do much more than this.
As to your second question, the analysis of the worst-case and best-case behaviors of the algorithm are absolutely dependent on the structure of the input data, and it's usually quite difficult to analyze the best- and worst-case behavior of an algorithm without seeing how it performs on different data sets. It's perfectly reasonable to do a worst-case analysis by exhibiting some specific family of inputs (note that it has to be a family of inputs, not just one input, since we need to get an asymptotic bound) and showing that the algorithm must run poorly on that input. You can do a best-case analysis in the same way.
Hope this helps!

Resources