worst case of time complexity - complexity-theory

WHY WE ALWAYS CONSIDER THE WORST CASE SCENARIO IN TIME COMPLEXITY FOR EXAMPLE TIME COMPLEXITY OF LINEAR SEARCH IS N AND FOR THE BINARY SEARCH IS log(n). SO THESE ARE THE WORST CASE SCENARIO.... SO WHY WE CONSIDER THIS WHY NOT BEST OR AVERAGE CASE SCENARIO

The reason we consider the worst case scenario is to rate the efficiency of the algorithm, think of it like the maximum cost you will incur when you run it on different inputs, In large scale applications where the performance is very important with varying inputs the maximum time taken will be determined only by worst case as we can never be sure if the user input is a best case or average case, but we can be sure that it won't be worse than the worst case.

Related

Best asymptotic notation

If an algorithm worst case running time is 6n^4 + 2, and its best case running time is 67+ 6n^3. What is the most appropriate asymptotic notation.
I'm trying learn about Big O notation.
is it Θ(n^2) ?
Essentially, Big-Oh time complexity analysis is defined for best case, worst case or average number of operations algorithm performs. "is it Θ(n^2) ?" So, you should specify which case are you looking for? Or do you mean to say is it Θ(n^2) for all cases? (which is obviously not correct)
Having said that, we know that algorithm performs 6n^4 + 2 operations in worst case. So it has Θ(n^4) worst case complexity. I've used theta here because I know exactly how many operations are going to be performed. In the best case, it performs 67+ 6n^3 operations. So it has Θ(n^3) time complexity for the best case.
How about average time complexity? Well, I can't know as long as I am not provided with the probability distribution of the inputs. It's maybe the case that best-case-like scenario rarely occurs and average time complexity is Θ(n^4), or vice versa. So we cannot directly infer the average time complexity from the worst/best case time complexities as long as we are not provided with input probability distribution, the algorithm itself or the recurrence relation. (Well, if best and worst time complexities are the same, then of course we can conclude that average time complexity is equal to them)
If algorithm is provided, we can calculate average time complexity making some very basic assumptions on the input (like equally likely distribution etc.). For example in linear search, best case is O(1). Worst case is O(n). Assuming equally likely distribution, you can conclude that average time complexity is O(n) using expectation formula. [sum of (probability of input i) * (number of operations for that input)]
Lastly, your average time complexity CANNOT be Θ(n^2) because your best and worst time complexities are worst than quadratic. It doesn't make sense to wait this algorithm perform n^2 operations in average, while it performs n^3 operations in best case.
Time complexity for best case <= time complexity for average <= time complexity for worst case

Combinations of Asymptotic Time Complexities with Best, Average and Worst Case Inputs

I'm confused by numerous claims that asymptotic notation has nothing to do with best-case, average-case and worst-case time complexity. If this is the case, then presumably the following combinations are all valid:
Time Complexity O(n)
Best case - upper bound for the best case input
For the best possible input, the number of basic operations carried out by this algorithm will never exceed some constant multiple of n.
Average case - upper bound for average case input
For an average input, the number of basic operations carried out by this algorithm will never exceed some constant multiple of n.
Worst case - upper bound for worst case input
For the worst possible input, the number of basic operations carried out by this algorithm will never exceed some constant multiple of n.
Time Complexity: Ө(n)
Best case - tight bound for the best case input
For the best possible input, the number of basic operations carried out by this algorithm will never exceed or be less than some constant multiple of n.
Average case - tight bound for average case input
For an average input, the number of basic operations carried out by this algorithm will never exceed or be less than some constant multiple of n.
Worst case - tight bound for worst case input
For the worst possible input, the number of basic operations carried out by this algorithm will never exceed or be less than some constant multiple of n.
Time Complexity: Ω(n)
Best case - lower bound for the best case input
For the best possible input, the number of basic operations carried out by this algorithm will never be less than some constant multiple of n.
Average case - lower bound for average case input
For an average input, the number of basic operations carried out by this algorithm will never be less than some constant multiple of n.
Worst case - lower bound for worst case input
For the worst possible input, the number of basic operations carried out by this algorithm will never be less than some constant multiple of n.
Which of the above make sense? Which are generally used in practice when assessing the efficiency of an algorithm in terms of time taken to execute as input grows? As far as I can tell, several of them are redundant and/or contradictory.
I'm really not seeing how the concepts of upper, tight and lower bounds have nothing to do with best, average and worst case inputs. This is one of those topics that the further I look into it, the more confused I become. I would be very grateful if someone could provide some clarity on the matter for me.
All of the statements make sense.
In practice, unless otherwise stated, we should be talking about the worst case input in all cases.
We're often concerned about the average case input, although the definition of "average case" gets a little dodgy. It's usually better to talk about the "expected time", since that is a more precise mathematical definition that corresponds to what people usually mean by "average case". People are unfortunately often sloppy here, and you'll often see complexity statements that refer to expected time instead of worst case time, but don't mention it.
We're rarely concerned about the best case input.
Unfortunately, it's also common to see people confusing the notions of best-vs-worst-case input vs upper-vs-lower bounds, especially on SO and other informal sites.
You can always go back to the definitions, as you have already done, to figure out what statements really mean.
"This algorithm runs in X(n2) time" Means that given the function f(n) for the worst-case execution time vs problem size in any environment, that function will be in the set X(n2).
"This algorithm runs in X(n2) expected time" Means that given the function f(n) for the mathematical expectation of execution time vs problem size in any environment, that function will be in the set X(n2).
Finally, note that in any environment actually only applies to specific computing models. We usually assume a random access machine, so complexity statements are not valid for, e.g., Turing machine implementations.

Average case complexity examples

I found many examples for the worst case and best case complexity, but average-case complexity was the same as worst-case complexity in most cases. Are there examples where average-case complexity can be different from worst-case complexity? If there are, please some cases in both recursive and iterative cases.
Algorithms based on partitioning using a pivot will typically have poor worst-case performance, because of the possibility of choosing an inefficient partition (i.e. most elements end up in the same side):
Quicksort has time complexity O(n log n) in the average case, but O(n2) in the worst case.
Quickselect has time complexity O(n) in the average case, but O(n2) in the worst case.
Note that both quicksort and quickselect can be implemented with or without recursion.
Algorithms based on hashes also typically have poor worst-case performance, because of the possibility of hash collision. For example, basic operations on a hash table are O(1) time in the average case, but O(n) time in the worst case.
More generally, algorithms designed for efficient performance on roughly-uniform data are likely to have worse performance on very non-uniform data, for example interpolation search degrades from an average O(log log n) on uniform data to O(n) in the worst case.
Yes ,
Average case complexity tends to be in between best and worst case in absolute terms.
but, the Complexities are dependent on input size and distribution
so they are expressed in the function of n which makes them either
equal to best case or worst case most of the times
So , No for the specific answer you are looking for.
Moreover Average case complexity has various constraints associated with it for e.g
* A probability distribution on the inputs has to be specified.
* Simple assumption: all equally likely inputs of size n.
* Sometimes this assumption may not hold for the real inputs.
* The analysis might be a difficult math challenge.
* Even if the average-case complexity is good, the worst-case
one may be very bad.
so , it's less likely to be used

Big O algorithms minimum time

I know that for some problems, no matter what algorithm you use to solve it, there will always be a certain minimum amount of time that will be required to solve the problem. I know BigO captures the worst-case (maximum time needed), but how can you find the minimum time required as a function of n? Can we find the minimum time needed for sorting n integers, or perhaps maybe finding the minimum of n integers?
what you are looking for is called best case complexity. It is kind of useless analysis for algorithms while worst case analysis is the most important analysis and average case analysis is sometimes used in special scenario.
the best case complexity depends on the algorithms. for example in a linear search the best case is, when the searched number is at the beginning of the array. or in a binary search it is in the first dividing point. in these cases the complexity is O(1).
for a single problem, best case complexity may vary depending on the algorithm. for example lest discuss about some basic sorting algorithms.
in bubble sort best case is when the array is already sorted. but even in this case you have to check all element to be sure. so the best case here is O(n). same goes to the insertion sort
for quicksort/mergesort/heapsort the best case complexity is O(n log n)
for selection sort it is O(n^2)
So from the above case you can understand that the complexity ( whether it is best , worst or average) depends on the algorithm, not on the problem

Difference between average case and amortized analysis

I am reading an article on amortized analysis of algorithms. The following is a text snippet.
Amortized analysis is similar to average-case analysis in that it is
concerned with the cost averaged over a sequence of operations.
However, average case analysis relies on probabilistic assumptions
about the data structures and operations in order to compute an
expected running time of an algorithm. Its applicability is therefore
dependent on certain assumptions about the probability distribution of
algorithm inputs.
An average case bound does not preclude the possibility that one will
get “unlucky” and encounter an input that requires more-than-expected
time even if the assumptions for probability distribution of inputs are
valid.
My questions about above text snippet are:
In the first paragraph, how does average-case analysis “rely on probabilistic assumptions about data structures and operations?” I know average-case analysis depends on probability of input, but what does the above statement mean?
What does the author mean in the second paragraph that average case is not valid even if the input distribution is valid?
Thanks!
Average case analysis makes assumptions about the input that may not be met in certain cases. Therefore, if your input is not random, in the worst case the actual performace of an algorithm may be much slower than the average case.
Amortized analysis makes no such assumptions, but it considers the total performance of a sequence of operations instead of just one operation.
Dynamic array insertion provides a simple example of amortized analysis. One algorithm is to allocate a fixed size array, and as new elements are inserted, allocate a fixed size array of double the old length when necessary. In the worst case a insertion can require time proportional to the length of the entire list, so in the worst case insertion is an O(n) operation. However, you can guarantee that such a worst case is infrequent, so insertion is an O(1) operation using amortized analysis. Amortized analysis holds no matter what the input is.
To get the average-case time complexity, you need to make assumptions about what the "average case" is. If inputs are strings, what's the "average string"? Does only length matter? If so, what is the average length of strings I will get? If not, what is the average character(s) in these strings? It becomes difficult to answer these questions definitively if the strings are, for instance, last names. What is the average last name?
In most interesting statistical samples, the maximum value is greater than the mean. This means that your average case analysis will sometimes underestimate the time/resources needed for certain inputs (which are problematic). If you think about it, for a symmetrical PDF, average case analysis should underestimate as much as it overestimates. Worst case analysis, OTOH, considers only the most problematic case(s), and so is guaranteed to overestimate.
Consider the computation of the minimum in an unsorted array. Maybe you know that it has O(n) running time but if we want be more precise it does n/2 comparison in the average case. Why this? because we are doing an assumption on data; we are assuming that the minimum can be in every position with the same probability.
if we change this assumption, and we say for example that the probability of being in the i position is for example increasing with i, we could prove a different comparison number, even a different asymptotical bound.
In the second paragraph the author say that with average case analysis we can be very unlucky and have a measured average case greater than the therotical case; recalling the previous example, if we are unlucky on m different arrays of size n, and the minimum is every time in the last position, than we'll measure a n average case and not a n/2. This can't just happen when a amortized bound is proven.

Resources