Dropping the less significant terms in the middle of calculating time complexity? - algorithm

We know that for some algorithm with time complexity of lets say T(n) = n^2 + n + 1 we can drop the less significant terms and say that it has a worst case of O(n^2).
What about when we're in the middle of calculating time complexity of an algorithm such as T(n) = 2T(n/2) + n + log(n)? Can we just drop the less significant terms and just say T(n) = 2T(n/2) + n = O(n log(n))?

In this case, yes, you can safely discard the dominated (log n) term. In general, you can do this any time you only need the asymptotic behaviour rather than the exact formula.
When you apply the Master theorem to solve a recurrence relation like
T(n) = a T(n/b) + f(n)
asymptotically, then you don't need an exact formula for f(n), just the asymptotic behaviour, because that's how the Master theorem works.
In your example, a = 2, b = 2, so the critical exponent is c = 1. Then the Master theorem tells us that T(n) is in Θ(n log n) because f(n) = n + log n, which is in Θ(nc) = Θ(n).
We would have reached the same conclusion using f(n) = n, because that's also in Θ(n). Applying the theorem only requires knowing the asymptotic behaviour of f(n), so in this context it's safe to discard dominated terms which don't affect f(n)'s asymptotic behaviour.

First of all you need to understand that T(n) = n^2 + n + 1 is a closed form expression, in simple terms it means you can inject some value for n and you will get the value of this whole expression.
on the other hand T(n) = 2T(n/2) + n + log(n) is a recurrence relation, it means this expression is defined recursively, to get a closed form expression you will have to solve the recurrence relation.
Now to answer your question, in general we drop lower order terms and coefficients when we can clearly see the highest order term, in T(n) = n^2 + n + 1 its n^2. but in a recurrence relation there is no such highest order term, because its not a closed form expression.
but one thing to observe is that highest order term in the closed form expression of a recurrence relation would be result of depth of recurrence tree multiplied with the highest order term in recurrence relation, so in your case it would be depthOf(2T(n/2)) * n, this would result in something like logn*n, so you can say that in terms of big O notation its O(nlogn).

Related

When solving a recurrence relation how to determine if the result is a big O or a Big theta?

I'm solving some recurrences and I didn't quite understand when to put O or Theta on the result.
If I have a recurrence like this:
T(N)=2T(n/2)+n
and I solve this with the iterative method I know the result is an O (nlogn).
could I also use theta (nlogn)?
is there any verification I could do to establish this?
By definition,
If T(n) = O(g(n)) as well as T(n) = Ω(g(n)),
then we can say that T(n) = Θ(g(n))
In case of T(n) = 2T(n/2) + n you're dividing n inputs into two branches and then merging those inputs from bottom up. This method is also known as Divide and Conquer.
So for every branch you're doing n operations. How many number of times? That is the height of the tree. For binary tree here, the height is log_2(n). Thus the total operations you're doing is at most n * log_2(n).
Since the height of the tree is fixed and the number of operations performed is also fixed so the total operations you'll be doing is also at least n * log_2(n).
Thus for your above function, T(n) = Θ(n * log_2(n)).

Expected running time value of a random algorithm

Given this algorithm, I am required to :
Find the recursion formula of the expected value of the running time.
Find the closest upper bound as possible.
I am actually a bit lost so if someone could help...
Recursive formula for worst case: T(n) = T(n/2) + n
Recursive formula for best case: T(n) = T(1) + n
Recursive formula for expected case: T(n) = T(n/4) + n
Worst case: 2n = O(n)
Best case: n = O(n)
Expected case: 4n/3 = O(n)
Some people here seem to be confused about the log(n) factor. log(n) factor would only be required if T(n) = 2T(n/2) + n i.e. if the function was called TWICE recursively with half the input.

Master Theorem confusion with the three cases

I know that we can apply the Master Theorem to find the running time of a divide and conquer algorithm, when the recurrence relation has the form of:
T(n) = a*T(n/b) + f(n)
We know the following :
a is the number of subproblems that the algorithm divides the original problem
b is the size of the sun-problem i.e n/b
and finally.. f(n) encompasses the cost of dividing the problem and combining the results of the subproblems.
Now we then find something (I will come back to the term "something")
and we have 3 cases to check.
The case that f(n) = O(n^log(b)a-ε) for some ε>0; Then T(n) is O(n*log(b)a)
The case that f(n) = O(n^log(b)a) ; Then T(n) is O(n^log(b)a * log n).
If n^log(b)a+ε = O(f(n)) for some constant ε > 0, and if a*f(n/b) =< cf(n) for some constant
c < 1 and almost all n, then T(n) = O(f(n)).
All fine, I am recalling the term something. How we can use general examples (i.e uses variables and not actual numbers) to decide which case the algorithm is in?
In instance. Consider the following:
T(n) = 8T(n/2) + n
So a = 8, b = 2 and f(n) = n
How I will proceed then? How can I decide which case is? While the function f(n) = some big-Oh notation how these two things are comparable?
The above is just an example to show you where I don't get it, so the question is in general.
Thanks
As CLRS suggests, the basic idea is comparing f(n) with n^log(b)a i.e. n to the power (log a to the base b). In your hypothetical example, we have:
f(n) = n
n^log(b)a = n^3, i.e. n-cubed as your recurrence yields 8 problems of half the size at every step.
Thus, in this case, n^log(b)a is larger because n^3 is always O(n) and the solution is: T(n) = θ(n^3).
Clearly, the number of subproblems vastly outpaces the work (linear, f(n) = n) you are doing for each subproblem. Thus, the intuition tells and master theorem verifies that it is the n^log(b)a that dominates the recurrence.
There is a subtle technicality where the master theorem says that f(n) should be not only smaller than n^log(b)a O-wise, it should be smaller polynomially.

time complexity and size of the input

I'm studying for an exam which is mostly about the time complexity. I've encountered a problem while solving these four questions.
1) if we prove that an algorithm has a time complexity of theta(n^2), is it possible that it takes him the time calculation of O(n) for ALL inputs?
2) if we prove that an algorithm has a time complexity of theta(n^2), is it possible that it takes him the time calculation of O(n) for SOME inputs?
3) if we prove that an algorithm has a time complexity of O(n^2), is it possible that it takes him the time calculation of O(n) for SOME inputs?
4) if we prove that an algorithm has a time complexity of O(n^2), is it possible that it takes him the time calculation of O(n) for ALL inputs?
can anyone tell me how to answer such questions. I'm mostly confused when they ask for "all" or "some" inputs.
thanks
gkovacs90 answer provides a good link : WIKI
T(n) = O(n3), means T(n) grows asymptotically no faster than n3. A constant k>0 exists and for all n>N , T(n) < k*n3
T(n) = Θ(n3), means T(n) grows asymptotically as fast as n3. Two constants k1, k2 >0 exist and for all n>N , k1*n3 < T(n) < k2*n3
so if T(n) = n3 + 2*n + 3
Then T(n) = Θ(n3) is more appropriate than T(n) = O(n3) since we have more information about the way T(n) behaves asymptotically.
T(n) = Θ(n3) means that for n>N the curve of T(n) will "approach" and "stay close" to the curve of k*n3, with k>0.
T(n) = O(n3) means that for n>N the curve of T(n) will always be under to the curve of k*n3, with k>0.
1:No
2:Yes, as gkovacs90 says, for small values of n you can have O(n) time calculation but I would say No for big enough inputs. The notations Theta and Big-O only mean something asymptotically
3:Yes
4:Yes
Example for number 4 (dumm but still true) : for an Array A : Int[] compute the sum of the values. Your algorithm certainly will be :
Given A an Int Array
sum=0
for int a in A
sum = sum + a
end for
return sum
If n is the length of the array A : The time complexity is T(n) = n. So T(n) = O(n2) since T(n) will not grow faster than n2. And still we have for all array a time calculation of O(n).
If you find such a result for a time (or memory) complexity. Then you can (and certainly you must) refine the Big-O / Theta of your function (here obviously we have : Θ(n))
Some last points :
T(n)=Θ(g(n)) implies T(n)=O(g(n)).
In computational complexity theory, the complexity is sometimes computed for best, worst and average cases.
A "barfoot" explanation:
Big O notation is for setting an upper bound. By definition, there is always an index(or an input-length) from wich the notation is correct. So below this index, anything can happen.
For example sorting an array(O(n^2)) with one element takes less time, than writing the elements to the output(O(n)). ( we don't sort, we know it is in the right order, so it takes 0 time ).
So the answers:
1: No
2: Yes
3: Yes
4: Yes
You can find a detailed understandable description at WIKI
And HERE You can find a simpler explanation.

Is worst case analysis not equal to asymptotic bounds

Can someone explain to me why this is true. I heard a professor mention this is his lecture
The two notions are orthogonal.
You can have worst case asymptotics. If f(n) denotes the worst case time taken by a given algorithm with input n, you can have eg. f(n) = O(n^3) or other asymptotic upper bounds of the worst case time complexity.
Likewise, you can have g(n) = O(n^2 log n) where g(n) is the average time taken by the same algorithm with (say) uniformly distributed (random) inputs of size n.
Or you can have h(n) = O(n) where h(n) is the average time taken by the same algorithm with particularly distributed random inputs of size n (eg. almost sorted sequences for a sorting algorithm).
Asymptotic notation is a "measure". You have to specify what you want to count: worst case, best case, average, etc.
Sometimes, you are interested in stating asymptotic lower bounds of (say) the worst case complexity. Then you write f(n) = Omega(n^2) to state that in the worst case, the complexity is at least n^2. The big-Omega notation is opposite to big-O: f = Omega(g) if and only if g = O(f).
Take quicksort for an example. Each successive recursive call n of quicksort has a run-time complexity T(n) of
T(n) = O(n) + 2 T[ (n-1)/2 ]
in the 'best case' if the unsorted input list is splitted into two equal sublists of size (n-1)/2 in each call. Solving for T(n) gives O(n log n), in this case. If the partition is not perfect, and the two sublists are not of equal size n, i.e.
T(n) = O(n) + T(k) + T(n - 1 - k),
we still obtain O(n log n) even if k=1, just with a larger constant factor. This is because the number of recursive calls of quicksort is rising exponentially while processing the input list as long as k>0.
However, in the 'worst case' no division of the input list takes place, i.e.:
T(n) = O(n) + T(0) + T(n - 1) = O(n) + O(n-1) + T(n-1) + T(n-2) ... .
This happens e.g. if we take the first element of a sorted list as the pivot element.
Here, T(0) means one of the resulting sublists is zero and therefore takes no computing time (since the sublist has zero elements). All the remaining load T(n-1) is needed for the second sublist. In this case, we obtain O(n²).
If an algorithm had no worst case scenario, it would be not only be O[f(n)] but also o[f(n)] (Big-O vs. little-o notation).
The asymptotic bound is the expected behaviour as the number of operations go to infinity. Mathematically it is just that lim as n goes to infinity. The worst case behaviour however is applicable to finite number of operations.

Resources