What will be the time complexity of this code fragment? - algorithm

Given this question where increment over the iterator i happens by incrementing its value by its own log value. What will be the big O time complexity for this fragment?
i = 10
while(i<n) {
i=i+log(i);
}

Interesting question! Here's an initial step toward working out the runtime.
Let's imagine that we have reached a point in the loop where the value of i has reached 2k for some natural number k. Then for i to increase to 2k+1, we'll need approximately 2k / k iterations to elapse. Why? That's because
The amount i needs to increase is 2k+1 - 2k = 2k(2 - 1) = 2k, and
At each step, increasing i by log i will increase i by roughly log 2k = k.
We can therefore break the algorithm apart into "stages." In the first stage, we grow from size 23 to size 24, requiring (roughly) 23/3 steps. In the second stage, we grow from size 24 to size 25, requiring (roughly) 24 / 4 steps. After repeating this process many times, we eventually grow from size n/2 = 2log n - 1 to size n = 2log n, requiring (roughly) 2log n / log n steps. The total amount of work done is therefore given by
23 / 3 + 24/4 + 25 / 5 + ... + 2log n / log n.
The goal now is to find some bounds on this expression. We can see that the sum is at least equal to its last term, which is 2log n / log n = n / log n, so the work done is Ω(n / log n). We can also see that the work done is less than
23 + 24 + 25 + ... + 2log n
≤ 2log n + 1 (sum of a geometric series)
= 2n,
so the work done is O(n). That sandwiches the runtime between Ω(n / log n) and O(n).
It turns out that Θ(n / log n) is indeed a tight bound here, which can be proved by doing a more nuanced analysis of the summation.

Let us look a bit about the definition of g(n)=O(f(n)): saying the function g is of order O(f(n)) means there exists a number n0 and a constant c such that for all n>n0 it is g(n)<=cf(n).
Looking at the worst case scenario, the while will run for maximum n times which means we can say your code is of order O(n).
Now let's assume that the code inside the while loop is
(*) while(i<n) {
i = i + i ;
}
which obviously should skip more iteration that the original one. So we can use this code to estimate a lower bound. Examining (*) we see that in each iteration the counter will be double, if we think a little bit about it we see that for n iteration each time half of the input will be thrown out. so the code in (*) will have worst case asymptotic runtime O(log n).
Now we estimate that the original code should be between both so we can say its asymptotic lower bound is Ω(log n) and its asymptotic upper bound is O(n).

Related

If stack operations are constant time O(1), what is the time complexity of this algorithm?

BinaryConversion:
We are inputting a positive integer n with the output being a binary representation of n on a stack.
What would the time complexity here be? I'm thinking it's O(n) as the while loop halves every time, meaning the iterations for a set of inputs size 'n' decrease to n/2, n/4, n/8 etc.
Applying sum of geometric series whereby n = a and r = 1/2, we get 2n.
Any help appreciated ! I'm still a noob.
create empty stack S
while n > 0 do
push (n mod 2) onto S
n = floor(n / 2)
end while
return S
If the loop was
while n>0:
for i in range n:
# some action
n = n/2
Then the complexity would have been O(n + n/2 + n/4 ... 1) ~ O(n), and your answer would have been correct.
while n > 0 do
# some action
n = n / 2
Here however, the complexity will should be the number of times the outer loop runs, since the amount of work done in each iteration is O(1). So the answer will be O(log(n)) (since n is getting halved each time).
The number of iterations is the number of times you have to divide n by 2 to get 0, which is O(log n).

Is complexity O(log(n)) equivalent to O(sqrt(n))?

My professor just taught us that any operation that halves the length of the input has an O(log(n)) complexity as a thumb rule. Why is it not O(sqrt(n)), aren't both of them equivalent?
They are not equivalent: sqrt(N) will increase a lot more quickly than log2(N). There is no constant C so that you would have sqrt(N) < C.log(N) for all values of N greater than some minimum value.
An easy way to grasp this, is that log2(N) will be a value close to the number of (binary) digits of N, while sqrt(N) will be a number that has itself half the number of digits that N has. Or, to state that with an equality:
        log2(N) = 2log2(sqrt(N))
So you need to take the logarithm(!) of sqrt(N) to bring it down to the same order of complexity as log2(N).
For example, for a binary number with 11 digits, 0b10000000000 (=210), the square root is 0b100000, but the logarithm is only 10.
Assuming natural logarithms (otherwise just multiply by a constant), we have
lim {n->inf} log n / sqrt(n) = (inf / inf)
= lim {n->inf} 1/n / 1/(2*sqrt(n)) (by L'Hospital)
= lim {n->inf} 2*sqrt(n)/n
= lim {n->inf} 2/sqrt(n)
= 0 < inf
Refer to https://en.wikipedia.org/wiki/Big_O_notation for alternative defination of O(.) and thereby from above we can say log n = O(sqrt(n)),
Also compare the growth of the functions below, log n is always upper bounded by sqrt(n) for all n > 0.
Just compare the two functions:
sqrt(n) ---------- log(n)
n^(1/2) ---------- log(n)
Plug in Log
log( n^(1/2) ) --- log( log(n) )
(1/2) log(n) ----- log( log(n) )
It is clear that: const . log(n) > log(log(n))
No, It's not equivalent.
#trincot gave one excellent explanation with example in his answer. I'm adding one more point. Your professor taught you that
any operation that halves the length of the input has an O(log(n)) complexity
It's also true that,
any operation that reduces the length of the input by 2/3rd, has a O(log3(n)) complexity
any operation that reduces the length of the input by 3/4th, has a O(log4(n)) complexity
any operation that reduces the length of the input by 4/5th, has a O(log5(n)) complexity
So on ...
It's even true for all reduction of lengths of the input by (B-1)/Bth. It then has a complexity of O(logB(n))
N:B: O(logB(n)) means B based logarithm of n
One way to approach the problem can be to compare the rate of growth of O()
and O( )
As n increases we see that (2) is less than (1). When n = 10,000 eq--1 equals 0.005 while eq--2 equals 0.0001
Hence is better as n increases.
No, they are not equivalent; you can even prove that
O(n**k) > O(log(n, base))
for any k > 0 and base > 1 (k = 1/2 in case of sqrt).
When talking on O(f(n)) we want to investigate the behaviour for large n,
limits is good means for that. Suppose that both big O are equivalent:
O(n**k) = O(log(n, base))
which means there's a some finite constant C such that
O(n**k) <= C * O(log(n, base))
starting from some large enough n; put it in other terms (log(n, base) is not 0 starting from large n, both functions are continuously differentiable):
lim(n**k/log(n, base)) = C
n->+inf
To find out the limit's value we can use L'Hospital's Rule, i.e. take derivatives for numerator and denominator and divide them:
lim(n**k/log(n)) =
lim([k*n**(k-1)]/[ln(base)/n]) =
ln(base) * k * lim(n**k) = +infinity
so we can conclude that there's no constant C such that O(n**k) < C*log(n, base) or in other words
O(n**k) > O(log(n, base))
No, it isn't.
When we are dealing with time complexity, we think of input as a very large number. So let's take n = 2^18. Now for sqrt(n) number of operation will be 2^9 and for log(n) it will be equal to 18 (we consider log with base 2 here). Clearly 2^9 much much greater than 18.
So, we can say that O(log n) is smaller than O(sqrt n).
To prove that sqrt(n) grows faster than lgn(base2) you can take the limit of the 2nd over the 1st and proves it approaches 0 as n approaches infinity.
lim(n—>inf) of (lgn/sqrt(n))
Applying L’Hopitals Rule:
= lim(n—>inf) of (2/(sqrt(n)*ln2))
Since sqrt(n) and ln2 will increase infinitely as n increases, and 2 is a constant, this proves
lim(n—>inf) of (2/(sqrt(n)*ln2)) = 0

About time complexity of algorithm

What is the time complexity of the problem below.
int j=1;
while(j<n){
j+=log(j+5);
}
By expanding the first three terms of the sum:
You can see that it's just a sum of iterations of log(log(j))'s.
Since O(j) >> O(log(j)), it follows that O(log(j)) >> O(log(log(j)); the first term therefore overshadows all of the other terms.
The sum is therefore O(log(j)), which means the time complexity is
.
Numerical tests show that this is actually O(n^0.82...).
With the edited code (+= instead of =), I'll conjecture that the asymptotic time complexity of this code (assuming that log() and other elementary operations take constant time) is Θ(n / log n).
It's easy to show that the loop takes at least n / log(n+5) = n / (log n × log 5) iterations to complete, since it's counting up to n, and each iteration increments the counter by an amount strictly less than log(n+5). Thus, the asymptotic time complexity is at least Ω(n / log n).
Showing that it's also O(n / log n), and thus Θ(n / log n), seems a bit trickier. Basically, my argument is that, for sufficiently large n, incrementing the counter j from exp(k) up to exp(k+1) takes on the order of C × exp(k) / k iterations (for a constant C &approx; (1 − exp(−1)) / 5, if I'm not mistaken). Letting h = ceil(log(n)), incrementing the counter from 1 to n thus takes at most
T = C × ( exp(h) / h + exp(h−1) / (h−1) + exp(h−2) / (h−2) + … + exp(1) / 1 )
iterations. For large n (and thus h), the leading exp(h) / h term should dominate the rest, such that T ≤ C2 × exp(h) / h for some constant C2, and thus the loop should run in O(exp(h) / h) = O(n / log n) time.
That said, I admit that this is just a proof sketch, and there might be gaps or errors in my argument. In particular, I have not actually determined an upper bound for the constant(?) C2. (C2 = C × h would obviously satisfy the inequality, but would only yield an O(n) upper bound on the runtime; C2 = C / (1 − exp(−1)) would give the desired bound, but is obviously too low.) Thus, I cannot completely rule out the possibility that the actual time complexity might be (very slightly) higher.

What is the complexity of multiple runs of an O(n log n) algorithm?

If the problem size is n, and every time an algorithm reduces the problem size by half, I believe the complexity is O (n log n) e..g merge sort. So, basically you are running a (log n) algorithm (the comparison) n times...
Now the problem is, if I have a problem of size n. My algorithm is able to reduce the size by half in a run and each run takes O(n log n). What is the complexity in this case?
If the problem takes n steps at size n, plus an additional run at size floor(n/2) when n > 1, then it takes O(n) time in total: n + n/2 + n/4 + ... =~ 2n = O(n).
Similarly, if each run takes time O(n log n) and an additional run at size floor(n/2) when n > 1, the total time is O(n log n).
Since the size of the problem gets halved in each iteration and at each level the time taken is n log n, the recurrence relation is
T(n) = T(n/2) + n log n
Applying Master theorem,
Comparing with T(n) = a T(n/b) + f(n), we have a=1 and b=2.
Hence nlogba = nlog21
= n0 = 1.
Thus f(n) = n log n > nlogba.
Applying Master theorem we get T(n) = Θ(f(n)) = Θ(n log n).
Hence the complexity is T(n) = Θ(n log n).
EDIT after comments:
If the size of the problem halves at every run, you'll have log(n) runs to complete it. Since every run take n*log(n) time, you'll have log(n) times n*log(n) runs. The total complexity will be:
O(n log(n)^2)
If I don't misunderstand the question, the first run completes in (proportional to) nlogn. The second run has only n/2 left, so completes in n/2log(n/2), and so on.
For large n, which is what you assume when analyzing the time-complexity, log(n/2) = (logn - log2) is to be replaced by logn.
summing over "all" steps:
log(n) * (n + n/2 + n/4 ...) = 2n log(n), i.e. time complexity nlogn
In other words: the time complexity is the same as for your first/basic step, all others together "only" contributing the same amount once more
I'm pretty sure that comes to O(n^2 log n). You create a geometric series of n + n/2 + n/4 + ... = 2n (for large n). But you ignore the coefficient and just get the n.
This is fine unless you mean the inner nlogn to be the same n value as the outer n.
Edit:
I think that what the OP means here is that each run the inner nlogn also gets havled. In other words,
nlogn + n/2 log n/2 + n/4 log n/4 + ... + n/2^(n - 1) log n/2^(n-1)
If this is the case, then one thing to consider is that at some point
2^(n-1) > n
At that point the log breaks down (because log of a number between 0 and 1 is negative). But, you don't really need the log as all there will be is 1 operation in these iterations. So from there on you are just adding 1s.
This occurs at the log n / log 2. So, for the first log n / log 2 iterations, we have the sum as we had above, and after that it is just a sum of 1s.
nlogn + n/2 log n/2 + n/4 log n/4 + ... + n/2^(log n / log 2) log n/2^(log n / log 2) + 1 + 1 + 1 + 1 ... (n - log n / log 2) times
Unfortunately, this expression is not an easy one to simplify...

Running time complexity of bubble sort

I was looking at the bubble sort algorithim in wiki, it seems that the worst case is o(n2).
Let's take an array size of n.
int a = [1,2,3,4,5.....n]
For any n elements, the total number of comparisons, therefore, is (n - 1) + (n - 2)...(2) + (1) = n(n - 1)/2 or O(n2).
Can anyone explain me how is n(n-1)/2 equals o(n2). I am not able to understand on how they came to a conclusion that the worst case analsysis of this algorithim is o(n2)
They are looking at the case when N is getting closer to infinity. So n(n-1)/2 would be practically the same as n*n/2 or n^2 / 2.
And since they are only looking at how much does the time (which is required for it to run) increase as N increases that means that constants are irrelevant. In this case when N doubles the algorithm takes 4 times longer to execute. So we end up with n^2 or O(n^2).

Resources