Time complexity for nested while loop problem - data-structures

Hi community i am currently having issues understanding this data structures problem:
function Prob(n):
x = 0
i = 5
while i <= n ^ 2 * sqrt(n) do:
j = n * logbase5(n) // In my solution for first iteration K i set everything equal to c
while j >= 3 do: // So far my solution is T(n) is in theta (n^2.5) but i don't think it's right.
// Iteration 1 K=nlog(n), Iteration 2 B=log(n).
x = x + i - j
j = j - 5
end while
i = 3 * i
end while
return x
end function

Let's observe two things here:
j is only effected by n, so its value is the same each run of the loop: n * logbase5(n) but divide it by 5 (because we are getting -5 each run. so it theta (n * log (n)) (notice that logbase5(n) is just to get you confused, since it is: log(n) / log(5) which is a constant, the -5 is also not changing the asymptotic behavior of this function).
i is changed in the outer loop by i = i * 3, but not in the inner loop. The upper bound is (n ^ 2) * sqrt(n) which is n ^ 2.5 as you have noted, but, each iteration we triple i, by i = 3 * i. Thus, this loop is happening logbase3(n ^ 2.5) (because i = 3 ^ # of iteration more or less). Again, logbase3(n) = log(n) / log(3) and log(n ^ m) = m * log (n), so we get again: theta(log(n))
All in all, we get theta(log(n) * log(n)) or theta(log^2 (n))
Remark - I'm using the CS log (meaning in base 2), this is the common notation, and as you can see, it only differs in a constant factor from any other non-variable base.

Related

what is the best way to argue about the big O or about theta?

We're asked to provide a $ n+4![\sqrt{n}] =O(n) $ with having a good argumentation and a logical build up for it but it's not said how a good argumentation would look like, so I know that $2n+4\sqrt{n}$ always bigger for n=1 but i wouldn't know how to argue about it and how to logically build it since i just thought about it and it happened to be true. Can someone help out with this example so i would know how to do it?
You should look at the following site https://en.wikipedia.org/wiki/Big_O_notation
For the O big notation we would say that if a function is the following: X^3+X^2+100X = O(x^3). This is with idea that if X-> some very big number, the X^3 term will become the dominant factor in the equation.
You can use the same logic to your equation. Which term will become dominant in your equation.
If this is not clear you should try to plot both terms and see how they scale. This could be more clarifying.
A proof is a convincing, logical argument. When in doubt, a good way to write a convincing, logical argument is to use an accepted template for your argument. Then, others can simply check that you have used the template correctly and, if so, the validity of your argument follows.
A useful template for showing asymptotic bounds is mathematical induction. To use this, you show that what you are trying to prove is true for specific simple cases, called base cases, then you assume it is true in all cases up to a certain size (the induction hypothesis) and you finish the proof by showing the hypothesis implies the claim is true for cases of the very next size. If done correctly, you will have shown the claim (parameterized by a natural number n) is true for a fixed n and for all larger n. This is what is exactly what is required for proving asymptotic bounds.
In your case: we want to show that n + 4 * sqrt(n) = O(n). Recall that the (one?) formal definition of big-Oh is the following:
A function f is bound from above by a function g, written f(n) = O(g(n)), if there exist constants c > 0 and n0 > 0 such that for all n > n0, f(n) <= c * g(n).
Consider the case n = 0. We have n + 4 * sqrt(n) = 0 + 4 * 0 = 0 <= 0 = c * 0 = c * n for any constant c. If we now assume the claim is true for all n up to and including k, can we show it is true for n = k + 1? This would require (k + 1) + 4 * sqrt(k + 1) <= c * (k + 1). There are now two cases:
k + 1 is not a perfect square. Since we are doing analysis of algorithms it is implied that we are using integer math, so sqrt(k + 1) = sqrt(k) in this case. Therefore, (k + 1) + 4 * sqrt(k + 1) = (k + 4 * sqrt(k)) + 1 <= (c * k) + 1 <= c * (k + 1) by the induction hypothesis provided that c > 1.
k + 1 is a perfect square. Since we are doing analysis of algorithms it is implied that we are using integer math, so sqrt(k + 1) = sqrt(k) + 1 in this case. Therefore, (k + 1) + 4 * sqrt(k + 1) = (k + 4 * sqrt(k)) + 5 <= (c * k) + 5 <= c * (k + 1) by the induction hypothesis provided that c >= 5.
Because these two cases cover all possibilities and in each case the claim is true for n = k + 1 when we choose c >= 5, we see that n + 4 * sqrt(n) <= 5 * n for all n >= 0 = n0. This concludes the proof that n + 4 * sqrt(n) = O(n).

How many times is x=x+1 executed in theta notation in terms of n?

I'm taking Data Analysis and Algorithms in the Summer.
The question: Find a Θ-notation in terms of n for the number of times the statement x = x + 1 is executed.
for i = 1 to 526
for j = 1 to n^2(lgn)^3
for k = 1 to n
x=x+1
I'm confused on how to find the answer. The first line would be 526, then the second line is clearly n^2 times (lgn)^3, but would that be maybe 3(n^2)lgn? And then the third line is just n. So combined they would be 526*n^3(lgn)^3, and with just n it would be something like Θ(n^3)(lgn)^3. I'm not sure.
Also to make sure I understand this type of problem I have
for i = 1 to |nlgn|
for j = 1 to i
x=x+1
The answer would be just nlgn because the i on the second line isn't important right?
Answering for the first part :-
for i = 1 to 526
for j = 1 to n^2(lgn)^3
for k = 1 to n
x=x+1
This program will run 526 * n^2 * (lg n)^3 * n times = 526 * n^3 * (lg n)^3 times.
So, x = x +1 will be executed 526 * n^3 * (lg n)^3 times.
Coming to Big-theta notation,
As n is always greater than (lg n) for any n>1, so,
c1 * n^3 <= 526 * n^3 * (lg n)^3 <= c2 * n^3 * (lg n)^3
for some positive constants c1<526 and c2>526, so the big-theta notation will be
θ(n^3*(lg n)^3).
Answering for the second part,
for i = 1 to |nlgn|
for j = 1 to i
x=x+1
Your assumption is quite incorrect. The inner-loop guided by j is also necessary for evaluation of x = x+1 statement as it is the body of the inner-loop which itself is a body of the outer-loop.
So, here, x = x + 1 will be evaluated for
= 1 + 2 + ... + n*(lg(n) times
= [n*lg(n) * {n*lg(n)}+1]/2 times.
The answer for the second example is not nlogn. You cannot simply multiply the bounds of two for loops to each other. The second loop form a sigma since the second loop moves from 1 to i that can be at most nlogn. This sigma would be 1+2+...+nlogn... The summation of this sigma can be found using the formula of the sum of natural numbers. Therefore the sigma is (nlogn*(nlogn+1))/2.

Efficient Algorithm to Solve a Recursive Formula

I am given a formula f(n) where f(n) is defined, for all non-negative integers, as:
f(0) = 1
f(1) = 1
f(2) = 2
f(2n) = f(n) + f(n + 1) + n (for n > 1)
f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
My goal is to find, for any given number s, the largest n where f(n) = s. If there is no such n return None. s can be up to 10^25.
I have a brute force solution using both recursion and dynamic programming, but neither is efficient enough. What concepts might help me find an efficient solution to this problem?
I want to add a little complexity analysis and estimate the size of f(n).
If you look at one recursive call of f(n), you notice, that the input n is basically divided by 2 before calling f(n) two times more, where always one call has an even and one has an odd input.
So the call tree is basically a binary tree where always the half of the nodes on a specific depth k provides a summand approx n/2k+1. The depth of the tree is log₂(n).
So the value of f(n) is in total about Θ(n/2 ⋅ log₂(n)).
Just to notice: This holds for even and odd inputs, but for even inputs the value is about an additional summand n/2 bigger. (I use Θ-notation to not have to think to much about some constants).
Now to the complexity:
Naive brute force
To calculate f(n) you have to call f(n) Θ(2log₂(n)) = Θ(n) times.
So if you want to calculate the values of f(n) until you reach s (or notice that there is no n with f(n)=s) you have to calculate f(n) s⋅log₂(s) times, which is in total Θ(s²⋅log(s)).
Dynamic programming
If you store every result of f(n), the time to calculate a f(n) reduces to Θ(1) (but it requires much more memory). So the total time complexity would reduce to Θ(s⋅log(s)).
Notice: Since we know f(n) ≤ f(n+2) for all n, you don't have to sort the values of f(n) and do a binary search.
Using binary search
Algorithm (input is s):
Set l = 1 and r = s
Set n = (l+r)/2 and round it to the next even number
calculate val = f(n).
if val == s then return n.
if val < s then set l = n
else set r = n.
goto 2
If you found a solution, fine. If not: try it again but round in step 2 to odd numbers. If this also does not return a solution, no solution exists at all.
This will take you Θ(log(s)) for the binary search and Θ(s) for the calculation of f(n) each time, so in total you get Θ(s⋅log(s)).
As you can see, this has the same complexity as the dynamic programming solution, but you don't have to save anything.
Notice: r = s does not hold for all s as an initial upper limit. However, if s is big enough, it holds. To be save, you can change the algorithm:
check first, if f(s) < s. If not, you can set l = s and r = 2s (or 2s+1 if it has to be odd).
Can you calculate the value of f(x) which x is from 0 to MAX_SIZE only once time?
what i mean is : calculate the value by DP.
f(0) = 1
f(1) = 1
f(2) = 2
f(3) = 3
f(4) = 7
f(5) = 4
... ...
f(MAX_SIZE) = ???
If the 1st step is illegal, exit. Otherwise, sort the value from small to big.
Such as 1,1,2,3,4,7,...
Now you can find whether exists n satisfied with f(n)=s in O(log(MAX_SIZE)) time.
Unfortunately, you don't mention how fast your algorithm should be. Perhaps you need to find some really clever rewrite of your formula to make it fast enough, in this case you might want to post this question on a mathematics forum.
The running time of your formula is O(n) for f(2n + 1) and O(n log n) for f(2n), according to the Master theorem, since:
T_even(n) = 2 * T(n / 2) + n / 2
T_odd(n) = 2 * T(n / 2) + 1
So the running time for the overall formula is O(n log n).
So if n is the answer to the problem, this algorithm would run in approx. O(n^2 log n), because you have to perform the formula roughly n times.
You can make this a little bit quicker by storing previous results, but of course, this is a tradeoff with memory.
Below is such a solution in Python.
D = {}
def f(n):
if n in D:
return D[n]
if n == 0 or n == 1:
return 1
if n == 2:
return 2
m = n // 2
if n % 2 == 0:
# f(2n) = f(n) + f(n + 1) + n (for n > 1)
y = f(m) + f(m + 1) + m
else:
# f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
y = f(m - 1) + f(m) + 1
D[n] = y
return y
def find(s):
n = 0
y = 0
even_sol = None
while y < s:
y = f(n)
if y == s:
even_sol = n
break
n += 2
n = 1
y = 0
odd_sol = None
while y < s:
y = f(n)
if y == s:
odd_sol = n
break
n += 2
print(s,even_sol,odd_sol)
find(9992)
This recursive in every iteration for 2n and 2n+1 is increasing values, so if in any moment you will have value bigger, than s, then you can stop your algorithm.
To make effective algorithm you have to find or nice formula, that will calculate value, or make this in small loop, that will be much, much, much more effective, than your recursion. Your recursion is generally O(2^n), where loop is O(n).
This is how loop can be looking:
int[] values = new int[1000];
values[0] = 1;
values[1] = 1;
values[2] = 2;
for (int i = 3; i < values.length /2 - 1; i++) {
values[2 * i] = values[i] + values[i + 1] + i;
values[2 * i + 1] = values[i - 1] + values[i] + 1;
}
And inside this loop add condition of possible breaking it with success of failure.

How do I evaluate and explain the running time for this algorithm?

Pseudo code :
s <- 0
for i=1 to n do
if A[i]=1 then
for j=1 to n do
{constant number of elementary operations}
endfor
else
s <- s + a[i]
endif
endfor
where A[i] is an array of n integers, each of which is a random value between 1 and 6.
I'm at loss here...picking from my notes and some other sources online, I get
T(n) = C1(N) + C2 + C3(N) + C4(N) + C5
where C1(N) and C3(N) = for loops, and C4(N) = constant number of elementary operations. Though I have a strong feeling that I'm wrong.
You are looping from 1..n, and each loop ALSO loops from 1..n (in the worst case). O(n^2) right?
Put another way: You shouldn't be adding C3(N). That would be the case if you had two independent for loops from 1..n. But you have a nested loop. You will run the inner loop N times, and the outer loop N times. N*N = O(n^2).
Let's think for a second about what this algorithm does.
You are basically looping through every element in the array at least once (outer for, the one with i). Sometimes, if the value A[i] is 1, you are also looping again through the whole loop with the j for.
In your worst case scenario, you are running against an array of all 1's.
In that case, your complexity is:
time to init s
n * (time to test a[i] == 1 + n * time of {constant ...})
Which means: T = T(s) + n * (T(test) + n * T(const)) = n^2 * T(const) + n*T(test) + T(s).
Asymptotically, this is a O(n^2).
But this was a worst-case analysis (the one you should perform most of the times). What if we want an average case analysis?
In that case, assuming an uniform distribution of values in A, you are going to access the for loop of j, on average, 1/6 of the times.
So you'd get:
- time to init s
- n * (time to test a[i] == 1 + 1/6 * n * time of {constant ...} + 5/6 * T(increment s)
Which means: T = T(s) + n * (T(test) + 1/6 * n * T(const) + 5/6 * T(inc s)) = 1/6* n^2 * T(const) + n * (5/6 * T(inc s) + T(test)) + T(s).
Again, asymptotically this is still O(n^2), but according to the value of T(inc s) this could be larger or lower than the other case.
Fun exercise: can you estimate the expected average run time for a generic distribution of values in A, instead of an uniform one?

What is the running time for this function?

I have 3 questions in this function ,
Sum = 0
MyFunction (N)
M = 1,000,000
If (N > 1)
For I = 1 to M do
Sum = 0
J = 1
Do
Sum = Sum + J
J = J + 2
While J < N
End For
If (MyFunction(N / 2) % 3 == 0)
Return (2 * MyFunction(N / 2))
Else
Return (4 * MyFunction(N / 2))
End If
Else
Return 1
End If
End MyFunction
First question is : What's the Complexity of the non-recursive part of code?
I think the non recursive part is that loop
For I = 1 to M do
Sum = 0
J = 1
Do
Sum = Sum + J
J = J + 2
While J < N
End For
and my answer is M * log(n) , but my slides say it's not M * log (n) !
I need explanation for this.
The second question is: What's the correct recurrence for the previous code of MyFunction?
when I saw these lines of code
If (MyFunction(N / 2) % 3 == 0)
Return (2 * MyFunction(N / 2))
Else
Return (4 * MyFunction(N / 2))
End If
I think that it's T(n) = T(n/2) + Theta(non-recursive),
because if will execute one of the 2 calls.
Again this answer is wrong.
The third one is: What's the complexity of MyFunction?
My answer based on the 2 questions is T(n) = T(n/2) + M * lg n
and total running time is M * lg n .
Let's look at this one piece at a time.
First, here's the non-recursive part of the code:
For I = 1 to M do
Sum = 0
J = 1
Do
Sum = Sum + J
J = J + 2
While J < N
End For
The outer loop will run Θ(M) times. Since M is a fixed constant (one million), the loop will run Θ(1) times.
Inside the loop, the inner while loop will run Θ(N) times, since on each iteration J increases by two and stops as soon as J meets or exceeds N. Therefore, the total work done by this loop nest is Θ(N): Θ(N) work Θ(1) times.
Now, let's look at this part:
If (MyFunction(N / 2) % 3 == 0)
Return (2 * MyFunction(N / 2))
Else
Return (4 * MyFunction(N / 2))
End If
The if statement will make one recursive call on an input of size N / 2, and then depending on the result there will always be a second recursive call of size N / 2 (since you're not caching the result).
This gives the following recurrence relation for the runtime:
T(n) = 2T(n / 2) + Θ(n)
Using the Master Theorem, this solves to Θ(n log n).
Hope this helps!

Resources