Calculating tilde-complexity of for-loop with cubic index - for-loop

Say I have following algorithm:
for(int i = 1; i < N; i *= 3) {
sum++
}
I need to calculate the complexity using tilde-notation, which basically means that I have to find a tilde-function so that when I divide the complexity of the algorithm by this tilde-function, the limit in infinity has to be 1.
I don't think there's any need to calculate the exact complexity, we can ignore the constants and then we have a tilde-complexity.
By looking at the growth off the index, I assume that this algorithm is
~ log N
But rather than having a binary logarithmic function, the base in this case is 3.
Does this matter for the exact notation? Is the order of growth exactly the same and thus can we ignore the base when using Tilde-notation? Do I approach this correctly?

You are right, the for loop executes ceil(log_3 N) times, where log_3 N denotes the base-3 logarithm of N.
No, you cannot ignore the base when using the tilde notation.
Here's how we can derive the time complexity.
We will assume that each iteration of the for loop costs C, for some constant C>0.
Let T(N) denote the number of executions of the for-loop. Since at j-th iteration the value of i is 3^j, it follows that the number of iterations that we make is the smallest j for which 3^j >= N. Taking base-3 logarithms of both sides we get j >= log_3 N. Because j is an integer, j = ceil(log_3 N). Thus T(N) ~ ceil(log_3 N).
Let S(N) denote the time complexity of the for-loop. The "total" time complexity is thus C * T(N), because the cost of each of T(N) iterations is C, which in tilde notation we can write as S(N) ~ C * ceil*(log_3 N).

Related

How are the following equivalent to O(N)

I am reading an example where the following are equivalent to O(N):
O(N + P), where P < N/2
O(N + log N)
Can someone explain in laymen terms how it is that the two examples above are the same thing as O(N)?
We always take the greater one in case of addition.
In both the cases N is bigger than the other part.
In first case P < N/2 < N
In second case log N < N
Hence the complexity is O(N) in both the cases.
Let f and g be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinite
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M |g(x)| for all x > x0
So in your case 1:
f(N) = N + P <= N + N/2
We could set M = 2 Then:
|f(N)| <= 3/2|N| <= 2|N| (N0 could any number)
So:
N+p = O(N)
In your second case, we could also set M=2 and N0=1 to satify that:
|N + logN| <= 2 |N| for N > 1
Big O notation usually only provides an upper bound on the growth rate of the function, wiki. Meaning for your both cases, as P < N and logN < N. So that O(N + P) = O(2N) = O(N), The same to O(N + log N) = O(2N) = O(N). Hope that can answer your question.
For the sake of understanding you can assume that O(n) represents that the complexity is of the order of n and also that O notation represents the upper bound(or the complexity in worst case). So, when I say that O(n+p) it represents that the order of n+p.
Let's assume that in worst case p = n/2, then what would be order of n+n/2? It would still be O(n), that is, linear because constants do form a part of the Big-O notation.
Similary, for O(n+logn) because logn can never be greater than n. So, overall complexity turns out to be linear.
In short
If N is a function and C is a constant:
O(N+N/2):
If C=2, then for any N>1 :
(C=2)*N > N+N/2,
2*N>3*N/2,
2> 3/2 (true)
O(N+logN):
If C=2, then for any N>2 :
(C=2)*N > N+logN,
2*N > N+logN,
2>(N+logN)/N,
2> 1 + logN/N (limit logN/N is 0),
2>1+0 (true)
Counterexample O(N^2):
No C exists such that C*N > N^2 :
C > N^2/N,
C>N (contradiction).
Boring mathematical part
I think the source of confusion is that equals sign in O(f(x))=O(N) does not mean equality! Usually if x=y then y=x. However consider O(x)=O(x^2) which is true, but reverse is false: O(x^2) != O(x)!
O(f(x)) is an upper bound of how fast a function is growing.
Upper bound is not an exact value.
If g(x)=x is an upper bound for some function f(x), then function 2*g(x) (and in general anything growing faster than g(x)) is also an upper bound for f(x).
The formal definition is: for function f(x) to be bound by some other function g(x) if you chose any constant C then, starting from some x_0 g(x) is always greater than f(x).
f(x)=N+N/2 is the same as 3*N/2=1.5*N. If we take g(x)=N and our constant C=2 then 2*g(x)=2*N is growing faster than 1.5*N:
If C=2 and x_0=1 then for any n>(x_0=1) 2*N > 1.5*N.
same applies to N+log(N):
C*N>N+log(N)
C>(N+logN)/N
C>1+log(N)/N
...take n_0=2
C>1+1/2
C>3/2=1.5
use C=2: 2*N>N+log(N) for any N>(n_0=2),
e.g.
2*3>3+log(3), 6>3+1.58=4.68
...
2*100>100+log(100), 200>100+6.64
...
Now interesting part is: no such constant exist for N & N^2. E.g. N squared grows faster than N:
C*N > N^2
C > N^2/N
C > N
obviously no single constant exists which is greater than a variable. Imagine such a constant exists C=C_0. Then starting from N=C_0+1 function N is greater than constant C, therefore such constant does not exist.
Why is this useful in computer science?
In most cases calculating exact algorithm time or space does not make sense as it would depend on hardware speed, language overhead, algorithm implementation details and many other factors.
Big O notation provides means to estimate which algorithm is better independently from real world complications. It's easy to see that O(N) is better than O(N^2) starting from some n_0 no matter which constants are there in front of two functions.
Another benefit is ability to estimate algorithm complexity by just glancing at program and using Big O properties:
for x in range(N):
sub-calc with O(C)
has complexity of O(N) and
for x in range(N):
sub-calc with O(C_0)
sub-calc with O(C_1)
still has complexity of O(N) because of "multiplication by constant rule".
for x in range(N):
sub-calc with O(N)
has complexity of O(N*N)=O(N^2) by "product rule".
for x in range(N):
sub-calc with O(C_0)
for y in range(N):
sub-calc with O(C_1)
has complexity of O(N+N)=O(2*N)=O(N) by "definition (just take C=2*C_original)".
for x in range(N):
sub-calc with O(C)
for x in range(N):
sub-calc with O(N)
has complexity of O(N^2) because "the fastest growing term determines O(f(x)) if f(x) is a sum of other functions" (see explanation in the mathematical section).
Final words
There is much more to Big-O than I can write here! For example in some real world applications and algorithms beneficial n_0 might be so big that an algorithm with worse complexity works faster on real data.
CPU cache might introduce unexpected hidden factor into otherwise asymptotically good algorithm.
Etc...

Why is the Big-O complexity of this algorithm O(n^2)?

I know the big-O complexity of this algorithm is O(n^2), but I cannot understand why.
int sum = 0;
int i = 1; j = n * n;
while (i++ < j--)
sum++;
Even though we set j = n * n at the beginning, we increment i and decrement j during each iteration, so shouldn't the resulting number of iterations be a lot less than n*n?
During every iteration you increment i and decrement j which is equivalent to just incrementing i by 2. Therefore, total number of iterations is n^2 / 2 and that is still O(n^2).
big-O complexity ignores coefficients. For example: O(n), O(2n), and O(1000n) are all the same O(n) running time. Likewise, O(n^2) and O(0.5n^2) are both O(n^2) running time.
In your situation, you're essentially incrementing your loop counter by 2 each time through your loop (since j-- has the same effect as i++). So your running time is O(0.5n^2), but that's the same as O(n^2) when you remove the coefficient.
You will have exactly n*n/2 loop iterations (or (n*n-1)/2 if n is odd).
In the big O notation we have O((n*n-1)/2) = O(n*n/2) = O(n*n) because constant factors "don't count".
Your algorithm is equivalent to
while (i += 2 < n*n)
...
which is O(n^2/2) which is the same to O(n^2) because big O complexity does not care about constants.
Let m be the number of iterations taken. Then,
i+m = n^2 - m
which gives,
m = (n^2-i)/2
In Big-O notation, this implies a complexity of O(n^2).
Yes, this algorithm is O(n^2).
To calculate complexity, we have a table the complexities:
O(1)
O(log n)
O(n)
O(n log n)
O(n²)
O(n^a)
O(a^n)
O(n!)
Each row represent a set of algorithms. A set of algorithms that is in O(1), too it is in O(n), and O(n^2), etc. But not at reverse. So, your algorithm realize n*n/2 sentences.
O(n) < O(nlogn) < O(n*n/2) < O(n²)
So, the set of algorithms that include the complexity of your algorithm, is O(n²), because O(n) and O(nlogn) are smaller.
For example:
To n = 100, sum = 5000. => 100 O(n) < 200 O(n·logn) < 5000 (n*n/2) < 10000(n^2)
I'm sorry for my english.
Even though we set j = n * n at the beginning, we increment i and decrement j during each iteration, so shouldn't the resulting number of iterations be a lot less than n*n?
Yes! That's why it's O(n^2). By the same logic, it's a lot less than n * n * n, which makes it O(n^3). It's even O(6^n), by similar logic.
big-O gives you information about upper bounds.
I believe you are trying to ask why the complexity is theta(n) or omega(n), but if you're just trying to understand what big-O is, you really need to understand that it gives upper bounds on functions first and foremost.

Finding big-omega of an algorithm for maximum subarray sum

I have an algorithm and I am trying to figure out the best-case scenario (asymptotic notation) for this algorithm that solves the maximum subarray sum problem:
-- Pseudocode --
// Input: An n-element array A of numbers, indexed from 1 to n.
// Output: The maximum subarray sum of Array A.
Algorithm MaxSubSlow(A):
m = 0.
for j = 1 to n do:
for k = j to n do:
s = 0
for i = j to k do:
s = s + A[i]
if s > m then:
m = s
return m
Looking at the algorithm, using asymptotic notation math it is easy to determine the worst-case scenario (each loop runs, in its worst case, all n times) so the worst-case complexity class would be O(N^3).
However, my textbook states that this algorithm also runs in Big-Omega(N^3) time; that is, the lower bound is equal to its upper bound. It does not offer an explanation as to why, though.
How would you formally calculate and prove this? Do you have to prove that for the algorithm, there is a subset of numbers (i, j, k) such that each loop in the algorithm will run at least n times? If so, how do you do that?
Intuitively, even if you change the input keeping the same size n, it performs the same number of operations: the number of operation only depends from the input size, not the specific input. So best and worst case scenario are the same.
The computation of o() is exactly the same as O() in this case

Algorithm Running Time for O(n.m^2)

I would like to know, because I couldn't find any information online, how is an algorithm like O(n * m^2) or O(n * k) or O(n + k) supposed to be analysed?
Does only the n count?
The other terms are superfluous?
So O(n * m^2) is actually O(n)?
No, here the k and m terms are not superfluous,they do have a valid existence and essential for computing time complexity. They are wrapped together to provide a concrete-complexity to the code.
It may seem like the terms n and k are independent to each other in the code,but,they both combinedly determines the complexity of the algorithm.
Say, if you've to iterate a loop of size n-elements, and, in between, you have another loop of k-iterations, then the overall complexity turns O(nk).
Complexity of order O(nk), you can't dump/discard k here.
for(i=0;i<n;i++)
for(j=0;j<k;j++)
// do something
Complexity of order O(n+k), you can't dump/discard k here.
for(i=0;i<n;i++)
// do something
for(j=0;j<k;j++)
// do something
Complexity of order O(nm^2), you can't dump/discard m here.
for(i=0;i<n;i++)
for(j=0;j<m;j++)
for(k=0;k<m;k++)
// do something
Answer to the last question---So O(n.m^2) is actually O(n)?
No,O(nm^2) complexity can't be reduced further to O(n) as that would mean m doesn't have any significance,which is not the case actually.
FORMALLY: O(f(n)) is the SET of ALL functions T(n) that satisfy:
There exist positive constants c and N such that, for all n >= N,
T(n) <= c f(n)
Here are some examples of when and why factors other than n matter.
[1] 1,000,000 n is in O(n). Proof: set c = 1,000,000, N = 0.
Big-Oh notation doesn't care about (most) constant factors. We generally leave constants out; it's unnecessary to write O(2n), because O(2n) = O(n). (The 2 is not wrong; just unnecessary.)
[2] n is in O(n^3). [That's n cubed]. Proof: set c = 1, N = 1.
Big-Oh notation can be misleading. Just because an algorithm's running time is in O(n^3) doesn't mean it's slow; it might also be in O(n). Big-Oh notation only gives us an UPPER BOUND on a function.
[3] n^3 + n^2 + n is in O(n^3). Proof: set c = 3, N = 1.
Big-Oh notation is usually used only to indicate the dominating (largest
and most displeasing) term in the function. The other terms become
insignificant when n is really big.
These aren't generalizable, and each case may be different. That's the answer to the questions: "Does only the n count? The other terms are superfluous?"
Although there is already an accepted answer, I'd still like to provide the following inputs :
O(n * m^2) : Can be viewed as n*m*m and assuming that the bounds for n and m are similar then the complexity would be O(n^3).
Similarly -
O(n * k) : Would be O(n^2) (with the bounds for n and k being similar)
and -
O(n + k) : Would be O(n) (again, with the bounds for n and k being similar).
PS: It would be better not to assume the similarity between the variables and to first understand how the variables relate to each other (Eg: m=n/2; k=2n) before attempting to conclude.

What's the big O of this little code snippet?

for i := 1 to n do
j := 2;
while j < i do
j := j^4;
I'm really confused when it comes to Big-O notation, so I'd like to know if it's O(n log n). That's my gut, but I can't prove it. I know the while loop is probably faster than log n, but I don't know by how much!
Edit: the caret denotes exponent.
The problem is the number of iterations the while loop is executed for a given i.
On every iteration j:= j^4 and at the beginning j := 2, so after x iterations j = 24^x
j < i is equivalent to x < log_4(log_2(i))
I'd risk a statement, that the complexity is O(n * log_4(log_2(n)))
You can get rid of constant factors in Big O notation. log_4(x) = log(x) / log(4) and log(4) is a constant. Similarly you can change log_2(x) to log(x). The complexity can be expressed as O(n*log(log(n)))
Off the cuff, I'd guess is it is O(n slog4 n) where slog represents the inverse of the tetration operator. Tetration is the next operation after exponentiation. Just like multiplication is iterated addition, and exponentiation is iterated multiplication, tetration is iterated exponentiation.
My reasoning is, if you multiplied j by 4 each iteration then the function would be O(n log4 n). But since you are exponentiating it each iteration, you need a correspondingly more powerful operator than log: slog.

Resources