Big-O time complexity, nested for and while loop - algorithm

I am trying to understand Big-O Time complexity and am unfortunately struggling, I cannot seem to grasp the concept, I know my results are correct for the following two code fragments however how I got there seems to be wrong. Would someone be able to help explain where I'm mis-understanding (not please, however not using sigma. Thank you!
Code Fragment Time Complexity
sum ← 0 O(1)
for i ← 1 to n do O(n)
for j ← 1 to n do O(n)
k←1 O(1)
while k < n do
k ← k * C O(log n) - affected by C (due to multiplication)
sum ← sum + 1 O(1)
-----------
O(1 x n x n x 1 x [log n] x 1)
O(n2 log n)
Code Fragment Time Complexity
sum ← 0 O(1)
for i ← 1 to n do O(n)
for j ← 1 to i do O(n)
k←1 O(1)
while k < n do
k ← k + C O(n) – not affected by C, k is arbitrarily large
sum ← sum + 1 O(1)
-----------
O(1 x n x n x 1 x n x 1)
O(n^3)

I see minor errors in the computation, though the final results are correct.
In the first algorithm :
O(1 x n x n x 1 x [log n] x 1)
should be
1 + n x n x (1 + (O(log n) x 2)) = O(n^2 * log n)
In the second algorithm :
O(1 x n x n x 1 x n x 1)
should be
1 + n x O(n) x (1 + (O(n) x 2)) = O(n^3)

Related

Why is m + k log m = O(m + k log k)?

Paredes and Navarro state that
m + k log m = O(m + k log k)
This gives an immediate "tighter looking" bound for incremental sorting. That is, if a partial or incremental sorting algorithm is O(m + k log m), then it is automatically O(m + k log k), where the k smallest elements are sorted from a set of size m. Unfortunately, their explanation is rather difficult for me to understand. Why does it hold?
Specifically, they state
Note that m + k log m = O(m + k log k), as they can differ only
if k = o(mα) for any α > 0, and then m dominates k log m.
This seems to suggest they're talking about k as a function of m along some path, but it's very hard to see how k = o(mα) plays into things, or where to place the quantifiers in their statement.
There are various ways to define big-O notation for multi-variable functions, which would seem to make the question difficult to approach. Fortunately, it doesn't actually matter exactly which definition you pick, as long as you make the entirely reasonable assumption that m > 0 and k >= 1. That is, in the incremental sorting context, you assume that you need to obtain at least the first element from a set with at least one element.
Theorem
If m and k are real numbers, m > 0, and k >= 1, then m + k log m <= 2(m + k log k).
Proof
Suppose for the sake of contradiction that
m + k log m > 2(m + k log k)
Rearranging terms,
k log m - 2k log k > m
By the product property for logarithms,
k log m - k (log (k^2)) > m
By the sum property for logarithms,
k (log (m / k^2)) > m
Dividing by k (which is positive),
log (m / k^2) > m/k
Since k >= 1, k^2 >= k, so (since m >= 0) m / k >= m / k^2. Thus
log (m / k^2) > m / k^2
The logarithm of a number can never exceed that number, so we have reached a contradiction.

How does my randomly partitioned array look in the general case?

I have an array of n random integers
I choose a random integer and partition by the chosen random integer (all integers smaller than the chosen integer will be on the left side, all bigger integers will be on the right side)
What will be the size of my left and right side in the average case, if we assume no duplicates in the array?
I can easily see, that there is 1/n chance that the array is split in half, if we are lucky. Additionally, there is 1/n chance, that the array is split so that the left side is of length 1/2-1 and the right side is of length 1/2+1 and so on.
Could we derive from this observation the "average" case?
You can probably find a better explanation (and certainly the proper citations) in a textbook on randomized algorithms, but here's the gist of average-case QuickSort, in two different ways.
First way
Let C(n) be the expected number of comparisons required on average for a random permutation of 1...n. Since the expectation of the sum of the number of comparisons required for the two recursive calls equals the sum of the expectations, we can write a recurrence that averages over the n possible divisions:
C(0) = 0
1 n−1
C(n) = n−1 + ― sum (C(i) + C(n−1−i))
n i=0
Rather than pull the exact solution out of a hat (or peek at the second way), I'll show you how I'd get an asymptotic bound.
First, I'd guess the asymptotic bound. Obviously I'm familiar with QuickSort and my reasoning here is fabricated, but since the best case is O(n log n) by the Master Theorem, that's a reasonable place to start.
Second, I'd guess an actual bound: 100 n log (n + 1). I use a big constant because why not? It doesn't matter for asymptotic notation and can only make my job easier. I use log (n + 1) instead of log n because log n is undefined for n = 0, and 0 log (0 + 1) = 0 covers the base case.
Third, let's try to verify the inductive step. Assuming that C(i) ≤ 100 i log (i + 1) for all i ∈ {0, ..., n−1},
1 n−1
C(n) = n−1 + ― sum (C(i) + C(n−1−i)) [by definition]
n i=0
2 n−1
= n−1 + ― sum C(i) [by symmetry]
n i=0
2 n−1
≤ n−1 + ― sum 100 i log(i + 1) [by the inductive hypothesis]
n i=0
n
2 /
≤ n−1 + ― | 100 x log(x + 1) dx [upper Darboux sum]
n /
0
2
= n−1 + ― (50 (n² − 1) log (n + 1) − 25 (n − 2) n)
n
[WolframAlpha FTW, I forgot how to integrate]
= n−1 + 100 (n − 1/n) log (n + 1) − 50 (n − 2)
= 100 (n − 1/n) log (n + 1) − 49 n + 100.
Well that's irritating. It's almost what we want but that + 100 messes up the program a little bit. We can extend the base cases to n = 1 and n = 2 by inspection and then assume that n ≥ 3 to finish the bound:
C(n) = 100 (n − 1/n) log (n + 1) − 49 n + 100
≤ 100 n log (n + 1) − 49 n + 100
≤ 100 n log (n + 1). [since n ≥ 3 implies 49 n ≥ 100]
Once again, no one would publish such a messy derivation. I wanted to show how one could work it out formally without knowing the answer ahead of time.
Second way
How else can we derive how many comparisons QuickSort does in expectation? Another possibility is to exploit the linearity of expectation by summing over each pair of elements the probability that those elements are compared. What is that probability? We observe that a pair {i, j} is compared if and only if, at the leaf-most invocation where i and j exist in the array, either i or j is chosen as the pivot. This happens with probability 2/(j+1 − i), since the pivot must be i, j, or one of the j − (i+1) elements that compare between them. Therefore,
n n 2
C(n) = sum sum ―――――――
i=1 j=i+1 j+1 − i
n n+1−i 2
= sum sum ―
i=1 d=2 d
n
= sum 2 (H(n+1−i) − 1) [where H is the harmonic numbers]
i=1
n
= 2 sum H(i) − n
i=1
= 2 (n + 1) (H(n+1) − 1) − n. [WolframAlpha FTW again]
Since H(n) is Θ(log n), this is Θ(n log n), as expected.

Why does this algorithm run in O(n log n)?

We're supposed to "consider the following algorithm, which operates on an array A[1 . . n] of integers."
for i in [1 . . n] do
A[i] ← 0
for i in [1 . . n] do
j ← i
while j ≤ n do
A[j] ← A[j] + 1
j ← j + i
The assignment asks us to demonstrate that this algorithm runs in O(n log n).
The first loop is quite clearly going to add n to the run time, which would simply be dropped.
The second nested loops will run faster than a pure O(n^2) algorithm, since the while loop doesn't always run n times. When i = 1 it goes n times, i = 2 it will run n-1 times, all the way up to i = n where it will run once.
But, using the same method as Gauss summing the integers between 1 and 100, we can see that the while loop will run an average of (n+1)/2 times. Multiply by n for the for loop, and we get to (n^2 + n)/2, which can be simplified down to O(n^2), not O(n log n)
How does this result in an O(n log n) running time?
Considering the following:
for i in [1 . . n] do
j ← i
while j ≤ n do
A[j] ← A[j] + 1
j ← j + i
Each time, j is incremented with +i, not +1. As such, the first while loop while be iterated n times, then n/2, then n/3... up to n/n.
Ignoring integer rounding, we can write this in the following format:
n(1 + 1/2 + 1/3 + ... + 1/n).
It seems that we are dealing with an harmonic serie, with an additional multiplication by n. Some more details here and here. This would be O(n log n).

3D Matrix traversal Big-O

My attempt for the Big-O of each of these two algorithms..
1) Algorithm threeD(matrix, n)
// a 3D matrix of size n x n x n
layer ← 0
while (layer < n)
row ← 0
while (row < layer)
col ← 0
while (col < row)
print matrix[layer][row][col]
col ← col + 1
done
row ← row + 1
done
layer ← layer * 2
done
O((n^2)log(n)) because the two outer loops are each O(N) and the innermost one seems to be O(log n)
2) Algorithm Magic(n)
//Integer, n > 0
i ← 0
while (i < n)
j ← 0
while (j < power(2,i))
j ← j + 1
done
i ← i + 1
done
O(N) for outer loop, O(2^n) for inner? = O(n(2^n))?
1. Algorithm
First of all: This algorithm never terminates due to layer is initiated with zero. layer is only multipyed by 2 so it will never get bigger than zero, specially not bigger than n.
To get this work, you have to start with layer > 0.
So lets start with layer = 1.
The time-complexity can be written as T(n) = T(n/2) + n^2.
You can get this by looking that way: At the end the layer is setted at most to n. Then the inner loops do n^2 steps. Bevor that, the layer was only half that big. So you have to do the n^2 steps on the last rould of the outer loop and all stepps of the round bevor wirtten as T(n/2).
The masters theorem gets you Theta(n^2).
2. Algorithm
You can just count the steps:
2^0 + 2^1 + 2^2 + ... + 2^(n-1) = sum_(i=0)^(n-1)2^i = 2^n-1
To get this simplification just take a look at binary numbers: The sum of steps corresponds to a binary number containing only 1's (like 1111 1111). This number equals 2^n-1.
So the time complexity is Theta(2^n)
Notice: Both your Big-O's are not wrong, there are bette boundings.

Running time T(n) of algorithm

I have analyzed running time of following alogirthm i analyzed theta but can its running time could be Big O?
Cost Time
1. for i ←1 to n c1 n
2. do for j ← i to n c2 n
3. do k ← k+ j c3 n-1
T(n) = c1n +c2n+c3(n-1)
= C1n+C2n+C3(n-1)
= n(C1+C2)+n-1
= n+n-1
Or T(n) = Ө(n)
So running time is Ө(n)
Your loop will continue as follows (well-known arithmetic progression formula):
-which also can be estimated as since big-O gives majority estimation.
1. for i ←1 to n c1 n
2. do for j ← i to n c2 n
3. do k ← k+ j c3 1
T(n) = n * n * 1 = O(n^2) #Giulio Franco
It's a nested loop that does a constant time operation in there.
do k ← k+ j is constant because it's a fixed length of time for the operation to take place no matter what inputs you put. k + j
loop(n)
loop(n)
constant time(1)
When it's a loop inside a loop you multiply. n*n*1
loop(n)
loop(n)
These loops aren't nested.
this would be n + n
O(n+n) which reduces to O(n)

Resources