Big(O) for this algorithm - algorithm

What is the big(O) of this algorithm.
I know that it is similar to O(log(n)) but instead of being halved each time, it is being shrunken exponentially.
sum = 0
i = n
j = 2
while(i>=1)
sum = sum+i
i = i/j
j = 2*j

The denominator d is
d := 2^(k * (k + 1) / 2)
in the k-th iteration of the loop. Thus you have to solve when d is larger than n which leads to a fraction less than 1
2^(k * (k + 1) / 2) > n
for k and fixed n. Inserting
solve 2^(k * (k + 1) / 2) > n for k
in WolframAlpha gives
Thus, you have a running time of O(sqrt(log n)) for your algorithm, when you remove the irrelevant constants from the formula.

Related

What's the complexity of sum i=0 -> n (n_i*i))

This is a test I failed because I thought this complexity would be O(n), but it appears i'm wrong and it's O(n^2). Why not O(n)?
First, notice that the question does not ask what is the time complexity of a function calculating f(n), but rather the complexity of the function f(n) itself. you can think about f(n) as being the time complexity of some other algorithm if you are more comfortable talking about time complexity.
This is indeed O(n^2), when the sequence a_i is bounded by a constant and each a_i is at least 1.
By the assumption, for all i, a_i <= c for some constant c.
Hence, a_1*1+...+a_n*n <= c * (1 + 2 + ... + n). Now we need to show that 1 + 2 +... + n = O(n^2) to complete the proof.
1 + 2 + ... + n <= n + n + ... + n = n * n = n ^ 2
and
1 + 2 + ... + n >= n / 2 + (n / 2 + 1) + ... + n >= (n / 2) * (n / 2) = n^2/4
So the complexity is actually Theta(n^2).
Note that if a_i was not constant, e.g., a_i = i then the result is not correct.
in that case, f(n) = 1^2 + 2^2 + ... + n^2 and you can show easily (using the same method as before) that f(n) = Omega(n^3), which means it's not O(n^2).
Preface, not super great with complexity-theory but I'll take a stab.
I think what is confusing is that its not a time complexity problem, but rather the functions complexity.
So for easy part i just goes up to n ie. 1,2,3 ...n , then for ai all entries must be above 0 meaning that a could be something like this 2,5,1... for n times. If you multiply them together n*n = O(n2).
The best case would be if a is 1,1,1 which drop the complexity down to O(n) but the average case will be n so you get squared.
Unless it's mentioned that a[i] is O(n), it's definitely O(n)
Here an another try to achieve O(n*n) if sum should be returned as result.
int sum = 0;
for(int i = 0; i<=n; i++){
for(int j = 0; j<=n; j++){
if(i == j){
sum += A[i] * j;
}
}
return sum;

Is there a better algorithm to assign numbers to combinations?

It is well known that Pascal's identity can be used to encode a combination of k elements out of n into a number from 0 to (n \choose k) - 1 (let's call this number a combination index) using a combinatorial number system. Assuming constant time for arithmetic operations, this algorithm takes O(n) time.†
I have an application where k ≪ n and an algorithm in O(n) time is infeasible. Is there an algorithm to bijectively assign a number between 0 and (n \choose k) - 1 to a combination of k elements out of n whose runtime is of order O(k) or similar? The algorithm does not need to compute the same mapping as the combinatorial number system, however, the inverse needs to be computable in a similar time complexity.
† More specifically, the algorithm computing the combination from the combination index runs in O(n) time. Computing the combination index from the combination works in O(k) time if you pre-compute the binomial coefficients.
Description of a comment.
For given combinatorial index (N), to find k'th digit it is needed to find c_k such that (c_k \choose k) <= N and ((c_k+1) \choose k) > N.
Set P(i,k) = i!/(i-k)!.
P(i, k) = i * (i-1) * ... * (i-k+1)
substitute x = i - (k-1)/2
= (x+(k-1)/2) * (x+(k-1)/2-1) * ... * (x-(k-1)/2+1) * (x-(k-1)/2)
= (x^2 - ((k-1)/2)^2) * (x^2 - ((k-1)/2-1)^2) * ...
= x^k - sum(((k-2i-1)/2)^2))*x^(k-2) + O(x^(k-4))
= x^k - O(x^(k-2))
P(i, k) = (i - (k-1)/2)^k - O(i^(k-2))
From above inequality:
(c_k \choose k) <= N
P(c_k, k) <= N * k!
c_k ~= (N * k!)^(1/k) + (k-1)/2
I am not sure how large is O(c_k^(k-2)) part. I suppose it doesn't influence too much. If it is of order (c_k+1)/(c_k-k+1) than approximation is very good. That is due:
((c_k+1) \choose k) = (c_k \choose k) * (c_k + 1) / (c_k - k + 1)
I would try algorithm something like:
For given k
Precalculate k!
For given N
For i in (k, ..., 0)
Calculate c_i with (N * i!)^(1/i) + (i-1)/2
(*) Check is P(c_i, k) <=> N * i!
If smaller check c_i+1
If larger check c_i-1
Repeat (*) until found P(c_i, i) <= N * i! < P(c_i+1, i)
N = N - P(c_i, i)
If approximation is good, number of steps << k, than finding one digit is O(k).

What is the time complexity for given snippet?

for i = 1 to n do
for j = 1 to i do
for k = 1 to j do
What is its time complexity in terms of 'n'?
The inner-most loop will obviously run j times. Assuming that it contains operations worth 1 time unit, this will be:
T_inner(j) = j
The middle loop will run i times, i.e.
T_middle(i) = Sum {j from 1 to i} T_inner(j)
= Sum {j from 1 to i} j
= i/2 * (1 + i)
Finally:
T_outer(n) = Sum {i from 1 to n} T_middle(i)
= Sum {i from 1 to n} (i/2 * (1 + i))
= 1/6 * n * (1 + n) * (2 + n)
= 1/6 n^3 + 1/2 n^2 + 1/3 n
And this is obviously O(n^3).
Note: This only counts the operations in the inner most block. It neglects the operations necessary to perform the loop. But if you include those, you will see that the time complexity is the same.

Efficient Algorithm to Solve a Recursive Formula

I am given a formula f(n) where f(n) is defined, for all non-negative integers, as:
f(0) = 1
f(1) = 1
f(2) = 2
f(2n) = f(n) + f(n + 1) + n (for n > 1)
f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
My goal is to find, for any given number s, the largest n where f(n) = s. If there is no such n return None. s can be up to 10^25.
I have a brute force solution using both recursion and dynamic programming, but neither is efficient enough. What concepts might help me find an efficient solution to this problem?
I want to add a little complexity analysis and estimate the size of f(n).
If you look at one recursive call of f(n), you notice, that the input n is basically divided by 2 before calling f(n) two times more, where always one call has an even and one has an odd input.
So the call tree is basically a binary tree where always the half of the nodes on a specific depth k provides a summand approx n/2k+1. The depth of the tree is log₂(n).
So the value of f(n) is in total about Θ(n/2 ⋅ log₂(n)).
Just to notice: This holds for even and odd inputs, but for even inputs the value is about an additional summand n/2 bigger. (I use Θ-notation to not have to think to much about some constants).
Now to the complexity:
Naive brute force
To calculate f(n) you have to call f(n) Θ(2log₂(n)) = Θ(n) times.
So if you want to calculate the values of f(n) until you reach s (or notice that there is no n with f(n)=s) you have to calculate f(n) s⋅log₂(s) times, which is in total Θ(s²⋅log(s)).
Dynamic programming
If you store every result of f(n), the time to calculate a f(n) reduces to Θ(1) (but it requires much more memory). So the total time complexity would reduce to Θ(s⋅log(s)).
Notice: Since we know f(n) ≤ f(n+2) for all n, you don't have to sort the values of f(n) and do a binary search.
Using binary search
Algorithm (input is s):
Set l = 1 and r = s
Set n = (l+r)/2 and round it to the next even number
calculate val = f(n).
if val == s then return n.
if val < s then set l = n
else set r = n.
goto 2
If you found a solution, fine. If not: try it again but round in step 2 to odd numbers. If this also does not return a solution, no solution exists at all.
This will take you Θ(log(s)) for the binary search and Θ(s) for the calculation of f(n) each time, so in total you get Θ(s⋅log(s)).
As you can see, this has the same complexity as the dynamic programming solution, but you don't have to save anything.
Notice: r = s does not hold for all s as an initial upper limit. However, if s is big enough, it holds. To be save, you can change the algorithm:
check first, if f(s) < s. If not, you can set l = s and r = 2s (or 2s+1 if it has to be odd).
Can you calculate the value of f(x) which x is from 0 to MAX_SIZE only once time?
what i mean is : calculate the value by DP.
f(0) = 1
f(1) = 1
f(2) = 2
f(3) = 3
f(4) = 7
f(5) = 4
... ...
f(MAX_SIZE) = ???
If the 1st step is illegal, exit. Otherwise, sort the value from small to big.
Such as 1,1,2,3,4,7,...
Now you can find whether exists n satisfied with f(n)=s in O(log(MAX_SIZE)) time.
Unfortunately, you don't mention how fast your algorithm should be. Perhaps you need to find some really clever rewrite of your formula to make it fast enough, in this case you might want to post this question on a mathematics forum.
The running time of your formula is O(n) for f(2n + 1) and O(n log n) for f(2n), according to the Master theorem, since:
T_even(n) = 2 * T(n / 2) + n / 2
T_odd(n) = 2 * T(n / 2) + 1
So the running time for the overall formula is O(n log n).
So if n is the answer to the problem, this algorithm would run in approx. O(n^2 log n), because you have to perform the formula roughly n times.
You can make this a little bit quicker by storing previous results, but of course, this is a tradeoff with memory.
Below is such a solution in Python.
D = {}
def f(n):
if n in D:
return D[n]
if n == 0 or n == 1:
return 1
if n == 2:
return 2
m = n // 2
if n % 2 == 0:
# f(2n) = f(n) + f(n + 1) + n (for n > 1)
y = f(m) + f(m + 1) + m
else:
# f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
y = f(m - 1) + f(m) + 1
D[n] = y
return y
def find(s):
n = 0
y = 0
even_sol = None
while y < s:
y = f(n)
if y == s:
even_sol = n
break
n += 2
n = 1
y = 0
odd_sol = None
while y < s:
y = f(n)
if y == s:
odd_sol = n
break
n += 2
print(s,even_sol,odd_sol)
find(9992)
This recursive in every iteration for 2n and 2n+1 is increasing values, so if in any moment you will have value bigger, than s, then you can stop your algorithm.
To make effective algorithm you have to find or nice formula, that will calculate value, or make this in small loop, that will be much, much, much more effective, than your recursion. Your recursion is generally O(2^n), where loop is O(n).
This is how loop can be looking:
int[] values = new int[1000];
values[0] = 1;
values[1] = 1;
values[2] = 2;
for (int i = 3; i < values.length /2 - 1; i++) {
values[2 * i] = values[i] + values[i + 1] + i;
values[2 * i + 1] = values[i - 1] + values[i] + 1;
}
And inside this loop add condition of possible breaking it with success of failure.

complexity of a randomized search algorithm

Consider the following randomized search algorithm on a sorted array a of length n (in increasing order). x can be any element of the array.
size_t randomized_search(value_t a[], size_t n, value_t x)
size_t l = 0;
size_t r = n - 1;
while (true) {
size_t j = rand_between(l, r);
if (a[j] == x) return j;
if (a[j] < x) l = j + 1;
if (a[j] > x) r = j - 1;
}
}
What is the expectation value of the Big Theta complexity (bounded both below and above) of this function when x is selected randomly from a?
Although this seems to be log(n), I carried out an experiment with instruction count, and found out that the result grows a little faster than log(n) (according to my data, even (log(n))^1.1 better fit the result).
Someone told me that this algorithm has an exact big theta complexity (so obviously log(n)^1.1 is not the answer). So, could you please give the time complexity along with your approach to prove it? Thanks.
Update: the data from my experiment
log(n) fit result by mathematica:
log(n)^1.1 fit result:
If you're willing to switch to counting three-way compares, I can tell you the exact complexity.
Suppose that the key is at position i, and I want to know the expected number of compares with position j. I claim that position j is examined if and only if it's the first position between i and j inclusive to be examined. Since the pivot element is selected uniformly at random each time, this happens with probability 1/(|i - j| + 1).
The total complexity is the expectation over i <- {1, ..., n} of sum_{j=1}^n 1/(|i - j| + 1), which is
sum_{i=1}^n 1/n sum_{j=1}^n 1/(|i - j| + 1)
= 1/n sum_{i=1}^n (sum_{j=1}^i 1/(i - j + 1) + sum_{j=i+1}^n 1/(j - i + 1))
= 1/n sum_{i=1}^n (H(i) + H(n + 1 - i) - 1)
= 1/n sum_{i=1}^n H(i) + 1/n sum_{i=1}^n H(n + 1 - i) - 1
= 1/n sum_{i=1}^n H(i) + 1/n sum_{k=1}^n H(k) - 1 (k = n + 1 - i)
= 2 H(n + 1) - 3 + 2 H(n + 1)/n - 2/n
= 2 H(n + 1) - 3 + O(log n / n)
= 2 log n + O(1)
= Theta(log n).
(log means natural log here.) Note the -3 in the low order terms. This makes it look like the number of compares is growing faster than logarithmic at the beginning, but the asymptotic behavior dictates that it levels off. Try excluding small n and refitting your curves.
Assuming rand_between to implement sampling from a uniform probability distribution in constant time, the expected running time of this algorithm is Θ(lg n). Informal sketch of a proof: the expected value of rand_between(l, r) is (l+r)/2, the midpoint between them. So each iteration is expected to skip half of the array (assuming the size is a power of two), just like a single iteration of binary search would.
More formally, borrowing from an analysis of quickselect, observe that when you pick a random midpoint, half of the time it will be between ¼n and ¾n. Neither the left nor the right subarray has more than ¾n elements. The other half of the time, neither has more than n elements (obviously). That leads to a recurrence relation
T(n) = ½T(¾n) + ½T(n) + f(n)
where f(n) is the amount of work in each iteration. Subtracting ½T(n) from both sides, then doubling both sides, we have
½T(n) = ½T(¾n) + f(n)
T(n) = T(¾n) + 2f(n)
Now, since 2f(n) = Θ(1) = Θ(n ᶜ log⁰ n) where c = log(1) = 0, it follows by the master theorem that T(n) = Θ(n⁰ lg n) = Θ(lg n).

Resources