Time complexity for a specific recursive algorithm - algorithm

How should one go about to find the time complexity of sum1?
func sum1(x []int) int {
// Returns the sum of all the elements in the list x.
return sum(x, 0, len(x)-1)
}
func sum(x []int, i int, j int) int {
// Returns the sum of the elemets from x[i] to x[j]
if i > j {
return 0
}
if i == j {
return x[i]
}
mid := (i + j) / 2
return sum(x, i, mid) + sum(x, mid+1, j)
}
Is it correct that the amount of steps requried for this specific algorithm is
T(n)= 1 + 2*T(n/2) ?
where n is the amount of elements in the array?

The i>j case can never happen via sum1 unless the user passes in an empty list, so let's ignore it for the calculation of the time complexity.
Otherwise a call to sum(x[], i, j) either returns an element of x, or adds to things together. If the result is the sum of x[i]+x[i+1]+...+x[j], then it must be the case that there's j-i+1 cases that return an element of x, and j-i cases that perform an addition. Thus there must be a total of 2j-2i+1 calls to sum, and so the complexity of sum1 is O(len(x)).
Note that this code is pointless -- it's just a very complicated and overhead-heavy way of doing the same(*) as the naive code for i = 0..n-1 {sum += x[i]} return sum. (*) assuming addition is associative.
If you want a recursive formulation of T(n) (where n is j-i+1), then it's T(1)=1, T(n) = T(floor(n/2)) + T(ceil(n/2)) + 1. You will get the right complexity if you approximate it with T(n)=2T(n/2)+1, but it isn't quite right unless you state that n is a power of 2. This is a common approximation for dealing with divide-and-conquer array algorithms.

Related

Fast algorithm for sum of steps taken by the Euclidean algorithm over pairs of numbers under an upper bound

Note: This may involve a good deal of number theory, but the formula I found online is only an approximation, so I believe an exact solution requires some sort of iterative calculation by a computer.
My goal is to find an efficient algorithm (in terms of time complexity) to solve the following problem for large values of n:
Let R(a,b) be the amount of steps that the Euclidean algorithm takes to find the GCD of nonnegative integers a and b. That is, R(a,b) = 1 + R(b,a%b), and R(a,0) = 0. Given a natural number n, find the sum of R(a,b) for all 1 <= a,b <= n.
For example, if n = 2, then the solution is R(1,1) + R(1,2) + R(2,1) + R(2,2) = 1 + 2 + 1 + 1 = 5.
Since there are n^2 pairs corresponding to the numbers to be added together, simply computing R(a,b) for every pair can do no better than O(n^2), regardless of the efficiency of R. Thus, to improve the efficiency of the algorithm, a faster method must somehow calculate the sum of R(a,b) over many values at once. There are a few properties that I suspect might be useful:
If a = b, then R(a,b) = 1
If a < b, then R(a,b) = 1 + R(b,a)
R(a,b) = R(ka,kb) where k is some natural number
If b <= a, then R(a,b) = R(a+b,b)
If b <= a < 2b, then R(a,b) = R(2a-b,a)
Because of the first two properties, it is only necessary to find the sum of R(a,b) over pairs where a > b. I tried using this in addition to the third property in a method that computes R(a,b) only for pairs where a and b are also coprime in addition to a being greater than b. The total sum is then n plus the sum of (n / a) * ((2 * R(a,b)) + 1) over all such pairs (using integer division for n / a). This algorithm still had time complexity O(n^2), I discovered, due to Euler's totient function being roughly linear.
I don't need any specific code solution, I just need to figure out the procedure for a more efficient algorithm. But if the programming language matters, my attempts to solve this problem have used C++.
Side note: I have found that a formula has been discovered that nearly solves this problem, but it is only an approximation. Note that the formula calculates the average rather than the sum, so it would just need to be multiplied by n^2. If the formula could be expanded to reduce the error, it might work, but from what I can tell, I'm not sure if this is possible.
Using Stern-Brocot, due to symmetry, we can look at just one of the four subtrees rooted at 1/3, 2/3, 3/2 or 3/1. The time complexity is still O(n^2) but obviously performs less calculations. The version below uses the subtree rooted at 2/3 (or at least that's the one I looked at to think through :). Also note, we only care about the denominators there since the numerators are lower. Also note the code relies on rules 2 and 3 as well.
C++ code (takes about a tenth of a second for n = 10,000):
#include <iostream>
using namespace std;
long g(int n, int l, int mid, int r, int fromL, int turns){
long right = 0;
long left = 0;
if (mid + r <= n)
right = g(n, mid, mid + r, r, 1, turns + (1^fromL));
if (mid + l <= n)
left = g(n, l, mid + l, mid, 0, turns + fromL);
// Multiples
int k = n / mid;
// This subtree is rooted at 2/3
return 4 * k * turns + left + right;
}
long f(int n) {
// 1/1, 2/2, 3/3 etc.
long total = n;
// 1/2, 2/4, 3/6 etc.
if (n > 1)
total += 3 * (n >> 1);
if (n > 2)
// Technically 3 turns for 2/3 but
// we can avoid a subtraction
// per call by starting with 2. (I
// guess that means it could be
// another subtree, but I haven't
// thought it through.)
total += g(n, 2, 3, 1, 1, 2);
return total;
}
int main() {
cout << f(10000);
return 0;
}
I think this is a hard problem. We can avoid division and reduce the space usage to linear at least via the Stern--Brocot tree.
def f(n, a, b, r):
return r if a + b > n else r + f(n, a + b, b, r) + f(n, a + b, a, r + 1)
def R_sum(n):
return sum(f(n, d, d, 1) for d in range(1, n + 1))
def R(a, b):
return 1 + R(b, a % b) if b else 0
def test(n):
print(R_sum(n))
print(sum(R(a, b) for a in range(1, n + 1) for b in range(1, n + 1)))
test(100)

Finding xth smallest element in unsorted array

I've been trying some coding algorithm exercises and one in particular topic has stood out to me. I've been trying to find out a good answer to this but I've been stuck in analysis paralysis. Let's say I have an array of unsorted integers and I want to determine the xth smallest element in this array.
I know of two options to go about this:
Option 1: Run a sort algorithm, sorting elements least to greatest and look up the xth element. To my understanding, the time complexity to this is O(n*log(n)) and O(1) space.
Option 2: Heapify the array, turning it into a min heap. Then pop() the top of the heap x times. To my understanding, this is O(n) + O(x*log(n)).
I can't tell which is optimal answer and maybe I fundamental misunderstanding of priority queues and when to use them. I've tried to measure runtime and I feel like I'm getting conflicting results. Maybe since with option 2, it depends on how big x is. And maybe there is a better way to go algo. If someone could help, I'd appreciate it!
Worst case time complexity of approach 2 should be O(n + n*log(n)), as maximum value of x = n.
For average case, time complexity = O(n + (1+2+3+....n)/n * log(n)) = O(n + (n+1)*log(n)).
Therefore approach 1 is more efficient than approach 2, but still not optimal.
PS: I would like you to have a look at quick select algorithm which works in O(n) on average case.
This algorithms complexity can revolve around two data points:
Value of x.
Value of n.
Space complexity
In both algos space complexity remains the O(1)
Time complexity
Approach 1
Best Case : O(nlog(n)) for sorting & O(1) for case x == 1;
Average Case : O(nlog(n)) if we consider all elements are unique &
O(x+nlog(n)) if there are duplicates.
Worst Case. : O(n+nlog(n)) for case x==n;
Approach 2:
Best Case : O(n) as just heapify would be require case x==1
Average Case : O(n + xlog(n))
Worst Case. : O(n+nlog(n)) case x==n;
Now Coming to the point to analyze this algo's in runtime.
In general below guidelines are to be followed.
1. Always test for larger values of n.
2. Have a good spread for values being tested(here x).
3. Do multiple iterations of the analysis with clean environment
(array created everytime before the experiment etc) & get the average of all
results.
4. Check for the any predefined functions code complexity for exact implementation.
In this case the sort(can be 2nlogn etc) & various heap operations code.
So if considered above all having idle values.
Method 2 should perform better than Method 1.
Although approach 1 will have less time complexity, but both of these algorithms will use auxiliary space,space complexity of std::sort is O(n). Another way of doing this ,in constant is to do is via binary search. You can do binary search for the xth element . Let l be the smallest element of the array and r be the largest, then time complexity will be O((nlog(r-l)).
int ans=l-1;
while(l<=r){
int mid=(l+r)/2;
int cnt=0;
for(int i=0;i<n;i++){
if(a[i]<=mid)
cnt++;
}
if(cnt<x){
ans=mid;
l=mid+1;
}
else
r=mid-1;
}
Now you can look for the smallest element larger than ans present in the array.
Time complexity-O(nlog(r-l))+O(n)(for the last step)
space complexity-O(1)
You can find xth element in O(n); there are also two simple heap algorithms that improve on your option 2 complexity. I'll start with the latter.
Simple heap algorithm ā„–1: O(x + (n-x) log x) worst-case complexity
Create a max heap out of the first x elements; for each of the remaining elements, pop the max and push them instead:
import heapq
def findKthSmallest(nums: List[int], k: int) -> int:
heap = [-n for n in nums[:k]]
heapq.heapify(heap)
for num in nums[k:]:
if -num > heap[0]:
heapq.heapreplace(heap, -num)
return -heap[0]
Simple heap algorithm ā„–2: O(n + x log x)
Turn the whole array into a min heap, and insert its root into an auxiliary min heap.
k-1 times pop an element from the second heap, and push back its children from the first heap.
Return the root of the second heap.
import heapq
def findKthSmallest(nums: List[int], k: int) -> int:
x = nums.copy()
heapq.heapify(x)
s = [(x[0], 0)] #auxiliary heap
for _ in range(k-1):
ind = heapq.heappop(s)[1]
if 2*ind+1 < len(x):
heapq.heappush(s, (x[2*ind+1], 2*ind+1))
if 2*ind+2 < len(x):
heapq.heappush(s, (x[2*ind+2], 2*ind+2))
return s[0][0]
Which of these is faster? It depends on values of x and n.
A more complicated Frederickson algorithm would allow you to find xth smallest element in a heap in O(x), but that would be overkill, since xth smallest element in unsorted array can be found in O(n) worst-case time.
Median-of-medians algorithm: O(n) worst-case time
Described in [1].
Quickselect algorithm: O(n) average time, O(n^2) worst-case time
def partition(A, lo, hi):
"""rearrange A[lo:hi+1] and return j such that
A[lo:j] <= pivot
A[j] == pivot
A[j+1:hi+1] >= pivot
"""
pivot = A[lo]
if A[hi] > pivot:
A[lo], A[hi] = A[hi], A[lo]
#now A[hi] <= A[lo], and A[hi] and A[lo] need to be exchanged
i = lo
j = hi
while i < j:
A[i], A[j] = A[j], A[i]
i += 1
while A[i] < pivot:
i += 1
j -= 1
while A[j] > pivot:
j -= 1
#now put pivot in the j-th place
if A[lo] == pivot:
A[lo], A[j] = A[j], A[lo]
else:
#then A[right] == pivot
j += 1
A[j], A[hi] = A[hi], A[j]
return j
def quickselect(A, left, right, k):
pivotIndex = partition(A, left, right)
if k == pivotIndex:
return A[k]
elif k < pivotIndex:
return quickselect(A, left, pivotIndex - 1, k)
else:
return quickselect(A, pivotIndex + 1, right, k)
Introselect: O(n) worst-case time
Basically, do quickselect, but if recursion gets too deep, switch to median-of-medians.
import numpy as np
def findKthSmallest(nums: List[int], k: int) -> int:
return np.partition(nums, k, kind='introselect')[k]
Rivest-Floyd algorithm: O(n) average time, O(n^2) worst-case time
Another way to speed up quickselect:
import math
C1 = 600
C2 = 0.5
C3 = 0.5
def rivest_floyd(A, left, right, k):
assert k < len(A)
while right > left:
if right - left > C1:
#select a random sample from A
N = right - left + 1
I = k - left + 1
Z = math.log(N)
S = C2 * math.exp(2/3 * Z) #sample size
SD = C3 * math.sqrt(Z * S * (N - S) / N) * math.copysign(1, I - N/2)
#select subsample such that kth element lies between newleft and newright most of the time
newleft = max(left, k - int(I * S / N + SD))
newright = min(right, k + int((N - I) * S / N + SD))
rivest_floyd(A, newleft, newright, k)
A[left], A[k] = A[k], A[left]
j = partition2(A, left, right)
if j <= k:
left = j+1
if k <= j:
right = j-1
return A[k]
[1]Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4., pp.220-223

Finding the temporal complexity of an exponential algorithm

Problem: Find best way to cut a rod of length n.
Each cut is integer length.
Assume that each length i rod has a price p(i).
Given: rod of length n, and a list of prices p, which provided the price of each possible integer lenght between 0 and n.
Find best set of cuts to get maximum price.
Can use any number of cuts, from 0 to nāˆ’1.
There is no cost for a cut.
Following I present a naive algorithm for this problem.
CUT-ROD(p,n)
if(n == 0)
return 0
q = -infinity
for i = 1 to n
q = max(q, p[i]+CUT-ROD(p,n-1))
return q
How can I prove that this algorithm is exponential? Step-by-step.
I can see that it is exponential. However, I'm not able to proove it.
Let's translate the code to C++ for clarity:
int prices[n];
int cut-rod(int n) {
if(n == 0) {
return 0;
}
q = -1;
res = cut-rod(n-1);
for(int i = 0; i < n; i++) {
q = max(q, prices[i] + res);
}
return q;
}
Note: We are caching the result of cut-rod(n-1) to avoid unnecessarily increasing the complexity of the algorithm. Here, we can see that cut-rod(n) calls cut-rod(n-1), which calls cut-rod(n-2) and so on until cut-rod(0). For cut-rod(n), we see that the function iterates over the array n times. Therefore the time complexity of the algorithm is equal to O(n + (n-1) + (n-2) + (n-3)...1) = O(n(n+1)/2) which is approximately equal to O((n^2)/2).
EDIT:
If we are using the exact same algorithm as the one in the question, its time complexity is O(n!) since cut-rod(n) calls cut-rod(n-1) n times. cut-rod(n-1) calls cut-rod(n-2) n-1 times and so on. Therefore the time complexity is equal to O(n*(n-1)*(n-2)...1) = O(n!).
I am unsure if this counts as a step-by-step solution but it can be shown easily by induction/substitution. Just assume T(i)=2^i for all i<n then we show that it holds for n:

Generating M distinct random numbers (one at a time) from a given range 0..N-1 in less than O(M) memory

Is there any method to do this?
I mean, we even cannot work with "in" array of {0,1,..,N-1} (because it's at least O(N) memory).
M can be = N. N can be > 2^64. Result should be uniformly random and would better be every possible sequence (but may not).
Also full-range PRNGs (and friends) aren't suitable, because it will give same sequence each time.
Time complexity doesn't matter.
If you don't care what order the random selection comes out in, then it can be done in constant memory. The selection comes out in order.
The answer hinges on estimating the probability that the smallest value in a random selection of M distinct values of the set {0, ..., N-1} is i, for each possible i. Call this value p(i, M, N). With more mathematics than I have the patience to type into an interface which doesn't support Latex, you can derive some pretty good estimates for the p function; here, I'll just show the simple, non-time-efficient approach.
Let's just focus on p(0, M, N), which is the probability that a random selection of M out of N objects will include the first object. Then we can iterate through the objects (that is, the numbers 0...N-1) one at a time; deciding for each one whether it is included or not by flipping a weighted coin. We just need to compute the coin's weights for each flip.
By definition, there are MCN possible M-selections of a set of N objects. Of these MCN-1 do not include the first element. (That's the count of M-selections of N-1 objects, which is all the M-selections of the set missing one element). Similarly, M-1CN-1 selections do include the first element (that is, all the M-1-selections of the N-1-set, with the first element added to each selection).
These two values add up to MCN; the well-known recursive algorithm for computing C.
So p(0, M, N) is just M-1CN-1/MCN. Since MCN = N!/(M!*(N-M)!), we can simplify that fraction to M/N. As expected, if M == N, that works out to 1 (M of N objects must include every object).
So now we know what the probability that the first object will be in the selection. We can then reduce the size of the set, and either reduce the remaining selection size or not, depending on whether the coin flip determined that we did or did not include the first object. So here's the final algorithm, in pseudo-code, based on the existence of the weighted random boolean function:
w(x, y) => true with probability X / Y; otherwise false.
I'll leave the implementation of w for the reader, since it's trivial.
So:
Generate a random M-selection from the set 0...N-1
Parameters: M, N
Set i = 0
while M > 0:
if w(M, N):
output i
M = M - 1
N = N - 1
i = i + 1
It might not be immediately obvious that that works, but note that:
the output i statement must be executed exactly M times, since it is coupled with a decrement of M, and the while loop executes until M is 0
The closer M gets to N, the higher the probability that M will be decremented. If we ever get to the point where M == N, then both will be decremented in lockstep until they both reach 0.
i is incremented exactly when N is decremented, so it must always be in the range 0...N-1. In fact, it's redundant; we could output N-1 instead of outputting i, which would change the algorithm to produce sets in decreasing order instead of increasing order. I didn't do that because I think the above is easier to understand.
The time complexity of that algorithm is O(N+M) which must be O(N). If N is large, that's not great, but the problem statement said that time complexity doesn't matter, so I'll leave it there.
PRNGs that don't map their state space to a lower number of bits for output should work fine. Examples include Linear Congruential Generators and Tausworthe generators. They will give the same sequence if you use the same seed to start them, but that's easy to change.
Brute force:
if time complexity doesn't matter it would be a solution for 0 < M <= N invariant. nextRandom(N) is a function which returns random integer in [0..N):
init() {
for (int idx = 0; idx < N; idx++) {
a[idx] = -1;
}
for (int idx = 0; idx < M; idx++) {
getNext();
}
}
int getNext() {
for (int idx = 1; idx < M; idx++) {
a[idx -1] = a[idx];
}
while (true) {
r = nextRandom(N);
idx = 0;
while (idx < M && a[idx] != r) idx++;
if (idx == M) {
a[idx - 1] = r;
return r;
}
}
}
O(M) solution: It is recursive solution for simplicity. It supposes to run nextRandom() which returns a random number in [0..1):
rnd(0, 0, N, M); // to get next M distinct random numbers
int rnd(int idx, int n1, int n2, int m) {
if (n1 >= n2 || m <= 0) return idx;
int r = nextRandom(n2 - n1) + n1;
int m1 = (int) ((m-1.0)*(r-n1)/(n2-n1) + nextRandom()); // gives [0..m-1]
int m2 = m - m1 - 1;
idx = rnd(idx, n1, r-1, m1);
print r;
return rnd(idx+1, r+1, n2, m2);
}
the idea is to select a random r in between [0..N) on first step which splits the range on two sub-ranges by N1 and N2 elements in each (N1+N2==N-1). We need to repeat the same step for [0..r) which has N1 elements and [r+1..N) (N2 elements) choosing M1 and M2 (M1+M2==M-1) so as M1/M2 == N1/N2. M1 and M2 must be integers, but the proportion can give real results, we need to round values with probabilities (1.2 will give 1 with p=0.8 and 2 with p=0.2 etc.).

How to find 2 numbers and their sum in an unsorted array

This was an interview question that I was asked to solve: Given an unsorted array, find out 2 numbers and their sum in the array. (That is, find three numbers in the array such that one is the sum of the other two.) Please note, I have seen question about the finding 2 numbers when the sum (int k) is given. However, this question expect you to find out the numbers and the sum in the array. Can it be solved in O(n), O(log n) or O(nlogn)
There is a standard solution of going through each integer and then doing a binary search on it. Is there a better solution?
public static void findNumsAndSum(int[] l) {
// sort the array
if (l == null || l.length < 2) {
return;
}
BinarySearch bs = new BinarySearch();
for (int i = 0; i < l.length; i++) {
for (int j = 1; j < l.length; j++) {
int sum = l[i] + l[j];
if (l[l.length - 1] < sum) {
continue;
}
if (bs.binarySearch(l, sum, j + 1, l.length)) {
System.out.println("Found the sum: " + l[i] + "+" + l[j]
+ "=" + sum);
}
}
}
}
This is very similar to the standard problem 3SUM, which many of the related questions along the right are about.
Your solution is O(n^2 lg n); there are O(n^2) algorithms based on sorting the array, which work with slight modification for this variant. The best known lower bound is O(n lg n) (because you can use it to perform a comparison sort, if you're clever about it). If you can find a subquadratic algorithm or a tighter lower bound, you'll get some publications out of it. :)
Note that if you're willing to bound the integers to fall in the range [-u, u], there's a solution for the a + b + c = 0 problem in time O(n + u lg u) using the Fast Fourier Transform. It's not immediately obvious to me how to adjust it to the a + b = c problem, though.
You can solve it in O(nlog(n)) as follows:
Sort your array in O(nlog(n)) ascendingly. You need 2 indices pointing to the left/right end of your array. Lets's call them i and j, i being the left one and j the right one.
Now calculate the sum of array[i] + array[j].
If this sum is greater than k, reduce j by one.
If this sum is smaller than k. increase i by one.
Repeat until the sum equals k.
So with this algorithm you can find the solution in O(nlog(n)) and it is pretty simple to implement
Sorry. It seems that I didn't read your post carefully enough ;)

Resources