How to calculate the space complexity for backtracking? - algorithm

Question Description
What is a good way to calculate the space(stack space) and the time complexity for backtracking algorithms?
Example
We want the output combo to have exactly a length of K, and the sol set must be unique
Input:
arr: [1,2,3,4,5]
K: 4
Output:
[1,2,3,4] //[2,1,3,4] is invalid because it's == [1,2,3,4]
[1,2,3,5]
[1,3,4,5]
[2,3,4,5]
// arr == [1,2,3,4,5]
// K == 4
// backtrack(arr,4,0,{},...other params we don't care)
void backtrack(vector<int> &arr, int K, int start, vector<int> &sol, ...)
{
if(sol.size() == K)
{
//...
}
else
{
for(int i = start; i < arr.size(); i++)
{
sol.push_back(arr[i]);
backtrack(arr, K, i+1,sol,...);
sol.pop_back();
}
}
}
I think
The worst space complexity is O(k), because I think when I recur f1(), f2() to f5() won't be called after the whole subtree of f1() is finished.
[]
f1() f2() f3() f4() f()5
f1.1() f()1.2 ...
The worst time complexity is O(n^k), where n is the length of the array.

Technically time complexity isn't O(n^k) but something like O(sum from i = 1 to k bin(n,i)) because you don't start searching from the beginning of arr but from the element after the last one on the solution and also don't cut states from which you can't finish like [3]. Usually time complexity in such cases is number of states times average time that you need to get from one state to another.

Related

Time complexity for a specific recursive algorithm

How should one go about to find the time complexity of sum1?
func sum1(x []int) int {
// Returns the sum of all the elements in the list x.
return sum(x, 0, len(x)-1)
}
func sum(x []int, i int, j int) int {
// Returns the sum of the elemets from x[i] to x[j]
if i > j {
return 0
}
if i == j {
return x[i]
}
mid := (i + j) / 2
return sum(x, i, mid) + sum(x, mid+1, j)
}
Is it correct that the amount of steps requried for this specific algorithm is
T(n)= 1 + 2*T(n/2) ?
where n is the amount of elements in the array?
The i>j case can never happen via sum1 unless the user passes in an empty list, so let's ignore it for the calculation of the time complexity.
Otherwise a call to sum(x[], i, j) either returns an element of x, or adds to things together. If the result is the sum of x[i]+x[i+1]+...+x[j], then it must be the case that there's j-i+1 cases that return an element of x, and j-i cases that perform an addition. Thus there must be a total of 2j-2i+1 calls to sum, and so the complexity of sum1 is O(len(x)).
Note that this code is pointless -- it's just a very complicated and overhead-heavy way of doing the same(*) as the naive code for i = 0..n-1 {sum += x[i]} return sum. (*) assuming addition is associative.
If you want a recursive formulation of T(n) (where n is j-i+1), then it's T(1)=1, T(n) = T(floor(n/2)) + T(ceil(n/2)) + 1. You will get the right complexity if you approximate it with T(n)=2T(n/2)+1, but it isn't quite right unless you state that n is a power of 2. This is a common approximation for dealing with divide-and-conquer array algorithms.

Runtime analysis of memoized recursive function

Code below computes if s1 can be reduced to t, if so how many ways.
Let's say length of s1 is n and length of t is m. Worst case runtime of code below is O(n^m) without memoization. Say we can memoize sub-problems of s1, that substring recur. Runtime is O(m*n). Since we need to recur m times for each n. Is this reasoning correct?
static int distinctSeq(String s1, String t) {
if (s1.length() == t.length()) {
if (s1.equals(t))
return 1;
else
return 0;
}
int count = 0;
for (int i = 0; i < s1.length(); i++) {
String ss = s1.substring(0, i)+ s1.substring(i + 1);
count += distinctSeqRec(ss, t);
}
return count;
}
As #meowgoesthedog already mentioned your initial solution has a time complexity of O(n!/m!):
If you are starting with s1 of length n, and n > m then you can go into n different states by excluding one symbol from the original string.
You will continue doing it until your string has the length of m. The number of ways to come to the length of m from n using the given algorithm is n*(n-1)*(n-2)*...*(m+1), which is effectively n!/m!
For each string of length m formed by excluding symbols from initial string of length n you will have to compare string derived from s1 and t, which will require m operations (length of the strings), so the complexity from the previous step should be multiplied by m, but considering that you have factorial in the big-O, another *m won't change the asymptotic complexity.
Now about the memoization. If you add memoization, then algorithm would transition only to states that weren't already visited, which means that the task is to count the number of unique substrings of s1. For simplicity we will consider that all symbols of s1 are different. The number of states with length x is the number of ways of removing different n-x symbols from s1 disregarding the order. Which is actually a binomial coefficient - C(n,x) = n!/((n-x)! * x!).
The algorithm will transition through all lengths between n and m, so overall time complexity would be Sum(k=m...n, C(n,k)) = n!/((n-1)!*1!) + n!/((n-2)!*2! + ... + n!/((n-k)!*k!). Considering that we are counting asymptotic complexity we are interested in the largest member of that sum, which is the one with k as closest as possible to n/2. If m is lesser than n/2, then C(n, n/2) is present in the sum, otherwise C(n,m) is the largest element in it, so the complexity of the algorithm with memoization is O(max(C(n,n/2), C(n,m))).

Maximum Sub-Set Sum

I'm trying to solve a slightly different variation of the Maximum Sub-Set Sum problem. Instead of consecutive elements, I want to find the elements that gives you the largest sum in the array. For example, given the following array:
{1,-3,-5,3,2,-7,1} the output should be 7 ( the sub-array with largest sum is {1,3,2,1} ).
Here the code I use to calculate the max sum:
int max(int a, int b)
{
if (a >= b)
return a;
return b;
}
int func(List<Integer> l, int idx, int sum)
{
if (idx < 0)
return sum;
return max ( func(l,idx - 1, sum+l.get(idx)), func(l,idx-1,sum) );
}
public static void main(String[] args) {
List<Integer> l = new LinkedList<Integer>();
l.add(-2);
l.add(-1);
l.add(-3);
l.add(-4);
l.add(-1);
l.add(-2);
l.add(-1);
l.add(-5);
System.out.println(func(l,l.size()-1,0));
}
It works when I use positive and negative numbers together in the same array. However, the problem starts when I use only negative numbers - the output always is 0. I guess it happens because I send 0 as the sum at the very first time when I call the function. Can someone tell me how should I change my function so it will work with only negative numbers as well.
Your solution is unnecessarily complicated and inefficient (it has an O(2^n) time complexity).
Here's a simple and efficient (O(N) time, O(1) extra space) way to do it:
If there's at least one non-negative number in the list, return the sum of all positive numbers.
Return the largest element in the list otherwise.
Here's some code:
def get_max_non_empty_subset_sum(xs):
max_elem = max(xs)
if max_elem < 0:
return max_elem
return sum(x for x in xs if x > 0)
The special case would be when all elements are negative. In this case find the least negative number. That would be your answer. This can be done in O(N) time complexity.
PSEUDO CODE
ALLNEG=true
LEAST_NEG= -INFINITY
for number in list:
if number>=0
ALLNEG=false
break
else if number>LEAST_NEG
LEAST_NEG=number
if ALLNEG==true
answer=LEAST_NEG
else
...

Can this be solved in linear time complexity?

Given an array of N integers (elements are either positive or -1), and another integer M.
For each 1 <= i <= N, we can jump to i + 1, i + 2, .. i + M indexes of the array. Starting from index 1 is there a linear O(N) algorithm that can find out the minimum cost as well as the path to reach Nth index. Where cost is the sum of all elements in the path from 1 to N. I have a dynamic programming solution of complexity of O(N*M).
Note: If A[i] is -1, then it means that we can't land on ith index.
If I'm understanding your problem right, A* would likely provide your best runtime. For every i, i+1 through i+M would be the child nodes, and h would be the cost from i to N assuming every following node had a cost of 1 (so for instance if N=11 and M=4 then h=3 for i=2, because that would be the minimum number of jumps necessary to reach the final index).
New Approach
Assumption: The graph is not weighted graph.
This explained approach can solve the question in linear time.
So, the algorithm goes as follows.
int A[N]; // It contains the initial values
int result[N]; // Initialise all with positive infinty or INT_MAX in C
bool visited[N]; // Initially, all initialise with '0' means none of the index is visited
int current_index = 1
cost = 0
result[current_index] = cost
visited[current_index] = true
while(current_index less than N) {
cost = cost + 1 // Increase the value of the cost by 1 in each level
int last_index = -1 /* It plays the important role, it actually saves the last index
which can be reached form the currnet index, it is initialised
with -1, means it is not pointing to any valid index*/
for(i in 1 to M) {
temp_index = current_index + i;
if(temp_index <= N AND visited[temp_index] == false AND A[temp_index] != -1) {
result[temp_index] = cost
visited[temp_index] = true
last_index = temp_index
}
}
if(last_index == -1) {
print "Not possible to reach"
break
} else {
current_index = last_index
}
}
// Finally print the value of A[N]
print A[N]
Do, let me know when you are done with this approach.
=========================================================================
Previous Approach
Although, this explained approach is also not linear. But trust me, it will work more efficient than your Dynamic Approach. Because in your approach it always takes O(N.M) time but here it could be reduce to O(n.M), where n is the number the elements in an array with no -1 values.
Assumption: Here, I am considering the values of A[1] and A[N] are not -1. And, there are not more then M-1 consecutive -1 values in the array. Otherwise, we can't finish the job.
Now, do BFS described as follows:
int A[N]; // It contains the initial values
int result[N]; // Initialise all with positive infinty or INT_MAX in C
bool visited[N]; // Initially, all initialise with '0' means none of the index is visited
queue Q; // create a queue
index = 1
cost = 0
push index in rear of Q.
result[index] = cost
visited[index] = true
while(Q is not empty) {
index = pop the value from the front of the Q.
cost = cost + 1
for(i in 1 to M) {
temp_index = index + i;
if(temp_index <= N AND visited[temp_index] == false AND A[temp_index] != -1) {
push temp_index in rear of Q.
result[temp_index] = cost
visited[temp_index] = true
}
}
}
// Finally print the value of A[N]
print A[N]
Note: Worst case time-complexity would be same as the DP one.
Any doubt regarding algorithm , comments most welcome. And, do share if anyone got better approach than me. After all, we are here to learn.

Generating M distinct random numbers (one at a time) from a given range 0..N-1 in less than O(M) memory

Is there any method to do this?
I mean, we even cannot work with "in" array of {0,1,..,N-1} (because it's at least O(N) memory).
M can be = N. N can be > 2^64. Result should be uniformly random and would better be every possible sequence (but may not).
Also full-range PRNGs (and friends) aren't suitable, because it will give same sequence each time.
Time complexity doesn't matter.
If you don't care what order the random selection comes out in, then it can be done in constant memory. The selection comes out in order.
The answer hinges on estimating the probability that the smallest value in a random selection of M distinct values of the set {0, ..., N-1} is i, for each possible i. Call this value p(i, M, N). With more mathematics than I have the patience to type into an interface which doesn't support Latex, you can derive some pretty good estimates for the p function; here, I'll just show the simple, non-time-efficient approach.
Let's just focus on p(0, M, N), which is the probability that a random selection of M out of N objects will include the first object. Then we can iterate through the objects (that is, the numbers 0...N-1) one at a time; deciding for each one whether it is included or not by flipping a weighted coin. We just need to compute the coin's weights for each flip.
By definition, there are MCN possible M-selections of a set of N objects. Of these MCN-1 do not include the first element. (That's the count of M-selections of N-1 objects, which is all the M-selections of the set missing one element). Similarly, M-1CN-1 selections do include the first element (that is, all the M-1-selections of the N-1-set, with the first element added to each selection).
These two values add up to MCN; the well-known recursive algorithm for computing C.
So p(0, M, N) is just M-1CN-1/MCN. Since MCN = N!/(M!*(N-M)!), we can simplify that fraction to M/N. As expected, if M == N, that works out to 1 (M of N objects must include every object).
So now we know what the probability that the first object will be in the selection. We can then reduce the size of the set, and either reduce the remaining selection size or not, depending on whether the coin flip determined that we did or did not include the first object. So here's the final algorithm, in pseudo-code, based on the existence of the weighted random boolean function:
w(x, y) => true with probability X / Y; otherwise false.
I'll leave the implementation of w for the reader, since it's trivial.
So:
Generate a random M-selection from the set 0...N-1
Parameters: M, N
Set i = 0
while M > 0:
if w(M, N):
output i
M = M - 1
N = N - 1
i = i + 1
It might not be immediately obvious that that works, but note that:
the output i statement must be executed exactly M times, since it is coupled with a decrement of M, and the while loop executes until M is 0
The closer M gets to N, the higher the probability that M will be decremented. If we ever get to the point where M == N, then both will be decremented in lockstep until they both reach 0.
i is incremented exactly when N is decremented, so it must always be in the range 0...N-1. In fact, it's redundant; we could output N-1 instead of outputting i, which would change the algorithm to produce sets in decreasing order instead of increasing order. I didn't do that because I think the above is easier to understand.
The time complexity of that algorithm is O(N+M) which must be O(N). If N is large, that's not great, but the problem statement said that time complexity doesn't matter, so I'll leave it there.
PRNGs that don't map their state space to a lower number of bits for output should work fine. Examples include Linear Congruential Generators and Tausworthe generators. They will give the same sequence if you use the same seed to start them, but that's easy to change.
Brute force:
if time complexity doesn't matter it would be a solution for 0 < M <= N invariant. nextRandom(N) is a function which returns random integer in [0..N):
init() {
for (int idx = 0; idx < N; idx++) {
a[idx] = -1;
}
for (int idx = 0; idx < M; idx++) {
getNext();
}
}
int getNext() {
for (int idx = 1; idx < M; idx++) {
a[idx -1] = a[idx];
}
while (true) {
r = nextRandom(N);
idx = 0;
while (idx < M && a[idx] != r) idx++;
if (idx == M) {
a[idx - 1] = r;
return r;
}
}
}
O(M) solution: It is recursive solution for simplicity. It supposes to run nextRandom() which returns a random number in [0..1):
rnd(0, 0, N, M); // to get next M distinct random numbers
int rnd(int idx, int n1, int n2, int m) {
if (n1 >= n2 || m <= 0) return idx;
int r = nextRandom(n2 - n1) + n1;
int m1 = (int) ((m-1.0)*(r-n1)/(n2-n1) + nextRandom()); // gives [0..m-1]
int m2 = m - m1 - 1;
idx = rnd(idx, n1, r-1, m1);
print r;
return rnd(idx+1, r+1, n2, m2);
}
the idea is to select a random r in between [0..N) on first step which splits the range on two sub-ranges by N1 and N2 elements in each (N1+N2==N-1). We need to repeat the same step for [0..r) which has N1 elements and [r+1..N) (N2 elements) choosing M1 and M2 (M1+M2==M-1) so as M1/M2 == N1/N2. M1 and M2 must be integers, but the proportion can give real results, we need to round values with probabilities (1.2 will give 1 with p=0.8 and 2 with p=0.2 etc.).

Resources