How to calculate time complexity of recursions with memoization?

How to calculate time complexity of recursions with memoization? - algorithm

What is time complexity of the code below? I know it has recursive call multiple times so it should probably be 3^n, but still each time it initializes array of length n, which is latter used and it kinda confuses me. What should be time complexity if we would add additional array to apply memoization? This below is solution for Hackerrank Java 1D Array (Hard) task.
public static boolean solve(int n, int m, int[] arr, boolean[] visited, int curr) {
if (curr + m >= n || curr + 1 == n) {
return true;
}
boolean[] newVisited = new boolean[n];
for (int i = 0; i < n; i++) {
newVisited[i] = visited[i];
}
boolean s = false;
if (!visited[curr+1] && arr[curr+1] == 0) {
newVisited[curr+1] = true;
s = solve(n,m,arr,newVisited,curr+1);
}
if (s) {
return true;
}
if (m > 1 && arr[curr+m] == 0 && !visited[curr+m]) {
newVisited[curr+m] = true;
s = solve(n,m,arr,newVisited,curr+m);
}
if (s) {
return true;
}
if (curr > 0 && arr[curr-1] == 0 && !visited[curr-1]) {
newVisited[curr-1] = true;
s = solve(n,m,arr,newVisited,curr-1);
}
return s;
}

Your implementation does indeed seem to have exponential complexity. I did not really think about this part of your question. It is perhaps a bit tedious to come up with a worst case scenario. But one "at-least-pretty-bad" scenario would be to have the first n-m elements in arr set to 0 and the last m elements set to 1. A lot of branching right there, not really making use of the memoization mechanism. I would guess that your solution it is at least exponential in n/m.
Here is another solution. We can rephrase the problem as a graph one. Let the elements in your array be the vertices of a directed graph and let there be an edge in between every pair of vertices of one of the following forms: (x,x-1), (x,x+1) and (x,x+m), if both ends of such an edge have value 0. Add an additional vertex t to your graph. Also add an edge from every vertex with value 0 in {n-m+1,n-m+2,...,n} to t. So we have no more than 3n+m edges in our graph. Now, your problem is equivalent to determining if there is a path from vertex 0 to t in the graph we have just constructed. This can be achieved by running a Depth First Search starting from vertex 0, having complexity O(|E|), which in our case is O(n+m).
Coming back to your solution, you are doing pretty much the same thing (perhaps without realizing it). The only real difference is that you are copying the visited array into newVisited and thus never really using all that memoization :p So, just eliminate newVisited, use visited wherever you are using newVisited and check what happens.

Related

RunTime Complexity of Recursive BinaryTree Traversal

This is my solution to the problem, where, given a Binary Tree, you're asked to find, the total sum of all non-directly linked nodes. "Directly linked" refers to parent-child relationship, just to be clear.
My solution
If the current node is visited, you're not allowed to visit the nodes at the next level. If the current node, however, is not visited, you may or may not visit the nodes at the next level.
It passes all tests. However, what is the run time complexity of this Recursive Binary Tree Traversal. I think it's 2^n because, at every node, you have two choices, whether to use it, or not use it, and accordingly, the next level, would have two choices for each of these choices and so on.
Space complexity : Not using any additional space for storage, but since this is a recursive implementation, stack space is used, and the maximum elements in the stack, could be the height of the tree, which is n. so O(n) ?
public int rob(TreeNode root) {
return rob(root, false);
}
public int rob(TreeNode root, boolean previousStateUsed) {
if(root == null)
return 0;
if(root.left == null && root.right == null)
{
if(previousStateUsed == true)
return 0;
return root.val;
}
if(previousStateUsed == true)
{
int leftSumIfCurrentIsNotUsedNotUsed = rob(root.left, false);
int rightSumIfCurrentIsNotUsed = rob(root.right, false);
return leftSumIfCurrentIsNotUsedNotUsed + rightSumIfCurrentIsNotUsed;
}
else
{
int leftSumIfCurrentIsNotUsedNotUsed = rob(root.left, false);
int rightSumIfCurrentIsNotUsed = rob(root.right, false);
int leftSumIsCurrentIsUsed = rob(root.left, true);
int rightSumIfCurrentIsUsed = rob(root.right, true);
return Math.max(leftSumIfCurrentIsNotUsedNotUsed + rightSumIfCurrentIsNotUsed, leftSumIsCurrentIsUsed + rightSumIfCurrentIsUsed + root.val);
}
}

Your current recursive solution would be O(2^n). It's pretty clear to see if we take an example:
Next, let's cross out alternating layers of nodes:
With the remaining nodes we have about n/2 nodes (this will vary, but you can always remove alternating layers to get at least n/2 - 1 nodes worst case). With just these nodes, we can make any combination of them because none of them are conflicting. Therefore we can be certain that this takes at least Omega( 2^(n/2) ) time worst case. You can probably get a tighter bound, but this should make you realize your solution will not scale well.
This problem is a pretty common adaptation of the Max Non-Adajacent Sum Problem.
You should be able to use dynamic programming on this. I would highly recommend it. Imagine we are finding the solution for node i. Let's assume we already have the solution to nodes i.left and i.right and let's also assume we have the solution to their children (i's grandchildren). We now have 2 options for i's max solution:
max-sum(i.left) + max-sum(i.right)
i.val + max-sum(i.left.left) + max-sum(i.left.right) + max-sum(i.right.left) + max-sum(i.right.right)
You take the max of these and that's your solution for i. You can perform this bottom-up DP or use memoization in your current program. Either should work. The best part is, now your solution is O(n)!

Can this be solved in linear time complexity?

Given an array of N integers (elements are either positive or -1), and another integer M.
For each 1 <= i <= N, we can jump to i + 1, i + 2, .. i + M indexes of the array. Starting from index 1 is there a linear O(N) algorithm that can find out the minimum cost as well as the path to reach Nth index. Where cost is the sum of all elements in the path from 1 to N. I have a dynamic programming solution of complexity of O(N*M).
Note: If A[i] is -1, then it means that we can't land on ith index.

If I'm understanding your problem right, A* would likely provide your best runtime. For every i, i+1 through i+M would be the child nodes, and h would be the cost from i to N assuming every following node had a cost of 1 (so for instance if N=11 and M=4 then h=3 for i=2, because that would be the minimum number of jumps necessary to reach the final index).

New Approach
Assumption: The graph is not weighted graph.
This explained approach can solve the question in linear time.
So, the algorithm goes as follows.
int A[N]; // It contains the initial values
int result[N]; // Initialise all with positive infinty or INT_MAX in C
bool visited[N]; // Initially, all initialise with '0' means none of the index is visited
int current_index = 1
cost = 0
result[current_index] = cost
visited[current_index] = true
while(current_index less than N) {
cost = cost + 1 // Increase the value of the cost by 1 in each level
int last_index = -1 /* It plays the important role, it actually saves the last index
which can be reached form the currnet index, it is initialised
with -1, means it is not pointing to any valid index*/
for(i in 1 to M) {
temp_index = current_index + i;
if(temp_index <= N AND visited[temp_index] == false AND A[temp_index] != -1) {
result[temp_index] = cost
visited[temp_index] = true
last_index = temp_index
}
}
if(last_index == -1) {
print "Not possible to reach"
break
} else {
current_index = last_index
}
}
// Finally print the value of A[N]
print A[N]
Do, let me know when you are done with this approach.
=========================================================================
Previous Approach
Although, this explained approach is also not linear. But trust me, it will work more efficient than your Dynamic Approach. Because in your approach it always takes O(N.M) time but here it could be reduce to O(n.M), where n is the number the elements in an array with no -1 values.
Assumption: Here, I am considering the values of A[1] and A[N] are not -1. And, there are not more then M-1 consecutive -1 values in the array. Otherwise, we can't finish the job.
Now, do BFS described as follows:
int A[N]; // It contains the initial values
int result[N]; // Initialise all with positive infinty or INT_MAX in C
bool visited[N]; // Initially, all initialise with '0' means none of the index is visited
queue Q; // create a queue
index = 1
cost = 0
push index in rear of Q.
result[index] = cost
visited[index] = true
while(Q is not empty) {
index = pop the value from the front of the Q.
cost = cost + 1
for(i in 1 to M) {
temp_index = index + i;
if(temp_index <= N AND visited[temp_index] == false AND A[temp_index] != -1) {
push temp_index in rear of Q.
result[temp_index] = cost
visited[temp_index] = true
}
}
}
// Finally print the value of A[N]
print A[N]
Note: Worst case time-complexity would be same as the DP one.
Any doubt regarding algorithm , comments most welcome. And, do share if anyone got better approach than me. After all, we are here to learn.

Dynamic programming: Find longest subsequence that is zig zag using only one dp array

Can this problem be done using only one dp array?
It is the zigzag problem from topcoder (http://community.topcoder.com/stat?c=problem_statement&pm=1259&rd=4493)
A sequence of numbers is called a zig-zag sequence if the differences between successive numbers strictly alternate between positive and negative. The first difference (if one exists) may be either positive or negative. A sequence with fewer than two elements is trivially a zig-zag sequence.
For example, 1,7,4,9,2,5 is a zig-zag sequence because the differences (6,-3,5,-7,3) are alternately positive and negative. In contrast, 1,4,7,2,5 and 1,7,4,5,5 are not zig-zag sequences, the first because its first two differences are positive and the second because its last difference is zero.
Given a sequence of integers, sequence, return the length of the longest subsequence of sequence that is a zig-zag sequence. A subsequence is obtained by deleting some number of elements (possibly zero) from the original sequence, leaving the remaining elements in their original order.

For reference: the DP with two arrays uses an array A[1..n] where A[i] is the maximum length of a zig-zag sequence ending with a zig on element i, and an array B[1..n] where B[i] is the maximum length of a zig-zag sequence ending with a zag on element i. For i from 1 to n, this DP uses the previous entries of the A array to compute the B[i], and the previous entries of the B array to compute A[i]. At the cost of an extra loop, it would be possible to recreate the B entries on demand and thus use only the A array. I'm not sure if this solves your problem, though.
(Also, since the input arrays are so short, there are any number of encoding tricks not worth mentioning.)

Here's an attempt, I'm returning the indices from where you have zigzag. In your 2nd input (1,4,7,2,5), it returns indices of 5 and 4 since it's a zigzag from 4,7,2,5.
You can figure out if the whole array is zigzag based on the result.
public class LongestZigZag
{
private readonly int[] _input;
public LongestZigZag(int[] input)
{
_input = input;
}
public Tuple<int,int> Sequence()
{
var indices = new Tuple<int, int>(int.MinValue, int.MinValue);
if (_input.Length <= 2) return indices;
for (int i = 2; i < _input.Length; i++)
{
var firstDiff = _input[i - 1] - _input[i - 2];
var secondDiff = _input[i] - _input[i - 1];
if ((firstDiff > 0 && secondDiff < 0) || (firstDiff < 0 && secondDiff > 0))
{
var index1 = indices.Item1;
if (index1 == int.MinValue)
{
index1 = i - 2;
}
indices = new Tuple<int, int>(index1, i);
}
else
{
indices = new Tuple<int, int>(int.MinValue, int.MinValue);
}
}
return indices;
}
}

Dynamic Programming takes O(n2) time to run a program. I have designed a code which is of Linear Time Complexity O(n). With one go in the array, it gives the length of Largest possible sequence. I have tested for many test cases provided by different sites for the problem and have got positive results.
Here is my C implementation of code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i,j;
int n;
int count=0;
int flag=0;
scanf(" %d",&n);
int *a;
a = (int*)malloc(n*sizeof(a));
for(i=0;i<n;i++)
{
scanf(" %d",&a[i]); //1,7,5,10,13,15,10,5,16,8
}
i=0;
if(a[0] < a[1])
{
count++;
while(a[i] <= a[i+1] && i<n-1)
i++;
if(i==n-1 && a[i-1]<a[i])
{
count++;
i++;
}
}
while(i<n-1)
{ count++;
while(a[i] >= a[i+1] && i<n-1)
{
i++;
}
if(i==n-1 && a[i-1]>a[i])
{
count++;
break;
}
if(i<n-1)
count++;
while(a[i] <= a[i+1] && i<n-1)
{
i++;
}
if(i==n-1 && a[i-1]<a[i])
{
count++;
break;
}
}
printf("%d",count);
return 0;
}

Every (to my knowledge on the topic, so don't take it for granted) solution which you work out with dynamic programming, comes down to representing a "solution space" (meaning every possible solution that is correct, not necessarily optimal) with a DAG (Directed Acyclic Graph).
For example, if you are looking for a longest rising subseqence, then the solution space can be represented as the following DAG:
Nodes are labeled with the numbers of the sequence
Edge e(u, v) between two nodes indicates that valueOf(u) < valueOf(v) (where valueOf(x) is the value associated with node x)
In dynamic programming, finding an optimal solution to the problem is the same thing as traversing this graph in the right way. The information provided by that graph is in some sense represented by that DP array.
In this case we have two ordering operations. If we would present both of them on one of such graphs, that graph would not be acyclic - we will require at least two graphs (one representing < relation, and one for >).
If the topological ordering requires two DAGs, the solution will require two DP arrays, or some clever way of indicating which edge in Your DAG corresponds to which ordering operation (which in my opinion needlessly complicates the problem).
Hence no, You can't do it with just one DP array. You will require at least two. At least if you want a simple solution that is approached purely by using dynamic programming.
The recursive call for this problem should look something like this (the directions of the relations might be wrong, I haven't checked it):
S - given sequence (array of integers)
P(i), Q(i) - length of the longest zigzag subsequence on elements S[0 -> i] inclusive (the longest sequence that is correct, where S[i] is the last element)
P(i) = {if i == 0 then 1
{max(Q(j) + 1 if A[i] < A[j] for every 0 <= j < i)
Q(i) = {if i == 0 then 0 #yields 0 because we are pedantic about "is zig the first relation, or is it zag?". If we aren't, then this can be a 1.
{max(P(j) + 1 if A[i] > A[j] for every 0 <= j < i)
This should be O(n) with the right memoization (two DP arrays). These calls return the length of the solution - the actual result can be found by storing "parent pointer" whenever a max value is found, and then traversing backwards on these pointers.

Algorithm complexity for minimum number of clique in a graph

I have written an algorithm which solves the minimum number of clique in a graph. I have tested my backtracking algorithm, but I couldn't calculate the worst case time complexity, I have tried a lot of times.
I know that this problem is an NP hard problem, but I think is it possible to give a worst time complexity based on the code. What is the worst time complexity for this code? Any idea? How you formalize the recursive equation?
I have tried to write understandable code. If you have any question, write a comment.
I will be very glad for tips, references, answers.
Thanks for the tips guys:).
EDIT
As M C commented basically I have tried to solve this problem Clique cover problem
Pseudocode:
function countCliques(graph, vertice, cliques, numberOfClique, minimumSolution)
for i = 1 .. number of cliques + 1 new loop
if i > minimumSolution then
return;
end if
if (fitToClique(cliques(i), vertice, graph) then
addVerticeToClique(cliques(i), vertice);
if (vertice == 0) then //last vertice
minimumSolution = numberOfClique
printResult(result);
else
if (i == number of cliques + 1) then // if we are using a new clique the +1 always a new clique
countCliques(graph, vertice - 1, cliques, number of cliques + 1, minimum)
else
countCliques(graph, vertice - 1, cliques, number of cliques, minimum)
end if
end if
deleteVerticeFromClique(cliques(i), vertice);
end if
end loop
end function
bool fitToClique(clique, vertice, graph)
for ( i = 1 .. cliqueSize) loop
verticeFromClique = clique(i)
if (not connected(verticeFromClique, vertice)) then
return false
end if
end loop
return true
end function
Code
int countCliques(int** graph, int currentVertice, int** result, int numberOfSubset, int& minimum) {
// if solution
if (currentVertice == -1) {
// if a better solution
if (minimum > numberOfSubset) {
minimum = numberOfSubset;
printf("New minimum result:\n");
print(result, numberOfSubset);
}
c++;
} else {
// if not a solution, try to insert to a clique, if not fit then create a new clique (+1 in the loop)
for (int i = 0; i < numberOfSubset + 1; i++) {
if (i > minimum) {
break;
}
//if fit
if (fitToSubset(result[i], currentVertice, graph)) {
// insert
result[i][0]++;
result[i][result[i][0]] = currentVertice;
// try to insert the next vertice
countCliques(graph, currentVertice - 1, result, (i == numberOfSubset) ? (i + 1) : numberOfSubset, minimum);
// delete vertice from the clique
result[i][0]--;
}
}
}
return c;
}
bool fitToSubset(int *subSet, int currentVertice, int **graph) {
int subsetLength = subSet[0];
for (int i = 1; i < subsetLength + 1; i++) {
if (graph[subSet[i]][currentVertice] != 1) {
return false;
}
}
return true;
}
void print(int **result, int n) {
for (int i = 0; i < n; i++) {
int m = result[i][0];
printf("[");
for (int j = 1; j < m; j++) {
printf("%d, ",result[i][j] + 1);
}
printf("%d]\n", result[i][m] + 1);
}
}
int** readFile(const char* file, int& v, int& e) {
int from, to;
int **graph;
FILE *graphFile;
fopen_s(&graphFile, file, "r");
fscanf_s(graphFile,"%d %d", &v, &e);
graph = (int**)malloc(v * sizeof(int));
for (int i = 0; i < v; i ++) {
graph[i] = (int*)calloc(v, sizeof(int));
}
while(fscanf_s(graphFile,"%d %d", &from, &to) == 2) {
graph[from - 1][to - 1] = 1;
graph[to - 1][from - 1] = 1;
}
fclose(graphFile);
return graph;
}

The time complexity of your algorithm is very closely linked to listing compositions of an integer, of which there are O(2^N).
The compositions alone is not enough though, as there is also a combinatorial aspect, although there are rules as well. Specifically, a clique must contain the highest numbered unused vertex.
An example is the composition 2-2-1 (N = 5). The first clique must contain 4, reducing the number of unused vertices to 4. There is then a choice between 1 of 4 elements, unused vertices is now 3. 1 element of the second clique is known, so 2 unused vertices. Thus must be a choice between 1 of 2 elements decides the final vertex in the second clique. This only leaves a single vertex for the last clique. For this composition there are 8 possible ways it could be made, given by (1*C(4,1)*1*C(2,1)*1). The 8 possible ways are as followed:
(5,4),(3,2),(1)
(5,4),(3,1),(2)
(5,3),(4,2),(1)
(5,3),(4,1),(2)
(5,2),(4,3),(1)
(5,2),(4,1),(3)
(5,1),(4,3),(2)
(5,1),(4,2),(3)
The above example shows the format required for the worst case, which is when the composition contains the as many 2s as possible. I'm thinking this is still O(N!) even though it's actually (N-1)(N-3)(N-5)...(1) or (N-1)(N-3)(N-5)...(2). However, it is impossible as it would as shown require a complete graph, which would be caught right away, and limit the graph to a single clique, of which there is only one solution.
Given the variations of the compositions, the number of possible compositions is probably a fair starting point for the upper bound as O(2^N). That there are O(3^(N/3)) maximal cliques is another bit of useful information, as the algorithm could theoretically find all of them. Although that isn't good enough either as some maximal cliques are found multiple times while others not at all.
A tighter upper bound is difficult for two main reasons. First, the algorithm progressively limits the max number of cliques, which I suppose you could call the size of the composition, which puts an upper limit on the computation time spent per clique. Second, missing edges cause a large number of possible variations to be ignored, which almost ensures that the vast majority of the O(N!) variations are ignored. Combined with the above paragraph, makes putting the upper bound difficult. If this isn't enough for an answer, you might want to take the question to math area of stack exchange as a better answer will require a fair bit of mathematical analysis.

Dynamic programming recurrence relation

I am trying to find and solve the recurrence relation for a dynamic programming approach to UVA #11450. As a disclaimer, this is part of a homework assignment that I have mostly finished but am confused about the analysis.
Here is my (working) code:
int shop(int m, int c, int items[][21], int sol[][20]) {
if (m < 0) return NONE; // No money left
if (c == 0) return 0; // No garments left
if (sol[m][c] != NONE) return sol[m][c]; // We've been here before
// For each model of the current garment
for (int i = 1; i <= items[c-1][0]; i++) {
// Save the result
int result = shop(m-items[c-1][i], c-1, items, sol);
// If there was a valid result, record it for next time
if (result != NONE) sol[m][c] = max(sol[m][c], result+items[c-1][i]);
}
return sol[m][c];
}
I am having trouble with a few aspects of the analysis:
What is the basic operation? My initial reaction would be subtraction, since each time we call the function we subtract one from C.
Since the recursive call is within a loop, does that just mean multiplication in the recurrence relation?
How do I factor in the fact that it uses a dynamic table into the recurrence relation? I know that some problems decompose into linear when a tabula is used, but I'm not sure how this one decomposes.
I know that the complexity (according to Algorithmist) is O(M*C*max(K)) where K is the number of models of each garment, but I'm struggling to work backwards to get the recurrence relation. Here's my guess:
S(c) = k * S(c-1) + 1, S(0) = 0
However, this fails to take M into account.
Thoughts?

You can think of each DP state (m,c) as a vertex of a graph, where the recursive calls to states (m-item_i,c-1) are edges from (m,c) to (m-item_i,i).
Memorization of your recursion means that you only start the search from a vertex once, and also process its outgoing edges only once. So, your algorithm is essentially a linear search on this graph, and has complexity O(|V|+|E|). There are M*C vertices and at most max(K) edges going out of each one, so you can bound the number of edges by O(M*C*max(K)).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio