What is the time complexity of this leetcode problem solution? - performance

I was watching this video of the Partition Equal Subset Sum Problem. My code is basically the same as his in the video with a few changes. He says the time complexity is O(n * sum(nums)), but he doesn't really explain. I thought the time complexity would be O(n * 2^n) because the outer loop is clearly O(n) and the inner loop I thought is 2^n because the size of the set could be 2^n with 2 choices (include or exclude) per number. If this is incorrect, why?
def canPartition(self, nums: List[int]) -> bool:
total_sum = sum(nums)
if total_sum % 2:
return False
sums = set([0])
target = total_sum // 2
for num in nums:
sums_to_add = set()
for pos_sum in sums:
if pos_sum + num == target:
return True
sums_to_add.add(pos_sum + num)
sums.update(sums_to_add)
return False

Related

What would be the time complexity and space complexity of brute force approach of matrix chain multiplication?

I know the time complexity and space complexity of matrix chain multiplication using dynamic programming would be O(n^3) and O(n^2).
But I want to know the time as well as space complexity of the brute force approach for this problem which can be implemented with below code.
def MatrixChainOrder(p, i, j):
if i == j:
return 0
_min = sys.maxsize
for k in range(i, j):
count = (MatrixChainOrder(p, i, k)
+ MatrixChainOrder(p, k + 1, j)
+ p[i-1] * p[k] * p[j])
if count < _min:
_min = count
# Return minimum count
return _min
#arr = [1, 2, 3, 4, 3]
#n = len(arr)
# p is array name
# i=1
#j= n-1
Please elaborate...
The stack depth is linear, hence so is the space usage.
As for time, we get a recurrence
T(1) = 1
T(n) = sum_{k=1}^{n-1} (T(k) + T(n-k)) = 2 sum_{k=1}^{n-1} T(k).
We can verify that the solution is
T(1) = 1
T(n) = 2 (3^(n-1))
so the running time is Θ(3n).

Time complexity of this power function

I'm having this code to calculate the power of certain number
func power(x, n int) int {
if n == 0 {
return 1
}
if n == 1 {
return x
}
if n%2 == 0 {
return power(x, n/2) * power(x, n/2)
} else {
return power(x, n/2) * power(x, n/2) * x
}
}
go playground:
So, the total number of execution is 1 + 2 + 4 + ... + 2^k
and according to the formula of Geometric progression
a(1-r^n) / (1-r)
the sum of the execution times will be 2^k, where k is the height of the binary tree
Hence the time complexity is 2^logn
Am I correct? Thanks :)
Yes.
Another way of thinking on complexity of recursive functions is (amount of calls)**(height of recursive tree)
In each call you make two calls which divide n by two so the height of tree is logn so the time complexity is 2**(logn) which is O(n)
See a much more formal proof here:
https://cs.stackexchange.com/questions/69539/time-complexity-of-recursive-power-code
Every time you are dividing n by 2 unless n <= 1. So think how many times you can reduce n to 1 only by dividing by 0? Let's see,
n = 26
n1 = 13
n2 = 6 (take floor of 13/2)
n3 = 3
n4 = 1 (take floor of 3/2)
Let's say x_th power of 2 is greater or equal to x. Then,
2^x >= n
or, log2(2^x) = log2(n)
or, x = log2(n)
That is how you find the time complexity of your algorithm as log2(n).

Time complexity: theory vs reality

I'm currently doing an assignment that requires us to discuss time complexities of different algorithms.
Specifically sum1 and sum2
def sum1(a):
"""Return the sum of the elements in the list a."""
n = len(a)
if n == 0:
return 0
if n == 1:
return a[0]
return sum1(a[:n/2]) + sum1(a[n/2:])
def sum2(a):
"""Return the sum of the elements in the list a."""
return _sum(a, 0, len(a)-1)
def _sum(a, i, j):
"""Return the sum of the elements from a[i] to a[j]."""
if i > j:
return 0
if i == j:
return a[i]
mid = (i+j)/2
return _sum(a, i, mid) + _sum(a, mid+1, j)
Using the Master theorem, my best guess for both of theese are
T(n) = 2*T(n/2)
which accoring to Wikipedia should equate to O(n) if I haven't made any mistakes in my assumptions, however when I do a benchmark with different arrays of length N with random integers in the range 1 to 100, I get the following result.
I've tried running the benchmark a multiple of times and I get the same result each time. sum2 seems to be twice as fast as sum1 which baffles me since they should make the same amount of operations. (?).
My question is, are these algorthim both linear and if so why do their run time vary.
If it does matter, I'm running these tests on Python 2.7.14.
sum1 looks like O(n) on the surface, but for sum1 T(n) is actually 2T(n/2) + 2*n/2. This is because of the list slicing operations which themselves are O(n). Using the master theorem, the complexity becomes O(n log n) which causes the difference.
Thus, for sum1, the time taken t1 = k1 * n log n. For sum2, time taken t2 = k2 * n.
Since you are plotting a time vs log n graph, let x = log n. Then,
t1 = k1 * x * 10^x
t2 = k2 * 10^x
With suitable values for k1 and k2, you get a graph very similar to yours. From your data, when x = 6, 0.6 ~ k1 * 6 * 10^6 or k1 ~ 10^(-7) and 0.3 ~ k2 * 10^6 or k2 = 3 * 10^(-7).
Your graph has log10(N) on the x-axis, which means that the right-most data points are for an N value that's ten times the previous ones. And, indeed, they take roughly ten times as long. So that's a linear progression, as you expect.

Expected running time of randomized binary search

I want to calculate the expected running time of randomized binary search of the following pseudo-code, where instead of considering the midpoint as the pivot, a random point is selected:
BinarySearch(x, A, start, end)
if(start == end)
if(A[end] == x)
return end
else
return -1
else
mid = RANDOM(start, end)
if(A[mid] == x)
return mid
else if(A[mid] > x)
return BinarySearch(x, A, start, mid-1)
else
return BinarySearch(x, A, mid+1, end)
I looked at this previous question, which has the following:
T(n) = sum ( T(r)*Pr(search space becomes r) ) + O(1) = sum ( T(r) )/n + O(1)
How is this obtained?
sum( T(r)*Pr(search space becomes r) )
And in the last line of calculation, how was this obtained?
T(n) = 1 + 1/2 + 1/3 + ... + 1/(n-1) = H(n-1) < H(n) = O(log n)
sum( T(r)*Pr(search space becomes r) )
This line obtained by observing fact that you can choose any point to partition array, so to get expected time you need to sum up all possiblities multiplied with their probabilities. See expected value.
T(n) = 1 + 1/2 + 1/3 + ... + 1/(n-1) = H(n-1) < H(n) = O(log n)
About this line. Well you can think of it as of integral of 1/x on [1, n] and it is log(n) - log(1) = log(n). See Harmonic series.
I would argue that the recurrence only holds when the target element is the first/last element of the array. Assume that the target element is in the middle, then in the first call we reduce the size of the array to be up to n/2, not n as in the recursion. Moreover, the position of the target element may change with each recursive call. For the proof of O(log n) complexity you may want to see my answer which uses another approach here.

Efficient Algorithm to Solve a Recursive Formula

I am given a formula f(n) where f(n) is defined, for all non-negative integers, as:
f(0) = 1
f(1) = 1
f(2) = 2
f(2n) = f(n) + f(n + 1) + n (for n > 1)
f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
My goal is to find, for any given number s, the largest n where f(n) = s. If there is no such n return None. s can be up to 10^25.
I have a brute force solution using both recursion and dynamic programming, but neither is efficient enough. What concepts might help me find an efficient solution to this problem?
I want to add a little complexity analysis and estimate the size of f(n).
If you look at one recursive call of f(n), you notice, that the input n is basically divided by 2 before calling f(n) two times more, where always one call has an even and one has an odd input.
So the call tree is basically a binary tree where always the half of the nodes on a specific depth k provides a summand approx n/2k+1. The depth of the tree is log₂(n).
So the value of f(n) is in total about Θ(n/2 ⋅ log₂(n)).
Just to notice: This holds for even and odd inputs, but for even inputs the value is about an additional summand n/2 bigger. (I use Θ-notation to not have to think to much about some constants).
Now to the complexity:
Naive brute force
To calculate f(n) you have to call f(n) Θ(2log₂(n)) = Θ(n) times.
So if you want to calculate the values of f(n) until you reach s (or notice that there is no n with f(n)=s) you have to calculate f(n) s⋅log₂(s) times, which is in total Θ(s²⋅log(s)).
Dynamic programming
If you store every result of f(n), the time to calculate a f(n) reduces to Θ(1) (but it requires much more memory). So the total time complexity would reduce to Θ(s⋅log(s)).
Notice: Since we know f(n) ≤ f(n+2) for all n, you don't have to sort the values of f(n) and do a binary search.
Using binary search
Algorithm (input is s):
Set l = 1 and r = s
Set n = (l+r)/2 and round it to the next even number
calculate val = f(n).
if val == s then return n.
if val < s then set l = n
else set r = n.
goto 2
If you found a solution, fine. If not: try it again but round in step 2 to odd numbers. If this also does not return a solution, no solution exists at all.
This will take you Θ(log(s)) for the binary search and Θ(s) for the calculation of f(n) each time, so in total you get Θ(s⋅log(s)).
As you can see, this has the same complexity as the dynamic programming solution, but you don't have to save anything.
Notice: r = s does not hold for all s as an initial upper limit. However, if s is big enough, it holds. To be save, you can change the algorithm:
check first, if f(s) < s. If not, you can set l = s and r = 2s (or 2s+1 if it has to be odd).
Can you calculate the value of f(x) which x is from 0 to MAX_SIZE only once time?
what i mean is : calculate the value by DP.
f(0) = 1
f(1) = 1
f(2) = 2
f(3) = 3
f(4) = 7
f(5) = 4
... ...
f(MAX_SIZE) = ???
If the 1st step is illegal, exit. Otherwise, sort the value from small to big.
Such as 1,1,2,3,4,7,...
Now you can find whether exists n satisfied with f(n)=s in O(log(MAX_SIZE)) time.
Unfortunately, you don't mention how fast your algorithm should be. Perhaps you need to find some really clever rewrite of your formula to make it fast enough, in this case you might want to post this question on a mathematics forum.
The running time of your formula is O(n) for f(2n + 1) and O(n log n) for f(2n), according to the Master theorem, since:
T_even(n) = 2 * T(n / 2) + n / 2
T_odd(n) = 2 * T(n / 2) + 1
So the running time for the overall formula is O(n log n).
So if n is the answer to the problem, this algorithm would run in approx. O(n^2 log n), because you have to perform the formula roughly n times.
You can make this a little bit quicker by storing previous results, but of course, this is a tradeoff with memory.
Below is such a solution in Python.
D = {}
def f(n):
if n in D:
return D[n]
if n == 0 or n == 1:
return 1
if n == 2:
return 2
m = n // 2
if n % 2 == 0:
# f(2n) = f(n) + f(n + 1) + n (for n > 1)
y = f(m) + f(m + 1) + m
else:
# f(2n + 1) = f(n - 1) + f(n) + 1 (for n >= 1)
y = f(m - 1) + f(m) + 1
D[n] = y
return y
def find(s):
n = 0
y = 0
even_sol = None
while y < s:
y = f(n)
if y == s:
even_sol = n
break
n += 2
n = 1
y = 0
odd_sol = None
while y < s:
y = f(n)
if y == s:
odd_sol = n
break
n += 2
print(s,even_sol,odd_sol)
find(9992)
This recursive in every iteration for 2n and 2n+1 is increasing values, so if in any moment you will have value bigger, than s, then you can stop your algorithm.
To make effective algorithm you have to find or nice formula, that will calculate value, or make this in small loop, that will be much, much, much more effective, than your recursion. Your recursion is generally O(2^n), where loop is O(n).
This is how loop can be looking:
int[] values = new int[1000];
values[0] = 1;
values[1] = 1;
values[2] = 2;
for (int i = 3; i < values.length /2 - 1; i++) {
values[2 * i] = values[i] + values[i + 1] + i;
values[2 * i + 1] = values[i - 1] + values[i] + 1;
}
And inside this loop add condition of possible breaking it with success of failure.

Resources