How can I solve this dynamic programing problem? - algorithm

I was stuck in a problem studying dynamic programming.
I have a string of numbers. You need to find the length of the longest substring of the substrings in this string that has the sum of the first half of the numbers and the second half of the numbers.
For example,
Input string: 142124
Output : 6
When the input string is "142124", the sum of the numbers of the first half (142) and the number of the second half (124) is the same, so the entire given string becomes the longest substring we find. Therefore, the output is 6, the length of the entire string.
Input string: 9430723
Output: 4
The longest substring in this string that has the sum of the first half and the second half becomes "4307".
I solved this problem this way
int maxSubStringLength(char* str){
int n = strlen(str);
int maxLen = 0;
int sum[n][n];
for(int i=0; i<n; i++)
sum[i][i] = str[i] - '0';
for(int len =2; len <=n; len++){
for(int i = 0; i < n - len + 1; i++){
int j = i + len - 1;
int k = len / 2;
sum[i][j] = sum[i][j-k] + sum[j-k+1][j];
if(len%2 == 0 && sum[i][j-k] == sum[j-k+1][j] && len > maxLen)
maxLen = len;
}
}
return maxLen;
}
This code has a time complexity of O (n * n) and a space complexity of O (n * n).
However, this problem requires solving with O (1) space complexity with O (n * n) time complexity.
Is it possible to solve this problem with the space complexity of O (1)?

You can easily solve this problem with O(1) space complexity and O(n^2) time complexity.
Here is one aproach:
Go from m = 0 to n-2. This denotes the middle of the string (you split after the mth character).
For i = 1 to n (break if you get out of bounds). Build the left and right sums, if they are equal, compare i to best so far and update it if better.
Solution is 2 times best (because it denotes the half string).
In Java it would be something like this:
public int maxSubstringLength(String s) {
int best = 0;
for (int m = 0; m < s.length() - 1; m++) {
int l = 0; // left sum
int r = 0; // right sum
for (int i = 1; m - i + 1 >= 0 && m + i < s.length(); i++) {
l += s.charAt(m - i + 1);
r += s.charAt(m + i);
if (l == r && i > best)
best = i;
}
}
return 2 * best;
}

Related

How to find the time complexity of these two programs? [duplicate]

int sum = 0;
for(int i = 1; i < n; i++) {
for(int j = 1; j < i * i; j++) {
if(j % i == 0) {
for(int k = 0; k < j; k++) {
sum++;
}
}
}
}
I don't understand how when j = i, 2i, 3i... the last for loop runs n times. I guess I just don't understand how we came to that conclusion based on the if statement.
Edit: I know how to compute the complexity for all the loops except for why the last loop executes i times based on the mod operator... I just don't see how it's i. Basically, why can't j % i go up to i * i rather than i?
Let's label the loops A, B and C:
int sum = 0;
// loop A
for(int i = 1; i < n; i++) {
// loop B
for(int j = 1; j < i * i; j++) {
if(j % i == 0) {
// loop C
for(int k = 0; k < j; k++) {
sum++;
}
}
}
}
Loop A iterates O(n) times.
Loop B iterates O(i2) times per iteration of A. For each of these iterations:
j % i == 0 is evaluated, which takes O(1) time.
On 1/i of these iterations, loop C iterates j times, doing O(1) work per iteration. Since j is O(i2) on average, and this is only done for 1/i iterations of loop B, the average cost is O(i2 / i) = O(i).
Multiplying all of this together, we get O(n × i2 × (1 + i)) = O(n × i3). Since i is on average O(n), this is O(n4).
The tricky part of this is saying that the if condition is only true 1/i of the time:
Basically, why can't j % i go up to i * i rather than i?
In fact, j does go up to j < i * i, not just up to j < i. But the condition j % i == 0 is true if and only if j is a multiple of i.
The multiples of i within the range are i, 2*i, 3*i, ..., (i-1) * i. There are i - 1 of these, so loop C is reached i - 1 times despite loop B iterating i * i - 1 times.
The first loop consumes n iterations.
The second loop consumes n*n iterations. Imagine the case when i=n, then j=n*n.
The third loop consumes n iterations because it's executed only i times, where i is bounded to n in the worst case.
Thus, the code complexity is O(n×n×n×n).
I hope this helps you understand.
All the other answers are correct, I just want to amend the following.
I wanted to see, if the reduction of executions of the inner k-loop was sufficient to reduce the actual complexity below O(n⁴). So I wrote the following:
for (int n = 1; n < 363; ++n) {
int sum = 0;
for(int i = 1; i < n; ++i) {
for(int j = 1; j < i * i; ++j) {
if(j % i == 0) {
for(int k = 0; k < j; ++k) {
sum++;
}
}
}
}
long cubic = (long) Math.pow(n, 3);
long hypCubic = (long) Math.pow(n, 4);
double relative = (double) (sum / (double) hypCubic);
System.out.println("n = " + n + ": iterations = " + sum +
", n³ = " + cubic + ", n⁴ = " + hypCubic + ", rel = " + relative);
}
After executing this, it becomes obvious, that the complexity is in fact n⁴. The last lines of output look like this:
n = 356: iterations = 1989000035, n³ = 45118016, n⁴ = 16062013696, rel = 0.12383254507467704
n = 357: iterations = 2011495675, n³ = 45499293, n⁴ = 16243247601, rel = 0.12383580700180696
n = 358: iterations = 2034181597, n³ = 45882712, n⁴ = 16426010896, rel = 0.12383905075183874
n = 359: iterations = 2057058871, n³ = 46268279, n⁴ = 16610312161, rel = 0.12384227647628734
n = 360: iterations = 2080128570, n³ = 46656000, n⁴ = 16796160000, rel = 0.12384548432498857
n = 361: iterations = 2103391770, n³ = 47045881, n⁴ = 16983563041, rel = 0.12384867444612208
n = 362: iterations = 2126849550, n³ = 47437928, n⁴ = 17172529936, rel = 0.1238518469862343
What this shows is, that the actual relative difference between actual n⁴ and the complexity of this code segment is a factor asymptotic towards a value around 0.124... (actually 0.125). While it does not give us the exact value, we can deduce, the following:
Time complexity is n⁴/8 ~ f(n) where f is your function/method.
The wikipedia-page on Big O notation states in the tables of 'Family of Bachmann–Landau notations' that the ~ defines the limit of the two operand sides is equal. Or:
f is equal to g asymptotically
(I chose 363 as excluded upper bound, because n = 362 is the last value for which we get a sensible result. After that, we exceed the long-space and the relative value becomes negative.)
User kaya3 figured out the following:
The asymptotic constant is exactly 1/8 = 0.125, by the way; here's the exact formula via Wolfram Alpha.
Remove if and modulo without changing the complexity
Here's the original method:
public static long f(int n) {
int sum = 0;
for (int i = 1; i < n; i++) {
for (int j = 1; j < i * i; j++) {
if (j % i == 0) {
for (int k = 0; k < j; k++) {
sum++;
}
}
}
}
return sum;
}
If you're confused by the if and modulo, you can just refactor them away, with j jumping directly from i to 2*i to 3*i ... :
public static long f2(int n) {
int sum = 0;
for (int i = 1; i < n; i++) {
for (int j = i; j < i * i; j = j + i) {
for (int k = 0; k < j; k++) {
sum++;
}
}
}
return sum;
}
To make it even easier to calculate the complexity, you can introduce an intermediary j2 variable, so that every loop variable is incremented by 1 at each iteration:
public static long f3(int n) {
int sum = 0;
for (int i = 1; i < n; i++) {
for (int j2 = 1; j2 < i; j2++) {
int j = j2 * i;
for (int k = 0; k < j; k++) {
sum++;
}
}
}
return sum;
}
You can use debugging or old-school System.out.println in order to check that i, j, k triplet is always the same in each method.
Closed form expression
As mentioned by others, you can use the fact that the sum of the first n integers is equal to n * (n+1) / 2 (see triangular numbers). If you use this simplification for every loop, you get :
public static long f4(int n) {
return (n - 1) * n * (n - 2) * (3 * n - 1) / 24;
}
It is obviously not the same complexity as the original code but it does return the same values.
If you google the first terms, you can notice that 0 0 0 2 11 35 85 175 322 546 870 1320 1925 2717 3731 appear in "Stirling numbers of the first kind: s(n+2, n).", with two 0s added at the beginning. It means that sum is the Stirling number of the first kind s(n, n-2).
Let's have a look at the first two loops.
The first one is simple, it's looping from 1 to n. The second one is more interesting. It goes from 1 to i squared. Let's see some examples:
e.g. n = 4
i = 1
j loops from 1 to 1^2
i = 2
j loops from 1 to 2^2
i = 3
j loops from 1 to 3^2
In total, the i and j loops combined have 1^2 + 2^2 + 3^2.
There is a formula for the sum of first n squares, n * (n+1) * (2n + 1) / 6, which is roughly O(n^3).
You have one last k loop which loops from 0 to j if and only if j % i == 0. Since j goes from 1 to i^2, j % i == 0 is true for i times. Since the i loop iterates over n, you have one extra O(n).
So you have O(n^3) from i and j loops and another O(n) from k loop for a grand total of O(n^4)

Dynamic programming based zigzag puzzle

I found this interesting dynamic programming problem where it's required to re-order a sequence of integers in order to maximize the output.
Steve has got N liquor bottles. Alcohol quantity of ith bottle is given by A[i]. Now he wants to have one drink from each of the bottles, in such a way that the total hangover is maximised.
Total hangover is calculated as follow (Assume the 'alcohol quantity' array uses 1-based indexing) :
int hangover=0 ;
for( int i=2 ; i<=N ; i++ ){
hangover += i * abs(A[i] - A[i-1]) ;
}
So, obviously the order in which he drinks from each bottle changes the Total hangover. He can drink the liquors in any order but not more than one drink from each bottle. Also once he starts drinking a liquor he will finish that drink before moving to some other liquor.
Steve is confused about the order in which he should drink so that the hangover is maximized. Help him find the maximum hangover he can have, if he can drink the liquors in any order.
Input Format :
First line contain number of test cases T. First line of each test case contains N, denoting the number of fruits. Next line contain N space separated integers denoting the sweetness of each fruit.
2
7
83 133 410 637 665 744 986
4
1 5 9 11
I tried everything that I could but I wasn't able to achieve a O(n^2) solution. By simply calculating the total hangover over all the permutations has a O(n!) time complexity. Can this problem be solved more efficiently?
Thanks!
My hunch: use a sort of "greedy chaining algorithm" instead of DP.
1) find the pair with the greatest difference (O(n^2))
2) starting from either, find successively the next element with the greatest difference, forming a sort of "chain" (2 x O(n^2))
3) once you've done it for both you'll have two "sums". Return the largest one as your optimal answer.
This greedy strategy should work because the nature of the problem itself is greedy: choose the largest difference for the last bottle, because this has the largest index, so the result will always be larger than some "compromising" alternative (one that distributes smaller but roughly uniform differences to the indices).
Complexity: O(3n^2). Can prob. reduce it to O(3/2 n^2) if you use linked lists instead of a static array + boolean flag array.
Pseudo-ish code:
int hang_recurse(int* A, int N, int I, int K, bool* F)
{
int sum = 0;
for (int j = 2; j <= N; j++, I--)
{
int maxdiff = 0, maxidx;
for (int i = 1; i <= N; i++)
{
if (F[i] == false)
{
int diff = abs(F[K] - F[i]);
if (diff > maxdiff)
{
maxdiff = diff;
maxidx = i;
}
}
}
K = maxidx;
F[K] = true;
sum += maxdiff * I;
}
return sum;
}
int hangover(int* A, int N)
{
bool* F = new bool[N];
int maxdiff = 0;
int maxidx_i, maxidx_j;
for (int j = 2; j <= N; j++, I--)
{
for (int i = 1; i <= N; i++)
{
int diff = abs(F[j] - F[i]);
if (diff > maxdiff)
{
maxdiff = diff;
maxidx_i = i;
maxidx_j = j;
}
}
}
F[maxidx_i] = F[maxidx_j] = true;
int maxsum = max(hang_recurse(A, N, N - 1, maxidx_i, F),
hang_recurse(A, N, N - 1, maxidx_j, F));
delete [] F;
return maxdiff * N + maxsum;
}

Maximum subarray sum modulo M

Most of us are familiar with the maximum sum subarray problem. I came across a variant of this problem which asks the programmer to output the maximum of all subarray sums modulo some number M.
The naive approach to solve this variant would be to find all possible subarray sums (which would be of the order of N^2 where N is the size of the array). Of course, this is not good enough. The question is - how can we do better?
Example: Let us consider the following array:
6 6 11 15 12 1
Let M = 13. In this case, subarray 6 6 (or 12 or 6 6 11 15 or 11 15 12) will yield maximum sum ( = 12 ).
We can do this as follow:
Maintaining an array sum which at index ith, it contains the modulus sum from 0 to ith.
For each index ith, we need to find the maximum sub sum that end at this index:
For each subarray (start + 1 , i ), we know that the mod sum of this sub array is
int a = (sum[i] - sum[start] + M) % M
So, we can only achieve a sub-sum larger than sum[i] if sum[start] is larger than sum[i] and as close to sum[i] as possible.
This can be done easily if you using a binary search tree.
Pseudo code:
int[] sum;
sum[0] = A[0];
Tree tree;
tree.add(sum[0]);
int result = sum[0];
for(int i = 1; i < n; i++){
sum[i] = sum[i - 1] + A[i];
sum[i] %= M;
int a = tree.getMinimumValueLargerThan(sum[i]);
result = max((sum[i] - a + M) % M, result);
tree.add(sum[i]);
}
print result;
Time complexity :O(n log n)
Let A be our input array with zero-based indexing. We can reduce A modulo M without changing the result.
First of all, let's reduce the problem to a slightly easier one by computing an array P representing the prefix sums of A, modulo M:
A = 6 6 11 2 12 1
P = 6 12 10 12 11 12
Now let's process the possible left borders of our solution subarrays in decreasing order. This means that we will first determine the optimal solution that starts at index n - 1, then the one that starts at index n - 2 etc.
In our example, if we chose i = 3 as our left border, the possible subarray sums are represented by the suffix P[3..n-1] plus a constant a = A[i] - P[i]:
a = A[3] - P[3] = 2 - 12 = 3 (mod 13)
P + a = * * * 2 1 2
The global maximum will occur at one point too. Since we can insert the suffix values from right to left, we have now reduced the problem to the following:
Given a set of values S and integers x and M, find the maximum of S + x modulo M
This one is easy: Just use a balanced binary search tree to manage the elements of S. Given a query x, we want to find the largest value in S that is smaller than M - x (that is the case where no overflow occurs when adding x). If there is no such value, just use the largest value of S. Both can be done in O(log |S|) time.
Total runtime of this solution: O(n log n)
Here's some C++ code to compute the maximum sum. It would need some minor adaptions to also return the borders of the optimal subarray:
#include <bits/stdc++.h>
using namespace std;
int max_mod_sum(const vector<int>& A, int M) {
vector<int> P(A.size());
for (int i = 0; i < A.size(); ++i)
P[i] = (A[i] + (i > 0 ? P[i-1] : 0)) % M;
set<int> S;
int res = 0;
for (int i = A.size() - 1; i >= 0; --i) {
S.insert(P[i]);
int a = (A[i] - P[i] + M) % M;
auto it = S.lower_bound(M - a);
if (it != begin(S))
res = max(res, *prev(it) + a);
res = max(res, (*prev(end(S)) + a) % M);
}
return res;
}
int main() {
// random testing to the rescue
for (int i = 0; i < 1000; ++i) {
int M = rand() % 1000 + 1, n = rand() % 1000 + 1;
vector<int> A(n);
for (int i = 0; i< n; ++i)
A[i] = rand() % M;
int should_be = 0;
for (int i = 0; i < n; ++i) {
int sum = 0;
for (int j = i; j < n; ++j) {
sum = (sum + A[j]) % M;
should_be = max(should_be, sum);
}
}
assert(should_be == max_mod_sum(A, M));
}
}
For me, all explanations here were awful, since I didn't get the searching/sorting part. How do we search/sort, was unclear.
We all know that we need to build prefixSum, meaning sum of all elems from 0 to i with modulo m
I guess, what we are looking for is clear.
Knowing that subarray[i][j] = (prefix[i] - prefix[j] + m) % m (indicating the modulo sum from index i to j), our maxima when given prefix[i] is always that prefix[j] which is as close as possible to prefix[i], but slightly bigger.
E.g. for m = 8, prefix[i] being 5, we are looking for the next value after 5, which is in our prefixArray.
For efficient search (binary search) we sort the prefixes.
What we can not do is, build the prefixSum first, then iterate again from 0 to n and look for index in the sorted prefix array, because we can find and endIndex which is smaller than our startIndex, which is no good.
Therefore, what we do is we iterate from 0 to n indicating the endIndex of our potential max subarray sum and then look in our sorted prefix array, (which is empty at the beginning) which contains sorted prefixes between 0 and endIndex.
def maximumSum(coll, m):
n = len(coll)
maxSum, prefixSum = 0, 0
sortedPrefixes = []
for endIndex in range(n):
prefixSum = (prefixSum + coll[endIndex]) % m
maxSum = max(maxSum, prefixSum)
startIndex = bisect.bisect_right(sortedPrefixes, prefixSum)
if startIndex < len(sortedPrefixes):
maxSum = max(maxSum, prefixSum - sortedPrefixes[startIndex] + m)
bisect.insort(sortedPrefixes, prefixSum)
return maxSum
From your question, it seems that you have created an array to store the cumulative sums (Prefix Sum Array), and are calculating the sum of the sub-array arr[i:j] as (sum[j] - sum[i] + M) % M. (arr and sum denote the given array and the prefix sum array respectively)
Calculating the sum of every sub-array results in a O(n*n) algorithm.
The question that arises is -
Do we really need to consider the sum of every sub-array to reach the desired maximum?
No!
For a value of j the value (sum[j] - sum[i] + M) % M will be maximum when sum[i] is just greater than sum[j] or the difference is M - 1.
This would reduce the algorithm to O(nlogn).
You can take a look at this explanation! https://www.youtube.com/watch?v=u_ft5jCDZXk
There are already a bunch of great solutions listed here, but I wanted to add one that has O(nlogn) runtime without using a balanced binary tree, which isn't in the Python standard library. This solution isn't my idea, but I had to think a bit as to why it worked. Here's the code, explanation below:
def maximumSum(a, m):
prefixSums = [(0, -1)]
for idx, el in enumerate(a):
prefixSums.append(((prefixSums[-1][0] + el) % m, idx))
prefixSums = sorted(prefixSums)
maxSeen = prefixSums[-1][0]
for (a, a_idx), (b, b_idx) in zip(prefixSums[:-1], prefixSums[1:]):
if a_idx > b_idx and b > a:
maxSeen = max((a-b) % m, maxSeen)
return maxSeen
As with the other solutions, we first calculate the prefix sums, but this time we also keep track of the index of the prefix sum. We then sort the prefix sums, as we want to find the smallest difference between prefix sums modulo m - sorting lets us just look at adjacent elements as they have the smallest difference.
At this point you might think we're neglecting an essential part of the problem - we want the smallest difference between prefix sums, but the larger prefix sum needs to appear before the smaller prefix sum (meaning it has a smaller index). In the solutions using trees, we ensure that by adding prefix sums one by one and recalculating the best solution.
However, it turns out that we can look at adjacent elements and just ignore ones that don't satisfy our index requirement. This confused me for some time, but the key realization is that the optimal solution will always come from two adjacent elements. I'll prove this via a contradiction. Let's say that the optimal solution comes from two non-adjacent prefix sums x and z with indices i and k, where z > x (it's sorted!) and k > i:
x ... z
k ... i
Let's consider one of the numbers between x and z, and let's call it y with index j. Since the list is sorted, x < y < z.
x ... y ... z
k ... j ... i
The prefix sum y must have index j < i, otherwise it would be part of a better solution with z. But if j < i, then j < k and y and x form a better solution than z and x! So any elements between x and z must form a better solution with one of the two, which contradicts our original assumption. Therefore the optimal solution must come from adjacent prefix sums in the sorted list.
Here is Java code for maximum sub array sum modulo. We handle the case we can not find least element in the tree strictly greater than s[i]
public static long maxModulo(long[] a, final long k) {
long[] s = new long[a.length];
TreeSet<Long> tree = new TreeSet<>();
s[0] = a[0] % k;
tree.add(s[0]);
long result = s[0];
for (int i = 1; i < a.length; i++) {
s[i] = (s[i - 1] + a[i]) % k;
// find least element in the tree strictly greater than s[i]
Long v = tree.higher(s[i]);
if (v == null) {
// can't find v, then compare v and s[i]
result = Math.max(s[i], result);
} else {
result = Math.max((s[i] - v + k) % k, result);
}
tree.add(s[i]);
}
return result;
}
Few points from my side that might hopefully help someone understand the problem better.
You do not need to add +M to the modulo calculation, as mentioned, % operator handles negative numbers well, so a % M = (a + M) % M
As mentioned, the trick is to build the proxy sum table such that
proxy[n] = (a[1] + ... a[n]) % M
This then allows one to represent the maxSubarraySum[i, j] as
maxSubarraySum[i, j] = (proxy[j] - proxy[j]) % M
The implementation trick is to build the proxy table as we iterate through the elements, instead of first pre-building it and then using. This is because for each new element in the array a[i] we want to compute proxy[i] and find proxy[j] that is bigger than but as close as possible to proxy[i] (ideally bigger by 1 because this results in a reminder of M - 1). For this we need to use a clever data structure for building proxy table while keeping it sorted and
being able to quickly find a closest bigger element to proxy[i]. bisect.bisect_right is a good choice in Python.
See my Python implementation below (hope this helps but I am aware this might not necessarily be as concise as others' solutions):
def maximumSum(a, m):
prefix_sum = [a[0] % m]
prefix_sum_sorted = [a[0] % m]
current_max = prefix_sum_sorted[0]
for elem in a[1:]:
prefix_sum_next = (prefix_sum[-1] + elem) % m
prefix_sum.append(prefix_sum_next)
idx_closest_bigger = bisect.bisect_right(prefix_sum_sorted, prefix_sum_next)
if idx_closest_bigger >= len(prefix_sum_sorted):
current_max = max(current_max, prefix_sum_next)
bisect.insort_right(prefix_sum_sorted, prefix_sum_next)
continue
if prefix_sum_sorted[idx_closest_bigger] > prefix_sum_next:
current_max = max(current_max, (prefix_sum_next - prefix_sum_sorted[idx_closest_bigger]) % m)
bisect.insort_right(prefix_sum_sorted, prefix_sum_next)
return current_max
Total java implementation with O(n*log(n))
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.TreeSet;
import java.util.stream.Stream;
public class MaximizeSumMod {
public static void main(String[] args) throws Exception{
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
Long times = Long.valueOf(in.readLine());
while(times --> 0){
long[] pair = Stream.of(in.readLine().split(" ")).mapToLong(Long::parseLong).toArray();
long mod = pair[1];
long[] numbers = Stream.of(in.readLine().split(" ")).mapToLong(Long::parseLong).toArray();
printMaxMod(numbers,mod);
}
}
private static void printMaxMod(long[] numbers, Long mod) {
Long maxSoFar = (numbers[numbers.length-1] + numbers[numbers.length-2])%mod;
maxSoFar = (maxSoFar > (numbers[0]%mod)) ? maxSoFar : numbers[0]%mod;
numbers[0] %=mod;
for (Long i = 1L; i < numbers.length; i++) {
long currentNumber = numbers[i.intValue()]%mod;
maxSoFar = maxSoFar > currentNumber ? maxSoFar : currentNumber;
numbers[i.intValue()] = (currentNumber + numbers[i.intValue()-1])%mod;
maxSoFar = maxSoFar > numbers[i.intValue()] ? maxSoFar : numbers[i.intValue()];
}
if(mod.equals(maxSoFar+1) || numbers.length == 2){
System.out.println(maxSoFar);
return;
}
long previousNumber = numbers[0];
TreeSet<Long> set = new TreeSet<>();
set.add(previousNumber);
for (Long i = 2L; i < numbers.length; i++) {
Long currentNumber = numbers[i.intValue()];
Long ceiling = set.ceiling(currentNumber);
if(ceiling == null){
set.add(numbers[i.intValue()-1]);
continue;
}
if(ceiling.equals(currentNumber)){
set.remove(ceiling);
Long greaterCeiling = set.ceiling(currentNumber);
if(greaterCeiling == null){
set.add(ceiling);
set.add(numbers[i.intValue()-1]);
continue;
}
set.add(ceiling);
ceiling = greaterCeiling;
}
Long newMax = (currentNumber - ceiling + mod);
maxSoFar = maxSoFar > newMax ? maxSoFar :newMax;
set.add(numbers[i.intValue()-1]);
}
System.out.println(maxSoFar);
}
}
Adding STL C++11 code based on the solution suggested by #Pham Trung. Might be handy.
#include <iostream>
#include <set>
int main() {
int N;
std::cin>>N;
for (int nn=0;nn<N;nn++){
long long n,m;
std::set<long long> mSet;
long long maxVal = 0; //positive input values
long long sumVal = 0;
std::cin>>n>>m;
mSet.insert(m);
for (long long q=0;q<n;q++){
long long tmp;
std::cin>>tmp;
sumVal = (sumVal + tmp)%m;
auto itSub = mSet.upper_bound(sumVal);
maxVal = std::max(maxVal,(m + sumVal - *itSub)%m);
mSet.insert(sumVal);
}
std::cout<<maxVal<<"\n";
}
}
As you can read in Wikipedia exists a solution called Kadane's algorithm, which compute the maximum subarray sum watching ate the maximum subarray ending at position i for all positions i by iterating once over the array. Then this solve the problem with with runtime complexity O(n).
Unfortunately, I think that Kadane's algorithm isn't able to find all possible solution when more than one solution exists.
An implementation in Java, I didn't tested it:
public int[] kadanesAlgorithm (int[] array) {
int start_old = 0;
int start = 0;
int end = 0;
int found_max = 0;
int max = array[0];
for(int i = 0; i<array.length; i++) {
max = Math.max(array[i], max + array[i]);
found_max = Math.max(found_max, max);
if(max < 0)
start = i+1;
else if(max == found_max) {
start_old=start;
end = i;
}
}
return Arrays.copyOfRange(array, start_old, end+1);
}
I feel my thoughts are aligned with what have been posted already, but just in case - Kotlin O(NlogN) solution:
val seen = sortedSetOf(0L)
var prev = 0L
return max(a.map { x ->
val z = (prev + x) % m
prev = z
seen.add(z)
seen.higher(z)?.let{ y ->
(z - y + m) % m
} ?: z
})
Implementation in java using treeset...
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.TreeSet;
public class Main {
public static void main(String[] args) throws IOException {
BufferedReader read = new BufferedReader(new InputStreamReader(System.in)) ;
String[] str = read.readLine().trim().split(" ") ;
int n = Integer.parseInt(str[0]) ;
long m = Long.parseLong(str[1]) ;
str = read.readLine().trim().split(" ") ;
long[] arr = new long[n] ;
for(int i=0; i<n; i++) {
arr[i] = Long.parseLong(str[i]) ;
}
long maxCount = 0L ;
TreeSet<Long> tree = new TreeSet<>() ;
tree.add(0L) ;
long prefix = 0L ;
for(int i=0; i<n; i++) {
prefix = (prefix + arr[i]) % m ;
maxCount = Math.max(prefix, maxCount) ;
Long temp = tree.higher(prefix) ;
System.out.println(temp);
if(temp != null) {
maxCount = Math.max((prefix-temp+m)%m, maxCount) ;
}
//System.out.println(maxCount);
tree.add(prefix) ;
}
System.out.println(maxCount);
}
}
Here is one implementation of solution in java for this problem which works using TreeSet in java for optimized solution !
public static long maximumSum2(long[] arr, long n, long m)
{
long x = 0;
long prefix = 0;
long maxim = 0;
TreeSet<Long> S = new TreeSet<Long>();
S.add((long)0);
// Traversing the array.
for (int i = 0; i < n; i++)
{
// Finding prefix sum.
prefix = (prefix + arr[i]) % m;
// Finding maximum of prefix sum.
maxim = Math.max(maxim, prefix);
// Finding iterator poing to the first
// element that is not less than value
// "prefix + 1", i.e., greater than or
// equal to this value.
long it = S.higher(prefix)!=null?S.higher(prefix):0;
// boolean isFound = false;
// for (long j : S)
// {
// if (j >= prefix + 1)
// if(isFound == false) {
// it = j;
// isFound = true;
// }
// else {
// if(j < it) {
// it = j;
// }
// }
// }
if (it != 0)
{
maxim = Math.max(maxim, prefix - it + m);
}
// adding prefix in the set.
S.add(prefix);
}
return maxim;
}
public static int MaxSequence(int[] arr)
{
int maxSum = 0;
int partialSum = 0;
int negative = 0;
for (int i = 0; i < arr.Length; i++)
{
if (arr[i] < 0)
{
negative++;
}
}
if (negative == arr.Length)
{
return 0;
}
foreach (int item in arr)
{
partialSum += item;
maxSum = Math.Max(maxSum, partialSum);
if (partialSum < 0)
{
partialSum = 0;
}
}
return maxSum;
}
Modify Kadane algorithm to keep track of #occurrence. Below is the code.
#python3
#source: https://github.com/harishvc/challenges/blob/master/dp-largest-sum-sublist-modulo.py
#Time complexity: O(n)
#Space complexity: O(n)
def maxContiguousSum(a,K):
sum_so_far =0
max_sum = 0
count = {} #keep track of occurrence
for i in range(0,len(a)):
sum_so_far += a[i]
sum_so_far = sum_so_far%K
if sum_so_far > 0:
max_sum = max(max_sum,sum_so_far)
if sum_so_far in count.keys():
count[sum_so_far] += 1
else:
count[sum_so_far] = 1
else:
assert sum_so_far < 0 , "Logic error"
#IMPORTANT: reset sum_so_far
sum_so_far = 0
return max_sum,count[max_sum]
a = [6, 6, 11, 15, 12, 1]
K = 13
max_sum,count = maxContiguousSum(a,K)
print("input >>> %s max sum=%d #occurrence=%d" % (a,max_sum,count))

Longest positive sum substring

I was wondering how could I get the longest positive-sum subsequence in a sequence:
For example I have -6 3 -4 4 -5, so the longest positive subsequence is 3 -4 4. In fact the sum is positive (3), and we couldn't add -6 neither -5 or it would have become negative.
It could be easily solvable in O(N^2), I think could exist something much more faster, like in O(NlogN)
Do you have any idea?
EDIT: the order must be preserved, and you can skip any number from the substring
EDIT2: I'm sorry if I caused confusion using the term "sebsequence", as #beaker pointed out I meant substring
O(n) space and time solution, will start with the code (sorry, Java ;-) and try to explain it later:
public static int[] longestSubarray(int[] inp) {
// array containing prefix sums up to a certain index i
int[] p = new int[inp.length];
p[0] = inp[0];
for (int i = 1; i < inp.length; i++) {
p[i] = p[i - 1] + inp[i];
}
// array Q from the description below
int[] q = new int[inp.length];
q[inp.length - 1] = p[inp.length - 1];
for (int i = inp.length - 2; i >= 0; i--) {
q[i] = Math.max(q[i + 1], p[i]);
}
int a = 0;
int b = 0;
int maxLen = 0;
int curr;
int[] res = new int[] {-1,-1};
while (b < inp.length) {
curr = a > 0 ? q[b] - p[a-1] : q[b];
if (curr >= 0) {
if(b-a > maxLen) {
maxLen = b-a;
res = new int[] {a,b};
}
b++;
} else {
a++;
}
}
return res;
}
we are operating on input array A of size n
Let's define array P as the array containing the prefix sum until index i so P[i] = sum(0,i) where `i = 0,1,...,n-1'
let's notice that if u < v and P[u] <= P[v] then u will never be our ending point
because of the above we can define an array Q which has Q[n-1] = P[n-1] and Q[i] = max(P[i], Q[i+1])
now let's consider M_{a,b} which shows us the maximum sum subarray starting at a and ending at b or beyond. We know that M_{0,b} = Q[b] and that M_{a,b} = Q[b] - P[a-1]
with the above information we can now initialise our a, b = 0 and start moving them. If the current value of M is bigger or equal to 0 then we know we will find (or already found) a subarray with sum >= 0, we then just need to compare b-a with the previously found length. Otherwise there's no subarray that starts at a and adheres to our constraints so we need to increment a.
Let's make a naive implementation and then improve it.
We move from the left to the right calculating partial sums and for each position we find the most-left partial sum such as the current partial sum is greater than that.
input a
int partialSums[len(a)]
for i in range(len(a)):
partialSums[i] = (i == 0 ? 0 : partialSums[i - 1]) + a[i]
if partialSums[i] > 0:
answer = max(answer, i + 1)
else:
for j in range(i):
if partialSums[i] - partialSums[j] > 0:
answer = max(answer, i - j)
break
This is O(n2). Now the part of finding the left-most "good" sum could be actually maintained via BST, where each node would be represented as a pair (partial sum, index) with a comparison by partial sum. Also each node should support a special field min that would be the minimum of indices in this subtree.
Now instead of the straightforward search of an appropriate partial sum we could descend the BST using the current partial sum as a key following the next three rules (assuming C is the current node, L and R are the roots of the left and the right subtrees respectively):
Maintain the current minimal index of "good" partial sums found in curMin, initially +∞.
If C.partial_sum is "good" then update curMin with C.index.
If we go to R then update curMin with L.min.
And then update the answer with i - curMin, also add the current partial sum to the BST.
That would give us O(n * log n).
We can easily have a O(n log n) solution for longest subsequence.
First, sort the array, remember their indexes.
Pick all the largest numbers, stop when their sum are negative, and you have your answer.
Recover their original order.
Pseudo code
sort(data);
int length = 0;
long sum = 0;
boolean[] result = new boolean[n];
for(int i = n ; i >= 1; i--){
if(sum + data[i] <= 0)
break;
sum += data[i];
result[data[i].index] = true;
length++;
}
for(int i = 1; i <= n; i++)
if(result[i])
print i;
So, rather than waiting, I will propose a O(n log n) solution for longest positive substring.
First, we create an array prefix which is the prefix sum of the array.
Second, we using binary search to look for the longest length that has positive sum
Pseudocode
int[]prefix = new int[n];
for(int i = 1; i <= n; i++)
prefix[i] = data[i];
if(i - 1 >= 1)
prefix[i] += prefix[i - 1];
int min = 0;
int max = n;
int result = 0;
while(min <= max){
int mid = (min + max)/2;
boolean ok = false;
for(int i = 1; i <= n; i++){
if(i > mid && pre[i] - pre[i - mid] > 0){//How we can find sum of segment with mid length, and end at index i
ok = true;
break;
}
}
if(ok){
result = max(result, mid)
min = mid + 1;
}else{
max = mid - 1;
}
}
Ok, so the above algorithm is wrong, as pointed out by piotrekg2 what we need to do is
create an array prefix which is the prefix sum of the array.
Sort the prefix array, and we need to remember the index of the prefix array.
Iterate through the prefix array, storing the minimum index we meet so far, the maximum different between the index is the answer.
Note: when we comparing value in prefix, if two indexes have equivalent values, so which has smaller index will be considered larger, this will avoid the case when the sum is 0.
Pseudo code:
class Node{
int val, index;
}
Node[]prefix = new Node[n];
for(int i = 1; i <= n; i++)
prefix[i] = new Node(data[i],i);
if(i - 1 >= 1)
prefix[i].val += prefix[i - 1].val;
sort(prefix);
int min = prefix[1].index;
int result = 0;
for(int i = 2; i <= n; i ++)
if(prefix[i].index > min)
result = max(prefix[i].index - min + 1, result)
min = min(min, prefix[i].index);

How to calculate Time Complexity for a given algorithm

i, j, N, sum is all integer type. N is input.
( Code1 )
i = N;
while(i > 1)
{
i = i / 2;
for (j = 0; j < 1000000; j++)
{
sum = sum + j;
}
}
( Code2 )
sum = 0;
d = 1;
d = d << (N-1);
for (i = 0; i < d; i++)
{
for (j = 0; j < 1000000; j++)
{
sum = sum + i;
}
}
How to calculate step count and time complexity for a Code1, Code2?
to calculate the time complexity try to understand what takes how much time, and by what n are you calculating.
if we say the addition ("+") takes O(1) steps then we can check how many time it is done in means of N.
the first code is dividing i in 2 each step, meaning it is doing log(N) steps. so the time complexity is
O(log(N) * 1000000)= O(log(N))
the second code is going form 0 to 2 in the power of n-1 so the complexity is:
O(s^(N-1) * 1000000)= O(2^(N-1))
but this is just a theory, because d has a max of 2^32/2^64 or other number, so it might not be O(2^(N-1)) in practice

Resources