Finding best algorithm for sum of a section of an array's values - algorithm

Given an array of n integers in the locations A[1], A[2], …, A[n], describe an O(n^2) time algorithm to
compute the sum A[i] + A[i+1] + … + A[j] for all i, j, 1 ≤ i < j ≤ n.
I've tried multiple ways of solving this problem but none have in O(n^2) time.
So for an array containing {1,2,3,4}
You would output:
1+2 = 3
1+2+3 = 6
1+2+3+4 = 10
2+3 = 5
2+3+4 = 9
3+4 = 7
The answer does not need to be in a specific language, pseudocode is preferred.

A good preperation is everything.
You could create an array of integrals:
I[0..n] = (0, I[0] + A[1], I[1] + A[2], ..., I[n-1]+A[n]);
This will cost you O(n) * O(1) (looping over all elements and doing one addition);
Now you can calculate each Sum(A, i, j) with just a single subtraction: I[j] - I[i-1];
so this has O(1)
Looping over all combinations of i and j with 1 <= (i,j) <= n has O(n^2).
So you end up with O(n) * O(1) + O(n^2) * O(1) = O(n^2) .
Edit:
Your array A starts at 1 - adapted to this - this also solves the little quirk with i-1
So the integral array I starts with index 0 and is 1 element larger than A
Edit:

First you'll maybe have thought about the most naive idea:
Naive idea
Create a function that for given values of i and of j will return the sum A[i] + ... + A[j].
function sumRange(A, i, j):
sum = 0
for k = i to j
sum = sum + A[k]
return sum
Then generate all pairs of i and j (with i < j) and call the above function for each pair:
for i = 1 to n
for j = i+1 to n
output sumRange(A, i, j)
This is not O(n²), because already the two loops on i and j represent O(n²) iterations, and then the function will perform yet another loop, making it O(n³).
Better idea
The above can be improved. Look at the repetition it performs. The sum that was calculated for given values of i and j could be reused to calculate the sum for when j has increased with 1, without starting from scratch and summing the values between i and (now) j-1 again, only to add that one more value to it.
We should just remember what the previous sum was, and add A[j] to it.
So without a separate function:
for i = 1 to n
sum = A[i]
for j = i+1 to n
sum = sum + A[j]
output sum
Note how the sum is not reset to 0 once it is output. It is preserved, so that when j is incremented, only one value needs to be added to it.
Now it is O(n²). Note also how it does not require an extra array for storage. It only needs the memory for a few variables (i, j, sum), so its space complexity is O(1).
As the number of sums you need to output is O(n²), there is no way to improve this time complexity any further.
NB: I assume here that single array values do not constitute a "sum". As you stated in your question, i < j, and also in your example you only showed sums of at least two array values. The above can be easily adapted to also include single value "sums" if ever that were needed.

Related

Efficiently sum max(Ai+Bj, Bi+Aj) over all i, j

You are given two integer arrays A and B of length N. You have to find the value of two summation:
Z=Σ Σ max(Ai+Bj, Bi+Aj)
Here is my brute force algorithm
for loop (i to length)
for loop (j to length)
sum+=Math.max(A[i]+B[j], A[j]+B[i]);
Please tell me a better efficient algorithm for this.
Rewrite the sum as Z = Σi Σj [max(Ai−Bi, Aj−Bj) + Bi + Bj] by using the distributive property of plus over max. Then construct C = A−B, sort it, and return Σi (2i+1)Ci + 2n Σi Bi (using zero-based indexing).
A minor improvement I can think of is to omit the results that you already computed. This means, instead of beginning the inner loop from 0, you can start with j = i. Since you already have computed the results for j < i in the previous loops.
To achieve this, you can change the instruction in the inner loop to the following:
if i != j
sum += 2 * Math.max(A[i]+B[j], A[j]+B[i]);
else
sum += Math.max(A[i]+B[j], A[j]+B[i]);
The reason is that every pair of i and j are visited twice by the loops.

Find scalar interval containing maximum elements from population A and zero elements from population B

Given two large sets A and B of scalar (floating point) values, what algorithm would you use to find the (scalar) range [x0,x1] containing zero elements from B and the maximum number of elements from A?
Is sorting complexity (O(n log n)) unavoidable?
Create a single list with all values, where each value is marked with two counts: one count that relates to set A, and another that relates to set B. Initially these counts are 1 and 0, when the value comes from set A, and 0 and 1 when the value comes from set B. So entries in this list could be tuples (value, countA, countB). This operation is O(n).
Sort these tuples. O(nlogn)
Merge tuples with duplicate values into one tuple, and accumulate the values in the corresponding counters, so that the tuple tells us how many times the value occurs in set A and how many times in set B. O(n)
Traverse this list in sorted order and maintain the largest sum of counts for countA of a series of adjacent tuples where countB is always 0, and the minimum and maximum value of that range. O(n)
The sorting is the determining factor of the time complexity: O(nlogn).
Sort both A and B in O(|A| log |A| + |B| log |B|). Then apply the following algorithm, which has complexity O(|A| + |B|):
i = j = k = 0
best_interval = (0, 1)
while i < len(B) - 1:
lo = B[i]
hi = B[i+1]
j = k # We can skip ahead from last iteration.
while j < len(A) and A[j] <= lo:
j += 1
k = j # We can skip ahead from the above loop.
while k < len(A) and A[k] < hi:
k += 1
if k - j > best_interval[1] - best_interval[0]:
best_interval = (j, k)
i += 1
x0 = A[best_interval[0]]
x1 = A[best_interval[1]-1]
It may look quadratic at a first inspection but note we never decrease j and k - it really is just a linear scan with three pointers.

Get highest score in this game: choosing and removing elements in an array

Given an array arr of n integers, what is the highest score that a player can reach, playing the following game?
Choose an index 0 < i < n-1 in the array
Add arr[i-1] * arr[i+1] points to the score (initially the score is 0)
Shrink the array by removing element i (forall j >= i: arr[j] = arr[j+1]; then n = n - 1
Repeat steps 1-3 until n == 2.
Do the above until there are only 2 elements (which are the first and the last element because you can't remove them).
What is the highest score you can get ?
Example
arr = [1 2 3 4]
Choose i=2, get: 2*4 = 8 points, remove 3
Remaining: arr = [1 2 4]
Choose i=1, get 1*4 = 4 points, remove 2
Remaining: arr = [1 4].
The sum of points is 8 + 4 = 12, which is the highest possible score on this example.
I think it is related to Dynamic programming but I'm not sure how to solve it.
This problem has a dynamic programming approach similar to Matrix-chain multiplication problem. You can find further explanation in the book "Introduction to Algorithms", 3rd Edition (Cormen, page 370).
Let's find the optimal substructure property and then use it to construct an optimal solution to the problem from optimal solutions to subproblems.
Notation: Ci..j, where i ≤ j, stands for elements Ci,Ci+1,...,Cj.
Definition: A removal sequence for Ci..j is a permutation of i+1,i+2,...,j-1.
A removal sequence for Ci..j is optimal if the score achieved by removing the elements of Ci..j in that order is maximum among all possible removal sequences for Ci..j.
1. Characterize the structure of an optimal solution
If the problem is nontrivial, i.e. i + 1 < j, then any solution has a last removed element which corresponding index is k in the range
i < k < j. Such k split the problem into Ci..k and Ck..j. That is, for some value k, we first remove non extremal elements of Ci..k and Ck..j and then we remove element k. As removing non extremal elements of Ci..k doesn't affect score obtained by removing non extremal elements of Ck..j and an analogous reasoning for removing non extremal elements of Ck..j is also true we state that both subproblems are independent. Then, for a given removal sequence where kth-element is last, the score of Ci..j is equal to the sum of scores of Ci..k and Ck..j, plus the score of removing kth-element (C[i] * C[j]).
The optimal substructure of this problem is as follows. Suppose there is an optimal removal sequence O for Ci..j that ends at kth-element, then the ordering of removed elements from Ci..k must be optimal too. We can prove it by contradiction: If there was a removal sequence for Ci..k that scored higher than removal subsequence extracted from O for Ci..k then we can produce another removal sequence for Ci..j with higher score than optimal removal sequence (contradiction). A similar observation holds for the ordering of removed elements from Ck..j in the optimal removal sequence for Ci..j: it must be optimal too.
We can build an optimal solution for nontrivial instances of the problem by splitting the problem into two subproblems, finding optimal solutions to subproblem instances, and them combining these optimal subproblem solutions.
2. Recursively define the value of an optimal solution.
For this problem our subproblems are the maximum score obtained in Ci..j for 1 ≤ i ≤ j ≤ N. Let S[i, j] be the maximum score obtained in Ci..j; for the full problem, the highest score when evaluating the given rules is S[1, N].
We can define S[i, j] recursively as follows:
If j ≤ i + 1 then S[i, j] = 0
If i + 1 < j then S[i, j] = maxi < k < j{S[i, k] + S[k, j] + C[i] * C[j]}
We ensure that we search for the correct place to split because we consider all possible places, so that we are sure of having examined the optimal one.
3. Compute the value of an optimal solution
You can use your favorite method to compute S:
top-down approach (recursive)
bottom-up approach (iterative)\
I would use bottom-up for computing the solution since it would be < 5 lines long in almost any programming language.
Example in C++11:
for(int l = 2; l <= N; ++l) \\ increasing length intervals
for(int i = 1, j = i + l; j <= N; ++i, ++j)
for(int k = i + 1; k < j; ++k)
S[i, j] = max(S[i, j], S[i, k] + S[k, j] + C[i] * C[j])
4. Time Complexity and Space Complexity
There are nC2 + n = Θ(n2) subproblems and every subproblem do an operation which running time is Θ(l) where l is length of the subproblem so the math yield a running time of Θ(n3) for the algorithm (it's easy to spot the O(n3) part :-)). Also, the algorithm requires Θ(n2) space to store the S table.

Maximum sum of all contiguous subarrays of prime length

I was recently asked this question in an interview,
Given an array of non-negative integers find the
maximum cumulative sum that could be obtained such that the length of all the
participating subarray is a prime number. I tried to come up with a solution for this using Dynamic Programming but unfortunately could not.
Eg: If the array is [9,8,7,6,5,4,3,1,2,2] it should return 46 (sum of the subarray [9,8,7,6,5,4,3] of length 7 and [2,2] of length 2). You cannot combine [9,8,7,6,5,4,3] and [1,2,2] since it would result in a contiguous subarray (idempotency) of length 10 which is non prime.
Can anyone explain how to solve such problems using DP? Thanks.
What you can do:
take the length of the list and go back until you find a prime number
get a window of elements and sum them
check if it's the maximum sum and in case it is, store it
go to the next window
This works because of the constraint that all integers are positive, it would not work otherwise.
Basically something like this (very roughly, in pseudo-python, obviously not tested):
input_list = (8, 1, 3, 4, 5, 2)
list_size = len(input_list)
while (list_size):
if (is_prime(list_size)):
window_size = list_size
break
list_size--
max_sum = -1
for i in xrange(0, list_size - window_size):
current_sum = sum(input_list[i:i+window_size])
if (max_sum < current_sum):
max_sum = current_sum
print max_sum
How about something like this (approximately) O(n * n / log n) time, O(n) space, DP?
Let f(i) represent the greatest sum up to index i where a[i] is either excluded from a contiguous subset or is the last of a subset of prime length. Then:
f(i) = sum(a[0]...a[i]) if (i + 1) is prime, otherwise
max(
// a[i] excluded
f(i-1),
f(i-2),
// a[i] is last of a subset
sum(a[i - primes[j] + 1]...a[i]) + f(i - primes[j] - 1)
for primes[j] <= i
)
(Summing the intervals can be done in O(1) time with O(n) preprocessing of prefix-sums.)
Since others have solved the problem for non-negative integers.
But if you have -ve numbers also, then also this algorithm will work.
I think you have to slightly tweak Kadane's Algo.
Following are the changes for modified Kadane's Algo. All lines marked ** are changes.
Initialize:
max_so_far = 0
max_ending_here = 0
** MAX_SO_FAR_FOR_PRIMES =0
** SUB_ARRAY_SIZE = 0
Loop for each element of the array
max_ending_here = max_ending_here + a[i]
** SUB_ARRAY_SIZE = SUB_ARRAY_SIZE + 1 // since a[i] included in subarray, increase sub_array_size
if(max_ending_here < 0)
max_ending_here = 0
** SUB_ARRAY_SIZE = 0
if(max_so_far < max_ending_here)
max_so_far = max_ending_here
** if(MAX_SO_FAR_FOR_PRIMES < max_ending_here && isPrime(SUB_ARRAY_SIZE)) // comparing when SUB_ARRAY_SIZE is Prime.
** MAX_SO_FAR_FOR_PRIMES = max_ending_here.
return MAX_SO_FAR_FOR_PRIMES
Basically, take one more variable MAX_SO_FAR_FOR_PRIMES, which keeps the maximum sum subarray for prime sized subarray so far.
Also store the SUB_ARRAY_SIZE, which stores the size of the sub-array anytime during looping.
Now just compare you MAX_SO_FAR_FOR_PRIMES with your max_ending_here whenever the SUBARRAY_SIZE is prime. And update `MAX_SO_FAR_FOR_PRIMES1 accordingly.

Find largest continuous sum such that the minimum of it and it's complement is largest

I'm given a sequence of numbers a_1,a_2,...,a_n. It's sum is S=a_1+a_2+...+a_n and I need to find a subsequence a_i,...,a_j such that min(S-(a_i+...+a_j),a_i+...+a_j) is the largest possible (both sums must be non-empty).
Example:
1,2,3,4,5 the sequence is 3,4, because then min(S-(a_i+...+a_j),a_i+...+a_j)=min(8,7)=7 (and it's the largest possible which can be checked for other subsequences).
I tried to do this the hard way.
I load all values into the array tab[n].
I do this n-1 times tab[i]+=tab[i-j]. So that tab[j] is the sum from the beginning till j.
I check all possible sums a_i+...+a_j=tab[j]-tab[i-1] and substract it from the sum, take the minimum and see if it's larger than before.
It takes O(n^2). This makes me very sad and miserable. Is there a better way?
Seems like this can be done in O(n) time.
Compute the sum S. The ideal subsequence sum is the longest one which gets closest to S/2.
Start with i=j=0 and increase j until sum(a_i..a_j) and sum(a_i..a_{j+1}) are as close as possible to S/2. Note which ever is closer and save the values of i_best,j_best,sum_best.
Increment i and then increase j again until sum(a_i..a_j) and sum(a_i..a_{j+1}) are as close as possible to S/2. Note which ever is closer and replace the values of i_best,j_best,sum_best if they are better. Repeat this step until done.
Note that both i and j are never decremented, so they are changed a total of at most O(n) times. Since all other operations take only constant time, this results in an O(n) runtime for the entire algorithm.
Let's first do some clarifications.
A subsequence of a sequence is actually a subset of the indices of the sequence. Haivng said that, and specifically int he case where you sequence has distinct elements, your problem will reduce to the famous Partition problem, which is known to be NP-complete. If that is the case, you can manage to solve the problem in O(Sn) where "n" is the number of elements and "S" is the total sum. This is not polynomial time as "S" can be arbitrarily large.
So lets consider the case with a contiguous subsequence. You need to observe array elements twice. First run sums them up into some "S". In the second run you carefully adjust array length. Lets assume you know that a[i] + a[i + 1] + ... + a[j] > S / 2. Then you let i = i + 1 to reduce the sum. Conversely, if it was smaller, you would increase j.
This code runs in O(n).
Python code:
from math import fabs
a = [1, 2, 3, 4, 5]
i = 0
j = 0
S = sum(a)
s = 0
while s + a[j] <= S / 2:
s = s + a[j]
j = j + 1
s = s + a[j]
best_case = (i, j)
best_difference = fabs(S / 2 - s)
while True:
if fabs(S / 2 - s) < best_difference:
best_case = (i, j)
best_difference = fabs(S / 2 - s)
if s > S / 2:
s -= a[i]
i += 1
else:
j += 1
if j == len(a):
break
s += a[j]
print best_case
i = best_case[0]
j = best_case[1]
print "Best subarray = ", a[i:j + 1]
print "Best sum = " , sum(a[i:j + 1])

Resources