Find the value in range L to R in given array - algorithm

Given array A, and two indexes L and R,find the value of
Summation(AS[i]*AS[j]*AS[k])
where L<=i<j<k<=R holds, and AS is the sorted set of all elements of A in range L to R inclusive.
Example:
Let A=(4,4,1,6,1,3) L=0 and R=3 gives AS=(1,4,6), so Ans=1*4*6=24
I don't have any approach better than O(n^3) , which is very slow.
Please suggest me some faster approach.
Number of elements in A are upto 10^5.

As the question commentators said, determining AS can be done by using a hash table H. You simply iterate through the elements of A from index L to R and you insert each element into H. The result should be the set of elements you need. You still need to sort the set. For that you maybe copy the elements of H into an array and sort that array. The result is AS. This should take no more than O(NlogN) steps, where N=R-L.
What the commentators did not say is how to compute the sum efficiently. It can be done in O(N) steps. Here is how.
We first make the following observation:
Sum(AS[j]*AS[k], a <= j < k <= b) =
1/2*(AS[a] + AS[a+1] + ... + AS[b])^2 -
1/2*(AS[a]^2 + AS[a+1]^2 + ... + AS[b]^2)
We expand our target sum as follows:
S = Sum(AS[i]*AS[j]*AS[k]) =
AS[L] * Sum(AS[j]*AS[k], L+1 <= j < k <= R) + (iteration 1)
AS[L+1] * Sum(AS[j]*AS[k], L+2 <= j < k <= R) + (iteration 2)
...
AS[R-2] * Sum(AS[j]*AS[k], R-1 <= j < k <= R). (iteration R-L-1)
We now apply the observation.
To determine the sums of the form Sum(AS[j]*AS[k], a <= j < k <= b) efficiently we can first compute
S1 = AS[L] + AS[L+1] + ... + A[R]
S2 = AS[L]^2 + AS[L+1]^2 + ... + A[R]^2
and then incrementally subtract the first term from each sum as we iterate through the elements of AS from from index L to R-2.
Thus, determining the sum you want can be done in O(N) steps after you determine AS. Provided that you use some comparison sort method the whole algorithm should take O(|A|) + O(NlogN) + O(N) steps.

Related

Efficiently sum max(Ai+Bj, Bi+Aj) over all i, j

You are given two integer arrays A and B of length N. You have to find the value of two summation:
Z=Σ Σ max(Ai+Bj, Bi+Aj)
Here is my brute force algorithm
for loop (i to length)
for loop (j to length)
sum+=Math.max(A[i]+B[j], A[j]+B[i]);
Please tell me a better efficient algorithm for this.
Rewrite the sum as Z = Σi Σj [max(Ai−Bi, Aj−Bj) + Bi + Bj] by using the distributive property of plus over max. Then construct C = A−B, sort it, and return Σi (2i+1)Ci + 2n Σi Bi (using zero-based indexing).
A minor improvement I can think of is to omit the results that you already computed. This means, instead of beginning the inner loop from 0, you can start with j = i. Since you already have computed the results for j < i in the previous loops.
To achieve this, you can change the instruction in the inner loop to the following:
if i != j
sum += 2 * Math.max(A[i]+B[j], A[j]+B[i]);
else
sum += Math.max(A[i]+B[j], A[j]+B[i]);
The reason is that every pair of i and j are visited twice by the loops.

Finding best algorithm for sum of a section of an array's values

Given an array of n integers in the locations A[1], A[2], …, A[n], describe an O(n^2) time algorithm to
compute the sum A[i] + A[i+1] + … + A[j] for all i, j, 1 ≤ i < j ≤ n.
I've tried multiple ways of solving this problem but none have in O(n^2) time.
So for an array containing {1,2,3,4}
You would output:
1+2 = 3
1+2+3 = 6
1+2+3+4 = 10
2+3 = 5
2+3+4 = 9
3+4 = 7
The answer does not need to be in a specific language, pseudocode is preferred.
A good preperation is everything.
You could create an array of integrals:
I[0..n] = (0, I[0] + A[1], I[1] + A[2], ..., I[n-1]+A[n]);
This will cost you O(n) * O(1) (looping over all elements and doing one addition);
Now you can calculate each Sum(A, i, j) with just a single subtraction: I[j] - I[i-1];
so this has O(1)
Looping over all combinations of i and j with 1 <= (i,j) <= n has O(n^2).
So you end up with O(n) * O(1) + O(n^2) * O(1) = O(n^2) .
Edit:
Your array A starts at 1 - adapted to this - this also solves the little quirk with i-1
So the integral array I starts with index 0 and is 1 element larger than A
Edit:
First you'll maybe have thought about the most naive idea:
Naive idea
Create a function that for given values of i and of j will return the sum A[i] + ... + A[j].
function sumRange(A, i, j):
sum = 0
for k = i to j
sum = sum + A[k]
return sum
Then generate all pairs of i and j (with i < j) and call the above function for each pair:
for i = 1 to n
for j = i+1 to n
output sumRange(A, i, j)
This is not O(n²), because already the two loops on i and j represent O(n²) iterations, and then the function will perform yet another loop, making it O(n³).
Better idea
The above can be improved. Look at the repetition it performs. The sum that was calculated for given values of i and j could be reused to calculate the sum for when j has increased with 1, without starting from scratch and summing the values between i and (now) j-1 again, only to add that one more value to it.
We should just remember what the previous sum was, and add A[j] to it.
So without a separate function:
for i = 1 to n
sum = A[i]
for j = i+1 to n
sum = sum + A[j]
output sum
Note how the sum is not reset to 0 once it is output. It is preserved, so that when j is incremented, only one value needs to be added to it.
Now it is O(n²). Note also how it does not require an extra array for storage. It only needs the memory for a few variables (i, j, sum), so its space complexity is O(1).
As the number of sums you need to output is O(n²), there is no way to improve this time complexity any further.
NB: I assume here that single array values do not constitute a "sum". As you stated in your question, i < j, and also in your example you only showed sums of at least two array values. The above can be easily adapted to also include single value "sums" if ever that were needed.

Finding median in merged array of two sorted arrays

Assume we have 2 sorted arrays of integers with sizes of n and m. What is the best way to find median of all m + n numbers?
It's easy to do this with log(n) * log(m) complexity. But i want to solve this problem in log(n) + log(m) time. So is there any suggestion to solve this problem?
Explanation
The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:
if (aMid < bMid) Keep [aMid +1 ... n] and [bLeft ... m]
else Keep [bMid + 1 ... m] and [aLeft ... n]
// where n and m are the length of array A and B
As the following: time complexity is O(log(m + n))
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}
public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);
int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];
if (aMid < bMid)
return getkth(A, aStart + k / 2, B, bStart, k - k / 2); // Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k / 2, k - k / 2); // Check: bRight + aLeft
}
Hope it helps! Let me know if you need more explanation on any part.
Here's a very good solution I found in Java on Stack Overflow. It's a method of finding the K and K+1 smallest items in the two arrays where K is the center of the merged array.
If you have a function for finding the Kth item of two arrays then finding the median of the two is easy;
Calculate the weighted average of the Kth and Kth+1 items of X and Y
But then you'll need a way to find the Kth item of two lists; (remember we're one indexing now)
If X contains zero items then the Kth smallest item of X and Y is the Kth smallest item of Y
Otherwise if K == 2 then the second smallest item of X and Y is the smallest of the smallest items of X and Y (min(X[0], Y[0]))
Otherwise;
i. Let A be min(length(X), K / 2)
ii. Let B be min(length(Y), K / 2)
iii. If the X[A] > Y[B] then recurse from step 1. with X, Y' with all elements of Y from B to the end of Y and K' = K - B, otherwise recurse with X' with all elements of X from A to the end of X, Y and K' = K - A
If I find the time tomorrow I will verify that this algorithm works in Python as stated and provide the example source code, it may have some off-by-one errors as-is.
Take the median element in list A and call it a. Compare a to the center elements in list B. Lets call them b1 and b2 (if B has odd length then exactly where you split b depends on your definition of the median of an even length list, but the procedure is almost identical regardless). if b1&leq;a&leq;b2 then a is the median of the merged array. This can be done in constant time since it requires exactly two comparisons.
If a is greater than b2 then we add the top half of A to the top of B and repeat. B will no longer be sorted, but it doesn't matter. If a is less than b1 then we add the bottom half of A to the bottom of B and repeat. These will iterate log(n) times at most (if the median is found sooner then stop, of course).
It is possible that this will not find the median. If this is the case then the median is in B. If so, perform the same algorithm with A and B reversed. This will require log(m) iterations. In total you will have performed at most 2*(log(n)+log(m)) iterations of a constant time operation, so you have solved the problem in order log(n)+log(m) time.
This is essentially the same answer as was given by iehrlich, but written out more explicitly.
Yes, this can be done. Given two arrays, A and B, in the worst-case scenario you have to first perform a binary search in A, and then, if it fails, binary search in B looking for the median. On each step of a binary search, you check if the current element is actually a median of a merged A+B array. Such check takes constant time.
Let's see why such check is constant. For simplicity, let's assume that |A| + |B| is an odd number, and that all numbers in both arrays are different. You can remove these restrictions later by applying the usual median definition approach (i.e., how to calculate the median of an array containing duplicates, or of an array with even length). Anyway, given that, we know for sure, that in the merged array there will be (|A| + |B| - 1) / 2 elements to the right and to the left of an actual median. In the process of a binary search in A, we know the index of current element x in array A (let it be i). Now, if x satisfies the condition B[j] < x < B[j+1], where i + j == (|A| + |B| - 1) / 2, then x is your median.
The overall complexity is O(log(max(|A|, |B|)) time and O(1) memory.

Count number of subsequences with given k modulo sum

Given an array a of n integers, count how many subsequences (non-consecutive as well) have sum % k = 0:
1 <= k < 100
1 <= n <= 10^6
1 <= a[i] <= 1000
An O(n^2) solution is easily possible, however a faster way O(n log n) or O(n) is needed.
This is the subset sum problem.
A simple solution is this:
s = 0
dp[x] = how many subsequences we can build with sum x
dp[0] = 1, 0 elsewhere
for i = 1 to n:
s += a[i]
for j = s down to a[i]:
dp[j] = dp[j] + dp[j - a[i]]
Then you can simply return the sum of all dp[x] such that x % k == 0. This has a high complexity though: about O(n*S), where S is the sum of all of your elements. The dp array must also have size S, which you probably can't even afford to declare for your constraints.
A better solution is to not iterate over sums larger than or equal to k in the first place. To do this, we will use 2 dp arrays:
dp1, dp2 = arrays of size k
dp1[0] = dp2[0] = 1, 0 elsewhere
for i = 1 to n:
mod_elem = a[i] % k
for j = 0 to k - 1:
dp2[j] = dp2[j] + dp1[(j - mod_elem + k) % k]
copy dp2 into dp1
return dp1[0]
Whose complexity is O(n*k), and is optimal for this problem.
There's an O(n + k^2 lg n)-time algorithm. Compute a histogram c(0), c(1), ..., c(k-1) of the input array mod k (i.e., there are c(r) elements that are r mod k). Then compute
k-1
product (1 + x^r)^c(r) mod (1 - x^k)
r=0
as follows, where the constant term of the reduced polynomial is the answer.
Rather than evaluate each factor with a fast exponentiation method and then multiply, we turn things inside out. If all c(r) are zero, then the answer is 1. Otherwise, recursively evaluate
k-1
P = product (1 + x^r)^(floor(c(r)/2)) mod (1 - x^k).
r=0
and then compute
k-1
Q = product (1 + x^r)^(c(r) - 2 floor(c(r)/2)) mod (1 - x^k),
r=0
in time O(k^2) for the latter computation by exploiting the sparsity of the factors. The result is P^2 Q mod (1 - x^k), computed in time O(k^2) via naive convolution.
Traverse a and count a[i] mod k; there ought to be k such counts.
Recurse and memoize over the distinct partitions of k, 2*k, 3*k...etc. with parts less than or equal to k, adding the products of the appropriate counts.
For example, if k were 10, some of the partitions would be 1+2+7 and 1+2+3+4; but while memoizing, we would only need to calculate once how many pairs mod k in the array produce (1 + 2).
For example, k = 5, a = {1,4,2,3,5,6}:
counts of a[i] mod k: {1,2,1,1,1}
products of distinct partitions of k:
5 => 1
4,1 => 2
3,2 => 1
products of distinct partitions of 2 * k with parts <= k:
5,4,1 => 2
5,3,2 => 1
4,1,3,2 => 2
products of distinct partitions of 3 * k with parts <= k:
5,4,1,3,2 => 2
answer = 11
{1,4} {4,6} {2,3} {5}
{1,4,2,3} {1,4,5} {4,6,2,3} {4,6,5} {2,3,5}
{1,4,2,3,5} {4,6,2,3,5}

Get the first x elements of a Heapsort

I'm preparing for a Google developer interview and working on algorithm questions. I need to figure out how to get the first x elements in an array of size n using the Heapsort algorithm. What part of the algorithm needs to be modified to get just the first x smallest elements?
This is the Heapsort algorithm from Introduction to Algorithms by Cormen Leiserson (page 155):
HEAPSORT(A)
{
BUILD-MAX-HEAP(A)
for i = A.length down to 2
exchange A[1] with A[i]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A, 1)
}
These are the component algorithms:
BUILD-MAX-HEAP(A)
A.heap-size = A.length
for i = floor(A.length / 2) down to 1
MAX-HEAPIFY(A, i)
MAX-HEAPIFY(A, i)
l = LEFT(i)
r = RIGHT(i)
if l <= A.heap-size and A[l] > A[i]
largest = l
else largest = r
if r <= A.heap-size and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A, largest)
I can't figure out what part to modify to get the x smallest elements of the sorted array. Also need to find the time complexity of the modified algorithm.
By changing the condition in MAX-HEAPIFY, we can change it into MIN-HEAPIFY, thus , we can easily obtain a min heap.
Then, the first element of this heap is the smallest element, we can remove this element, and bring the last element in the heap to the first element, and call MIN-HEAPIFY again to maintain the property of the heap. Continuing this process n time, we can obtain the first n smallest object.
Time complexity : log(m) + log(m - 1) + ... + log(m - n) ~ O(nlogm)

Resources