Partition the array with minimal difference - algorithm

Given an array A of N integers . I need to find X such that the difference between the following 2 values (A[1] * A[2] * ... * A[X]) and (A[X+1] * A[X+2] * ... * A[N]) is minimum possible i.e. I need to minimize | (A[1] * A[2] * ... * A[X]) - (A[X+1] * A[X+2] * ... * A[N]) | and if there are multiple such values of X, print the smallest one.
Constraints:-
1 <= N <= 10^5
1 <= A[i] <= 10^18.
I am not able to find the approach to solve this problem in efficient way.
What should be the best approach to solve this problem. Is there any special algorithm for multiplying large quantity of numbers.

The idea is to use a form of prefix and suffix products.
Let:
pre[i] = A[1] * A[2] * ... A[i] and
suf[i] = A[i] * A[i + 1] * ... A[N]
You can compute these arrays in O(n) time, as:
pre[i] = A[i] * pre[i - 1] with pre[1] = A[i] and
suf[i] = A[i] * suf[i + 1] with suf[N] = A[n]
Then, iterate from i = 1 to N and compute the maximum of:
abs(pre[i] - suf[i + 1])
Observe that pre[i] - suf[i + 1] is the same as:
(A[1] * A[2] * ... * A[i]) - (A[i + 1] * A[i + 2] ... * A[N])
which is exactly what you want to compute.

You can do it in O(n): first go - get the product of all elements of array (P) and the second go - assuming at start the left part is one and the second is P, on each step i multiply left on X[i] and divide right on X[i]. Continue the process until left is less than right.
Since you have large array of numbers, you need some big-number multiplication. So, maybe you better move to array of logarithms of A[i], LA[i] and move to new criteria.
Edit:
As mentioned by #CiaPan, the precision of standard 64-bit decimal is not enough for making log operation here (since values may be up to 10^18).
So to solve this problem you should first split values of the source array to pairs such that:
s[2*i] = a[i].toDouble / (10.0^9)
s[2*i+1] = a[i]/s[2*i]
Array s is twice longer than source array a, but its values do not exceed 10^9, so it is safe to apply log operation, then find desired sX for array s and divide it to 2 to get X for array a.
Extra-precision logarithm logic is not required.

Related

Prefix sum variation

I'm trying parallelize some software that performs some recursive linear equations. I think some of them might be adapted into prefix sums. A couple of examples of the kinds of equation I'm dealing with are below.
The standard prefix sum is defined as:
y[i] = y[i-1] + x[i]
One equation I'm interested in looks like prefix sum, but with a multiplication:
y[i] = A*y[i-1] + x[i]
Another is having deeper recursion:
y[i] = y[i-1] + y[i-2] + x[i]
Outside of ways of tackling these two variations, I'm wondering if there are resources that cover how to adapt problems like the above into prefix sum form. Or more generally, techniques for adopting/adapting prefix sum to make it more flexible.
(1)
y[i] = A*y[i-1] + x[i]
can be written as
y[z] = A^z * y[0] + Sum(A^(z-j) * x[j])
,where j E [z,1].
A^z * y[0] can be calculated in O(log(z))
Sum(A^(z-j) * x[j]) can be calculated in O(z).
If the maximum size of the sequence is known beforehand (say max), then you can precompute a modified prefix sum array of x as
prefix_x[i] = A*prefix_x[i-1] + x[i]
then Sum(A^(z-j) * x[j]) is simply prefix_x[z]
and the query becomes O(1) with O(max) precomputation.
(2)
y[i] = y[i-1] + y[i-2] + x[i]
can be written as
y[z] = (F[z] * y[1] + F[z-1] * y[0]) + Sum(F[z-j+1] * x[j])
,where j E [z,2] and F[x] = xth fibonaci number
(F[z] * y[1] + F[z-1] * y[0]) can be calculated in O(log(z))
Sum(F[z-j+1] * x[j]) can be calculated in O(z).
If the maximum size of the sequence is known beforehand (say max), then you can precompute a modified prefix sum array of x as
prefix_x[i] = prefix_x[i-1] + prefix_x[i-2] + x[i]
then Sum(F[z-j+1] * x[j]) is simply prefix_x[z]
and the query becomes O(1) with O(max) precomputation.

How to get the probability of P(Xi) = 1 / ( k + 1 ) when A[i] != x

The question:
Now consider a deterministic linear search algorithm, which we refer to as
DETERMINISTIC-SEARCH. Specifically, the algorithm searches A for x in order,
considering A[1], A[2]; : : : ; A[n] until either it finds A[i] = x or it reaches the end of the array. Assume that all possible permutations of the input array are equally likely.
Suppose that there are k >= 1 indices i such that A[i] = x. What is the average-case running time of DETERMINISTIC-SEARCH?
I'm confused how to get the P(Xi) when A[i] is not equal to x. I know P(min(p1,..., pk) > i) = P(p1 > i) * ... * P(pk > i) = [(n-i) / n]^k = A, so why is P(Xi) not equal to A?
Suppose you have n different value in array A. So, there is n! different permutation for the values in A. Suppose A[1] == x, it means you you will find it by one comparison. Now, count the number of permutations that x is in the first place. It is (n-1)! that is for the other n-1 locations. Hence, the probability of A[1] == x is (n-1)!/n! = 1/n.
Now, try to compute the same things for the second place. It will be again 1/n (as the same analysis for the first place is valid here). But you need two comparisons to find the x.
Therefore, the expected number of comparison is \sum_{i=1}^n i*1/n = 1*1/n + 2*1/n + ... + n*1/n = (1+2+...+n)/n = (n+1)/2.

SubArray Sum of Fibonacci Number

I have an array A of N numbers. I have to find the sum of Fibonacci numbers of the sum of all its subarrays.
For example:
A = {1,1}
F[Sum A[1,1]] = F[1] = 1
F[Sum A[1,2]] = F[2] = 1
F[Sum A[2,2]] = F[1] = 1
Ans = 1 + 1 + 1 = 3
The question is similar to this one, but I need to compute the sum for a standard Fibonacci sequence.
Here is the source code.What property is used here ?
Can any one explain math behind this? How to avoid an O(N^2) solution? How to modify the source code for standard Fibonacci numbers?
Here is a solution that runs in O(N * log MAX_A * k^3) time (where MAX_A is the maximum value in the array and k is the number of variables used in recursive formula (its equal to 2 for fibonacci and 3 for the problem your linked to your question). I will not refer to the source code you have provided as it's pretty hard to read but it looks like it uses similar ideas:
Let's learn how to compute the i-th Fibonacci number quickly. It's easy to see that f(i) = F^n * [1, 1].transposed(), where F is a matrix [[1, 1], [1, 0]] (it would be [[1, 1, 1], [1, 0, 0], [0, 0, 1]] for the original problem. We can find the n-th power of a matrix quickly using binary exponentiation.
Let's develop this idea futher. In terms of matrices, we're asked to evaluate the following expression:
sum for all valid L, R of ((F^(a[L] + ... + a[R]) * [1, 1].transposed())[0])
or, equivalently (as matrix multiplication is distributive with respect to addition)
((sum for all valid L, R of F^(a[L] + ... + a[R])) * [1, 1].transposed())[0]
Now we need to figure out how to compute this sum efficiently. Let's learn to do it incrementally. Let's assume that all subarrays that end in a position n or smaller have already been processed and we want to add the next one. According to the problem statement, we should add F^a[n + 1] + F^(a[n + 1] + a[n]) + ... + F^(a[n + 1] + a[n] + ... + a[0]) to the answer, which is equal to F^a[n + 1] * (F^a[n] + F^(a[n] + a[n - 1]) + ... + F^(a[n] + a[n - 1] + ... + a[0])).
That's it. We already have an efficient solution. Here a pseudo code:
pref_sum = I // identity matrix. This variable holds the second
// term of the product used in step 3
total_sum = zeros // zero matrix of an appropriate size
// It holds the answer to the problem
for i = 0 .. N - 1
cur_matrix = binary_exponentiation(F, a[i]) // A matrix for the a[i]-th fibonacci number
// Everything is updated according to the formulas shown above
total_sum += pref_sum * cur_matrix
pref_sum = pref_sum * cur_matrix + I
// Now we just need to mulitiply the total_sum by an appropriate vector
// which is [1, 1].transposed() in case of Fibonacci numbers
Assuming that the size of the matrix F is constant (it's, again, 2 in case of Fibonacci numbers), the time complexity is O(N * (log MAX_A + const)) = O(N * log MAX_A) as there is just one binary exponentiation for each element of the original array followed by a constant number of matrix multiplications and additions.

Given 2 arrays of non-negative numbers, find the minimum sum of products

Given two arrays A and B, each containing n non-negative numbers, remove a>0 elements from the end of A and b>0 elements from the end of B. Evaluate the cost of such an operation as X*Y where X is the sum of the a elements removed from A and Y the sum of the b elements removed from B. Keep doing this until both arrays are empty. The goal is to minimize the total cost.
Using dynamic programming and the fact that an optimal strategy will always take exactly one element from either A or B I can find an O(n^3) solution. Now I'm curious to know if there is an even faster solution to this problem?
EDIT: Stealing an example from #recursive in the comments:
A = [1,9,1] and B = [1, 9, 1]. Possible to do with a cost of 20. (1) *
(1 + 9) + (9 + 1) * (1)
Here's O(n^2). Let CostA(i, j) be the min cost of eliminating A[1..i], B[1..j] in such a way that the first removal takes only one element from B. Let CostB(i, j) be the min cost of eliminating A[1..i], B[1..j] in such a way that the first removal takes only one element from A. We have mutually recursive recurrences
CostA(i, j) = A[i] * B[j] + min(CostA(i - 1, j),
CostA(i - 1, j - 1),
CostB(i - 1, j - 1))
CostB(i, j) = A[i] * B[j] + min(CostB(i, j - 1),
CostA(i - 1, j - 1),
CostB(i - 1, j - 1))
with base cases
CostA(0, 0) = 0
CostA(>0, 0) = infinity
CostA(0, >0) = infinity
CostB(0, 0) = 0
CostB(>0, 0) = infinity
CostB(0, >0) = infinity.
The answer is min(CostA(n, n), CostB(n, n)).

Special case of sparse matrices multiplication

I'm trying to come up with fast algorithm to find result of operation, where
L - is symmetric n x n matrix with real numbers.
A - is sparse n x m matrix, m < n. Each row has one and only one non-zero element, and it's equal to 1. It's also guaranteed that every column has at most two non-zero elements.
I come up with one algorithm, but I feel like there should be something faster than this.
Let's represent every column of A as pair of row numbers with non-zero elements. If a column has only one non-zero element, its row number listed twice. E.g. for the following matrix
Such representation would be
column 0: [0, 2]; column 1: [1, 3]; column 2: [4, 4]
Or we can list it as a single array: A = [0, 2, 1, 3, 4, 4]; Now, can be calculated as:
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
# sum of two column vectors, i/2-th column of L'
L'[i/2] = L[A[i]] + L[A[i + 1]]
else:
L'[i/2] = L[A[i]]
To calculate we do it one more time:
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
# sum of two row vectors, i/2-th row of L''
L''[i/2] = L'[A[i]] + L'[A[i + 1]]
else:
L''[i/2] = L'[A[i]]
The time complexity of such approach is O(mn + mn), and space complexity (to get final result) is O(nn). I'm wondering if it's possible to improve it to O(mm) in terms of space and/or performance?
The second loop combines at most 2m rows of L', so if m is much smaller than n there will be several rows of L' that are never used.
One way to avoid calculating and storing these unused entries is to change your first loop into a function and only calculate the individual elements of L' as they are needed.
def L'(row,col):
i=col*2
if A[i] != A[i + 1]:
# sum of two column vectors, i/2-th column of L'
return L[row][A[i]] + L[row][A[i + 1]]
else:
return L[row][A[i]]
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
for (k=0;k<m;k++):
L''[i/2][k] = L'(A[i],k) + L'(A[i + 1],k)
else:
for (k=0;k<m;k++):
L''[i/2][k] = L'(A[i],k)
This should then have space and complexity O(m*m)
The operation Transpose(A) * L works as follows:
For each column of A we see:
column 1 has `1` in row 1 and 3
column 2 has `1` in row 2 and 4
column 3 has `1` in row 5
The output matrix B = Transpose(A) * L has three rows which are equal to:
Row(B, 1) = Row(A, 1) + Row(A, 3)
Row(B, 2) = Row(A, 2) + Row(A, 4)
Row(B, 3) = Row(A, 5)
If we multiply C = B * A:
Column(C, 1) = Column(B, 1) + Column(B, 3)
Column(C, 2) = Column(B, 2) + Column(B, 4)
Column(C, 3) = Column(B, 5)
If you follow through this in a algorithmic way, you should achieve something very similar to what Peter de Rivaz has suggested.
The time complexity of your algorithm is O(n^2), not O(m*n). The rows and columns of L have length n, and the A array has length 2n.
If a[k] is the column where row k of A has a 1, then you can write:
A[k,i] = δ(a[k],i)
and the product, P = A^T*L*A is:
P[i,j] = Σ(k,l) A^T[i,k]*L[k,l]*A[l,j]
= Σ(k,l) A[k,i]*L[k,l]*A[l,j]
= Σ(k,l) δ(a[k],i)*L[k,l]*δ(a[l],j)
If we turn this around and look at what happens to the elements of L, we see that L[k,l] is added to P[a[k],a[l]], and it's easy to get O(m^2) space complexity using O(n^2) time complexity.
Because a[k] is defined for all k=0..n-1, we know that every element of L must appear somewhere in the product. Because there are O(n^2) distinct elements in L, you can't do better than O(n^2) time complexity.

Resources