Prefix sum variation - algorithm

I'm trying parallelize some software that performs some recursive linear equations. I think some of them might be adapted into prefix sums. A couple of examples of the kinds of equation I'm dealing with are below.
The standard prefix sum is defined as:
y[i] = y[i-1] + x[i]
One equation I'm interested in looks like prefix sum, but with a multiplication:
y[i] = A*y[i-1] + x[i]
Another is having deeper recursion:
y[i] = y[i-1] + y[i-2] + x[i]
Outside of ways of tackling these two variations, I'm wondering if there are resources that cover how to adapt problems like the above into prefix sum form. Or more generally, techniques for adopting/adapting prefix sum to make it more flexible.

(1)
y[i] = A*y[i-1] + x[i]
can be written as
y[z] = A^z * y[0] + Sum(A^(z-j) * x[j])
,where j E [z,1].
A^z * y[0] can be calculated in O(log(z))
Sum(A^(z-j) * x[j]) can be calculated in O(z).
If the maximum size of the sequence is known beforehand (say max), then you can precompute a modified prefix sum array of x as
prefix_x[i] = A*prefix_x[i-1] + x[i]
then Sum(A^(z-j) * x[j]) is simply prefix_x[z]
and the query becomes O(1) with O(max) precomputation.
(2)
y[i] = y[i-1] + y[i-2] + x[i]
can be written as
y[z] = (F[z] * y[1] + F[z-1] * y[0]) + Sum(F[z-j+1] * x[j])
,where j E [z,2] and F[x] = xth fibonaci number
(F[z] * y[1] + F[z-1] * y[0]) can be calculated in O(log(z))
Sum(F[z-j+1] * x[j]) can be calculated in O(z).
If the maximum size of the sequence is known beforehand (say max), then you can precompute a modified prefix sum array of x as
prefix_x[i] = prefix_x[i-1] + prefix_x[i-2] + x[i]
then Sum(F[z-j+1] * x[j]) is simply prefix_x[z]
and the query becomes O(1) with O(max) precomputation.

Related

Partition the array with minimal difference

Given an array A of N integers . I need to find X such that the difference between the following 2 values (A[1] * A[2] * ... * A[X]) and (A[X+1] * A[X+2] * ... * A[N]) is minimum possible i.e. I need to minimize | (A[1] * A[2] * ... * A[X]) - (A[X+1] * A[X+2] * ... * A[N]) | and if there are multiple such values of X, print the smallest one.
Constraints:-
1 <= N <= 10^5
1 <= A[i] <= 10^18.
I am not able to find the approach to solve this problem in efficient way.
What should be the best approach to solve this problem. Is there any special algorithm for multiplying large quantity of numbers.
The idea is to use a form of prefix and suffix products.
Let:
pre[i] = A[1] * A[2] * ... A[i] and
suf[i] = A[i] * A[i + 1] * ... A[N]
You can compute these arrays in O(n) time, as:
pre[i] = A[i] * pre[i - 1] with pre[1] = A[i] and
suf[i] = A[i] * suf[i + 1] with suf[N] = A[n]
Then, iterate from i = 1 to N and compute the maximum of:
abs(pre[i] - suf[i + 1])
Observe that pre[i] - suf[i + 1] is the same as:
(A[1] * A[2] * ... * A[i]) - (A[i + 1] * A[i + 2] ... * A[N])
which is exactly what you want to compute.
You can do it in O(n): first go - get the product of all elements of array (P) and the second go - assuming at start the left part is one and the second is P, on each step i multiply left on X[i] and divide right on X[i]. Continue the process until left is less than right.
Since you have large array of numbers, you need some big-number multiplication. So, maybe you better move to array of logarithms of A[i], LA[i] and move to new criteria.
Edit:
As mentioned by #CiaPan, the precision of standard 64-bit decimal is not enough for making log operation here (since values may be up to 10^18).
So to solve this problem you should first split values of the source array to pairs such that:
s[2*i] = a[i].toDouble / (10.0^9)
s[2*i+1] = a[i]/s[2*i]
Array s is twice longer than source array a, but its values do not exceed 10^9, so it is safe to apply log operation, then find desired sX for array s and divide it to 2 to get X for array a.
Extra-precision logarithm logic is not required.

Finite Difference Method for Solving ODEs Algorithm

I'm trying to devise an algorithm for the finite difference method, but I'm a bit confused. The ODE in question is y''-5y'+10y = 10x, with y(0)=0 and y(1)=100. So I need a way to somehow obtain the coefficients that will multiply "y_i" from the relation:
And then store the resultant coefficients into a matrix, which will be the matrix of the system I'll solve trough Gauss-Jordan. The question boils down to how to obtain these coefficients and move them to the matrix. I thought about working out the coefficients by hand and then just inputing the matrix, but I need to do this for steps of size 0.1, 0.001 and 0.001, so that's really not a viable option here.
Let us assume the more general case of the ODE
c1 * y''(x) + c2 * y'(x) + c3 * y(x) + c4 * x = 0
with the boundary conditions
y(0) = lo
y(1) = hi
And you want to solve this for x ∈ [0, 1] with a step size of h = 1 / n (where n + 1 is the number of samples). We want to solve for the yi = y(h * i). The yi range from i ∈ [0, n]. To do this, we want to solve a linear system
A y = b
Every interior yi will impose a single linear constraint. Hence, we have n - 1 rows in A and n - 1 columns corresponding to the unknown yi.
To set up A and b, we can simply slide a window over our unknown yi (I assume zero-based indexing).
A = 0 //the zero matrix
b = 0 //the zero vector
for i from 1 to n - 1
//we are going to create the constraint for yi and store it in row i-1
//coefficient for yi+1
coeff = c1 / h^2 + c2 / h
if i + 1 < n
A(i - 1, i) = coeff
else
b(i - 1) -= coeff * hi //we already know yi+1
//coefficient for yi
coeff = -2 * c1 / h^2 - c2 / h + c3
A(i - 1, i - 1) = coeff
//coefficient for yi-1
coeff = c1 / h^2
if i - 1 > 0
A(i - 1, i - 2) = coeff
else
b(i - 1) -= coeff * lo //we already know yi-1
//coefficient for x
b(i - 1) -= c4 * i * h
next

Variation on 0/1 Knapsack Algorithm

I'm very new to programming and have been asked to solve a program for work. Right now we are dealing with a typical 0/1 Knapsack problem, in which the benefit/value is maximized given mass and volume constraints.
My task is to basically reverse this and minimize either the volume or mass given a value constraint. In other words, I want my benefit score to be greater than or equal to a set value and then see how small I can get the knapsack given that threshold value.
I have tried researching this problem elsewhere and am sure that it probably has a formal name, however I am unable to find it. If anyone has any information I would greatly appreciate it. I am at a bit of a loss of how to go about solving this type of algorithm as you cannot use the same recursion formulas.
Let's call the weight of item i w(i), and its value v(i). Order the items arbitrarily, and define f(i, j) to be the minimum possible capacity of a knapsack that holds a subset of the first i items totalling at least a value of j.
To calculate f(i, j), we can either include the ith item or not in the knapsack, so
f(i>0, j>0) = min(g(i, j), h(i, j)) # Can include or exclude ith item; pick the best
f(_, 0) = 0 # Don't need any capacity to reach value of 0
f(i<=0, j>0) = infinity # Can't get a positive value with <= 0 items
g(i, j) = f(i-1, j) # Capacity needed if we exclude ith item
h(i, j) = f(i-1, max(0, j-v(i))) + w(i) # Capacity needed if we include ith item
In the last line, max(0, j-v(i)) just makes sure that the second argument in the recursive call to f() does not go negative in the case where v(i) > j.
Memoising this gives a pseudopolynomial O(nc)-time, O(nc)-space algorithm, where n is the number of items and c is the value threshold. You can save space (and possibly time, although not in the asymptotic sense) by calculating it in bottom-up fashion -- this would bring the space complexity down to O(c), since while calculating f(i, ...) you only ever need access to f(i-1, ...), so you only need to keep the previous and current "rows" of the DP matrix.
If I understand your question correctly, the problem you wish to solve is on the form:
let mass_i be the mass of item i, let vol_i the volume, and let val_i be its value.
Let x_i be a binary variable, where x_i is one if and only if the item is in the knapsack.
minimize (mass_1 * x_1 + ... + mass_n * x_n) //The case where you are minimizing mass
s.t. mass_1 * x_1 + ... + mass_n * x_n >= MinMass
vol_1 * x_1 + ... + vol_n * x_n >= MinVolume
val_1 * x_1 + ... + val_n * x_n >= MinValue
x_i in {0,1} for all i
A trick you can use to to make it more "knapsacky" is to substitute x_i with 1-y_i, where y_i is 1 one, if and only if item i is not in the knapsack. Then you get an equivalent problem on the form:
let mass_i be the mass of item i, let vol_i the volume, and let val_i be its value.
Let y_i be a binary variable, where y_i is one if and only if the item is NOT in the knapsack.
maximize mass_1 * y_1 + ... + mass_n * y_n) //The case where you are minimizing mass
s.t. mass_1 * y_1 - ... + mass_n * y_n <= mass_1 + ... + mass_n - MinMass
vol_1 * y_1 - ... + vol_n * y_n <= vol_1 + ... + vol_n - MinVolume
val_1 * y_1 - ... + val_n * y_n <= val_1 + ... + val_n - MinValue
y_i in {0,1} for all i
which is a knapsack problem with 3 constraints. The solution y can easily be transformed into an equivalent solution for your original problem by setting x_i = 1 - y_i.

Find the sum of the digits of all the numbers from 1 to N [duplicate]

This question already has answers here:
Sum of Digits till a number which is given as input
(2 answers)
Closed 6 years ago.
Problem:
Find the sum of the digits of all the numbers from 1 to N (both ends included)
Time Complexity should be O(logN)
For N = 10 the sum is 1+2+3+4+5+6+7+8+9+(1+0) = 46
For N = 11 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) = 48
For N = 12 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) +(1+2)= 51
This recursive solution works like a charm, but I'd like to understand the rationale for reaching such a solution. I believe it's based on finite induction, but can someone show exactly how to solve this problem?
I've pasted (with minor modifications) the aforementioned solution:
static long Solution(long n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
long y = long.Parse(n.ToString().Substring(1)); // remaining digits
int power = (int)Math.Pow(10, n.ToString().Length - 1);
// how to reach this recursive solution?
return (power * Solution(x - 1))
+ (x * (y + 1))
+ (x * Solution(power - 1))
+ Solution(y);
}
Unit test (which is NOT O(logN)):
long count = 0;
for (int i=1; i<=N; i++)
{
foreach (var c in i.ToString().ToCharArray())
count += int.Parse(c.ToString());
}
Or:
Enumerable.Range(1, N).SelectMany(
n => n.ToString().ToCharArray().Select(
c => int.Parse(c.ToString())
)
).Sum();
This is actually a O(n^log10(2))-time solution (log10(2) is approximately 0.3). Not sure if that matters. We have n = xy, where I use concatenation to denote concatenation, not multiplication. Here are the four key lines with commentary underneath.
return (power * Solution(x - 1))
This counts the contribution of the x place for the numbers from 1 inclusive to x*power exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.
+ (x * (y + 1))
This counts the contribution of the x place for the numbers from x*power inclusive to n inclusive.
+ (x * Solution(power - 1))
This counts the contribution of the lower-order places for the numbers from 1 inclusive to x*power exclusive. This call is on a number one digit shorter than n.
+ Solution(y);
This counts the contribution of the lower-order places for the numbers from x*power inclusive to n inclusive. This call is on a number one digit shorter than n.
We get the time bound from applying Case 1 of the Master Theorem. To get the running time down to O(log n), we can compute Solution(power - 1) analytically. I don't remember offhand what the closed form is.
After thinking for a while (and finding similar answers), I guess I could achieve the rationale that gave me another solution.
Definitions
Let S(n) be the sum of the digits of all numbers 0 <= k < n.
Let D(k) be the plain digits sum of k only.
(I'll omit parentheses for >clarity, so consider Dx = D(x)
If n>=10, let's decompose n by spliting the last digit and the tens (n = 10*k + r) (k, r being integers)
We need to sum S(n) = S(10*k + r) = S(10*k) + D(10*k+1) + ... + D(10*k+r)
The first part, S(10*k), follows a pattern:
S(10*1)=D1+D2+D3+...+D9 =(1+2+3+...+9) *1 + D10
S(10*2)=D1+D2+D3+...+D19 =(1+2+3+...+9) *2 +1*9 +D10 + D20
S(10*3)=D1+D2+D3+...+D29 =(1+2+3+...+9) *3 +1*9+2*9 +D10+...+D20 + D30
So S(10*k) = (1+2+3+...+9)*k + 9*S(k-1) + S(k-1) + D(10*k) = 45*k + 10*S(k-1) + D(10*k)
Regarding the last part, we know that D(10*k+x) = D(10*k)+D(x) = D(k)+x, so this last part can be simplified:
D(10*k+1) + ... + D(10*k+r) = D(k)+1 + D(k)+2 + ... D(k)+r = rD(k) + (1+2+...+r) = rD(k) + r*(1+r)/2
So, adding both parts of the equation (and grouping D(k)) we have:
S(n) = 45*k + 10*S(k-1) + (1+r)D(k) + r*(1+r)/2
And replacing k and r we have:
S(n) = 45*k + 10*S((n/10)-1) + (1+n%10)D(n/10) + n%10(1+n%10)/2
Pseudocode:
S(n):
if n=0, sum=0
if n<10, n*(1+n)/2
r=n%10 # let's decompose n = 10*k + r (being k, r integers).
k=n/10
return 45*k + 10*S((n/10)-1) + (1+n%10)*D(n/10) + n%10*(1+n%10)/2
D(n):
just sum digits
First algorithm (the one from the original question) in C#
static BigInteger Solution(BigInteger n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
long y = long.Parse(n.ToString().Substring(1)); // remaining digits
BigInteger power = BigInteger.Pow(10, n.ToString().Length - 1);
var log = Math.Round(BigInteger.Log10(power)); // BigInteger.Log10 can give rounding errors like 2.99999
return (power * Solution(x - 1)) //This counts the contribution of the x place for the numbers from 1 inclusive to x*power exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.
+ (x * (y + 1)) //This counts the contribution of the x place for the numbers from x*power inclusive to n inclusive.
//+ (x * Solution(power - 1)) // This counts the contribution of the lower-order places for the numbers from 1 inclusive to x*power exclusive. This call is on a number one digit shorter than n.
+ (x * 45*new BigInteger(log)* BigInteger.Pow(10,(int)log-1)) //
+ Solution(y);
}
Second algorithm (deduced from formula above) in C#
static BigInteger Solution2(BigInteger n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
BigInteger r = BigInteger.ModPow(n, 1, 10); // decompose n = 10*k + r
BigInteger k = BigInteger.Divide(n, 10);
return 45 * k
+ 10*Solution2(k-1) // 10*S((n/10)-1)
+ (1+r) * (k.ToString().ToCharArray().Select(x => int.Parse(x.ToString())).Sum()) // (1+n%10)*D(n/10)
+ (r * (r + 1)) / 2; //n%10*(1+n%10)/2
}
EDIT: According to my tests, it's running faster than both the original version (which was using recursion twice), and the version modified to calculate Solution(power - 1) in a single step.
PS: I'm not sure, but I guess that if I had splitted the first digit of the number instead of the last, maybe I could achieve a solution like the original algorithm.

Efficient Algorithm to Compute The Summation of the Function

We are given N points of the form (x,y) and we need to compute the following function:
F(i,j) = ( | X[i] - X[j] | ) * ( | Y[i] - Y[j] | )
Compute Summation of F(i,j) for all ordered pairs (i,j)
N <= 300000
I am looking for a O(N log N) solution.
My initial thought was to sort the points by X and then use a BIT but I'm not being able to formulate a clear solution.
I have a solution using O(N log(M)) time and O(M) memory, where M is the size of range of Y. It's similar to what you are thinking.
First sort the points so that the X coordinates are increasing.
Let's write A for the sum of (X[i] - X[j]) * (Y[i] - Y[j]) for all pairs i > j such that Y[i] > Y[j], and B for the sum of the same expression for all pairs i > j such that Y[i] < Y[j].
The sum A + B can be calculated easily in O(N) time, and the final answer can be calculated from A - B. Thus it suffices to calculate A.
Now create a binary indexed tree, whose nodes are indexed by intevals of the form [a, b) with b = a + 2^k for some k. (Not a good sentance, but you know what I mean, right?) The root node should cover the inteval [Y_min, Y_max] of possible values of Y.
For any node indexed by [a, b) and for any i, let f(a, b, i) be the following polynomial:
f(a, b, i)(X, Y) = sum of (X - X[j]) * (Y - Y[j]) for all j such that j < i and Y[j] < Y
It is of the form P * XY + Q * X + R * Y + S, thus such a polynomial can be represented by the four numbers P, Q, R, S.
Now beginning with i = 0, you may calculate f(a, b, i)(X[i], Y[i]). To go from i to i + 1, you only need to update those intevals [a, b) containing Y[i]. When you reach i = N, the value of A is calculated.
If you can afford O(M) memory, then this should work fine.

Resources