SubArray Sum of Fibonacci Number - algorithm

I have an array A of N numbers. I have to find the sum of Fibonacci numbers of the sum of all its subarrays.
For example:
A = {1,1}
F[Sum A[1,1]] = F[1] = 1
F[Sum A[1,2]] = F[2] = 1
F[Sum A[2,2]] = F[1] = 1
Ans = 1 + 1 + 1 = 3
The question is similar to this one, but I need to compute the sum for a standard Fibonacci sequence.
Here is the source code.What property is used here ?
Can any one explain math behind this? How to avoid an O(N^2) solution? How to modify the source code for standard Fibonacci numbers?

Here is a solution that runs in O(N * log MAX_A * k^3) time (where MAX_A is the maximum value in the array and k is the number of variables used in recursive formula (its equal to 2 for fibonacci and 3 for the problem your linked to your question). I will not refer to the source code you have provided as it's pretty hard to read but it looks like it uses similar ideas:
Let's learn how to compute the i-th Fibonacci number quickly. It's easy to see that f(i) = F^n * [1, 1].transposed(), where F is a matrix [[1, 1], [1, 0]] (it would be [[1, 1, 1], [1, 0, 0], [0, 0, 1]] for the original problem. We can find the n-th power of a matrix quickly using binary exponentiation.
Let's develop this idea futher. In terms of matrices, we're asked to evaluate the following expression:
sum for all valid L, R of ((F^(a[L] + ... + a[R]) * [1, 1].transposed())[0])
or, equivalently (as matrix multiplication is distributive with respect to addition)
((sum for all valid L, R of F^(a[L] + ... + a[R])) * [1, 1].transposed())[0]
Now we need to figure out how to compute this sum efficiently. Let's learn to do it incrementally. Let's assume that all subarrays that end in a position n or smaller have already been processed and we want to add the next one. According to the problem statement, we should add F^a[n + 1] + F^(a[n + 1] + a[n]) + ... + F^(a[n + 1] + a[n] + ... + a[0]) to the answer, which is equal to F^a[n + 1] * (F^a[n] + F^(a[n] + a[n - 1]) + ... + F^(a[n] + a[n - 1] + ... + a[0])).
That's it. We already have an efficient solution. Here a pseudo code:
pref_sum = I // identity matrix. This variable holds the second
// term of the product used in step 3
total_sum = zeros // zero matrix of an appropriate size
// It holds the answer to the problem
for i = 0 .. N - 1
cur_matrix = binary_exponentiation(F, a[i]) // A matrix for the a[i]-th fibonacci number
// Everything is updated according to the formulas shown above
total_sum += pref_sum * cur_matrix
pref_sum = pref_sum * cur_matrix + I
// Now we just need to mulitiply the total_sum by an appropriate vector
// which is [1, 1].transposed() in case of Fibonacci numbers
Assuming that the size of the matrix F is constant (it's, again, 2 in case of Fibonacci numbers), the time complexity is O(N * (log MAX_A + const)) = O(N * log MAX_A) as there is just one binary exponentiation for each element of the original array followed by a constant number of matrix multiplications and additions.

Related

Maximize sum of two numbers plus distance between them

We are given square matrix of numbers, e.g.
1 9 2
3 8 3
2 1 1
The distance between adjacent numbers is 2. We want to find such two numbers, in the same row or in the same column, that their sum plus the distance between them is maximal. For example, in the example above, such numbers are 9 and 8 and the max result is 9+8+1*2 = 19. We want to find just the maximal result, we don't need which specific numbers sum to it.
That looks like a DP problem for me, but I can't think of any elegant solution.
One can solve the 1D problem (that is, given a list of numbers, find the pair which maximizes sum+distance) using dynamic programming.
bi = 0
best = -10**9 # anything large and negative
for i in range(1, n+1):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
After this code finishes, best will store the maximum sum + distance of any pair of numbers in the list. It works because at any given loop iteration of i, bi stores the index of the value at index less than i that maximizes its value minus twice its index. One can observe that the number at this index is the best number (to the left of i) to pair the number at i with.
Once you have this, the 2D problem is straightforward: go through each row and column and apply the 1D algorithm, and return the maximum pair found. Overall for an n by n matrix, this runs in O(n^2) time, which is clearly asymptotically optimal since every element in the matrix needs to be read at least once.
Here is working Python3 code:
def max_sum_dist_1D(a):
bi = 0
best = -10**9
for i in range(1, len(a)):
best = max(best, a[i] + a[bi] + (i - bi)*2)
if a[i] - i*2 > a[bi] - bi*2:
bi = i
return best
def max_sum_dist_2D(M):
best_row = max(max_sum_dist_1D(row) for row in M)
best_col = max(max_sum_dist_1D(col) for col in zip(*M))
return max(best_row, best_col)
M = [[1, 9, 2], [3, 8, 3], [2, 1, 1]]
print(max_sum_dist_2D(M))

Sum of continuous sequences

Given an array A with N elements, I want to find the sum of minimum elements in all the possible contiguous sub-sequences of A. I know if N is small we can look for all possible sub sequences but as N is upto 10^5 what can be best way to find this sum?
Example: Let N=3 and A[1,2,3] then ans is 10 as Possible contiguous sub sequences {(1),(2),(3),(1,2),(1,2,3),(2,3)} so Sum of minimum elements = 1 + 2 + 3 + 1 + 1 + 2 = 10
Let's fix one element(a[i]). We want to know the position of the rightmost element smaller than this one located to the left from i(L). We also need to know the position of the leftmost element smaller than this one located to the right from i(R).
If we know L and R, we should add (i - L) * (R - i) * a[i] to the answer.
It is possible to precompute L and R for all i in linear time using a stack. Pseudo code:
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
We can reverse the array and run the same algorithm to find R.
How to deal with equal elements? Let's assume that a[i] is a pair <a[i], i>. All elements are distinct now.
The time complexity is O(n).
Here is a full pseudo code(I assume that int can hold any integer value here, you should
choose a feasible type to avoid an overflow in a real code. I also assume that all elements are distinct):
int[] getLeftSmallerElementPositions(int[] a):
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
return L
int[] getRightSmallerElementPositions(int[] a):
R = getLeftSmallerElementPositions(reversed(a))
for i <- 0 ... n - 1:
R[i] = n - 1 - R[i]
return reversed(R)
int findSum(int[] a):
L = getLeftSmallerElementPositions(a)
R = getRightSmallerElementPositions(a)
int res = 0
for i <- 0 ... n - 1:
res += (i - L[i]) * (R[i] - i) * a[i]
return res
If the list is sorted, you can consider all subsets for size 1, then 2, then 3, to N. The algorithm is initially somewhat inefficient, but an optimized version is below. Here's some pseudocode.
let A = {1, 2, 3}
let total_sum = 0
for set_size <- 1 to N
total_sum += sum(A[1:N-(set_size-1)])
First, sets with one element:{{1}, {2}, {3}}: sum each of the elements.
Then, sets of two element {{1, 2}, {2, 3}}: sum each element but the last.
Then, sets of three elements {{1, 2, 3}}: sum each element but the last two.
But this algorithm is inefficient. To optimize to O(n), multiply each ith element by N-i and sum (indexing from zero here). The intuition is that the first element is the minimum of N sets, the second element is the minimum of N-1 sets, etc.
I know it's not a python question, but sometimes code helps:
A = [1, 2, 3]
# This is [3, 2, 1]
scale = range(len(A), 0, -1)
# Take the element-wise product of the vectors, and sum
sum(a*b for (a,b) in zip(A, scale))
# Or just use the dot product
np.dot(A, scale)

Algorithm to find entires of array summing to 0 in O(nlog(n)) time

I'm working on the following problem:
Let A be an array of length n with each element -10n <= A[i] <= 10n. Create an algorithm running in O(nlog(n)) time that determines whether or not there exist entries A[i], A[j], and A[k] (i, j, and k not necessarily distinct) such that A[i] + A[j] + A[k] = 0.
I'm approaching it in the following way. Define a polynomial p of degree n-1 such that the coefficient on the x^k term is A[k]. Then use the FFT to multiply p with itself, and then multiply the resulting polynomial again by p. If any of the coefficients in the resulting polynomial are 0, then return true. Else, return false. Since the FFT is O(nlog(n)), this algorithm is then O(nlog(n)).
The problem I'm running into is that the FFT combines like terms, so to speak. Thus, the existence of a coefficient 0 does not imply that such entries exist.
Could anyone suggest a modification to this algorithm to improve it?
If I remember it right, the way to solve this problem is:
Define a polynomial of degree 60n + 1, where the coefficient on the term x^k is number of occurrences of element k - 10n in the array. For instance, if n=8, the coefficient on x^5 is number of occurrences of -75 (-75 = 5 - 10x8)
Use FFT to raise that polynomial (of degree 60n + 1) to the third power.
See if the coefficient on x^(30n) is non-zero. If it is, there's an answer.
Here's a sample implementation on python, it seems to work for all the cases I came up with:
import numpy as np
from numpy.fft import fft, ifft
def hasZeroSum(a):
n = len(a)
b = [0 for x in range(n * 60 + 1)]
for el in a: b[el + 10 * n] += 1
f = fft(b, n * 60 + 1)
f = np.power(f, 3)
res = ifft(f, n * 60 + 1)
return np.absolute(res[n * 30]) > 0.5
print hasZeroSum([-11, -5, 2, 3, 7])
print hasZeroSum([-11, -5, 2, 4, 8])
Prints
True
False

3SUM problem (finding triplets) in better than O(n^2)

Consider the following problem:
Given an unsorted array of integers, find all triplets that satisfy x^2 + y^2 = z^2.
For example if given array is 1, 3, 7, 5, 4, 12, 13, the answer should be 5, 12, 13 and 3, 4, 5
I suggest the below algorithm with complexity O(n^2):
Sort the array in descending order, O(nlogn)
square each element, O(n)
Now it reduces to the problem of finding all triplets(a,b,c) in a sorted array such that a = b+c.
The interviewer was insisting on a solution better than O(n^2).
I have read 3SUM problem on Wikipedia, which emphasizes problem can be solved in O(n+ulogu) if numbers are in range [-u,u] assuming the array can be represented as a bit vector. But I am not able to get a clear picture of further explanations.
Can someone please help me in understanding what is going on with a nice example?
First of all. Finding all triplets in worst case is O(n^3). Suppose you have n=3k numbers. K of them are 3, k are 4 and k are 5.
3,....,3,4,....,4,5,.....5
There are k^3 = n^3/27 = O(n^3) such triplets. Just printing them takes O(n^3) time.
Next will be explaining of 3SUM problem in such form:
There are numbers s_1, ..., s_n each of them in range [-u;u]. How many triplets a,b,c there are that a+b=c?
transforming. Get 2*u numbers a_-u, ..., a_0, a_1, ... a_u. a_i is amount of numbers s_j, that s_j = i. This is done in O(n+u)
res = a_0 * sum(i=-u..u, i=/=0, C(a_i, 2)) + C(a_0,3) a_0 = 0
Build a polynom P(x) = sum(i = -u...u, a_i*x^(i+u).
Find Q(x) = P(x)*P(x) using FFT.
Note that Q(x) = sum(i=-2u..2u, b_i*x^(i+2*u)), where b_i is number of pairs s_u,s_k that s_u+s_k = i (This includes using same number twice).
For all even i do b_i = b_i - a_(i/2). This will remove using same number twice.
Sum all b_i*a_i/2 - add this to res.
Example: to be more simple, I will assume that range for numbers is [0..u] and won't use any +u in powers of x.
Suppose that we have numbers 1,2,3
- a_0 = 0, a_1 = 1, a_2 = 1, a_3 = 1
res = 0
P(x) = x + x^2 + x^3
Q(x) = x^2 +2x^3+3x^4+2x^5+x^6
After subtracting b_2 = 0, b_3 = 2, b_4 = 2, b_5 = 2, b_6 = 0
res += 0*1/2 + 2*1/2 + 2*0/2 + 2*0/2 + 6*0/2 = 1
Another possibility (who can fathom the mind of an interviewer?) would be to rewrite the equation as:
x^2 + y^2 = z^2
x^2 = z^2 - y^2 = (z-y)(z+y)
If we knew the prime factorisation of x^2 then we could simply iterate through all possible factorisations into a pair of numbers p,q (with p < q) and compute
x^2 = p.q = (z-y)(z+y)
p+q = (z-y)+(z+y) = 2z
q-p = (z+y)-(z-y) = 2y
z = (p+q)/2
y = (q-p)/2
So given a factorisation x^2=p.q we can work out the z and y values. By putting all the integer values into a set we can then check each possible answer in time O(1) (per check) by looking to see if those z,y values are in the array (taking care that negative values are also detected).
Wikipedia says that a randomly chosen integer has about log(n) divisors so this should take about n.log(n) assuming you can do the factorisation fast enough (e.g. if you knew all the integers were under a million you could precompute an array of the smallest factor for each integer).

Special case of sparse matrices multiplication

I'm trying to come up with fast algorithm to find result of operation, where
L - is symmetric n x n matrix with real numbers.
A - is sparse n x m matrix, m < n. Each row has one and only one non-zero element, and it's equal to 1. It's also guaranteed that every column has at most two non-zero elements.
I come up with one algorithm, but I feel like there should be something faster than this.
Let's represent every column of A as pair of row numbers with non-zero elements. If a column has only one non-zero element, its row number listed twice. E.g. for the following matrix
Such representation would be
column 0: [0, 2]; column 1: [1, 3]; column 2: [4, 4]
Or we can list it as a single array: A = [0, 2, 1, 3, 4, 4]; Now, can be calculated as:
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
# sum of two column vectors, i/2-th column of L'
L'[i/2] = L[A[i]] + L[A[i + 1]]
else:
L'[i/2] = L[A[i]]
To calculate we do it one more time:
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
# sum of two row vectors, i/2-th row of L''
L''[i/2] = L'[A[i]] + L'[A[i + 1]]
else:
L''[i/2] = L'[A[i]]
The time complexity of such approach is O(mn + mn), and space complexity (to get final result) is O(nn). I'm wondering if it's possible to improve it to O(mm) in terms of space and/or performance?
The second loop combines at most 2m rows of L', so if m is much smaller than n there will be several rows of L' that are never used.
One way to avoid calculating and storing these unused entries is to change your first loop into a function and only calculate the individual elements of L' as they are needed.
def L'(row,col):
i=col*2
if A[i] != A[i + 1]:
# sum of two column vectors, i/2-th column of L'
return L[row][A[i]] + L[row][A[i + 1]]
else:
return L[row][A[i]]
for (i = 0; i < A.length; i += 2):
if A[i] != A[i + 1]:
for (k=0;k<m;k++):
L''[i/2][k] = L'(A[i],k) + L'(A[i + 1],k)
else:
for (k=0;k<m;k++):
L''[i/2][k] = L'(A[i],k)
This should then have space and complexity O(m*m)
The operation Transpose(A) * L works as follows:
For each column of A we see:
column 1 has `1` in row 1 and 3
column 2 has `1` in row 2 and 4
column 3 has `1` in row 5
The output matrix B = Transpose(A) * L has three rows which are equal to:
Row(B, 1) = Row(A, 1) + Row(A, 3)
Row(B, 2) = Row(A, 2) + Row(A, 4)
Row(B, 3) = Row(A, 5)
If we multiply C = B * A:
Column(C, 1) = Column(B, 1) + Column(B, 3)
Column(C, 2) = Column(B, 2) + Column(B, 4)
Column(C, 3) = Column(B, 5)
If you follow through this in a algorithmic way, you should achieve something very similar to what Peter de Rivaz has suggested.
The time complexity of your algorithm is O(n^2), not O(m*n). The rows and columns of L have length n, and the A array has length 2n.
If a[k] is the column where row k of A has a 1, then you can write:
A[k,i] = δ(a[k],i)
and the product, P = A^T*L*A is:
P[i,j] = Σ(k,l) A^T[i,k]*L[k,l]*A[l,j]
= Σ(k,l) A[k,i]*L[k,l]*A[l,j]
= Σ(k,l) δ(a[k],i)*L[k,l]*δ(a[l],j)
If we turn this around and look at what happens to the elements of L, we see that L[k,l] is added to P[a[k],a[l]], and it's easy to get O(m^2) space complexity using O(n^2) time complexity.
Because a[k] is defined for all k=0..n-1, we know that every element of L must appear somewhere in the product. Because there are O(n^2) distinct elements in L, you can't do better than O(n^2) time complexity.

Resources