How to solve this recurrence? - algorithm

I have a recurrence like this f(n)=(2*f(n-1)+2*f(n-2))%10007;
Now for a particular n i need to find:
g(n)=(f(n)f(0)+f(n-1)f(1)+....+f(0)f(n))%10007.
For example if n=3,
g(3)=(f(3)f(0)+f(2)f(1)+f(1)f(2)+f(0)f(3))%10007.
n can be as large as 10^9. I can find value of f(n) using matrix exponent in log(n) but i cant figure out how to get g(n).
(I need this to solve a problem from amritapuri 2008 regional

Forget about 10007 for a second.
Let F(x)=sum(f(n)*x^n). Then F(x)=(f(0)+x*(f(1)-2f(0))/(1-2x-2x^2).
Let G(x)=sum(g(n)*x^n). Then G(x)=F(x)^2.
Thus the problem is reduced to finding the coefficient of a series (modulo 10007).

BACKGROUND
The original question is about how to tile a 2*n rectangle with 4 types of tiles.
What is unusual is that the tiling must divide into two pieces.
HINT 1
However, you can also consider this as a way of tiling this with the original 4 tiles coloured red, and another set of 4 tiles coloured blue such that the final board has a red side and a blue side.
HINT 2
Let f(n) be the number of ways of tiling a 2*n rectangle with just red tiles, and h(n) be the number of ways of tiling a 2*n rectangle with 0 or more columns of red tiles followed by 1 or more columns of blue tiles.
HINT 3
You can now find a simple matrix multiplication that gives the next values of h and f in terms of their two previous values and use the standard matrix power exponentiation to find the final values.
EXAMPLE CODE
Here is a Python demonstration that this formula gives the same answer as the original summation.
def f(n):
"""Number of ways to tile 2*n board with red tiles"""
if n<0: return 0
if n==0: return 1
return 2*f(n-1)+2*f(n-2)
def g_orig(n):
"""Number of ways to tile 2*n board in two halves"""
return sum(f(k)*f(n-k) for k in range(n+1))
def h(n):
"""Number of ways to tile 2*n board with red tiles and at least one column of blue tiles"""
if n<1: return 0
# Consider placing one column of blue tiles (either 2*1 or 2 1*1)
t=2*(f(n-1)+h(n-1))
# Also consider placing two columns of blue tiles (either a 2*2 or L shaped and a 1*1)
t+=2*(f(n-2)+h(n-2))
return t
def g(n):
return f(n)+h(n)
for n in range(10):
print n,g_orig(n),g(n)

The trick is that the sequence f(n) mod 10007 has a period of 10007, i.e. f(n) mod 10007 = f(n + 10007) mod 10007. So all you need to do is simply (1) calculate f(0 .. n - 1) mod 10007, (2) calculate f(n - k)f(k) mod 10007 for 0 <= k < 10007, and then (3) sum them up according to your equation. You don't even need the power exponentiation method to calculate f(n).

Related

Total number of ways in which the bricks can be arranged on the wall?

Suppose we have a wall of n*3 size, and bricks of size 1*3, 2*3, and 3*3, the bricks can be put horizontally and vertically, what is the total number of ways to arrange the bricks to fill the wall? What is the recurrence relation of this problem?
I think it is T(n) = T(n-1)+ 2T(n-2)+ 7T(n-3), because for T(n-2) we have 1x3+1x3 or 2x3 so 2T(n-2). For three, 1x3+1x3+1x3, 1x3+2x3 or 2x3+1x3 and same for horizontal, plus 3x3 so we have 7dp(n-3), is this correct?
Thank you!
This is almost right, but it over-counts several terms. For example, a solution S for T(n-2) can have two vertical 1-bricks added after it to become a solution for T(n). If you add one 1-brick after S, it's a solution for T(n-1), so the arrangement S + two 1-bricks is being counted in your T(n-2) and T(n-1) terms.
Instead, think about how a solution S for T(n) ends on the right. You can show that the (n-1) x 3 initial segment of S is valid for T(n-1) if and only if the final block of S is a vertical 1-block.
Otherwise, when is the (n-2) x 3 initial segment of S the longest valid initial segment of S? Exactly when S ends with a vertical 2-block (if it ended with two vertical 1-blocks, the longest valid initial segment has length n-1, which we've already counted).
The final case is n-3: figure out how many configurations of the last 3 x 3 space are possible such that the longest valid initial segment of S has length n-3. As a hint: the answer, call it 'c', is smaller than 7, which, as you showed, is the count of all configurations of a 3 x 3 space. These give you the coefficients for the recursion, T(n) = T(n-1) + T(n-2) + c*T(n-3), with appropriate base cases for n = 1, 2 and 3.

Counting Number of Rectangles in a 2d histogram with area >=K

The problem is in a 2d histogram with N columns, counting number of rectangles with area ≥ K. The columns have width 1 and I know the number of unit squares on the ith column.
I've come up with the following O(N²) algorithm: let hi be the height of the i th column. Then I can do the following: when I fix i,j as our bottom side of rectangle, I find the highest possible height of the rectangle h and add max(0, h - ceil(K/(j-i+1)) + 1) to the answer.
I heard there is an O(N log N) algorithm, and I tried to derive it by using the fact
∑Ni=1N⁄i  ~ N log N
However, that's all I have and I can't make further progress. Can you give a hint on the algorithm?

Find the m by m square that contains the most "conflicting pairs"?

There are two types of units on a 2d plane, green units (G) and red units (R).
The plane is represented as an n by n matrix, each unit is represented as an element in the matrix.
A pair of two units is called a "conflicting pair" if the two are of different colours. The goal is to find the m by m submatrix that contains the most "conflicting pairs".
Example
[R R 0 0 0
R R 0 0 0
0 0 R R 0
0 0 0 G G
0 0 0 G G]
In the above 5 by 5 matrix, the "most conflicting" 3 by 3 submatrix is at the lower right corner, where there are two red units and four green units, which amounts to 8 conflicting pairs within the submatrix.
A naive solution will take O(m^2n^2) for iterating every element in every possible submatrix.
I also thought of using dynamic programming like the Summed-area table algorithm, the time complexity will then be O(n^2), which looks good since it's already O(n^2) for scanning each element once.
However the n by n matrix may be large and sparse and given in a sparse format (like CSR), in that case an O(n^2) algorithm may not be efficient. Any suggeststions on how do I do better for sparse matrices (and dense matrices)?
If you have k non-empty cells (with R or G) then you can solve with time complexity O(k^2) (squeeze the matrix) because optimal submatrix has one non-empty cell on the border of the matrix.
Or time complexity maybe O(k * (log n)^2) if use two dimension sparse segments tree for getting sum on a rectangle.
The answer is given by
idx = argmax SUM(X_r,m) * SUM(X_g,m)
where SUM(X,m) returns a matrix with the summation of units in each m x m window, X_r and X_g are the matrices with only red and green units enabled respectively, and idx is the m x m window with the largest number of conflicting nodes.
The question then becomes can SUM(X,m) be more efficiently calculated for sparse matrices. I think the answer is: it really depends on the structure of X and the value of m.
An obvious way to make use of the sparsity of X is to compute SUM(X,m) by using the identity
SUM(X,m) = transpose(SUM1d( transpose(SUM1d(X,m) ), m )) (1)
where SUM1d(X,m) is the results of summing intervals of length m along rows of X. Clearly, SUM1d can be implemented in O(n) time for each row, and O(n^2) for the entire matrix, in a similar fashion to the Sum-Area-Table algorithm. This yields the same complexity O(n^2) for the entire algorithm. But that is rather uninteresting as it's the same runtime as a Sum-Area-Table algorithm.
What is interesting is asking whether SUM1d(X,m) can be implemented to take advantage of any sparsity of X. It's clear that SUM1d can be implemented to take full advantage of the sparsity of the input matrix; however, depending on the structure of X and the size of m the output matrix may not be sparse.
Assuming, m is much less than n then implementing SUM1d(X,m) as described in eq (1) above can be done in O(nz_row) time where nz_row is the max number of non-zero elements on any of the rows of X. Furthermore, SUM1d(X,m) will produce a sparse matrix, albeit with O(m) less sparsity. Since we assume m is much less than n this is still a sparse matrix and will still translate to efficiency gains.
Therefore, we should expect O(n*nz_row) for the first call to SUM1d in eq (1) and O(n*m*nz_col) for the second call to SUM1d.

algorithm em : comprehension and example

I'm studying pattern recognition and I found an interesting algorithm that I'd like to deepen, the Expectations Maximization Algorithm. I haven't great knowledge of probability and statistics and I've read some article on the operation of the algorithm on normal or Gaussian distributions, but I would start with a simple example to understand better. I hope that the example may be suitable.
Assume we have a jar with 3 colors, red, green, blue. Corresponding probability of drawing each colored ball are: pr, pg, pb. Now, let's assume that we have the following parametrized model for the probabilities of drawing the different colours :
pr = 1/4
pg = 1/4 + p/4
pb = 1/2 - p/4
with p unknown parameter. Now assume that the man who is doing the experiment is actually colourblind and cannot discern the red from the green balls. He draws N balls, but only sees
m1 = nR + nG red/green balls and m2 = nB blue balls.
The question is, can the man still estimate the parameter p and with that in hand calculate his best guess for the number of red and green balls (obviously, he knows the number of blue balls)? I think that obviously he can, but what about EM? What I have to consider?
Well, the general outline of the EM algorithm is that if you know the values of some of the parameters, then computing the MLE for the other parameters is very simple. The commonly-given example is mixture density estimation. If you know the mixture weights, then estimating the parameters for the individual densities is easy (M step). Then you go back a step: if you know the individual densities then you can estimate the mixture weights (E step). There isn't necessarily an EM algorithm for every problem, and even if there is one, it's not necessarily the most efficient algorithm. It is, however, usually simpler and therefore more convenient.
In the problem you stated, you can pretend that you know the numbers of red and green balls and then you can carry out ML estimation for p (M step). Then with the value of p you go back and estimate the numbers of red and green balls (E step). Without thinking about it too much, my guess is that you could reverse the roles of the parameters and still work it as an EM algorithm: you could pretend that you know p and carry out ML estimation for the numbers of balls, then go back and estimate p.
If you are still following, we can work out formulas for all this stuff.
When "p" is not known, you can go for maximum likihood or MLE.
First, from your descriptions, "p" has to be in [-1, 2] or the probabilities will not make sense.
You have two certain observations: nG + nR = m and nB = N - m (m = m1, N = m1 + m2)
The chances of this happening is N! / (m! (N - m)!) (1- pb)^m (1 - pb)^(N - m).
Ignoring the constant of N choose m, we will maximize the second term:
p* = argmax over p of (1 - pb)^m pb^(N - m)
The easy solution is that p* should make pb = (N - m) / N = 1 - m / N.
So 0.5 - 0.25 p* = 1 = m / N ==> p* = max(-1, -2 + 4 * m / N)

Bijection on the integers below x

i'm working on image processing, and i'm writing a parallel algorithm that iterates over all the pixels in an image, and changes the surrounding pixels based on it's value. In this algorithm, minor non-deterministic is acceptable, but i'd rather minimize it by only querying distant pixels simultaneously. Could someone give me an algorithm that bijectively maps the integers below n to the integers below n, in a fast and simple manner, such that two integers that are close to each other before mapping are likely to be far apart after application.
For simplicity let's say n is a power of two. Could you simply reverse the order of the least significant log2(n) bits of the number?
Considering the pixels to be a one dimentional array you could use a hash function j = i*p % n where n is the zero based index of the last pixel and p is a prime number chosen to place the pixel far enough away at each step. % is the remainder operator in C, mathematically I'd write j(i) = i p (mod n).
So if you want to jump at least 10 rows at each iteration, choose p > 10 * w where w is the screen width. You'll want to have a lookup table for p as a function of n and w of course.
Note that j hits every pixel as i goes from 0 to n.
CORRECTION: Use (mod (n + 1)), not (mod n). The last index is n, which cannot be reached using mod n since n (mod n) == 0.
Apart from reverting the bit order, you can use modulo. Say N is a prime number (like 521), so for all x = 0..520 you define a function:
f(x) = x * fac mod N
which is bijection on 0..520. fac is arbitrary number different from 0 and 1. For example for N = 521 and fac = 122 you get the following mapping:
which as you can see is quite uniform and not many numbers are near the diagonal - there are some, but it is a small proportion.

Resources