Algorithm for modeling expanding gases on a 2D grid - algorithm

I have a simple program, at it's heart is a two dimensional array of floats, supposedly representing gas concentrations, I have been trying to come up with a simple algorithm that will model the gas expanding outwards, like a cloud, eventually ending up with the same concentration of the gas everywhere across the grid.
For example a given state progression could be:
(using ints for simplicity)
starting state
00000
00000
00900
00000
00000
state after 1 pass of algorithm
00000
01110
01110
01110
00000
one more pas should give a 5x5 grid all containing the value 0.36 (9/25).
I've tried it out on paper but no matter how I try, I cant get my head around the algorithm to do this.
So my question is, how should I set about trying to code this algorithm? I've tried a few things, applying a convolution, trying to take each grid cell in turn and distributing it to its neighbours, but they all end up having undesirable effects, such as ending up eventually with less gas than I originally started with, or all of gas movement being in one direction instead of expanding outwards from the centre. I really can't get my head around it at all and would appreciate any help at all.

It's either a diffusion problem if you ignore convection or a fluid dynamics/mass transfer problem if you don't. You would start with equations for conservation of mass and momentum for an Eulerian (fixed control volume) viewpoint if you were solving from scratch.
It's a transient problem, so you need to perform an integration to advance the state from time t(n) to t(n+1). You show a grid, but nothing about how you're solving in time. What integration scheme have you tried? Explicit? Implicit? Crank-Nicholson? If you don't know, you're not approaching the problem correctly.
One book that I really liked on this subject was S.W. Patankar's "Numerical Heat Transfer and Fluid Flow". It's a little dated now, but I liked the treatment. It's still good after 29 years, but there might be better texts since I was reading on the subject. I think it's approachable for somebody looking into it for the first time.

In the example you give, your second stage has a core of 1's. Usually diffusion requires a concentration gradient, so most diffusion related techniques won't change the 1 in the middle on the next iteration (nor would they have got to that state after the first one, but it's a bit easier to see once you've got a block of equal values). But as the commenters on your post say, that's not likely to be the cause of a net movement. Reducing the gas may be edge effects, but can also be a question of rounding errors - set the cpu to round half even, and total the gas and apply a correction now and again.

It looks like you're trying to implement a finite difference solver for the heat equation with Neumann boundary conditions (insulation at the edges). There's a lot of literature on this kind of thing. The Wikipedia page on finite difference method describes a simple but stable method, but for Dirichlet boundary conditions (constant density at edges). Modifying the handling of the boundary conditions shouldn't be too difficult.

It looks like what you want is something like a smoothing algorithm, often used in programs like Photoshop, or old school demo effects, like this simple Flame Effect.
Whatever algorithm you use, it will probably help you to double buffer your array.
A typical smoothing effect will be something like:
begin loop forever
For every x and y
{
b2[x,y] = (b1[x,y] + (b1[x+1,y]+b1[x-1,y]+b1[x,y+1]+b1[x,y-1])/8) / 2
}
swap b1 and b2
end loop forever

See Tom Forsyth's Game Programming Gems article. Looks like it fulfils your requirements, but if not then it should at least give you some ideas.

Here's a solution in 1D for simplicity:
The initial setup is with a concentration of 9 at the origin (), and 0 at all other positive and negative coordinates.
initial state:
0 0 0 0 (9) 0 0 0 0
The algorithm to find next iteration values is to start at the origin and average current concentrations with adjacent neighbors. The origin value is a boundary case and the average is done considering the origin value, and its two neighbors simultaneously, i.e. average among 3 values. All other values are effectively averaged among 2 values.
after iteration 1:
0 0 0 3 (3) 3 0 0 0
after iteration 2:
0 0 1.5 1.5 (3) 1.5 1.5 0 0
after iteration 3:
0 .75 .75 2 (2) 2 .75 .75 0
after iteration 4:
.375 .375 1.375 1.375 (2) 1.375 1.375 .375 .375
You do these iterations in a loop. Outputting the state every n number of iterations. You may introduce a time constant to control how many iterations represent one second of clock-on-the-wall time. This is also a function of what length units the integer coordinates represent. For a given H/W system, you can tune this value empirically. You may also introduce a steady state tolerance value to control when the program says " all neighbor values are within this tolerance" or "no value changed between iterations by more than this tolerance" and so the algorithm has reached a steady state solution.

The concentration for each iteration given a starting concentration can be obtained by the equation:
concentration = startingConcentration/(2*iter + 1)**2
iter is the time iteration. So for your example.
startingConcentration = 9
iter = 0
concentration = 9/(2*0 + 1)**2 = 9
iter = 1
concentration = 9/(2*1 + 1)**2 = 1
iter = 2
concentration = 9/(2*2 + 1)**2 = 9/25 = .35
you can set the value of the array after each "time step"

Related

Find Minimum Time to Occupy Grid

Problem:
Consider a patient suffering from skin infection and germs are spreading all over rapidly. Assume that skin surface is scaled as a rectangular grid of size MxN and cells are marked by 0 and 1 where 0 represents non affected region on skin and 1 represents affected region on skin. Germs can move from one cell of grid to another in 4 possible directions (right, left, up, down) but can move to only one cell at a time in one direction and affect that cell in 1 sec. Doctor currently who is treating the patient see's status and wants to know the time left for him to save him before the germs spread all over the skin and patient dies. Can you help to estimate the minimum time taken for the germs to completely occupy skin surface?
Input: : Current status of skin. (A matrix of size MxN with 1's and 0's which represents affected and non affected area)
Output: : Min time in sec to cover all over the grid.
Example:
Input:
[1 1 0 0 1]
[0 1 1 0 0]
[0 0 0 0 1]
[0 1 0 0 0]
Output: 2 seconds
Explanation:
After 1 sec from input, matrix could be as below
[1 1 1 0 1]
[1 1 1 0 1]
[0 1 1 0 1]
[0 1 1 0 1]
In next sec, matrix is completely filled by 1's
I will not present a detailed solution here, but some thoughts that hopefully may help you to write your own program.
First step is to determine the kind of algorithm to implement. The optimal way would be to find a simple and fast ad hoc solution for this problem. In the absence of such a solution, for this kind of problems, classical candidates are DFS, BFS, A* ...
As the goal is to find the shortest solution, it seems natural to consider BFS first, as once BFS finds a solution, we know that it is the shortest ones and we can stop the search. However, then, we have to consider avoiding inflation of the nodes, as it would lead not only to a huge calculation time, but also a huge memory.
First idea to avoid node inflation is to consider that some 1 cells can only be expended in one another cell. In the posted diagram, for example the cell (0, 0) (top left) can only be expended to cell (1, 0). Then, after this expansion, cell (1, 1) can only move to cell (2, 1). Therefore, we know it would be suboptimal to move cell (1,1) to cell (1,0). Therefore: move such cells first.
In a similar way, once an infected cell is surrounded by other infected cells only, it is no longer necessary to consider it for next moves.
At the end, it would be convenient to have a list of infected cells, together with the number of non-infected cells that each such cell can move to.
Another idea to limit the number of nodes is to detect duplicates, as it is likely here that many of them will exist. For that, we have to define a kind of hashing. The used hash function does not need to be 100% efficient, but need to be calculated rapidly, and if possible in a recursive manner. If we obtain B diagram from A diagram by adding a 1-cell at position (i, j), then I propose something like
H(B) = H(A)^f(i, j)
f(i, j) = a*(1024*i+j)%b
Here, I used the fact that N and M are less than 1000.
Each time a new diagram is consider, we have to calculate the corresponding H value and check if it exists already in the set of past diagrams.
I'm not sure how far I would get with this in an interview situation. After some thought, rather than considering solutions that store more than one full board state, I would rather consider a greedy priority queue since a strong heuristic for the next zero-cell candidates to fill seems to be:
(1) healthy cells that have the least neighbouring infected cells (but at least one, of course),
e.g., choose A over B
1 1 B 0 1
0 1 1 0 0
0 0 A 0 1
0 1 0 0 0
and (2) break ties by choosing first the healthy cells that when infected will block the least infected cells.
e.g., choose A over B
1 1 1 0 1
1 B 1 0 A
0 0 0 0 1
0 1 0 0 0
An interesting observation is that any healthy cell destination can technically be reached in time Manhattan-distance from the nearest infected cell, where the cell leading such a "crawl" continually chooses the single move that brings us closer to the destination. We know that at the same time, though, this same infected-cell "snake" produces new "crawlers" that could reach any equally far or closer neighbours. This makes me wonder if there may be a more efficient way to determine the lower-bound, based on counts of the farthest healthy cells.
This is a variant of the multi-agent pathfinding problem (MAPF). There is a ton of recent work on this topic, but earlier modern work is a good starting point for finding optimal solutions to this problem - for instance the operator decomposition approach.
To do this you would order the agents (germs) 1..k. Then, you would start a search where you generate all possible first moves for germ 1, followed by all possible first moves for germ 2, and so on, where moves for an agent are to stay in place, or to spread to an adjacent unoccupied location. With 4 possible actions for each germ, there are up to 4^k possible actions between complete states. (Partial states occur when you haven't yet assigned actions to all k agents.) The number of actions is exponential, meaning you may run up against resource constraints (time or space) fairly quickly. But, there are only 2^(MxN) states possible. (Since agents don't go away, it's actually 2^(MxN-i) where i is the number of initial germs.)
Every time all (k) germs have considered a possible action, you have a new complete state. (And k then increases for the next iteration.) The minimum time left comes from the shallowest complete state which has the grid filled. A bit of brute-force computation will find the shortest solution. (Quite a bit in the case of large grids.)
You could use a BFS to find the first state that is completely filled. But, A* might do much better. As a heuristic, you could consider that all adjacent locations of all cells were filled in each step, and then compute the number of steps required to fill the grid under that model. That gives a lower bound on the time required to fill the full grid.
But, there are many more optimizations. The reason to do operator decomposition is that you could order the moves to take the best moves first and not consider the weaker possibilities (eg all germs don't spread). You could also use a partial-expansion approach (EPEA*) to avoid generating a lot of clearly suboptimal policies for the germs.
If I was asking this as an interview questions I might be looking to see someone formulate the problem (what are actions, what are states), come up with the lower bound on the solution (every germ expands to every adjacent cell), come up with an algorithm, and perhaps analyze how hard the problem is, in order of increasing difficulty.

Expectation Maximization coin toss examples

I've been self-studying the Expectation Maximization lately, and grabbed myself some simple examples in the process:
http://cs.dartmouth.edu/~cs104/CS104_11.04.22.pdf
There are 3 coins 0, 1 and 2 with P0, P1 and P2 probability landing on Head when tossed. Toss coin 0, if the result is Head, toss coin 1 three times else toss coin 2 three times. The observed data produced by coin 1 and 2 is like this: HHH, TTT, HHH, TTT, HHH. The hidden data is coin 0's result. Estimate P0, P1 and P2.
http://ai.stanford.edu/~chuongdo/papers/em_tutorial.pdf
There are two coins A and B with PA and PB being the probability landing on Head when tossed. Each round, select one coin at random and toss it 10 times then record the results. The observed data is the toss results provided by these two coins. However, we don't know which coin was selected for a particular round. Estimate PA and PB.
While I can get the calculations, I can't relate the ways they are solved to the original EM theory. Specifically, during the M-Step of both examples, I don't see how they're maximizing anything. It just seems they are recalculating the parameters and somehow, the new parameters are better than the old ones. Moreover, the two E-Steps don't even look similar to each other, not to mention the original theory's E-Step.
So how exactly do these example work?
The second PDF won't download for me, but I also visited the wikipedia page http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm which has more information. http://melodi.ee.washington.edu/people/bilmes/mypapers/em.pdf (which claims to be a gentle introduction) might be worth a look too.
The whole point of the EM algorithm is to find parameters which maximize the likelihood of the observed data. This is the only bullet point on page 8 of the first PDF, the equation for capital Theta subscript ML.
The EM algorithm comes in handy where there is hidden data which would make the problem easy if you knew it. In the three coins example this is the result of tossing coin 0. If you knew the outcome of that you could (of course) produce an estimate for the probability of coin 0 turning up heads. You would also know whether coin 1 or coin 2 was tossed three times in the next stage, which would allow you to make estimates for the probabilities of coin 1 and coin 2 turning up heads. These estimates would be justified by saying that they maximized the likelihood of the observed data, which would include not only the results that you are given, but also the hidden data that you are not - the results from coin 0. For a coin that gets A heads and B tails you find that the maximum likelihood for the probability of A heads is A/(A+B) - it might be worth you working this out in detail, because it is the building block for the M step.
In the EM algorithm you say that although you don't know the hidden data, you come in with probability estimates which allow you to write down a probability distribution for it. For each possible value of the hidden data you could find the parameter values which would optimize the log likelihood of the data including the hidden data, and this almost always turns out to mean calculating some sort of weighted average (if it doesn't the EM step may be too difficult to be practical).
What the EM algorithm asks you to do is to find the parameters maximizing the weighted sum of log likelihoods given by all the possible hidden data values, where the weights are given by the probability of the associated hidden data given the observations using the parameters at the start of the EM step. This is what almost everybody, including the Wikipedia algorithm, calls the Q-function. The proof behind the EM algorithm, given in the Wikipedia article, says that if you change the parameters so as to increase the Q-function (which is only a means to an end), you will also have changed them so as to increase the likelihood of the observed data (which you do care about). What you tend to find in practice is that you can maximize the Q-function using a variation of what you would do if you know the hidden data, but using the probabilities of the hidden data, given the estimates at the start of the EM-step, to weight the observations in some way.
In your example it means totting up the number of heads and tails produced by each coin. In the PDF they work out P(Y=H|X=) = 0.6967. This means that you use weight 0.6967 for the case Y=H, which means that you increment the counts for Y=H by 0.6967 and increment the counts for X=H in coin 1 by 3*0.6967, and you increment the counts for Y=T by 0.3033 and increment the counts for X=H in coin 2 by 3*0.3033. If you have a detailed justification for why A/(A+B) is a maximum likelihood of coin probabilities in the standard case, you should be ready to turn it into a justification for why this weighted updating scheme maximizes the Q-function.
Finally, the log likelihood of the observed data (the thing you are maximizing) gives you a very useful check. It should increase with every EM step, at least until you get so close to convergence that rounding error comes in, in which case you may have a very small decrease, signalling convergence. If it decreases dramatically, you have a bug in your program or your maths.
As luck would have it, I have been struggling with this material recently as well. Here is how I have come to think of it:
Consider a related, but distinct algorithm called the classify-maximize algorithm, which we might use as a solution technique for a mixture model problem. A mixture model problem is one where we have a sequence of data that may be produced by any of N different processes, of which we know the general form (e.g., Gaussian) but we do not know the parameters of the processes (e.g., the means and/or variances) and may not even know the relative likelihood of the processes. (Typically we do at least know the number of the processes. Without that, we are into so-called "non-parametric" territory.) In a sense, the process which generates each data is the "missing" or "hidden" data of the problem.
Now, what this related classify-maximize algorithm does is start with some arbitrary guesses at the process parameters. Each data point is evaluated according to each one of those parameter processes, and a set of probabilities is generated-- the probability that the data point was generated by the first process, the second process, etc, up to the final Nth process. Then each data point is classified according to the most likely process.
At this point, we have our data separated into N different classes. So, for each class of data, we can, with some relatively simple calculus, optimize the parameters of that cluster with a maximum likelihood technique. (If we tried to do this on the whole data set prior to classifying, it is usually analytically intractable.)
Then we update our parameter guesses, re-classify, update our parameters, re-classify, etc, until convergence.
What the expectation-maximization algorithm does is similar, but more general: Instead of a hard classification of data points into class 1, class 2, ... through class N, we are now using a soft classification, where each data point belongs to each process with some probability. (Obviously, the probabilities for each point need to sum to one, so there is some normalization going on.) I think we might also think of this as each process/guess having a certain amount of "explanatory power" for each of the data points.
So now, instead of optimizing the guesses with respect to points that absolutely belong to each class (ignoring the points that absolutely do not), we re-optimize the guesses in the context of those soft classifications, or those explanatory powers. And it so happens that, if you write the expressions in the correct way, what you're maximizing is a function that is an expectation in its form.
With that said, there are some caveats:
1) This sounds easy. It is not, at least to me. The literature is littered with a hodge-podge of special tricks and techniques-- using likelihood expressions instead of probability expressions, transforming to log-likelihoods, using indicator variables, putting them in basis vector form and putting them in the exponents, etc.
These are probably more helpful once you have the general idea, but they can also obfuscate the core ideas.
2) Whatever constraints you have on the problem can be tricky to incorporate into the framework. In particular, if you know the probabilities of each of the processes, you're probably in good shape. If not, you're also estimating those, and the sum of the probabilities of the processes must be one; they must live on a probability simplex. It is not always obvious how to keep those constraints intact.
3) This is a sufficiently general technique that I don't know how I would go about writing code that is general. The applications go far beyond simple clustering and extend to many situations where you are actually missing data, or where the assumption of missing data may help you. There is a fiendish ingenuity at work here, for many applications.
4) This technique is proven to converge, but the convergence is not necessarily to the global maximum; be wary.
I found the following link helpful in coming up with the interpretation above: Statistical learning slides
And the following write-up goes into great detail of some painful mathematical details: Michael Collins' write-up
I wrote the below code in Python which explains the example given in your second example paper by Do and Batzoglou.
I recommend that you read this link first for a clear explanation of how and why the 'weightA' and 'weightB' in the code below are obtained.
Disclaimer : The code does work but I am certain that it is not coded optimally. I am not a Python coder normally and have started using it two weeks ago.
import numpy as np
import math
#### E-M Coin Toss Example as given in the EM tutorial paper by Do and Batzoglou* ####
def get_mn_log_likelihood(obs,probs):
""" Return the (log)likelihood of obs, given the probs"""
# Multinomial Distribution Log PMF
# ln (pdf) = multinomial coeff * product of probabilities
# ln[f(x|n, p)] = [ln(n!) - (ln(x1!)+ln(x2!)+...+ln(xk!))] + [x1*ln(p1)+x2*ln(p2)+...+xk*ln(pk)]
multinomial_coeff_denom= 0
prod_probs = 0
for x in range(0,len(obs)): # loop through state counts in each observation
multinomial_coeff_denom = multinomial_coeff_denom + math.log(math.factorial(obs[x]))
prod_probs = prod_probs + obs[x]*math.log(probs[x])
multinomial_coeff = math.log(math.factorial(sum(obs))) - multinomial_coeff_denom
likelihood = multinomial_coeff + prod_probs
return likelihood
# 1st: Coin B, {HTTTHHTHTH}, 5H,5T
# 2nd: Coin A, {HHHHTHHHHH}, 9H,1T
# 3rd: Coin A, {HTHHHHHTHH}, 8H,2T
# 4th: Coin B, {HTHTTTHHTT}, 4H,6T
# 5th: Coin A, {THHHTHHHTH}, 7H,3T
# so, from MLE: pA(heads) = 0.80 and pB(heads)=0.45
# represent the experiments
head_counts = np.array([5,9,8,4,7])
tail_counts = 10-head_counts
experiments = zip(head_counts,tail_counts)
# initialise the pA(heads) and pB(heads)
pA_heads = np.zeros(100); pA_heads[0] = 0.60
pB_heads = np.zeros(100); pB_heads[0] = 0.50
# E-M begins!
delta = 0.001
j = 0 # iteration counter
improvement = float('inf')
while (improvement>delta):
expectation_A = np.zeros((5,2), dtype=float)
expectation_B = np.zeros((5,2), dtype=float)
for i in range(0,len(experiments)):
e = experiments[i] # i'th experiment
ll_A = get_mn_log_likelihood(e,np.array([pA_heads[j],1-pA_heads[j]])) # loglikelihood of e given coin A
ll_B = get_mn_log_likelihood(e,np.array([pB_heads[j],1-pB_heads[j]])) # loglikelihood of e given coin B
weightA = math.exp(ll_A) / ( math.exp(ll_A) + math.exp(ll_B) ) # corresponding weight of A proportional to likelihood of A
weightB = math.exp(ll_B) / ( math.exp(ll_A) + math.exp(ll_B) ) # corresponding weight of B proportional to likelihood of B
expectation_A[i] = np.dot(weightA, e)
expectation_B[i] = np.dot(weightB, e)
pA_heads[j+1] = sum(expectation_A)[0] / sum(sum(expectation_A));
pB_heads[j+1] = sum(expectation_B)[0] / sum(sum(expectation_B));
improvement = max( abs(np.array([pA_heads[j+1],pB_heads[j+1]]) - np.array([pA_heads[j],pB_heads[j]]) ))
j = j+1
The key to understanding this is knowing what the auxiliary variables are that make estimation trivial. I will explain the first example quickly, the second follows a similar pattern.
Augment each sequence of heads/tails with two binary variables, which indicate whether coin 1 was used or coin 2. Now our data looks like the following:
c_11 c_12
c_21 c_22
c_31 c_32
...
For each i, either c_i1=1 or c_i2=1, with the other being 0. If we knew the values these variables took in our sample, estimation of parameters would be trivial: p1 would be the proportion of heads in samples where c_i1=1, likewise for c_i2, and \lambda would be the mean of the c_i1s.
However, we don't know the values of these binary variables. So, what we basically do is guess them (in reality, take their expectation), and then update the parameters in our model assuming our guesses were correct. So the E step is to take the expectation of the c_i1s and c_i2s. The M step is to take maximum likelihood estimates of p_1, p_2 and \lambda given these cs.
Does that make a bit more sense? I can write out the updates for the E and M step if you prefer. EM then just guarantees that by following this procedure, likelihood will never decrease as iterations increase.

Minimizing a function of vectors

I need to minimize the following sum:
minimize sum for all i{(i = 1 to n) fi(v(i), v(i - 1), tangent(i))}
v and tangent are vectors.
fi takes the 3 vectors as arguments and returns a cost associated with these 3 vectors. For this function, v(i - 1) is the vector chosen in the previous iteration. tangent(i) is also known. fi calculates the cost of choosing a vector v(i), given the other two vectors v(i - 1) and tangent(i). The v(0) and v(n) vectors are known. tangent(i) values are also known in advance for alli = 0 to n.
My task is to determine all such v(i)s such that the total cost of the function values for i = 1 to n is minimized.
Can you please give me any ideas to solve this?
So far I can think of Branch and Bound or dynamic programming methods.
Thanks!
I think this is a problem in mathematical optimisation, with an objective function built up of dot products and arcCosines, subject to the constraint that your vectors should be unit vectors. You could enforce this either with Lagrange multipliers, or by including a normalising step in the arc-Cosine. If Ti is a unit vector then for Vi calculate cos^-1(Ti.Vi/sqrt(Vi.Vi)). I would have a go at using a conjugate gradient optimiser for this, or perhaps even Newton's method, with my starting point Vi = Ti.
I would hope that this would be reasonably tractable, because the Vi are only related to neighbouring Vi. You might even get somewhere by repeatedly adjusting each Vi in isolation, one by one, to optimise the objective function. It might be worth just seeing what happens if you repeatedly set Vi to be the average of Ti, Vi+1, and Vi-1, and then scaled Vi to be a unit vector again.

Programming problem - Game of Blocks

maybe you would have an idea on how to solve the following problem.
John decided to buy his son Johnny some mathematical toys. One of his most favorite toy is blocks of different colors. John has decided to buy blocks of C different colors. For each color he will buy googol (10^100) blocks. All blocks of same color are of same length. But blocks of different color may vary in length.
Jhonny has decided to use these blocks to make a large 1 x n block. He wonders how many ways he can do this. Two ways are considered different if there is a position where the color differs. The example shows a red block of size 5, blue block of size 3 and green block of size 3. It shows there are 12 ways of making a large block of length 11.
Each test case starts with an integer 1 ≤ C ≤ 100. Next line consists c integers. ith integer 1 ≤ leni ≤ 750 denotes length of ith color. Next line is positive integer N ≤ 10^15.
This problem should be solved in 20 seconds for T <= 25 test cases. The answer should be calculated MOD 100000007 (prime number).
It can be deduced to matrix exponentiation problem, which can be solved relatively efficiently in O(N^2.376*log(max(leni))) using Coppersmith-Winograd algorithm and fast exponentiation. But it seems that a more efficient algorithm is required, as Coppersmith-Winograd implies a large constant factor. Do you have any other ideas? It can possibly be a Number Theory or Divide and Conquer problem
Firstly note the number of blocks of each colour you have is a complete red herring, since 10^100 > N always. So the number of blocks of each colour is practically infinite.
Now notice that at each position, p (if there is a valid configuration, that leaves no spaces, etc.) There must block of a color, c. There are len[c] ways for this block to lie, so that it still lies over this position, p.
My idea is to try all possible colors and positions at a fixed position (N/2 since it halves the range), and then for each case, there are b cells before this fixed coloured block and a after this fixed colour block. So if we define a function ways(i) that returns the number of ways to tile i cells (with ways(0)=1). Then the number of ways to tile a number of cells with a fixed colour block at a position is ways(b)*ways(a). Adding up all possible configurations yields the answer for ways(i).
Now I chose the fixed position to be N/2 since that halves the range and you can halve a range at most ceil(log(N)) times. Now since you are moving a block about N/2 you will have to calculate from N/2-750 to N/2-750, where 750 is the max length a block can have. So you will have to calculate about 750*ceil(log(N)) (a bit more because of the variance) lengths to get the final answer.
So in order to get good performance you have to through in memoisation, since this inherently a recursive algorithm.
So using Python(since I was lazy and didn't want to write a big number class):
T = int(raw_input())
for case in xrange(T):
#read in the data
C = int(raw_input())
lengths = map(int, raw_input().split())
minlength = min(lengths)
n = int(raw_input())
#setup memoisation, note all lengths less than the minimum length are
#set to 0 as the algorithm needs this
memoise = {}
memoise[0] = 1
for length in xrange(1, minlength):
memoise[length] = 0
def solve(n):
global memoise
if n in memoise:
return memoise[n]
ans = 0
for i in xrange(C):
if lengths[i] > n:
continue
if lengths[i] == n:
ans += 1
ans %= 100000007
continue
for j in xrange(0, lengths[i]):
b = n/2-lengths[i]+j
a = n-(n/2+j)
if b < 0 or a < 0:
continue
ans += solve(b)*solve(a)
ans %= 100000007
memoise[n] = ans
return memoise[n]
solve(n)
print "Case %d: %d" % (case+1, memoise[n])
Note I haven't exhaustively tested this, but I'm quite sure it will meet the 20 second time limit, if you translated this algorithm to C++ or somesuch.
EDIT: Running a test with N = 10^15 and a block with length 750 I get that memoise contains about 60000 elements which means non-lookup bit of solve(n) is called about the same number of time.
A word of caution: In the case c=2, len1=1, len2=2, the answer will be the N'th Fibonacci number, and the Fibonacci numbers grow (approximately) exponentially with a growth factor of the golden ratio, phi ~ 1.61803399. For the
huge value N=10^15, the answer will be about phi^(10^15), an enormous number. The answer will have storage
requirements on the order of (ln(phi^(10^15))/ln(2)) / (8 * 2^40) ~ 79 terabytes. Since you can't even access 79
terabytes in 20 seconds, it's unlikely you can meet the speed requirements in this special case.
Your best hope occurs when C is not too large, and leni is large for all i. In such cases, the answer will
still grow exponentially with N, but the growth factor may be much smaller.
I recommend that you first construct the integer matrix M which will compute the (i+1,..., i+k)
terms in your sequence based on the (i, ..., i+k-1) terms. (only row k+1 of this matrix is interesting).
Compute the first k entries "by hand", then calculate M^(10^15) based on the repeated squaring
trick, and apply it to terms (0...k-1).
The (integer) entries of the matrix will grow exponentially, perhaps too fast to handle. If this is the case, do the
very same calculation, but modulo p, for several moderate-sized prime numbers p. This will allow you to obtain
your answer modulo p, for various p, without using a matrix of bigints. After using enough primes so that you know their product
is larger than your answer, you can use the so-called "Chinese remainder theorem" to recover
your answer from your mod-p answers.
I'd like to build on the earlier #JPvdMerwe solution with some improvements. In his answer, #JPvdMerwe uses a Dynamic Programming / memoisation approach, which I agree is the way to go on this problem. Dividing the problem recursively into two smaller problems and remembering previously computed results is quite efficient.
I'd like to suggest several improvements that would speed things up even further:
Instead of going over all the ways the block in the middle can be positioned, you only need to go over the first half, and multiply the solution by 2. This is because the second half of the cases are symmetrical. For odd-length blocks you would still need to take the centered position as a seperate case.
In general, iterative implementations can be several magnitudes faster than recursive ones. This is because a recursive implementation incurs bookkeeping overhead for each function call. It can be a challenge to convert a solution to its iterative cousin, but it is usually possible. The #JPvdMerwe solution can be made iterative by using a stack to store intermediate values.
Modulo operations are expensive, as are multiplications to a lesser extent. The number of multiplications and modulos can be decreased by approximately a factor C=100 by switching the color-loop with the position-loop. This allows you to add the return values of several calls to solve() before doing a multiplication and modulo.
A good way to test the performance of a solution is with a pathological case. The following could be especially daunting: length 10^15, C=100, prime block sizes.
Hope this helps.
In the above answer
ans += 1
ans %= 100000007
could be much faster without general modulo :
ans += 1
if ans == 100000007 then ans = 0
Please see TopCoder thread for a solution. No one was close enough to find the answer in this thread.

What's the algorithm behind minesweeper generation

Well I have been through many sites teaching on how to solve it, but was wondering how to create it. I am not interested much in the coding aspects of it, but wanted to know more on the algorithms behind it. For example, when the grid is generated with 10 mines or so, I would use any random function to distribute itself across the grid, but then again how do I set the numbers associated to it and decide which box to be opened? I couldn't frame any generic algorithm on how would I go about doing that.
Perhaps something in the lines of :
grid = [n,m] // initialize all cells to 0
for k = 1 to number_of_mines
get random mine_x and mine_y where grid(mine_x, mine_y) is not a mine
for x = -1 to 1
for y = -1 to 1
if x = 0 and y = 0 then
grid[mine_x, mine_y] = -number_of_mines // negative value = mine
else
increment grid[mine_x + x, mine_y + y] by 1
That's pretty much it...
** EDIT **
Because this algorithm could lead into creating a board with some mines grouped too much together, or worse very dispersed (thus boring to solve), you can then add extra validation when generating mine_x and mine_y number. For example, to ensure that at least 3 neighboring cells are not mines, or even perhaps favor limiting the number of mines that are too far from each other, etc.
** UPDATE **
I've taken the liberty of playing a little with JS bin here came up with a functional Minesweeper game demo. This is simply to demonstrate the algorithm described in this answer. I did not optimize the randomness of the generated mine position, therefore some games could be impossible or too easy. Also, there are no validation as to how many mines there are in the grid, so you can actually create a 2 by 2 grid with 1000 mines.... but that will only lead to an infinite loop :) Enjoy!
You just seed the mines and after that, you traverse every cell and count the neighbouring mines.
Or you set every counter to 0 and with each seeded mine, you increment all neighbouring cells counters.
If you want to place m mines on N squares, and you have access to a random number generator, you just walk through the squares remaining and for each square compute (# mines remaining)/(# squares remaining) and place a mine if your random number is equal to or below that value.
Now, if you want to label every square with the number of adjacent mines, you can just do it directly:
count(x,y) = sum(
for i = -1 to 1
for j = -1 to 1
1 if (x+i,y+j) contains a mine
0 otherwise
)
or if you prefer you can start off with an array of zeros and increment each by one in the 3x3 square that has a mine in the center. (It doesn't hurt to number the squares with mines.)
This produces a purely random and correctly annotated minesweeper game. Some random games may not be fun games, however; selecting random-but-fun games is a much more challenging task.

Resources