How to solve this programming situation using Clojure in a functional manner? - algorithm

I have a programming problem that I know how I might tackle in Ruby, but don’t know the best way in Clojure (and figure there may be an elegant way to do this taking a functional mindset).
The problem can be simplified as thus:
I have a 3 litre bucket, filled with water. There is a hole in the bottom of the bucket, that is leaking 10 mL/s (i.e. it will take 300 seconds / 5 minutes to empty). I have a glass of water with a 100 mL capacity that I can use to pour in new water to the bucket.
I can only pour the entire contents of the glass into the bucket, no partial pours. The pour occurs instantaneously.
Project out a set of time steps where I can pour glasses of water into the bucket.
I know there is a pretty obvious way to do this using algebra, but the actual problem involves a “leak rate” that changes with time, and "new glass volumes" that don't always equal 100 mL and as such isn’t simple to solve in a closed form.
The Ruby way to solve this would be to keep track of the bucket volume using a “Bucket instance”, and test at numerous time steps to see if there bucket has 100 mL of room. If so, dump the glass, and add to the water in the “bucket instance”. Continue the time steps, watching the bucket volume.
I hope what I have described is clear.

One of the most important concepts of functional programming is that any mutation without external side effects can be offloaded onto immutable function parameter bindings.
Here the time of the simulation and the level of the bucket are the primary function parameters, and they are updated for each recursive call. The other parameters are modeled as functions of time. We could picture each of these functions actually being a lazy sequence based on the time deltas just as the fill-times function itself is. Or piecewise linear equations modeled with lookups in a vector, or whathaveyou.
user>
(defn fill-times
[time level
{:keys [sample-rate calc-bucket-size calc-leak-rate calc-glass-size]
:as params}]
(loop [t time l level]
(let [input-capacity (calc-glass-size time)
bucket-capacity (calc-bucket-size time)
has-room (> (- bucket-capacity l) input-capacity)
leak-delta (* (calc-leak-rate) sample-rate -1)]
(if has-room
(lazy-seq (cons t (fill-times t (+ l input-capacity)
params)))
(recur (+ t sample-rate) (+ l leak-delta))))))
#'user/fill-times
user> (take 50 (fill-times 0 0 {:sample-rate 1
:calc-bucket-size (constantly 3000)
:calc-leak-rate (constantly 10)
:calc-glass-size (constantly 100)}))
(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201)
When there is sufficient room, the glass is dumped (and of course is filled instantly) and we make a note of the time, calling the function again to get the following times. When there is not room, we recur, updating the time and the bucket level. The result is a (hypothetically infinite) lazy sequence of times at which the glass can be emptied (assuming that the glass is filled instantly, and dumped instantly).

I don't have a lot of experience with Clojure, but one way to think of it is a lazy seq of state values at time steps. Lazily compute each state value from the previous state value.
This is a recurrence equation, also known as a difference equation. It computes new values as a function of previous values without overwriting them.
A state value could be just the bucket level or a tuple holding the (time, bucket_level, pour_in_count).

Related

Re-sort a vector after a small number of elements have been modified

If we have a vector of size N that was previously sorted, and replace up to M elements with arbitrary values (where M is much smaller than N), is there an easy way to re-sort them at lower cost (i.e. generate a sorting network of reduced depth) than a full sort?
For example if N=10 and M=2 the input might be
10 20 30 40 999 60 70 80 90 -1
Note: the indices of the modified elements are not known (until we compare them with the surrounding elements.)
Here is an example where I know the solution because the input size is small and I was able to find it with a brute-force search:
if N = 5 and M is 1, these would be valid inputs:
0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1 1 1 1 1 0
0 0 0 0 1 0 0 1 0 1 0 1 0 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1
0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 1
0 0 0 1 1 0 0 1 1 1 0 1 1 0 1 1 0 0 0 1 1 1 1 0 1
For example the input may be 0 1 1 0 1 if the previously sorted vector was 0 1 1 1 1 and the 4th element was modified, but there is no way to form 0 1 0 1 0 as a valid input, because it differs in at least 2 elements from any sorted vector.
This would be a valid sorting network for re-sorting these inputs:
>--*---*-----*-------->
| | |
>--*---|-----|-*---*-->
| | | |
>--*---|-*---*-|---*-->
| | | |
>--*---*-|-----*---*-->
| |
>--------*---------*-->
We do not care that this network fails to sort some invalid inputs (e.g. 0 1 0 1 0.)
And this network has depth 4, a saving of 1 compared with the general case (a depth of 5 generally necessary to sort a 5-element vector.)
Unfortunately the brute-force approach is not feasible for larger input sizes.
Is there a known method for constructing a network to re-sort a larger vector?
My N values will be in the order of a few hundred, with M not much more than √N.
Ok, I'm posting this as an answer since the comment restriction on length drives me nuts :)
You should try this out:
implement a simple sequential sort working on local memory (insertion sort or sth. similar). If you don't know how - I can help with that.
have only a single work-item perform the sorting on the chunk of N elements
calculate the maximum size of local memory per work-group (call clGetDeviceInfo with CL_DEVICE_LOCAL_MEM_SIZE) and derive the maximum number of work-items per work-group,
because with this approach your number of work-items will most likely be limited by the amount of local memory.
This will probably work rather well I suspect, because:
a simple sort may be perfectly fine, especially since the array is already sorted to a large degree
parallelizing for such a small number of items is not worth the trouble (using local memory however is!)
since you're processing billions of such small arrays, you will achieve a great occupancy even if only single work-items process such arrays
Let me know if you have problems with my ideas.
EDIT 1:
I just realized I used a technique that may be confusing to others:
My proposal for using local memory is not for synchronization or using multiple work items for a single input vector/array. I simply use it to get a low read/write memory latency. Since we use rather large chunks of memory I fear that using private memory may cause swapping to slow global memory without us realizing it. This also means you have to allocate local memory for each work-item. Each work-item will access its own chunk of local memory and use it for sorting (exclusively).
I'm not sure how good this idea is, but I've read that using too much private memory may cause swapping to global memory and the only way to notice is by looking at the performance (not sure if I'm right about this).
Here is an algorithm which should yield very good sorting networks. Probably not the absolute best network for all input sizes, but hopefully good enough for practical purposes.
store (or have available) pre-computed networks for n < 16
sort the largest 2^k elements with an optimal network. eg: bitonic sort for largest power of 2 less than or equal to n.
for the remaining elements, repeat #2 until m < 16, where m is the number of unsorted elements
use a known optimal network from #1 to sort any remaining elements
merge sort the smallest and second-smallest sub-lists using a merge sorting network
repeat #5 until only one sorted list remains
All of these steps can be done artificially, and the comparisons stored into a master network instead of acting on the data.
It is worth pointing out that the (bitonic) networks from #2 can be run in parallel, and the smaller ones will finish first. This is good, because as they finish, the networks from #5-6 can begin to execute.

Minimum Tile Ordering

Minimizing Tile Re-ordering Problem:
Suppose I had the following symmetric 9x9 matrix, N^2 interactions between N particles:
(1,2) (2,9) (4,5) (4,6) (5,8) (7,8),
These are symmetric interactions, so it implicitly implies that there exists:
(2,1) (9,2) (5,4) (6,4) (8,5) (8,7),
In my problem, suppose they are arranged in matrix form, where only the upper triangle is shown:
t 0 1 2 (tiles)
# 1 2 3 4 5 6 7 8 9
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 0 0 0 0 0 0 1 ]
3 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
9 [ x x x x x x x x 0 ] (x's denote symmetric pair)
I have some operation that's computed in 3x3 tiles, and any 3x3 that contains at least a single 1 must be computed entirely. The above example requires at least 5 tiles: (0,0), (0,2), (1,1), (1,2), (2,2)
However, if I swap the 3rd and 9th columns (and along with the rows since its a symmetric matrix) by permutating my input:
t 0 1 2
# 1 2 9 4 5 6 7 8 3
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 1 0 0 0 0 0 0 ]
9 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
3 [ x x x x x x x x 0 ] (x's denote symmetric pair)
Now I only need to compute 4 tiles: (0,0), (1,1), (1,2), (2,2).
The General Problem:
Given an NxN sparse matrix, finding an re-ordering to minimize the number of TxT tiles that must be computed. Suppose that N is a multiple of T. An optimal, but unfeasible, solution can be found by trying out the N! permutations of the input ordering.
For heuristics, I've tried bandwidth minimization routines (such as Reverse CutHill McKee), Tim Davis' AMD routines, so far to no avail. I don't think diagonalization is the right approach here.
Here's a sample starting matrix:
http://proteneer.com/misc/out2.dat
Hilbert Curve:
RCM:
Morton Curve:
There are several well-known options you can try (some of them you have, but still):
(Reverse) Cuthill-McKee reduced the matrix bandwidth, keeping the entries close to the diagonal.
Approximage Minimum Degree - a light-weight fill-reducing reordering.
fill-reducing reordering for sparse LU/LL' decomposition (METIS, SCOTCH) - quite computationally heavy.
space filling curve reordering (something in these lines)
quad-trees for 2D or oct-trees for 3D problems - you assign the particles to quads/octants and later number them according to the quad/octant id, similar to space filling curves in a sense.
Self Avoiding Walk is used on structured grids to traverse the grid points in such order that all points are only visited once
a lot of research in blocking of the sparse matrix entries has been done in the context of Sparse Matrix-Vector multiplication. Many of the researchers have tried to find good reordering for that purpose (I do not have the perfect overview on that subject, but have a look at e.g. this paper)
All of those tend to find structure in your matrix and in some sense group the non-zero entries. Since you say you deal with particles, it means that your connectivity graph is in some sense 'local' because of spatial locality of the particle interactions. In this case these methods should be of good use.
Of course, they do not provide the exact solution to the problem :) But they are commonly used in exactly such cases because they yield very good reorderings in practice. I wonder what do you mean by saying the methods you tried failed? Do you expect to find the optimum solution? Surely, they improve the situation compared to a random matrix ordering.
Edit Let me briefly go through a few pictures. I have created a 3D structured cartesian mesh composed of 20-node brick elements. I matched the size of the mesh so that it is similar to yours (~1000 nodes). Also, number of non-zero entries per row are not too far off (51-81 in my case, 59-81 in your case, both however have very different distributions) The pictures below show RCM and METIS reorderings for non-periodic mesh (left), and for mesh with complete x-y-z periodicity (right):
Next picture shows the same matrix reordered using METIS and fill-reducing reordering
The difference is striking - bad impact of periodicity is clear. Now your matrix reordered with RCM and METIS
WOW. You have a problem :) First of all, I think there is something wrong with your rcm, because mine looks different ;) Also, I am certain that you can not conclude anything general and meaningful about any reordering based on this particular matrix. This is because your system size is very small (less than roughly 10x10x10 points), and you seem to have relatively long-range interactions between your particles. Hence, introducing periodicity into such small system has a much stronger bad effect on reordering than is seen in my structured case.
I would start the search for a good reordering by turning off periodicity. Once you have a reordering that satisfies you, introduce periodic interactions. In the system you showed there is almost nothing but periodicity: because it is very smal and because your interactions are fairly long-range, at least compared to my mesh. In much larger systems periodicity will have a smaller effect on the center of the model.
Smaller, but still negative. Maybe you could change your approach to periodicity? Instead of including periodic connectivities explicitly in the matrix, construct and reorder a matrix without those and introduce explicit equations binding the periodic particles together, e.g.:
V_particle1 = V_particle100
or in other words
V_particle1 - V_particle100 = 0
and add those equations at the end of your matrix. This method is called the Lagrange multipliers. Here is how it looks for my system
You keep the reordering of the non-periodic system and the periodic connectivities are localized in a block at the end of the matrix. Of course, you can use it for any other reorderings.
The next idea is you start with a reordered non-periodic system and explicitly eliminate matrix rows for the periodic nodes by adding them into the rows they are mapped onto. You should of course also eliminate the columns.
Whether you can use these depends on what you do with your matrix. Lagrange multiplier for example introduce 0 on the diagonal - not all solvers like that..
Anyway, this is very interesting research. I think that because of the specifics of your problem (as I understand it - irregularly placed particles in 3D, with fairly long-range interactions) make it very difficult to group the matrix entries. But I am very curious what you end up doing. Please let me know!
You can look for a data structure like kd-tree, R-tree, quadtree or a space filling curve. Especially a space filling curve can help because it reduce the dimension and also reorder the tiles and thus can add some new information to the grid. With a 9x9 grid it's probably good to look into peano curves. The z order morton curve is better for power of 2 grids.

Dominoes - Competition

http://ecoocs.org/contests/ecoo_2007.pdf
I'm studying for the upcoming ecoo regionals for my area and I'm stumped on this one question. I really have no idea where to start.
It is in the "regional" "west and central" section "problem 3 - domino chains".
I keep going over the problem manually and I keep thinking of breadth first search or depth first search, but the dominoes having two sides is seriously throwing my thinking off.
Does anyone have any advice, or maybe some resources that might set me in the right direction?
It looks like this problem calls for a recursive backtracking approach. Keep a 7 by 7 symmetric matrix showing which numbers are attached to which. For example, given tiles 00 12 63 51 you would have the following matrix:
0 1 2 3 4 5 6
-------------
0|1 0 0 0 0 0 0
1|0 0 1 0 0 1 0
2|1 0 0 0 0 0 0
3|0 0 0 0 0 0 1
4|0 0 0 0 0 0 0
5|0 1 0 0 0 0 0
6|0 0 0 1 0 0 0
When you use up a tile by placing it in a potential chain, delete it from the matrix, and put it back in the matrix after you unplace the tile by backtracking. For example, if the chain currecntly contains 51 12, the matrix looks like this:
0 1 2 3 4 5 6
-------------
0|1 0 0 0 0 0 0
1|0 0 0 0 0 0 0
2|0 0 0 0 0 0 0
3|0 0 0 0 0 0 1
4|0 0 0 0 0 0 0
5|0 0 0 0 0 0 0
6|0 0 0 1 0 0 0
Given that the chain currecntly ends in 2, you would look along row 2 for any numbers that can connect to. Not finding any, you would mark down 51 12 as a potential longest chain, and then backtrack to the state where the chain contained only 51.
Maintain a set of all the longest chains you have found, and check a new chain for the existence of itself or its reverse in the set before inserting it.
If you find a longer chain, start a new set. Once you have exhaustively searched through all the possible chains, the size of your set should be the number of variations that are of the longest length.
Personally, when working out these sorts of problems, a great way to solve them is to do them in one case (taking into account that you will increase complexity later) and then increase the complexity. That way you aren't overwhelmed by the complexity and the almost endless "whatifs" that a problem like this can cause.
Also, on the programming competitions I've participated in, 60-70% of the credit was awarded for a solution that got the basic problem correct, and then final percentages were if you handled certain cases correctly. The case that sticks in my mind specifically is we had a mapping problem with a variation of the travelling salesman and if they supplied a graph with a loop, did my solution loop endlessly or did it do something about it.
Thus, with this approach, what I would do is try to solve the problem as directly as possible: take the input as stated by the documentation and just generate the longest chain with the pieces you have. Then increment the complexity by allowing pieces to be rotated etc.
While this is a personal approach to the problem, it has served me well in the past. I hope it serves you well also.
This is a dynamic programming problem, so you can solve it using dynamic programming techniques.
So, if we have these pieces:
45 36 46 56
What is the longest chain that can be made from 4 bones?
Obviously, the longest chain that can be made from 3 bones and 1 more bone.
What is the longest chain that can be made from 3 bones?
Obviously, the longest chain that can be made from 2 bones and 1 more bone.
What is the longest chain that can be made from 2 bones?
Obviously, the longest chain that can be made from 1 bone and 1 more bone.
What is the longest chain that can be made from 1 bone?
Obviously, 1 bone is the longest possible chain.
I think you see by the pattern here, we need to use recursion.
So if we have:
45 36 46 56
Suppose we have a function longest_chain(set_of_pieces). Then we need to check:
longest_chain({36 46 56}) (+ 1 if we can append 45 or 54 else discard this chain)
longest_chain({45 46 56}) (+ 1 if we can append 36 or 63 else discard this chain)
longest_chain({45 36 56}) (+ 1 if we can append 46 or 64 else discard this chain)
longest_chain({45 36 46}) (+ 1 if we can append 56 or 65 else discard this chain)
what is longest_chain({36 46 56})?
longest_chain({46 56}) (+ 1 if we can append 36 or 63 else discard this chain)
longest_chain({36 56}) (+ 1 if we can append 46 or 64 else discard this chain)
longest_chain({36 46}) (+ 1 if we can append 56 or 65 else discard this chain)
what is longest_chain({46 56})?
longest_chain({46}) (+ 1 if we can append 56 or 65 else discard this chain)
longest_chain({56}) (+ 1 if we can append 46 or 64 else discard this chain)
what is longest_chain({46})? Two possibilities: {46} {64}
Can we append 56 or 65 to any of these? Yes, we can make this chain {46, 65} and we discard {64}.
Do the same with longest_chain({56}) and we get: {56, 64}.
Therefore, we now know that longest_chain({46 56}) are {46, 65}, {56, 64}
Continue doing this until you get all answers.
Hope this helps.
Here's how I'd start.
Label the n dominoes D1..Dn.
Let Cm be the set of chains formed using the subset of domines D1..Dm (and C0 = {}).
Cm+1 is formed by trying to insert Dm+1 into all possible places in chains in Cm, plus using Dm+1 where possible to concatenate pairs of disjoint chains from Cm, plus the singleton chain consisting of Dm+1 by itself.
You can probably do some optimisation (e.g., ordering the dominoes), but I'd be inclined to just try this as is before getting too clever.

Hungarian algorithm - assign systematically

I'm implementing the Hungarian algorithm in a project. I managed to get it working until what is called step 4 on Wikipedia. I do manage to let the computer create enough zeroes so that the minimal amount of covering lines is the amount of rows/columns, but I'm stuck when it comes to actually assign the right agent to the right job. I see how I could assign myself, but that's more trial and error - i.e., I do not see the systematic method which is of course essential for the computer to get it work.
Say we have this matrix in the end:
a b c d
0 30 0 0 0
1 0 35 5 0
2 60 5 0 0
3 0 50 35 40
The zeroes we have to take to have each agent assigned to a job are (a, 3), (b, 0), (c,2) and (d,1). What is the system behind chosing these ones? My code now picks (b, 0) first, and ignores row 0 and column b from now on. However, it then picks (a, 1), but with this value picked there is no assignment possible for row 3 anymore.
Any hints are appreciated.
Well, I did manage to solve it in the end. The method I used was to check whether there are any columns/rows with only one zero. In such case, that agent must use that job, and that column and row have to be ignored in the future. Then, do it again so as to get a job for every agent.
In my example, (b, 0) would be the first choice. After that we have:
a b c d
0 x x x x
1 0 x 5 0
2 60 x 0 0
3 0 x 35 40
Using the method again, we can do (a, 3), etc. I'm not sure whether it has been proven that this is always correct, but it seems it is.

Special scheduling Algorithm (pattern expansion)

Question
Do you think genetic algorithms worth trying out for the problem below, or will I hit local-minima issues?
I think maybe aspects of the problem is great for a generator / fitness-function style setup. (If you've botched a similar project I would love hear from you, and not do something similar)
Thank you for any tips on how to structure things and nail this right.
The problem
I'm searching a good scheduling algorithm to use for the following real-world problem.
I have a sequence with 15 slots like this (The digits may vary from 0 to 20) :
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
(And there are in total 10 different sequences of this type)
Each sequence needs to expand into an array, where each slot can take 1 position.
1 1 0 0 1 1 1 0 0 0 1 1 1 0 0
1 1 0 0 1 1 1 0 0 0 1 1 1 0 0
0 0 1 1 0 0 0 1 1 1 0 0 0 1 1
0 0 1 1 0 0 0 1 1 1 0 0 0 1 1
The constraints on the matrix is that:
[row-wise, i.e. horizontally] The number of ones placed, must either be 11 or 111
[row-wise] The distance between two sequences of 1 needs to be a minimum of 00
The sum of each column should match the original array.
The number of rows in the matrix should be optimized.
The array then needs to allocate one of 4 different matrixes, which may have different number of rows:
A, B, C, D
A, B, C and D are real-world departments. The load needs to be placed reasonably fair during the course of a 10-day period, not to interfere with other department goals.
Each of the matrix is compared with expansion of 10 different original sequences so you have:
A1, A2, A3, A4, A5, A6, A7, A8, A9, A10
B1, B2, B3, B4, B5, B6, B7, B8, B9, B10
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10
D1, D2, D3, D4, D5, D6, D7, D8, D9, D10
Certain spots on these may be reserved (Not sure if I should make it just reserved/not reserved or function-based). The reserved spots might be meetings and other events
The sum of each row (for instance all the A's) should be approximately the same within 2%. i.e. sum(A1 through A10) should be approximately the same as (B1 through B10) etc.
The number of rows can vary, so you have for instance:
A1: 5 rows
A2: 5 rows
A3: 1 row, where that single row could for instance be:
0 0 1 1 1 0 0 0 0 0 0 0 0 0 0
etc..
Sub problem*
I'de be very happy to solve only part of the problem. For instance being able to input:
1 1 2 3 4 2 2 3 4 2 2 3 3 2 3
And get an appropriate array of sequences with 1's and 0's minimized on the number of rows following th constraints above.
Sub-problem solution attempt
Well, here's an idea. This solution is not based on using a genetic algorithm, but some ideas could be used in going in that direction.
Basis vectors
First of all, you should generate what I think of as the basis vectors. For instance, if your sequence were 3 numbers long rather than 15, the basis vectors would be:
v1 = [1 1 0]
v2 = [0 1 1]
v3 = [1 1 1]
Any solution for sequence length 3 would be a linear combination of these three vectors using only positive integers. In other words, the general solution would be
a*v1 + b*v2 + c*v3
where a, b and c are positive integers. For the sequence [1 2 1], the solution is v1 = 1, v2 = 1, v3 = 0. What you first want to do is find all of the possible basis vectors of length 15. From my rough calculations I think that there are somewhere between 300-400 basis vectors of length 15. I can give you some tips towards generating them if you want.
Finding solutions
Now, what you want to do is sort these basis vectors by their sums/magnitudes. Then in searching for your solution, you start with the basis vectors which have the largest sums. We start with the vectors that have the largest sums because they lead to having less total rows. We also have an array, veccoefs, which contains an entry for the linear coefficient for each basis vector. At the beginning of searching for the solution, all the veccoefs are 0.
So we take the first basis vector (the one with the largest sum/magnitude) and subtract this vector from the sequence until we either create an unsolvable result ( having a 0 1 0 in it for instance) or any of the numbers in the result is negative. We store the number of times we subtract the vector in veccoefs. We use the result after subtracting the basis vector from the sequence as the sequence for the next basis vector. If there are only zeros left in the result, then we stop the loop.
I'm not sure of the efficiency/accuracy of this method, but it might at least give you some ideas.
Other possible solutions
Another idea for solving this is to use the basis vectors and form the problem as an optimization/least squares problem. You form a matrix of the basis vectors such that the basic problem will be minimizing Sum[(Ax - b)^2] where A is the matrix of basis vectors, b is the input sequence, and x are the basis vector coefficients. However, you also want to minimize the number of rows, so you can add a term like x^T*x to the minimization function where x^T is the transpose of x. The hard part in my opinion is finding differentiable terms to add that will encourage integer vector coefficients. If you can think of a way to do that, then optimization could very well be a good way to do this.
Also, you might consider a Metropolis-type Monte Carlo solution. You would choose randomly whether to add a vector, remove a vector, or substitute a vector at each step. The vector to be added/removed/substituted would be chosen randomly. The probability of this change to be accepted would be a ratio of the suitabilities of the solutions before the change and after the change. The suitability could be equal to the difference between the current solution and the sequence, squared and summed, minus the number of rows/basis vectors involved in the solution. You would need to put in appropriate constants to for various terms to try to get the acceptance rate around 50%. I kind of doubt that this will work very well, but I thought that you should still consider it when looking for possible solutions.
GA can be applied to this problem, but it won't be 5 minute task. You need to put several things together, without knowing which implementation of each of them is best.
So:
Solution representation - how you will represent possible solution? Using matrix seems to be most straight forward. Using collection of one dimensional arrays is possible also.
But you have some constrains, so maybe SuperGene concept is worth considering?
You must use proper mutation/crossover operators for given gene representation.
How will you enforce constrains on solutions? Destroying those that are not proper? What if they contain valuable information? Maybe let them stay in population but add some penalty to fitness, so they will contribute to offspring, but won't go into next generations?
Anyway I think that GA can be applied to this problem. Is it worth? Usually GA are not best algorithm, but they are decent algorithm if others fail. I would go with GA, just because it would be most fun but I would look for alternative solution (just in case).
P.S. Personal insight: I was solving N Queens Problem, for 70 < N < 100 (board NxN, N queens). Algorithm was working fine for lower N (maybe it was trying all combination?), but with N in this range, I couldn't find proper solution. Fitness quickly jumped to about 90% of max, but in the end there were always two queens conflicting. But it was very naive implementation.

Resources