Minimum Tile Ordering - algorithm
Minimizing Tile Re-ordering Problem:
Suppose I had the following symmetric 9x9 matrix, N^2 interactions between N particles:
(1,2) (2,9) (4,5) (4,6) (5,8) (7,8),
These are symmetric interactions, so it implicitly implies that there exists:
(2,1) (9,2) (5,4) (6,4) (8,5) (8,7),
In my problem, suppose they are arranged in matrix form, where only the upper triangle is shown:
t 0 1 2 (tiles)
# 1 2 3 4 5 6 7 8 9
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 0 0 0 0 0 0 1 ]
3 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
9 [ x x x x x x x x 0 ] (x's denote symmetric pair)
I have some operation that's computed in 3x3 tiles, and any 3x3 that contains at least a single 1 must be computed entirely. The above example requires at least 5 tiles: (0,0), (0,2), (1,1), (1,2), (2,2)
However, if I swap the 3rd and 9th columns (and along with the rows since its a symmetric matrix) by permutating my input:
t 0 1 2
# 1 2 9 4 5 6 7 8 3
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 1 0 0 0 0 0 0 ]
9 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
3 [ x x x x x x x x 0 ] (x's denote symmetric pair)
Now I only need to compute 4 tiles: (0,0), (1,1), (1,2), (2,2).
The General Problem:
Given an NxN sparse matrix, finding an re-ordering to minimize the number of TxT tiles that must be computed. Suppose that N is a multiple of T. An optimal, but unfeasible, solution can be found by trying out the N! permutations of the input ordering.
For heuristics, I've tried bandwidth minimization routines (such as Reverse CutHill McKee), Tim Davis' AMD routines, so far to no avail. I don't think diagonalization is the right approach here.
Here's a sample starting matrix:
http://proteneer.com/misc/out2.dat
Hilbert Curve:
RCM:
Morton Curve:
There are several well-known options you can try (some of them you have, but still):
(Reverse) Cuthill-McKee reduced the matrix bandwidth, keeping the entries close to the diagonal.
Approximage Minimum Degree - a light-weight fill-reducing reordering.
fill-reducing reordering for sparse LU/LL' decomposition (METIS, SCOTCH) - quite computationally heavy.
space filling curve reordering (something in these lines)
quad-trees for 2D or oct-trees for 3D problems - you assign the particles to quads/octants and later number them according to the quad/octant id, similar to space filling curves in a sense.
Self Avoiding Walk is used on structured grids to traverse the grid points in such order that all points are only visited once
a lot of research in blocking of the sparse matrix entries has been done in the context of Sparse Matrix-Vector multiplication. Many of the researchers have tried to find good reordering for that purpose (I do not have the perfect overview on that subject, but have a look at e.g. this paper)
All of those tend to find structure in your matrix and in some sense group the non-zero entries. Since you say you deal with particles, it means that your connectivity graph is in some sense 'local' because of spatial locality of the particle interactions. In this case these methods should be of good use.
Of course, they do not provide the exact solution to the problem :) But they are commonly used in exactly such cases because they yield very good reorderings in practice. I wonder what do you mean by saying the methods you tried failed? Do you expect to find the optimum solution? Surely, they improve the situation compared to a random matrix ordering.
Edit Let me briefly go through a few pictures. I have created a 3D structured cartesian mesh composed of 20-node brick elements. I matched the size of the mesh so that it is similar to yours (~1000 nodes). Also, number of non-zero entries per row are not too far off (51-81 in my case, 59-81 in your case, both however have very different distributions) The pictures below show RCM and METIS reorderings for non-periodic mesh (left), and for mesh with complete x-y-z periodicity (right):
Next picture shows the same matrix reordered using METIS and fill-reducing reordering
The difference is striking - bad impact of periodicity is clear. Now your matrix reordered with RCM and METIS
WOW. You have a problem :) First of all, I think there is something wrong with your rcm, because mine looks different ;) Also, I am certain that you can not conclude anything general and meaningful about any reordering based on this particular matrix. This is because your system size is very small (less than roughly 10x10x10 points), and you seem to have relatively long-range interactions between your particles. Hence, introducing periodicity into such small system has a much stronger bad effect on reordering than is seen in my structured case.
I would start the search for a good reordering by turning off periodicity. Once you have a reordering that satisfies you, introduce periodic interactions. In the system you showed there is almost nothing but periodicity: because it is very smal and because your interactions are fairly long-range, at least compared to my mesh. In much larger systems periodicity will have a smaller effect on the center of the model.
Smaller, but still negative. Maybe you could change your approach to periodicity? Instead of including periodic connectivities explicitly in the matrix, construct and reorder a matrix without those and introduce explicit equations binding the periodic particles together, e.g.:
V_particle1 = V_particle100
or in other words
V_particle1 - V_particle100 = 0
and add those equations at the end of your matrix. This method is called the Lagrange multipliers. Here is how it looks for my system
You keep the reordering of the non-periodic system and the periodic connectivities are localized in a block at the end of the matrix. Of course, you can use it for any other reorderings.
The next idea is you start with a reordered non-periodic system and explicitly eliminate matrix rows for the periodic nodes by adding them into the rows they are mapped onto. You should of course also eliminate the columns.
Whether you can use these depends on what you do with your matrix. Lagrange multiplier for example introduce 0 on the diagonal - not all solvers like that..
Anyway, this is very interesting research. I think that because of the specifics of your problem (as I understand it - irregularly placed particles in 3D, with fairly long-range interactions) make it very difficult to group the matrix entries. But I am very curious what you end up doing. Please let me know!
You can look for a data structure like kd-tree, R-tree, quadtree or a space filling curve. Especially a space filling curve can help because it reduce the dimension and also reorder the tiles and thus can add some new information to the grid. With a 9x9 grid it's probably good to look into peano curves. The z order morton curve is better for power of 2 grids.
Related
Why are matricies used in computer graphics?
I understand how to apply matrices in computer graphics, but I don't quite understand why this is done. For example in translation: to translate vector (x, y, z) by vector (diffX, diffY, diffZ) you could simply just add the vectors together instead of creating a translation matrix: [1 0 0 diffX] [0 1 0 diffY] [0 0 1 diffZ] [0 0 0 1 ] and then multiplying the vector by the matrix to get (x+diffX, y+diffY, z+diffZ). Surely applying matrices like this would be wasteful of performance and memory?
Can I always assume that an mvp matrix with corner value !=1 is performing scaling?
Assume I have a modelview projection matrix, mvp and I know that mvp[3][3] !=1 and mvp[3][3] > 0 Can I assume that the model matrix performed the scaling or since the projection matrix itself performs scaling this number is not useful without the original matrices?
No, this value alone does not tell you much. Consider a diagonal matrix like the following: d 0 0 0 0 d 0 0 0 0 d 0 0 0 0 d d is an arbitrary number. This matrix is essentially the homogeneous equivalent of the identity matrix and does not perform any transformation at all. The uniform scaling part in the upper left 3x3 block is cancelled out by the perspective divide. You can always multiply the matrix by the inverse of the m33 entry to somewhat normalize it (this will preserve the transformation). For the above matrix, you would then get: 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 And in this form, you can easily see that it is the identity. Moreover, you can examine the upper left 3x3 block to find out if there is a scaling (depending on your definition of scaling, calculating the determinant of the 3x3 block and checking for 1 is one option as Robert mentioned in the comments).
Algorithm for read matrixes
An algorithm that need process a matrix n x m that is scalable. E.g. I have a time series of 3 seconds containing the values: 2,1,4. I need to decompose it to take a 3 x 4 matrix, where 3 is the number of elements of time series and 4 the maximum value. The resulting matrix that would look like this: 1 1 1 1 0 1 0 0 1 0 0 1 Is this a bad solution or is it only considered a data entry problem? The question is, do I need to distribute information from each row of the matrix for various elements without losing the values?
Converting points into another coordinate system
There are 3 points in 3D space. There are 2 orthogonal coordinate systems with the same origin. I know coordinates of those 3 points in both coordinate systems. Given a new point with its coordinates in the first coordinate system, how can I find its coordinates in the second coordinate system? I think it's possible to get a rotation matrix using given points which does this, but I did not succeed doing this.
You can do it using matrix inverses. Three matrix-vector multiplications (e.g. transforming three 3D vectors by a 3x3 matrix) is equivalent to multiplying two 3x3 matrices together. So, you can put your first set of points in one matrix, call it A: 0 0 1 < vector 1 0 1 0 < vector 2 2 0 0 < vector 3 Then put your second set of points in a second matrix, call it C. As an example, imagine a transform that scales by a factor of 2 around the origin and flips the Y and Z axes: 0 2 0 < vector 1 0 0 2 < vector 2 4 0 0 < vector 3 So, if A x B = C, we need to find the matrix B, which we can find by finding the A-1: Inverse of A: 0 0 0.5 0 1 0 1 0 0 The multiply A-1 x C (in that order): 2 0 0 0 0 2 0 2 0 This is a transform matrix B that you can apply to new points. Dot-product multiply the vector by the first column to get the transformed X, second column to get the transformed Y, etc.
Special scheduling Algorithm (pattern expansion)
Question Do you think genetic algorithms worth trying out for the problem below, or will I hit local-minima issues? I think maybe aspects of the problem is great for a generator / fitness-function style setup. (If you've botched a similar project I would love hear from you, and not do something similar) Thank you for any tips on how to structure things and nail this right. The problem I'm searching a good scheduling algorithm to use for the following real-world problem. I have a sequence with 15 slots like this (The digits may vary from 0 to 20) : 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 (And there are in total 10 different sequences of this type) Each sequence needs to expand into an array, where each slot can take 1 position. 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 The constraints on the matrix is that: [row-wise, i.e. horizontally] The number of ones placed, must either be 11 or 111 [row-wise] The distance between two sequences of 1 needs to be a minimum of 00 The sum of each column should match the original array. The number of rows in the matrix should be optimized. The array then needs to allocate one of 4 different matrixes, which may have different number of rows: A, B, C, D A, B, C and D are real-world departments. The load needs to be placed reasonably fair during the course of a 10-day period, not to interfere with other department goals. Each of the matrix is compared with expansion of 10 different original sequences so you have: A1, A2, A3, A4, A5, A6, A7, A8, A9, A10 B1, B2, B3, B4, B5, B6, B7, B8, B9, B10 C1, C2, C3, C4, C5, C6, C7, C8, C9, C10 D1, D2, D3, D4, D5, D6, D7, D8, D9, D10 Certain spots on these may be reserved (Not sure if I should make it just reserved/not reserved or function-based). The reserved spots might be meetings and other events The sum of each row (for instance all the A's) should be approximately the same within 2%. i.e. sum(A1 through A10) should be approximately the same as (B1 through B10) etc. The number of rows can vary, so you have for instance: A1: 5 rows A2: 5 rows A3: 1 row, where that single row could for instance be: 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 etc.. Sub problem* I'de be very happy to solve only part of the problem. For instance being able to input: 1 1 2 3 4 2 2 3 4 2 2 3 3 2 3 And get an appropriate array of sequences with 1's and 0's minimized on the number of rows following th constraints above.
Sub-problem solution attempt Well, here's an idea. This solution is not based on using a genetic algorithm, but some ideas could be used in going in that direction. Basis vectors First of all, you should generate what I think of as the basis vectors. For instance, if your sequence were 3 numbers long rather than 15, the basis vectors would be: v1 = [1 1 0] v2 = [0 1 1] v3 = [1 1 1] Any solution for sequence length 3 would be a linear combination of these three vectors using only positive integers. In other words, the general solution would be a*v1 + b*v2 + c*v3 where a, b and c are positive integers. For the sequence [1 2 1], the solution is v1 = 1, v2 = 1, v3 = 0. What you first want to do is find all of the possible basis vectors of length 15. From my rough calculations I think that there are somewhere between 300-400 basis vectors of length 15. I can give you some tips towards generating them if you want. Finding solutions Now, what you want to do is sort these basis vectors by their sums/magnitudes. Then in searching for your solution, you start with the basis vectors which have the largest sums. We start with the vectors that have the largest sums because they lead to having less total rows. We also have an array, veccoefs, which contains an entry for the linear coefficient for each basis vector. At the beginning of searching for the solution, all the veccoefs are 0. So we take the first basis vector (the one with the largest sum/magnitude) and subtract this vector from the sequence until we either create an unsolvable result ( having a 0 1 0 in it for instance) or any of the numbers in the result is negative. We store the number of times we subtract the vector in veccoefs. We use the result after subtracting the basis vector from the sequence as the sequence for the next basis vector. If there are only zeros left in the result, then we stop the loop. I'm not sure of the efficiency/accuracy of this method, but it might at least give you some ideas. Other possible solutions Another idea for solving this is to use the basis vectors and form the problem as an optimization/least squares problem. You form a matrix of the basis vectors such that the basic problem will be minimizing Sum[(Ax - b)^2] where A is the matrix of basis vectors, b is the input sequence, and x are the basis vector coefficients. However, you also want to minimize the number of rows, so you can add a term like x^T*x to the minimization function where x^T is the transpose of x. The hard part in my opinion is finding differentiable terms to add that will encourage integer vector coefficients. If you can think of a way to do that, then optimization could very well be a good way to do this. Also, you might consider a Metropolis-type Monte Carlo solution. You would choose randomly whether to add a vector, remove a vector, or substitute a vector at each step. The vector to be added/removed/substituted would be chosen randomly. The probability of this change to be accepted would be a ratio of the suitabilities of the solutions before the change and after the change. The suitability could be equal to the difference between the current solution and the sequence, squared and summed, minus the number of rows/basis vectors involved in the solution. You would need to put in appropriate constants to for various terms to try to get the acceptance rate around 50%. I kind of doubt that this will work very well, but I thought that you should still consider it when looking for possible solutions.
GA can be applied to this problem, but it won't be 5 minute task. You need to put several things together, without knowing which implementation of each of them is best. So: Solution representation - how you will represent possible solution? Using matrix seems to be most straight forward. Using collection of one dimensional arrays is possible also. But you have some constrains, so maybe SuperGene concept is worth considering? You must use proper mutation/crossover operators for given gene representation. How will you enforce constrains on solutions? Destroying those that are not proper? What if they contain valuable information? Maybe let them stay in population but add some penalty to fitness, so they will contribute to offspring, but won't go into next generations? Anyway I think that GA can be applied to this problem. Is it worth? Usually GA are not best algorithm, but they are decent algorithm if others fail. I would go with GA, just because it would be most fun but I would look for alternative solution (just in case). P.S. Personal insight: I was solving N Queens Problem, for 70 < N < 100 (board NxN, N queens). Algorithm was working fine for lower N (maybe it was trying all combination?), but with N in this range, I couldn't find proper solution. Fitness quickly jumped to about 90% of max, but in the end there were always two queens conflicting. But it was very naive implementation.