Related
I was trying out "Knight Probability in Chessboard" problem from leetcode:
Given n, k, row and column, we have to find the probability that knight initially kept at cell indexed by [row,column] will stay on n x n chessboard after k moves.
I wanted to do it by addition, that is, maintain number of ways we can get to cell at index [x,y] in kth step at dynamic programming memory location indexed [x,y,k]. Then sum counts in all cells at kth index and then divide it by 8^k. That is, if I start at index [0,0], with n=4, the values at successive k-th index will be:
After step 1:
0 0 0 0
0 0 1 0
0 1 0 0
0 0 0 0
After step 2:
4 0 2 0
0 0 0 2
2 0 0 0
0 2 0 4
After step 3:
0 6 0 0
6 0 11 0
0 11 0 6
0 0 6 0
Only first step output seems to be correct. After second step, the sum is 2+2+2+2+4+4=16 and the probability is 16/8^2 = 0.25. However, the actual answer is 0.125. After third step, the sum becomes 6+6+6+6+11+11=46 and the probability is 46/8^3 = 0.0898. But, the actual answer is 0.039. Where does this dynamic programming approach make mistake?
Sample calculation for step 2
Bottom up approach:
Start by filling P(x_start, y_start, 0) = 1 and setting (x_start, y_start) in a map (from positions to booleans) previous_layer_map. Also, set the counter current_layer to 1.
Iterate though each of the n^2 positions of the board. For each of them, check in O(1) if it reaches a square in previous_layer_map. If so:
If (x, y) was never saw before in the current layer (current_layer_map[x][y] == false), fill
P(x, y, current_layer) = P(x_reached, y_reached, current_layer-1)/8
and set (x, y) in current_layer_map.
Else, set
P(x, y, current_layer) += P(x_reached, y_reached, current_layer-1)/8
After you finish iterating though each of the n^2 positions of the board, empty previous_layer_map, fill it with the elements of current_layer_map and empty current_layer_map. Also, increase the counter current_layer. Then, start a new iteration. Go like this until you reach the k-th layer.
Total time complexity: O(k * n^2).
Top down approach:
Let P(x, y, k) be the probability that the knight is at the square (x, y) at the k-th step. Look at all squares that the knight could have come from (you can get them in O(1), just look at the board with a pen and paper and get the formulas from the different cases, like knight in the corner, knight in the border, knight in a central region etc). Let them be (x1, y1), ... (xj, yj). For each of these squares, what is the probability that the knight jumps to (x, y) ? Considering that it can go out of the border, it's always 1/8. So:
P(x, y, k) = (P(x1, y1, k-1) + ... + P(xj, yj, k-1))/8
The base case is k = 0.
P(x, y ,0) = 1 if (x, y) = (x_start, y_start) and P(x, y, 0) = 0 otherwise.
That is your recurrence formula. You can use dynamic programming to calculate it.
Open question: how to analyze the time complexity of this solution ? Is it equivalent to the bottom-up approach described in my other answer ?
I was incorrecty incrementing the numbers. For example in the diagram shown at the end of original question, red arrows increments from 1 to 2. It shouldnt be the case as going from one cell to next represents the same single path to next cell. It does not create two different paths to next cell. Same is the case with blue arrow. So, corrected steps are:
After step 1
0 0 0 0
0 0 1 0
0 1 0 0
0 0 0 0
After step 2
2 0 1 0
0 0 0 1
1 0 0 0
0 1 0 2
After step 3
0 2 0 0
2 0 6 0
0 6 0 2
0 0 2 0
and (2+2+2+2+6+6)/8^3 = 20/8^3 = 0.039
which is the correct answer!
Assume I have a modelview projection matrix, mvp and I know that mvp[3][3] !=1 and mvp[3][3] > 0
Can I assume that the model matrix performed the scaling or since the projection matrix itself performs scaling this number is not useful without the original matrices?
No, this value alone does not tell you much. Consider a diagonal matrix like the following:
d 0 0 0
0 d 0 0
0 0 d 0
0 0 0 d
d is an arbitrary number.
This matrix is essentially the homogeneous equivalent of the identity matrix and does not perform any transformation at all. The uniform scaling part in the upper left 3x3 block is cancelled out by the perspective divide. You can always multiply the matrix by the inverse of the m33 entry to somewhat normalize it (this will preserve the transformation). For the above matrix, you would then get:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
And in this form, you can easily see that it is the identity. Moreover, you can examine the upper left 3x3 block to find out if there is a scaling (depending on your definition of scaling, calculating the determinant of the 3x3 block and checking for 1 is one option as Robert mentioned in the comments).
Say we have a matrix of zeros and ones
0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1
and we want to find all the submatrices (we just need the row indices and column indices of the corners) with these properties:
contain at least L ones and L zeros
contain max H elements
i.e. take the previous matrix with L=1 and H=5, the submatrix 1 2 1 4 (row indices 1 2 and column indices 1 4)
0 1 1 1
1 1 1 1
satisfies the property 1 but has 8 elements (bigger than 5) so it is not good;
the matrix 4 5 1 2
0 1
0 0
is good because satisfies both the properties.
The objective is then to find all the submatrices with min area 2*L, max area H and containg at least L ones and L zeros.
If we consider a matrix as a rectangle it is easy to find all the possibile subrectangles with max area H and min area 2*L by looking at the divisors of all the numbers from H to 2*L.
For example, with H=5 and L=1 all the possibile subrectangles/submatrices are given by the divisors of
H=5 -> divisors [1 5] -> possibile rectangles of area 5 are 1x5 and 5x1
4 -> divisors [1 2 4] -> possibile rectangles of area 4 are 1x4 4x1 and 2x2
3 -> divisors [1 3] -> possibile rectangles of area 3 are 3x1 and 1x3
2*L=2 -> divisors [1 2] -> possibile rectangles of area 2 are 2x1 and 1x2
I wrote this code, which, for each number finds its divisors and cycles over them to find the submatrices. To find the submatrices it does this: take for example a 1x5 submatrix, what the code does is to fix the first line of the matrix and move step by step (along all the columns of the matrix) the submatrix from the left edge of the matrix to the right edge of the matrix, then the code fixes the second row of the matrix and moves the submatrix along all the columns from left to right, and so on until it arrives at the last row.
It does this for all the 1x5 submatrices, then it considers the 5x1 submatrices, then the 1x4, then the 4x1, then the 2x2, etc.
The code do the job in 2 seconds (it finds all the submatrices) but for big matrices, i.e. 200x200, a lot of minutes are needed to find all the submatrices. So I wonder if there are more efficient ways to do the job, and eventually which is the most efficient.
This is my code:
clc;clear all;close all
%% INPUT
P= [0 1 1 1 0 0 0 ;
1 1 1 1 0 1 1 ;
0 0 1 0 0 1 0 ;
0 1 1 0 1 1 1 ;
0 0 0 0 0 0 1 ;
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
[R,C]=size(P); % rows and columns of P
sub=zeros(1,6); % initializing the matrix containing the indexes of each submatrix (columns 1-4), their area (5) and the counter (6)
counter=1; % no. of submatrices found
%% FIND ALL RECTANGLES OF AREA >= 2*L & <= H
%
% idea: all rectangles of a certain area can be found using the area's divisors
% e.g. divisors(6)=[1 2 3 6] -> rectangles: 1x6 6x1 2x3 and 3x2
tic
for sH = H:-1:2*L % find rectangles of area H, H-1, ..., 2*L
div_sH=divisors(sH); % find all divisors of sH
disp(['_______AREA ', num2str(sH), '_______'])
for i = 1:round(length(div_sH)/2) % cycle over all couples of divisors
div_small=div_sH(i);
div_big=div_sH(end-i+1);
if div_small <= R && div_big <= C % rectangle with long side <= C and short side <= R
for j = 1:R-div_small+1 % cycle over all possible rows
for k = 1:C-div_big+1 % cycle over all possible columns
no_of_ones=length(find(P(j:j-1+div_small,k:k-1+div_big))); % no. of ones in the current submatrix
if no_of_ones >= L && no_of_ones <= sH-L % if the submatrix contains at least L ones AND L zeros
% row indexes columns indexes area position
sub(counter,:)=[j,j-1+div_small , k,k-1+div_big , div_small*div_big , counter]; % save the submatrix
counter=counter+1;
end
end
end
disp([' [', num2str(div_small), 'x', num2str(div_big), '] submatrices: ', num2str(size(sub,1))])
end
if div_small~=div_big % if the submatrix is a square, skip this part (otherwise there will be duplicates in sub)
if div_small <= C && div_big <= R % rectangle with long side <= R and short side <= C
for j = 1:C-div_small+1 % cycle over all possible columns
for k = 1:R-div_big+1 % cycle over all possible rows
no_of_ones=length(find(P(k:k-1+div_big,j:j-1+div_small)));
if no_of_ones >= L && no_of_ones <= sH-L
sub(counter,:)=[k,k-1+div_big,j,j-1+div_small , div_big*div_small, counter];
counter=counter+1;
end
end
end
disp([' [', num2str(div_big), 'x', num2str(div_small), '] submatrices: ', num2str(size(sub,1))])
end
end
end
end
fprintf('\ntime: %2.2fs\n\n',toc)
Here is a solution centered around 2D matrix convolution. The rough idea is to convolve P for each submatrix shape with a second matrix such that each element of the resulting matrix indicates how many ones are in the submatrix having its top left corner at said element. Like this you get all solutions for a single shape in one go, without having to loop over rows/columns, greatly speeding things up (it takes less than a second for a 200x200 matrix on my 8 years old laptop)
P= [0 1 1 1 0 0 0
1 1 1 1 0 1 1
0 0 1 0 0 1 0
0 1 1 0 1 1 1
0 0 0 0 0 0 1
0 0 0 0 0 0 1];
L=1; % a submatrix has to containg at least L ones and L zeros
H=5; % max area of a submatrix
submats = [];
for sH = H:-1:2*L
div_sH=divisors(sH); % find all divisors of sH
for i = 1:length(div_sH) % cycle over all couples of divisors
%number of rows of the current submatrix
nrows=div_sH(i);
% number of columns of the current submatrix
ncols=div_sH(end-i+1);
% perpare matrix to convolve P with
m = zeros(nrows*2-1,ncols*2-1);
m(1:nrows,1:ncols) = 1;
% get the number of ones in the top left corner each submatrix
submatsums = conv2(P,m,'same');
% set values where the submatrices go outside P invalid
validsums = zeros(size(P))-1;
validsums(1:(end-nrows+1),1:(end-ncols+1)) = submatsums(1:(end-nrows+1),1:(end-ncols+1));
% get the indexes where the number of ones and zeros is >= L
topLeftIdx = find(validsums >= L & validsums<=sH-L);
% save submatrixes in following format: [index, nrows, ncols]
% You can ofc use something different, but it seemed the simplest way to me
submats = [submats ; [topLeftIdx bsxfun(#times,[nrows ncols],ones(length(topLeftIdx),1))]];
end
end
First, I suggest that you combine finding the allowable sub-matrix sizes.
for smaller = 1:sqrt(H)
for larger = 2*L:H/smaller
# add smaller X larger and larger x smaller to your shapes list
Next, start with the smallest rectangles in the shapes. Note that any solution to a small rectangle can be extended in any direction, to the area limit of H, and the added elements will not invalidate the solution you found. This will identify many solutions without bothering to check the populations within.
Keep track of the solutions you've found. As you work your way toward larger rectangles, you can avoid checking anything already in your solutions set. If you keep that in a hash table, checking membership is O(1). All you'll need to check thereafter will be larger blocks of mostly-1 adjacent to mostly-0. This should speed up the processing somewhat.
Is that enough of a nudge to help?
I would like to calculate a couple of texture features (namely: small/ large number emphasis, number non-uniformity, second moment and entropy). Those can be computed from Neighboring gray-level dependence matrix. I'm struggling with understanding/implementation of this. There is very little info on this method (publicly available).
According to this paper:
This matrix takes the form of a two-dimensional array Q, where Q(i,j) can be considered as frequency counts of grayness variation of a processed image. It has a similar meaning as histogram of an image. This array is Ng×Nr where Ng is the number of possible gray levels and Nr is the number of possible neighbours to a pixel in an image.
If the image function f(i,j) is discrete, then it is easy to computer the Q matrix (for positive integer d, a) by counting the number of times the difference between each element in f(i,j) and its neighbours is equal or less than a at a certain distance d.
Here is the example from the same paper (d = 1, a = 0):
Input (image) matrix and output matrix Q:
I've been looking at this example for hours now and still can't figure out how they got that Q matrix. Anyone?
The method was originally created by C. Sun and W. Wee and was described in a paper called: "Neighboring gray level dependence matrix for texture classification" to which I got access, but can't download (after pressing download the page reloads and that's it).
In the example that you have provided, d=1 and a=0. When d=1, we consider pixels in an 8-pixel neighbourhood. When a=0, this means that we look for pixels that have the same value as the centre of the neighbourhood.
The basic algorithm is the following:
Initialize your NGLDM matrix to all zeroes. The total number of rows corresponds to the total number of possible intensities / values in your image. The total number of columns corresponds to how many pixels are in your neighbourhood plus 1. As such for d=1, we have an 8-pixel neighbourhood and so 8 + 1 = 9. Because there are 4 possible intensities (0,1,2,3), we thus have a 4 x 9 matrix. Let's call this matrix M.
For each pixel in your matrix, take note of this pixel. This goes in the Ng row.
Write out how many valid neighbours there are that surround this pixel.
Count how many times you see the neighbouring pixels matching that pixel in Step #1. This is your Nr column.
Once you figure out the numbers in Step #1 and Step #2, increment this location by 1.
Here's a slight gotcha: They ignore the border locations. As such, you don't do this procedure for the first row, last row, first column or last column. My guess is that they want to be sure that you have an 8-pixel neighbourhood all the time. This is also dictated by the distance d=1. You must be able to grab every valid pixel given a centre location at d=1. If d=2, then you would have to make sure that every pixel in the centre of the neighbourhood has a 25 pixel neighbourhood and so on.
Let's start from the second row, second column location of this matrix. Let's go through the steps:
Ng = 1 as the location is 1.
Valid neighbours - Starting from the top left pixel in this neighbourhood, and scanning left to right and omitting the centre, we have: 1, 1, 2, 0, 1, 0, 2, 2.
How many values are equal to 1? Three times. Therefore Nr = 3
M(Ng,Nr) += 1. Access row Ng = 1, and access row Nr = 3, and increment this spot by 1.
Want to know how I figured out they don't count the borders? Let's do the bottom left pixel. That location is 0, so Ng = 0. If you repeat the algorithm that I just said, you would expect Ng = 0, Nr = 1, and so you would expect at least one entry in that location in your matrix... but you don't! If you do similar checks around the border of the image, you'll see that entries that are supposed to be there... aren't. Take a look at the third row, fifth column. You would think that Ng = 1 and Nr = 1, but we don't see that in the matrix.
One more example. Why is M(Ng,Nr) = 4, Ng = 2, Nr = 4? Well, take a look at every pixel that has a 2 in it. The only valid locations where we can capture an 8 pixel neighbourhood successfully are the row=2, col=4, row=3, col=3, row=3, col=4, row=4, col=3, and row=4, col=4. By applying the same algorithm that we have seen, you'll see that for each of those locations, Nr = 4. As such, we see this combination of Ng = 2, Nr = 4 four times, and that's why the location is set to 4. However, in row=3, col=4, this actually is Nr = 5, as there are five 2s in that neighbourhood at that centre. That's why you see Ng = 2, Nr = 5, M(Ng,Nr) = 1.
As an example, let's do one of the locations. Let's do the 2 smack dab in the middle of the matrix (row=3, col=3):
Ng = 2
What are the valid neighbouring pixels? 1, 1, 2, 0, 2, 3, 2, 2 (omit the centre)
Count how many pixels equal to 2. There are four of them, so Nr = 4
M(Ng,Nr) += 1. Take Ng = 2, Nr = 4 and increment this spot by 1.
If you do this with the other valid locations that have 2, you'll see that Nr = 4 each time with the exception of the third row and fourth column, where Nr = 5.
So how would we implement this in MATLAB? What you can do is use im2col to transform each valid neighbourhood into columns. What I'm also going to do is extract the centre of each neighbourhood. This is actually the middle row of the matrix. We will then figure out how many pixels for each neighbourhood equal the centre, sum them up, and this will determine our Nr values. The Ng values will be the middle row values themselves. Once we do this, we can compute a histogram based on these values just like how the algorithm is doing to get our matrix. In other words, try doing this:
% // Your example
A = [1 1 2 3 1; 0 1 1 2 2; 0 0 2 2 1; 3 3 2 2 1; 0 0 2 0 1];
B = im2col(A, [3 3]); %//Convert neighbourhoods to columns - 3 x 3 means d = 1
C = bsxfun(#eq, B, B(5,:)); %//Figure out a logical matrix where each column tells
%//you how many elements equals the one in each centre
D = sum(C, 1) - 1; %// Must subtract by 1 to discount centre pixel
Ng = B(5,:).' + 1; % // We must make this into a column vector, and we also must
% // offset by 1 as MATLAB starts indexing by 1.
%// Column vector is for accumarray input
Nr = D.' + 1; %// Do the same for Nr. We could have simply left out the + 1 here and
%// took out the subtraction of -1 for D, but I want to explicitly show
%// the steps
Q = accumarray([Ng Nr], 1, [4 9]); %// 4 unique intensities, 9 possible locations (0-8)
... and here is our matrix:
Q =
0 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 0 4 1 0 0 0
0 1 0 0 0 0 0 0 0
If you check this, you'll see this matches with Q.
Bonus
If you want to be able to accommodate for the algorithm in general, where you specify d and a, we can simply follow the guidelines of your text. For each neighbourhood, you find the difference between the centre pixel and all of the other pixels. You count how many pixels are <= a for any positive integer d. Note that this will create a 2*d + 1 x 2*d + 1 neighbourhood we need to examine. We can also make this into a function. Without further ado:
%// Set A up yourself, then use a and d as inputs
%// Precondition - a and d are both integers. a can be 0 and d is positive!
function [Q] = calculateGrayDepMatrix(A, a, d)
neigh = 2*d + 1; % //Calculate rows/columns of neighbourhood
numTotalNeigh = neigh*neigh; % //Calculate total number of pixels in neighbourhood
middleRow = ceil(numTotalNeigh / 2); %// Figure out which index the middle row is
B = im2col(A, [neigh neigh]); %// Make into columns
Cdiff = abs(bsxfun(#minus, B, B(middleRow,:))); %// For each neighbourhood, subtract with its centre
C = Cdiff <= a; %// For each neighbourhood, figure out which differences are <= a
D = sum(C, 1) - 1; % //For each neighbourhood, add them up
Ng = B(middleRow,:).' + 1; % // Determine Ng and Nr, and find Q
Nr = D.' + 1;
Q = accumarray([Ng Nr], 1, [max(Ng) numTotalNeigh]);
end
We can recreate the scenario we showed above with the example matrix by:
A = [1 1 2 3 1; 0 1 1 2 2; 0 0 2 2 1; 3 3 2 2 1; 0 0 2 0 1];
Q = calculateGrayDepMatrix(A, 0, 1);
Q is thus:
Q =
0 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0
0 0 0 0 4 1 0 0 0
0 1 0 0 0 0 0 0 0
Hope this helps!
Minimizing Tile Re-ordering Problem:
Suppose I had the following symmetric 9x9 matrix, N^2 interactions between N particles:
(1,2) (2,9) (4,5) (4,6) (5,8) (7,8),
These are symmetric interactions, so it implicitly implies that there exists:
(2,1) (9,2) (5,4) (6,4) (8,5) (8,7),
In my problem, suppose they are arranged in matrix form, where only the upper triangle is shown:
t 0 1 2 (tiles)
# 1 2 3 4 5 6 7 8 9
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 0 0 0 0 0 0 1 ]
3 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
9 [ x x x x x x x x 0 ] (x's denote symmetric pair)
I have some operation that's computed in 3x3 tiles, and any 3x3 that contains at least a single 1 must be computed entirely. The above example requires at least 5 tiles: (0,0), (0,2), (1,1), (1,2), (2,2)
However, if I swap the 3rd and 9th columns (and along with the rows since its a symmetric matrix) by permutating my input:
t 0 1 2
# 1 2 9 4 5 6 7 8 3
1 [ 0 1 0 0 0 0 0 0 0 ]
0 2 [ x 0 1 0 0 0 0 0 0 ]
9 [ x x 0 0 0 0 0 0 0 ]
4 [ x x x 0 1 1 0 0 0 ]
1 5 [ x x x x 0 0 0 1 0 ]
6 [ x x x x x 0 0 0 0 ]
7 [ x x x x x x 0 1 0 ]
2 8 [ x x x x x x x 0 0 ]
3 [ x x x x x x x x 0 ] (x's denote symmetric pair)
Now I only need to compute 4 tiles: (0,0), (1,1), (1,2), (2,2).
The General Problem:
Given an NxN sparse matrix, finding an re-ordering to minimize the number of TxT tiles that must be computed. Suppose that N is a multiple of T. An optimal, but unfeasible, solution can be found by trying out the N! permutations of the input ordering.
For heuristics, I've tried bandwidth minimization routines (such as Reverse CutHill McKee), Tim Davis' AMD routines, so far to no avail. I don't think diagonalization is the right approach here.
Here's a sample starting matrix:
http://proteneer.com/misc/out2.dat
Hilbert Curve:
RCM:
Morton Curve:
There are several well-known options you can try (some of them you have, but still):
(Reverse) Cuthill-McKee reduced the matrix bandwidth, keeping the entries close to the diagonal.
Approximage Minimum Degree - a light-weight fill-reducing reordering.
fill-reducing reordering for sparse LU/LL' decomposition (METIS, SCOTCH) - quite computationally heavy.
space filling curve reordering (something in these lines)
quad-trees for 2D or oct-trees for 3D problems - you assign the particles to quads/octants and later number them according to the quad/octant id, similar to space filling curves in a sense.
Self Avoiding Walk is used on structured grids to traverse the grid points in such order that all points are only visited once
a lot of research in blocking of the sparse matrix entries has been done in the context of Sparse Matrix-Vector multiplication. Many of the researchers have tried to find good reordering for that purpose (I do not have the perfect overview on that subject, but have a look at e.g. this paper)
All of those tend to find structure in your matrix and in some sense group the non-zero entries. Since you say you deal with particles, it means that your connectivity graph is in some sense 'local' because of spatial locality of the particle interactions. In this case these methods should be of good use.
Of course, they do not provide the exact solution to the problem :) But they are commonly used in exactly such cases because they yield very good reorderings in practice. I wonder what do you mean by saying the methods you tried failed? Do you expect to find the optimum solution? Surely, they improve the situation compared to a random matrix ordering.
Edit Let me briefly go through a few pictures. I have created a 3D structured cartesian mesh composed of 20-node brick elements. I matched the size of the mesh so that it is similar to yours (~1000 nodes). Also, number of non-zero entries per row are not too far off (51-81 in my case, 59-81 in your case, both however have very different distributions) The pictures below show RCM and METIS reorderings for non-periodic mesh (left), and for mesh with complete x-y-z periodicity (right):
Next picture shows the same matrix reordered using METIS and fill-reducing reordering
The difference is striking - bad impact of periodicity is clear. Now your matrix reordered with RCM and METIS
WOW. You have a problem :) First of all, I think there is something wrong with your rcm, because mine looks different ;) Also, I am certain that you can not conclude anything general and meaningful about any reordering based on this particular matrix. This is because your system size is very small (less than roughly 10x10x10 points), and you seem to have relatively long-range interactions between your particles. Hence, introducing periodicity into such small system has a much stronger bad effect on reordering than is seen in my structured case.
I would start the search for a good reordering by turning off periodicity. Once you have a reordering that satisfies you, introduce periodic interactions. In the system you showed there is almost nothing but periodicity: because it is very smal and because your interactions are fairly long-range, at least compared to my mesh. In much larger systems periodicity will have a smaller effect on the center of the model.
Smaller, but still negative. Maybe you could change your approach to periodicity? Instead of including periodic connectivities explicitly in the matrix, construct and reorder a matrix without those and introduce explicit equations binding the periodic particles together, e.g.:
V_particle1 = V_particle100
or in other words
V_particle1 - V_particle100 = 0
and add those equations at the end of your matrix. This method is called the Lagrange multipliers. Here is how it looks for my system
You keep the reordering of the non-periodic system and the periodic connectivities are localized in a block at the end of the matrix. Of course, you can use it for any other reorderings.
The next idea is you start with a reordered non-periodic system and explicitly eliminate matrix rows for the periodic nodes by adding them into the rows they are mapped onto. You should of course also eliminate the columns.
Whether you can use these depends on what you do with your matrix. Lagrange multiplier for example introduce 0 on the diagonal - not all solvers like that..
Anyway, this is very interesting research. I think that because of the specifics of your problem (as I understand it - irregularly placed particles in 3D, with fairly long-range interactions) make it very difficult to group the matrix entries. But I am very curious what you end up doing. Please let me know!
You can look for a data structure like kd-tree, R-tree, quadtree or a space filling curve. Especially a space filling curve can help because it reduce the dimension and also reorder the tiles and thus can add some new information to the grid. With a 9x9 grid it's probably good to look into peano curves. The z order morton curve is better for power of 2 grids.