Given a grid (or table) with x*y cells. Each cell contains a value. Most of these cells have a value of 0, but there may be a "hot spot" somewhere on this grid with a cell that has a high value. The neighbours of this cell then also have a value > 0. As farer away from the hot spot as lower the value in the respective grid cell.
So this hot spot can be seen as the top of a hill, with decreasing values the farer we are away from this hill. At a certain distance the values drop to 0 again.
Now I need to determine the cell within the grid that represents the grid's center of gravity. In the simple example above this centroid would simply be the one cell with the highest value. However it's not always that simple:
the decreasing values of neighbour cells around the hot spot cell may not be equally distributed, or a "side of the hill" may fall down to 0 sooner than another side.
there is another hot spot/hill with values > 0 elsewehere within the grid.
I could think that this is kind of a typical problem. Unfortunately I am no math expert so I don't know what to search for (at least I have not found an answer in Google).
Any ideas how can I solve this problem?
Thanks in advance.
You are looking for the "weighted mean" of the cell values. Assuming each cell has a value z(x,y), then you can do the following
zx = sum( z(x, y) ) over all values of y
zy = sum( z(x, y) ) over all values of x
meanX = sum( x * zx(x)) / sum ( zx(x) )
meanY = sum( y * zy(y)) / sum ( zy(y) )
I trust you can convert this into a language of your choice...
Example: if you know Matlab, then the above would be written as follows
zx = sum( Z, 1 ); % sum all the rows
zy = sum( Z, 2 ); % sum all the columns
[ny nx] = size(Z); % find out the dimensions of Z
meanX = sum((1:nx).*zx) / sum(zx);
meanY = sum((1:ny).*zy) / sum(zy);
This would give you the meanX in the range 1 .. nx : if it's right in the middle, the value would be (nx+1)/2. You can obviously scale this to your needs.
EDIT: one more time, in "almost real" code:
// array Z(N, M) contains values on an evenly spaced grid
// assume base 1 arrays
zx = zeros(N);
zy = zeros(M);
// create X profile:
for jj = 1 to M
for ii = 1 to N
zx(jj) = zx(jj) + Z(ii, jj);
next ii
next jj
// create Y profile:
for ii = 1 to N
for jj = 1 to M
zy(ii) = zy(ii) + Z(ii, jj);
next jj
next ii
xsum = 0;
zxsum = 0;
for ii = 1 to N
zxsum += zx(ii);
xsum += ii * zx(ii);
next ii
xmean = xsum / zxsum;
ysum = 0;
zysum = 0;
for jj = 1 to M
zysum += zy(jj);
ysum += jj * zy(ii);
next jj
ymean = ysum / zysum;
This Wikipedia entry may help; the section entitled "A system of particles" is all you need. Just understand that you need to do the calculation once for each dimension, of which you apparently have two.
And here is a complete Scala 2.10 program to generate a grid full of random integers (using dimensions specified on the command line) and find the center of gravity (where rows and columns are numbered starting at 1):
object Ctr extends App {
val Array( nRows, nCols ) = args map (_.toInt)
val grid = Array.fill( nRows, nCols )( util.Random.nextInt(10) )
grid foreach ( row => println( row mkString "," ) )
val sum = grid.map(_.sum).sum
val xCtr = ( ( for ( i <- 0 until nRows; j <- 0 until nCols )
yield (j+1) * grid(i)(j) ).sum :Float ) / sum
val yCtr = ( ( for ( i <- 0 until nRows; j <- 0 until nCols )
yield (i+1) * grid(i)(j) ).sum :Float ) / sum
println( s"Center is ( $xCtr, $yCtr )" )
}
You could def a function to keep the calculations DRYer, but I wanted to keep it as obvious as possible. Anyway, here we run it a couple of times:
$ scala Ctr 3 3
4,1,9
3,5,1
9,5,0
Center is ( 1.8378378, 2.0 )
$ scala Ctr 6 9
5,1,1,0,0,4,5,4,6
9,1,0,7,2,7,5,6,7
1,2,6,6,1,8,2,4,6
1,3,9,8,2,9,3,6,7
0,7,1,7,6,6,2,6,1
3,9,6,4,3,2,5,7,1
Center is ( 5.2956524, 3.626087 )
Related
I am trying to calculate the partial first derivatives with respect to each of the two dimensions in a 2D matrix, i.e dF/dx and dF/dy, using the five point method. I have managed to do this successfully by looping over the points:
dF_dx = zeros(size(F));
dF_dy = zeros(size(F));
% Derivative with respect to y for each x value (Apply to all columns simultaneously)
dF_dy(2,1:(Nx-1)) = ( F(3,1:(Nx-1)) - F(1,1:(Nx-1)) )/(2*dy);
for m = 3:(Ny-2)
dF_dy(m,1:(Nx-1)) = ( F(m-2,1:(Nx-1)) - F(m+2,1:(Nx-1)) + 8*F(m+1,1:(Nx-1)) - 8*F(m-1,1:(Nx-1)) )/(12*dy);
end
dF_dy(Ny-1,1:(Nx-1)) = ( F(Ny,1:(Nx-1)) - F(Ny-2,1:(Nx-1)) )/(2*dy);
% Derivative with respect to x for each y value (Apply to all rows simultaneously)
dF_dx(2:(Ny-1),2) = ( F(2:(Ny-1),3) - F(2:(Ny-1),1) )/(2*dx);
for n = 3:(Nx-2)
dF_dx(2:(Ny-1),n) = ( F(2:(Ny-1),n-2) - F(2:(Ny-1),n+2) + 8*F(2:(Ny-1),n+1) - 8*F(2:(Ny-1),n-1) )/(12*dx);
end
dF_dx(2:(Ny-1),(Nx-1)) = ( F(2:(Ny-1),Nx) - F(2:(Ny-1),Nx-2) )/(2*dx);
Is there a clever Matlab way to vectorize this to make it much faster to execute? I have read several functions which seem like they might do the trick (such as diff(), circshift(), or maybe even kron() ), but am not sure how they might be used to solve this problem.
(My plan is to the implement this as a gpuArray later, if that is relevant for the solution).
Thank you!
1st Edit
By looking at the source code for gradient(), I was able to make the following version which is vectorized (i.e without the loop):
F = rand(5,8);
Nx = size(F,2);
Ny = size(F,1);
dx = 2;
dy = 3;
dF_dx = zeros(size(F));
dF_dy = zeros(size(F));
dF_dx(2:(Ny-1),3:Nx-2) = (F(2:(Ny-1),1:Nx-4) - F(2:(Ny-1),5:Nx) + 8*F(2:(Ny-1),4:Nx-1) - 8*F(2:(Ny-1),2:Nx-3))/(12*dx);
dF_dx(2:(Ny-1),2) = ( F(2:(Ny-1),3) - F(2:(Ny-1),1) )/(2*dx);
dF_dx(2:(Ny-1),(Nx-1)) = ( F(2:(Ny-1),Nx) - F(2:(Ny-1),Nx-2) )/(2*dx);
dF_dy(3:Ny-2,1:(Nx-1)) = (F(1:Ny-4,1:(Nx-1)) - F(5:Ny,1:(Nx-1)) + 8*F(4:Ny-1,1:(Nx-1)) - 8*F(2:Ny-3,1:(Nx-1)))/(12*dy);
dF_dy(2,1:(Nx-1)) = ( F(3,1:(Nx-1)) - F(1,1:(Nx-1)) )/(2*dy);
dF_dy((Ny-1),1:(Nx-1)) = ( F(Ny,1:(Nx-1)) - F(Ny-2,1:(Nx-1)) )/(2*dy);
Is there a way to do it without all the indexing (which I suspect is what is taking the most time)?
2nd Edit
I have now implemented Cris and chtz's suggestion and achieved this using a convolution with conv2(), as follows:
F = rand(500,600);
Nx = size(F,2);
Ny = size(F,1);
dx = 2;
dy = 3;
[dF_dx, dF_dy] = partial_derivatives(F, dx, dy, Nx, Ny)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [dF_dx, dF_dy] = partial_derivatives(F, dx, dy, Nx, Ny)
kernx = [-1 8 0 -8 1]/(12*dx); % Convolution kernel for x dimension
kerny = [-1;8;0;-8;1]/(12*dy); % Convolution kernel for y dimension
%%%%%%%% Partial derivative across x dimension %%%%%%%%
dF_dx = conv2(F, kernx, 'same') ; % Internal mesh points (five-point method)
% Second and penultimate mesh points (two-point method)
dF_dx(2:(Ny-1),2) = ( F(2:(Ny-1),3) - F(2:(Ny-1),1) )/(2*dx);
dF_dx(2:(Ny-1),(Nx-1)) = ( F(2:(Ny-1),Nx) - F(2:(Ny-1),Nx-2) )/(2*dx);
dF_dx(:,[1 Nx]) = 0; % Set boundary conditions
dF_dx([1 Ny],:) = 0; %
%%%%%%%% Partial derivative across x dimension %%%%%%%%
dF_dy = conv2(F, kerny, 'same') ; % Internal mesh points (five-point method)
% Second and penultimate mesh points (two-point method)
dF_dy(2,1:(Nx-1)) = ( F(3,1:(Nx-1)) - F(1,1:(Nx-1)) )/(2*dy);
dF_dy((Ny-1),1:(Nx-1)) = ( F(Ny,1:(Nx-1)) - F(Ny-2,1:(Nx-1)) )/(2*dy);
dF_dy(:,Nx) = 0; % Set boundary conditions
dF_dy([1 Ny],:) = 0; %
end
As a gpuArray, this does not provide any noticable improvement over the vectorized version in the 1st Edit. Is there an obvious way to improve it? Thanks.
Consider the following 2D matrix:
0,0,0,0,0
0,0,1,0,0
0,0,0,1,0
0,0,0,0,0
I could do only horizontal or vertical cuts stretching from end to end edges.
What could be the algorithm to use so I could find out how many times I can divide the matrix in 2 parts such that each of the 2 parts get equal number of cells with 1?
I assume you can do only one horizontal cut or vertical cut.
One approach for "one single matrix" is: ( You need to repetitively apply this to each partition you get ).
compute number of ones in each row and number of ones in each column, store them in two arrays like
OnesInRow[num_rows] , OnesInColumn[num_cols]
Also compute the total number of 1s in the matrix which is actually sum of values of all elements in either of the above arrays.
total = Sum( All elements in OnesInRow )
For example, you can get the number of ones in row number 2 like OnesInRow[1] ( assuming row index starts from 0 ). Similarly number of ones in col number 3 is OnesInCol[2].
Now consider a horizontal cut like this:
0,0,0,0,0
0,0,1,0,0
0,0,0,1,0
0,0,0,0,0
The number of ones that you get in each partition is : OnesInRow[0], Total - OnesInRow[0]
For this:
0,0,0,0,0
0,0,1,0,0
0,0,0,1,0
0,0,0,0,0
it is: Total - ( OnesInRow[0] + OnesInRow[1] ) , OnesInRow[0] + OnesInRow[1]
For this:
0,0, | 0,0,0
0,0, | 1,0,0
0,0, | 0,1,0
0,0, | 0,0,0
it is: Total - (OnesInCol[0] + OnesInCol[1] ), OnesInCol[0] + OnesInCol[1]
So you just need to consider all row cuts and col cuts and take which of those cuts will lead to two equal partitions of ones.
int count = 0;
int prevOnes = 0;
int onesInRowAboveThisCut = 0;
for ( int i = 1; i < rows; i++ ) {
onesInRowAboveThisCut = prevOnes + OnesInRow[i-1];
if ( onesInRowAboveThisCut= total/2 ) count++;
prevOnes = onesInRowAboveThisCut;
}
prevOnes = 0;
int onesInColBeforeThisCut = 0;
for ( int i = 1; i < cols; i++ ) {
onesInColBeforeThisCut = prevOnes + OnesInCol[i-1];
if ( onesInColBeforeThisCut = total/2 ) count++;
prevOnes = onesInColBeforeThisCut ;
}
return count;
Then for each matrix you get from such a partition you could repeat the process recursively, until you cant cut i.e. there is only one element in the array.
At each recursion you maintain a count variable and update it.
I have a matrix A and B. I want to take the sum of squares errors between them ss = sum(sum( (A-B).^2 )), but I only want to do so if NEITHER matrix elements are identically zero. For now, I am going through each matrix as follows:
for i = 1:N
for j = 1:M
if( A(i,j) == 0 )
B(i,j) = 0;
elseif( B(i,j) == 0 )
A(i,j) = 0;
end
end
end
and then taking the sum of squares after that. Is there a way to vectorize the comparison and reassigning of values?
If you were just trying to achieve what the listed code is doing, but in a vectorized fashion, you can use this approach -
%// Create mask to set elements in both A and B to zeros
mask = A==0 | B==0
%// Set A and B to zeros at places where mask has TRUE values
A(mask) = 0
B(mask) = 0
If the bigger context of finding the sum of squares errors after the listed code could be considered, you can do so with this -
df = A - B;
df(A==0 | B==0) = 0;
ss_vectorized = sum(df(:).^2);
Or as #carandraug commented, you can use the built-in sumsq for the sum of squares calculation at the last step -
ss_vectorized = sumsq(df(:));
I would like to split a rectangle in cells. In each cell it should be create a random coordinate (y, z).
The wide and height of the rectangle are known (initialW / initalH).
The size of the cells are calculated (dy / dz).
The numbers, in how many cells the rectangle to be part, are known. (numberCellsY / numberCellsZ)
Here my Code in Fortran to split the rectangle in Cells:
yRVEMin = 0.0
yRVEMax = initialW
dy = ( yRVEMax - yRVEMin ) / numberCellsY
zRVEMin = 0.0
zRVEMax = initialH
dz = ( zRVEMax - zRVEMin ) / numberCellsZ
do i = 1, numberCellsY
yMin(i) = (i-1)*dy
yMax(i) = i*dy
end do
do j = 1, numberCellsZ
zMin(j) = (j-1)*dz
zMax(j) = j*dz
end do
Now I would like to produce a random coordinate in each cell. The problem for me is, to store the coodinates in an array. It does not necessarily all be stored in one array, but as least as possible.
To fill the cells with coordinates it should start at the bottom left cell, go through the rows (y-direction), and after the last cell (numberCellsY) jump a column higher (z-dicrection) and start again by the first cell of the new row at left side. That should be made so long until a prescribed number (nfibers) is reached.
Here a deplorable try to do it:
call random_seed
l = 0
do k = 1 , nfibers
if (l < numberCellsY) then
l = l + 1
else
l = 1
end if
call random_number(y)
fiberCoordY(k) = yMin(l) + y * (yMax(l) - yMin(l))
end do
n = 0
do m = 1 , nfibers
if (n < numberCellsZ) then
n = n + 1
else
n = 1
end if
call random_number(z)
fiberCoordZ(m) = zMin(n) + z * (zMax(n) - zMin(n))
end do
The output is not what I want! fiberCoordZ should be stay on (zMin(1) / zMax(1) as long as numberCellsY-steps are reached.
The output for following settings:
nfibers = 9
numberCellsY = 3
numberCellsZ = 3
initialW = 9.0
initialH = 9.0
My random output for fiberCoordY is:
1.768946 3.362770 8.667685 1.898700 5.796713 8.770239 2.463412 3.546694 7.074708
and for fiberCoordZ is:
2.234807 5.213032 6.762228 2.948657 5.937295 8.649946 0.6795220 4.340364 8.352566
In this case the first 3 numbers of fiberCoordz should have a value between 0.0 and 3.0. Than number 4 - 6 a value between 3.0 and 6.0. And number 7 - 9 a value bewtween 6.0 - 9.0.
How can I solve this? If somebody has a solution with a better approach, please post it!
Thanks
Looking at
n = 0
do m = 1 , nfibers
if (n < numberCellsZ) then
n = n + 1
else
n = 1
end if
call random_number(z)
fiberCoordZ(m) = zMin(n) + z * (zMax(n) - zMin(n))
end do
we see that the z coordinate offset (the bottom cell boundary of interest) is being incremented inappropriately: for each consecutive nfibers/numberCellsZ coordinates n should be constant.
n should be incremented only every numberCellsY iterations, so perhaps a condition like
if (MOD(m, numberCellsY).eq.1) n=n+1
would be better.
Thanks francescalus! It works fine.
I added a little more for the case that nfibers > numberCellsY*numberCellsZ
n=0
do m = 1 , nfibers
if (MOD(m, numberCellsY).eq.1 .and. (n < numberCellsY)) then
n=n+1
end if
if (MOD(m, numberCellsY*numberCellsZ).eq.1 ) then
n = 1
end if
call random_number(z)
fiberCoordZ(m) = zMin(n) + z * (zMax(n) - zMin(n))
end do
Problem: We have x checkboxes and we want to check y of them evenly.
Example 1: select 50 checkboxes of 100 total.
[-]
[x]
[-]
[x]
...
Example 2: select 33 checkboxes of 100 total.
[-]
[-]
[x]
[-]
[-]
[x]
...
Example 3: select 66 checkboxes of 100 total:
[-]
[x]
[x]
[-]
[x]
[x]
...
But we're having trouble to come up with a formula to check them in code, especially once you go 11/111 or something similar. Anyone has an idea?
Let's first assume y is divisible by x. Then we denote p = y/x and the solution is simple. Go through the list, every p elements, mark 1 of them.
Now, let's say r = y%x is non zero. Still p = y/x where / is integer devision. So, you need to:
In the first p-r elements, mark 1 elements
In the last r elements, mark 2 elements
Note: This depends on how you define evenly distributed. You might want to spread the r sections withx+1 elements in between p-r sections with x elements, which indeed is again the same problem and could be solved recursively.
Alright so it wasn't actually correct. I think this would do though:
Regardless of divisibility:
if y > 2*x, then mark 1 element every p = y/x elements, x times.
if y < 2*x, then mark all, and do the previous step unmarking y-x out of y checkboxes (so like in the previous case, but x is replaced by y-x)
Note: This depends on how you define evenly distributed. You might want to change between p and p+1 elements for example to distribute them better.
Here's a straightforward solution using integer arithmetic:
void check(char boxes[], int total_count, int check_count)
{
int i;
for (i = 0; i < total_count; i++)
boxes[i] = '-';
for (i = 0; i < check_count; i++)
boxes[i * total_count / check_count] = 'x';
}
total_count is the total number of boxes, and check_count is the number of boxes to check.
First, it sets every box to unchecked. Then, it checks check_count boxes, scaling the counter to the number of boxes.
Caveat: this is left-biased rather than right-biased like in your examples. That is, it prints x--x-- rather than --x--x. You can turn it around by replacing
boxes[i * total_count / check_count] = 'x';
with:
boxes[total_count - (i * total_count / check_count) - 1] = 'x';
Correctness
Assuming 0 <= check_count <= total_count, and that boxes has space for at least total_count items, we can prove that:
No check marks will overlap. i * total_count / check_count increments by at least one on every iteration, because total_count >= check_count.
This will not overflow the buffer. The subscript i * total_count / check_count
Will be >= 0. i, total_count, and check_count will all be >= 0.
Will be < total_count. When n > 0 and d > 0:
(n * d - 1) / d < n
In other words, if we take n * d / d, and nudge the numerator down, the quotient will go down, too.
Therefore, (check_count - 1) * total_count / check_count will be less than total_count, with the assumptions made above. A division by zero won't happen because if check_count is 0, the loop in question will have zero iterations.
Say number of checkboxes is C and the number of Xes is N.
You example states that having C=111 and N=11 is your most troublesome case.
Try this: divide C/N. Call it D. Have index in the array as double number I. Have another variable as counter, M.
double D = (double)C / (double)N;
double I = 0.0;
int M = N;
while (M > 0) {
if (checkboxes[Round(I)].Checked) { // if we selected it, skip to next
I += 1.0;
continue;
}
checkboxes[Round(I)].Checked = true;
M --;
I += D;
if (Round(I) >= C) { // wrap around the end
I -= C;
}
}
Please note that Round(x) should return nearest integer value for x.
This one could work for you.
I think the key is to keep count of how many boxes you expect to have per check.
Say you want 33 checks in 100 boxes. 100 / 33 = 3.030303..., so you expect to have one check every 3.030303... boxes. That means every 3.030303... boxes, you need to add a check. 66 checks in 100 boxes would mean one check every 1.51515... boxes, 11 checks in 111 boxes would mean one check every 10.090909... boxes, and so on.
double count = 0;
for (int i = 0; i < boxes; i++) {
count += 1;
if (count >= boxes/checks) {
checkboxes[i] = true;
count -= count.truncate(); // so 1.6 becomes 0.6 - resetting the count but keeping the decimal part to keep track of "partial boxes" so far
}
}
You might rather use decimal as opposed to double for count, or there's a slight chance the last box will get skipped due to rounding errors.
Bresenham-like algorithm is suitable to distribute checkboxes evenly. Output of 'x' corresponds to Y-coordinate change. It is possible to choose initial err as random value in range [0..places) to avoid biasing.
def Distribute(places, stars):
err = places // 2
res = ''
for i in range(0, places):
err = err - stars
if err < 0 :
res = res + 'x'
err = err + places
else:
res = res + '-'
print(res)
Distribute(24,17)
Distribute(24,12)
Distribute(24,5)
output:
x-xxx-xx-xx-xxx-xx-xxx-x
-x-x-x-x-x-x-x-x-x-x-x-x
--x----x----x---x----x--
Quick html/javascript solution:
<html>
<body>
<div id='container'></div>
<script>
var cbCount = 111;
var cbCheckCount = 11;
var cbRatio = cbCount / cbCheckCount;
var buildCheckCount = 0;
var c = document.getElementById('container');
for (var i=1; i <= cbCount; i++) {
// make a checkbox
var cb = document.createElement('input');
cb.type = 'checkbox';
test = i / cbRatio - buildCheckCount;
if (test >= 1) {
// check the checkbox we just made
cb.checked = 'checked';
buildCheckCount++;
}
c.appendChild(cb);
c.appendChild(document.createElement('br'));
}
</script>
</body></html>
Adapt code from one question's answer or another answer from earlier this month. Set N = x = number of checkboxes and M = y = number to be checked and apply formula (N*i+N)/M - (N*i)/M for section sizes. (Also see Joey Adams' answer.)
In python, the adapted code is:
N=100; M=33; p=0;
for i in range(M):
k = (N+N*i)/M
for j in range(p,k-1): print "-",
print "x",
p=k
which produces
- - x - - x - - x - - x - - [...] x - - x - - - x where [...] represents 25 --x repetitions.
With M=66 the code gives
x - x x - x x - x x - x x - [...] x x - x x - x - x where [...] represents mostly xx- repetitions, with one x- in the middle.
Note, in C or java: Substitute for (i=0; i<M; ++i) in place of for i in range(M):. Substitute for (j=p; j<k-1; ++j) in place of for j in range(p,k-1):.
Correctness: Note that M = x boxes get checked because print "x", is executed M times.
What about using Fisher–Yates shuffle ?
Make array, shuffle and pick first n elements. You do not need to shuffle all of them, just first n of array. Shuffling can be find in most language libraries.