efficiently calculate locations for rectangles in a unit grid - algorithm

I'm working on a specific layout algorithm to display photos in a unit based grid. The desired behaviour is to have every photo placed in the next available space line by line.
Since there could easily be a thousand photos whose positions need to be calculated at once, efficiency is very important.
Has this problem maybe been solved with an existing algorithm already?
If not, how can I approach it to be as efficient as possible?
Edit
Regarding the positioning:
What I'm basically doing right now is iterating every line of the grid cell by cell until I find room to fit the element. That's why 4 is placed next to 2.

How about keeping a list of next available row by width? Initially the next-available-row list looks like:
(0,0,0,0,0)
When you've added the first photo, it looks like
(0,0,0,0,1)
Then
(0,0,0,2,2)
Then
(0,0,0,3,3)
Then
(1,1,1,4,4)
And the final photo doesn't change the list.
This could be efficient because you're only maintaining a small list, updating a little bit at each iteration (versus searching the entire space every time. It gets a little complicated - there could be a situation (with a tall photo) where the nominal next available row doesn't work, and then you could default to the existing approach. But overall I think this should save a fair amount of time, at the cost of a little added complexity.
Update
In response to #matteok's request for a coordinateForPhoto(width, height) method:
Let's say I called that array "nextAvailableRowByWidth".
public Coordinate coordinateForPhoto(width, height) {
int rowIndex = nextAvailableRowByWidth[width + 1]; // because arrays are zero-indexed
int[] row = space[rowIndex]
int column = findConsecutiveEmptySpace(width, row);
for (int i = 1; i < height; i++) {
if (!consecutiveEmptySpaceExists(width, space[i], column)) {
return null;
// return and fall back on the slow method, starting at rowIndex
}
}
// now either you broke out and are solving some other way,
// or your starting point is rowIndex, column. Done.
return new Coordinate(rowIndex, column);
}
Update #2
In response to #matteok's request for how to update the nextAvailableRowByWidth array:
OK, so you've just placed a new photo of height H and width W at row R. Any elements in the array which are less than R don't change (because this change didn't affect their row, so if there were 3 consecutive spaces available in the row before placing the photo, there are still 3 consecutive spaces available in it after). Every element which is in the range (R, R+H) needs to be checked, because it might have been affected. Let's postulate a method maxConsecutiveBlocksInRow() - because that's easy to write, right?
public void updateAvailableAfterPlacing(int W, int H, int R) {
for (int i = 0; i < nextAvailableRowByWidth.length; i++) {
if (nextAvailableRowByWidth[i] < R) {
continue;
}
int r = R;
while (maxConsecutiveBlocksInRow(r) < i + 1) {
r++;
}
nextAvailableRowByWidth[i] = r;
}
}
I think that should do it.

How about a matrix (your example would be 5x9) where each cell has a value representing the distance from the top left corner (for instance (row+1)*(column+1) [+1 is only necessary if your first row and value are 0]). In this matrix you look for the area which has the lowest value (when summing up the values of empty cells).
A 2nd matrix (or a 3rd dimension of the first matrix) stores the status of each cell.
edit:
int[][] grid = new int[9][5];
int[] filledRows = new int [9];
int photowidth = 2;
int photoheight = 1;
int emptyRowCounter = 0;
boolean photoFits = true;
for(int i = 0; i < grid.length; i++){
for(int m = 0; m < filledRows.length; m++){
if(filledRows[m]-(photoHeight-1) > i || filledRows[m]+(photoHeight-1) < i){
for(int j = 0; j < grid[i].length; j++){
if(grid[i][j] == 0){
for(int k = 0; k < photowidth; k++){
for(int l = 0; k < photoheight){
if(grid[i+l][j+k]!=0){
photoFits = false;
}
}
}
} else{
emptyRowCounter++;
}
}
if(photoFits){
//place Photo at i,j
}
if(emptyRowCounter == 5){
filledRows[i] = 1;
}
}
}
}

In the gif you have above, it turned out nicely that there was a photo (5) that could fit into the gap under (1) and to the left of (2). My intuition suggests we want to avoid creating gaps like that. Here is an idea that should avoid these gaps.
Maintain a list of "open regions", where an open region has a int leftBoundary, an int topBoundary, and an optional int bottomBoundary. The first open region is just the whole grid (leftBoundary:0, topBoundary: 0, bottom: null).
Sort the photos by height, breaking ties by width.
Until you have placed all photos:
Choose the tallest photo (in case of ties, choose the widest of the tallest photos). Find the first open region it can fit in (such that grid.Width - region.leftBoundary >= photo.Width). Place the photo at the top left of this region. When you place this photo, it may span the entire width or height of the region.
If it spans both the width and the height of the region, the region is filled! Remove this region from the list of open regions.
If it spans the width, but not the height, add the photo's height to the topBoundary of the region.
If it spans the height, but not the width, add the photo's width to the leftBoundary of the region.
If it does not span the height or width of the boundary, we are going to conceptually divide this region into two: one region will cover the space directly to the right of this photo (call it rightRegion), and the other region will cover the space below this region (call it belowRegion).
rightRegion = {
leftBoundary = parentRegion.leftBoundary + photo.width,
topBoundary = parentRegion.topBoundary,
bottomBoundary = parentRegion.topBoundary + photo.height
}
belowRegion = {
leftBoundary = 0,
topBoundary = parentRegion.topBoundary + photo.height,
bottomBoundary = parentRegion.bottomBoundary
}
Replace the current region in the list of open regions with rightRegion, and insert belowRegion directly after rightRegion.
You can visualize how this algorithm would work on your example: First, it would sort the photos: (2,3,4,1,5).
It considers 2, which fits into the first region (the whole grid). When it places 2 at the top left, it splits that region into the space directly to the right of 2, and the space below 2.
Then, it considers 3. It considers the open regions in turn. The first open region is to the right of 2. 3 fits there, so that's where it goes. It spans the width of the region, so the region's topBoundary gets adjusted downward.
Then, it considers 4. It again fits in the first open region, so it places 4 there. 4 spans the height of the region, so the region's leftBoundary gets adjusted rightward.
Then, 1 gets put in the 1x1 gap to the right of 4, filling its region. Finally, 5 gets put just below 2.

Related

Storing motion vectors from calculated optical flow in a practical way which enables reconstruction of subsequent frames from initial keyframes

I am trying to store the motion detected from optical flow for frames in a video sequence and then use these stored motion vectors in order to predict the already known frames using just the first frame as a reference. I am currently using two processing sketches - the first sketch draws a motion vector for every pixel grid (each of width and height 10 pixels). This is done for every frame in the video sequence. The vector is only drawn in a grid if there is sufficient motion detected. The second sketch aims to reconstruct the video frames crudely from just the initial frame of the video sequence combined with information about the motion vectors got from the first sketch.
My approach so far is as follows: I am able to determine the size, position and direction of each motion vector drawn in the first sketch from four variables. By creating four arrays (two for the motion vector's x and y coordinate and another two for its length in the x and y direction), every time a motion vector is drawn I can append each of the four variables to the arrays mentioned above. This is done for each pixel grid throughout an entire frame where the vector is drawn and for each frame in the sequence - via for loops. Once the arrays are full, I can then save them to a text file as a list of strings. I then load these strings from the text file into the second sketch, along with the first frame of the video sequence. I load the strings into variables within a while loop in the draw function and convert them back into floats. I increment a variable by one each time the draw function is called - this moves on to the next frame (I used a specific number as a separator in my text-files which appears at the end of every frame - the loop searches for this number and then increments the variable by one, thus breaking the while loop and the draw function is called again for the subsequent frame). For each frame, I can draw 10 by 10 pixel boxes and move then by the parameters got from the text files in the first sketch. My problem is simply this: How do I draw the motion of a particular frame without letting what I've have blitted to the screen in the previous frame affect what will be drawn for the next frame. My only way of getting my 10 by 10 pixel box is by using the get() function which gets pixels that are already drawn to the screen.
Apologies for the length and complexity of my question. Any tips would be very much appreciated! I will add the code for the second sketch. I can also add the first sketch if required, but it's rather long and a lot of it is not my own. Here is the second sketch:
import processing.video.*;
Movie video;
PImage [] naturalMovie = new PImage [0];
String xlengths [];
String ylengths [];
String xpositions [];
String ypositions [];
int a = 0;
int c = 0;
int d = 0;
int p;
int gs = 10;
void setup(){
size(640, 480, JAVA2D);
xlengths = loadStrings("xlengths.txt");
ylengths = loadStrings("ylengths.txt");
xpositions = loadStrings("xpositions.txt");
ypositions = loadStrings("ypositions.txt");
video = new Movie(this, "sample1.mov");
video.play();
rectMode(CENTER);
}
void movieEvent(Movie m) {
m.read();
PImage f = createImage(m.width, m.height, ARGB);
f.set(0, 0, m);
f.resize(width, height);
naturalMovie = (PImage []) append(naturalMovie, f);
println("naturalMovie length: " + naturalMovie.length);
p = naturalMovie.length - 1;
}
void draw() {
if(naturalMovie.length >= p && p > 0){
if (c == 0){
image(naturalMovie[0], 0, 0);
}
d = c;
while (c == d && c < xlengths.length){
float u, v, x0, y0;
u = float(xlengths[a]);
v = float(ylengths[a]);
x0 = float(xpositions[a]);
y0 = float(ypositions[a]);
if (u != 1.0E-19){
//stroke(255,255,255);
//line(x0,y0,x0+u,y0+v);
PImage box;
box = get(int(x0-gs/2), int(y0 - gs/2), gs, gs);
image(box, x0-gs/2 +u, y0 - gs/2 +v, gs, gs);
if (a < xlengths.length - 1){
a += 1;
}
}
else if (u == 1.0E-19){
if (a < xlengths.length - 1){
c += 1;
a += 1;
}
}
}
}
}
Word to the wise: most people aren't going to read that wall of text. Try to "dumb down" your posts so they get to the details right away, without any extra information. You'll also be better off if you post an MCVE instead of only giving us half your code. Note that this does not mean posting your entire project. Instead, start over with a blank sketch and only create the most basic code required to show the problem. Don't include any of your movie logic, and hardcode as much as possible. We should be able to copy and paste your code onto our own machines to run it and see the problem.
All of that being said, I think I understand what you're asking.
How do I draw the motion of a particular frame without letting what I've have blitted to the screen in the previous frame affect what will be drawn for the next frame. My only way of getting my 10 by 10 pixel box is by using the get() function which gets pixels that are already drawn to the screen.
Separate your program into a view and a model. Right now you're using the screen (the view) to store all of your information, which is going to cause you headaches. Instead, store the state of your program into a set of variables (the model). For you, this might just be a bunch of PVector instances.
Let's say I have an ArrayList<PVector> that holds the current position of all of my vectors:
ArrayList<PVector> currentPositions = new ArrayList<PVector>();
void setup() {
size(500, 500);
for (int i = 0; i < 100; i++) {
currentPositions.add(new PVector(random(width), random(height)));
}
}
void draw(){
background(0);
for(PVector vector : currentPositions){
ellipse(vector.x, vector.y, 10, 10);
}
}
Notice that I'm just hardcoding their positions to be random. This is what your MCVE should do as well. And then in the draw() function, I'm simply drawing each vector. This is like drawing a single frame for you.
Now that we have that, we can create a nextFrame() function that moves the vectors based on the ArrayList (our model) and not what's drawn on the screen!
void nextFrame(){
for(PVector vector : currentPositions){
vector.x += random(-2, 2);
vector.y += random(-2, 2);
}
}
Again, I'm just hardcoding a random movement, but you would be reading these from your file. Then we just call the nextFrame() function as the last line in the draw() function:
If you're still having trouble, I highly recommend posting an MCVE similar to mine and posting a new question. Good luck.

Infinite Blue Noise

I am looking for an algorithm which produces a point placement result similar to blue noise.
However, it needs to work for an infinite plane. Where a bounding box is given, and it returns the locations of all points which fall inside. Any help would be appreciated. I've done a lot of research, and found nothing that suits my needs.
Finally I've managed to get results.
One way of generating point distributions with blue noise properties
is by means of a Poisson-Disk Distribution
Following the algorithm from the paper Fast Poisson disk sampling in
arbitrary dimensions, Robert Bridson I've got:
The steps mentioned in the paper are:
Step 0. Initialize an n-dimensional background grid for storing
samples and accelerating spatial searches. We pick the cell size to be
bounded by r/sqrt(n), so that each grid cell will contain at most one
sample, and thus the grid can be implemented as a simple n-dimensional
array of integers: the default −1 indicates no sample, a non-negative
integer gives the index of the sample located in a cell
Step 1. Select the initial sample, x0, randomly chosen uniformly from
the domain. Insert it into the background grid, and initialize the
“active list” (an array of sample indices) with this index (zero).
Step 2. While the active list is not empty, choose a random index from
it (say i). Generate up to k points chosen uniformly from the
spherical annulus between radius r and 2r around xi. For each point in
turn, check if it is within distance r of existing samples (using the
background grid to only test nearby samples). If a point is adequately
far from existing samples, emit it as the next sample and add it to
the active list. If after k attempts no such point is found, instead
remove i from the active list.
Note that for simplicity I've skipped step 0. Despite that the run-time is still reasonable. It's < .5s. Implementing this step would most definitely increase the performance.
Here's a sample code in Processing. It's a language built on top of Java so the syntax is very similar. Translating it for your purposes shouldn't be hard.
import java.util.List;
import java.util.Collections;
List<PVector> poisson_disk_sampling(int k, int r, int size)
{
List<PVector> samples = new ArrayList<PVector>();
List<PVector> active_list = new ArrayList<PVector>();
active_list.add(new PVector(random(size), random(size)));
int len;
while ((len = active_list.size()) > 0) {
// picks random index uniformly at random from the active list
int index = int(random(len));
Collections.swap(active_list, len-1, index);
PVector sample = active_list.get(len-1);
boolean found = false;
for (int i = 0; i < k; ++i) {
// generates a point uniformly at random in the sample's
// disk situated at a distance from r to 2*r
float angle = 2*PI*random(1);
float radius = random(r) + r;
PVector dv = new PVector(radius*cos(angle), radius*sin(angle));
PVector new_sample = dv.add(sample);
boolean ok = true;
for (int j = 0; j < samples.size(); ++j) {
if (dist(new_sample.x, new_sample.y,
samples.get(j).x, samples.get(j).y) <= r)
{
ok = false;
break;
}
}
if (ok) {
if (0 <= new_sample.x && new_sample.x < size &&
0 <= new_sample.y && new_sample.y < size)
{
samples.add(new_sample);
active_list.add(new_sample);
len++;
found = true;
}
}
}
if (!found)
active_list.remove(active_list.size()-1);
}
return samples;
}
List<PVector> samples;
void setup() {
int SIZE = 500;
size(500, 500);
background(255);
strokeWeight(4);
noLoop();
samples = poisson_disk_sampling(30, 10, SIZE);
}
void draw() {
for (PVector sample : samples)
point(sample.x, sample.y);
}
However, it needs to work for an infinite plane.
You control the size of the box with the parameter size. r controls the relative distance between the points. k controls how many new sample should you try before rejecting the current. The paper suggests k=30.

Algorithm for finding a bounded image

what would be the most effective and efficient algorithm for finding a solid-color bounded image (an image within a border, for example) given a one-dimensional array of pixel values and a threshold?
I thought of a couple.
For example:
Start at the halfway point of the image dimensions e.g. width / 2 height / 2.
loop through pixels until you hit a pixel not in your threshold. Do this for all four sides and extract dimensions from the indexes.
The problem with this algorithm is if you are given an image that is, for example, only bounded on the right side, and its width is less than half of the containing image... then this wouldn't work.
public static Rect GetBounded(this WriteableBitmap wb, int aRGBThreshold)
{
int[] pixels = wb.Pixels;
int width = wb.PixelWidth;
int height = wb.PixelHeight;
int leftIndex = (height / 2) * width;
int topIndex = width / 2;
int rightIndex = (width * (height / 2 + 1)) - 1;
int bottomIndex = width * height - (width / 2);
int left = 0, top = 0, right = 0, bottom = 0;
int i;
for (i = leftIndex; i <= rightIndex; i++)
{
if (pixels[i] < aRGBThreshold)
break;
left++;
}
for (i = topIndex; i <= bottomIndex; i += width)
{
if (pixels[i] < aRGBThreshold)
break;
top++;
}
for (i = rightIndex; i >= leftIndex; i--)
{
if (pixels[i] < aRGBThreshold)
break;
right++;
}
for (i = bottomIndex; i >= topIndex; i -= width)
{
if (pixels[i] < aRGBThreshold)
break;
bottom++;
}
return new Rect(left, top, width - right - left, height - bottom - top);
}
public static Rect GetBounded(this WriteableBitmap wb, int aThreshold, int rThreshold, int gThreshold, int bThreshold)
{
int argbthreshold = (aThreshold << 24) + (rThreshold << 16) + (gThreshold << 8) + bThreshold;
return wb.GetBounded(argbthreshold);
}
In the case you are looking for a rectangle (as your approach and code suggest), your approach is good. You could improve it by doing a binary search instead of a linear one to find the first and last object points in a row or column. This is similar to the c++ functions std::lower_bound and std::upper_bound (see http://en.cppreference.com/w/cpp/algorithm). This should be faster if your rectangles are far away from the image boundaries.
If the object can have any shape but its components are connected, probably it would be better to find a single pixel that lies in the object and do flood fill later.
If the object can have any shape and does not need to be connected, you have to traverse the whole image and keep the minimum and maximum row and column where the pixel exceeds the threshold. I think it would be enough to scan rows only, from left until you find an object pixel and from right later. If the image is stored in row-major order, it is more efficient to scan rows. If it is in column-major order, scan columns.

Is there an algorithm to determine if one or two images can be presented on screen? (singe page vs double page)

Given a known number of images with the same width to height ratio and dimensions, is there an algorithm that determines the best way to present them in a screen whose resolution may vary? Aka arrange them single page or double page.
For example, to determine if you can present two images on the screen or, given the width/height, to "detect" that only one would look better. To see if it's better to have them fit on width, with some blank space above and/or below them, or to fit on height, with blank space on the left/right.
I have made some attempts to determine such an algorithm of my own, but I'm not completely satisfied and I was hoping there might be a better solution or maybe some advices.
Unfortunately, it's not as simple as "width > height => two images otherwise one image", as I found out.
Summed up, right now I do all my calculations based on the image height first and then check if the screen width's is larger than 1.5 x my image width. Larger means I decrease the height to have two images fit, smaller means I only present one image. Still, I keep getting some undesired results for certain combinations of image sizes / screen resolutions.
If you stumbled upon the same problem and have some code or tips to spare, it would be greatly appreciated.
P.S. As you may figure out, this is about presenting a magazine.
[edit] I forgot to mention that I'm using javascript (vanilla, no plugins/frameworks) for the coding
You can use a kd-tree algorithm. A kd-tree hierarchical subdivide the screen into rectangles and uses a data structure similar to a binary tree to store the information. It's uses in many application for example to show the disk space. The Jquery plugin Masonry can do the same. I would also take a look at the Jquery plugin treemap.
To answer my own question, it seems there isn't such an algorithm or at least I wasn't able to find one. The kd-tree suggested is a waaaaaaaaaaaaay too complicated overkill solution to be practical.
In the end, I settled for the one I came up with myself:
first I check the window height, to see if the largest image fits. If it doesn't, I check if a certain percentage (75%) does. If it's still to large, I downsize the heights
once I determine a height that fits, I check if two images fit in the total width. If they do, this means I have a double-page mode, otherwise is single page mode
Here is a vanilla javascript implementation
var selCase = 0; //an "ok" var
//pageH = array with page heights
//pageW = array with page widths
//currentWH = window height
//currentWW = windowWidth
for(i = pageH.length - 1; i >= 0; i--) {
if(!selCase) {
if(pageH[i] <= currentWH) { //if the page height is lower than the total height
if(pageW[i] * 2 <= currentWW) { //if we have room for double pages
currentMode = 'L';
currentPW = pageW[i];
currentPH = pageH[i];
selCase = 11;
}else {
if(currentWW > currentWH) { //if the width is bigger, double-page mode is prefered
if((pageW[i] * 1.5) >= (currentWW - 20)) { //there wouldn't be much space left for double-pages
currentMode = 'P';
currentPW = pageW[i];
currentPH = pageH[i];
selCase = 12;
} // else: the height fits, but not the width so a smaller height is required
}else { //standard portrait mode
if(pageW[i] <= (currentWW - 20)) {
currentMode = 'P';
currentPW = pageW[i];
currentPH = pageH[i];
selCase = 13;
}
}
}
}else { //we check to see if maybe we can shrink the page a little to fit in the total height
var sPerc; //scale percentage
sPerc = currentWH * 100 / pageH[i];
if(sPerc >= perc) {
var newW = pageW[i] * sPerc / 100;
if(newW * 2 <= (currentWW - 20)) { //if we have room for two also-scaled pages
currentMode = 'L';
currentPW = Math.floor(newW);
currentPH = currentWH;
selCase = 21;
}else {
if(currentWW > currentWH) { //same as before
if((newW * 1.5) >= (currentWW - 20)) { //only one scaled page fits
currentMode = 'P';
currentPW = Math.floor(newW);
currentPH = currentWH;
selCase = 22;
}
}else {
if(newW < (currentWW - 20)) {
currentMode = 'P';
currentPW = Math.floor(newW);
currentPH = currentWH;
selCase = 23;
}
}
}
}
}
}
}

Finding closest non-black pixel in an image fast

I have a 2D image randomly and sparsely scattered with pixels.
given a point on the image, I need to find the distance to the closest pixel that is not in the background color (black).
What is the fastest way to do this?
The only method I could come up with is building a kd-tree for the pixels. but I would really want to avoid such expensive preprocessing. also, it seems that a kd-tree gives me more than I need. I only need the distance to something and I don't care about what this something is.
Personally, I'd ignore MusiGenesis' suggestion of a lookup table.
Calculating the distance between pixels is not expensive, particularly as for this initial test you don't need the actual distance so there's no need to take the square root. You can work with distance^2, i.e:
r^2 = dx^2 + dy^2
Also, if you're going outwards one pixel at a time remember that:
(n + 1)^2 = n^2 + 2n + 1
or if nx is the current value and ox is the previous value:
nx^2 = ox^2 + 2ox + 1
= ox^2 + 2(nx - 1) + 1
= ox^2 + 2nx - 1
=> nx^2 += 2nx - 1
It's easy to see how this works:
1^2 = 0 + 2*1 - 1 = 1
2^2 = 1 + 2*2 - 1 = 4
3^2 = 4 + 2*3 - 1 = 9
4^2 = 9 + 2*4 - 1 = 16
5^2 = 16 + 2*5 - 1 = 25
etc...
So, in each iteration you therefore need only retain some intermediate variables thus:
int dx2 = 0, dy2, r2;
for (dx = 1; dx < w; ++dx) { // ignoring bounds checks
dx2 += (dx << 1) - 1;
dy2 = 0;
for (dy = 1; dy < h; ++dy) {
dy2 += (dy << 1) - 1;
r2 = dx2 + dy2;
// do tests here
}
}
Tada! r^2 calculation with only bit shifts, adds and subtracts :)
Of course, on any decent modern CPU calculating r^2 = dx*dx + dy*dy might be just as fast as this...
As Pyro says, search the perimeter of a square that you keep moving out one pixel at a time from your original point (i.e. increasing the width and height by two pixels at a time). When you hit a non-black pixel, you calculate the distance (this is your first expensive calculation) and then continue searching outwards until the width of your box is twice the distance to the first found point (any points beyond this cannot possibly be closer than your original found pixel). Save any non-black points you find during this part, and then calculate each of their distances to see if any of them are closer than your original point.
In an ideal find, you only have to make one expensive distance calculation.
Update: Because you're calculating pixel-to-pixel distances here (instead of arbitrary precision floating point locations), you can speed up this algorithm substantially by using a pre-calculated lookup table (just a height-by-width array) to give you distance as a function of x and y. A 100x100 array costs you essentially 40K of memory and covers a 200x200 square around the original point, and spares you the cost of doing an expensive distance calculation (whether Pythagorean or matrix algebra) for every colored pixel you find. This array could even be pre-calculated and embedded in your app as a resource, to spare you the initial calculation time (this is probably serious overkill).
Update 2: Also, there are ways to optimize searching the square perimeter. Your search should start at the four points that intersect the axes and move one pixel at a time towards the corners (you have 8 moving search points, which could easily make this more trouble than it's worth, depending on your application's requirements). As soon as you locate a colored pixel, there is no need to continue towards the corners, as the remaining points are all further from the origin.
After the first found pixel, you can further restrict the additional search area required to the minimum by using the lookup table to ensure that each searched point is closer than the found point (again starting at the axes, and stopping when the distance limit is reached). This second optimization would probably be much too expensive to employ if you had to calculate each distance on the fly.
If the nearest pixel is within the 200x200 box (or whatever size works for your data), you will only search within a circle bounded by the pixel, doing only lookups and <>comparisons.
You didn't specify how you want to measure distance. I'll assume L1 (rectilinear) because it's easier; possibly these ideas could be modified for L2 (Euclidean).
If you're only doing this for relatively few pixels, then just search outward from the source pixel in a spiral until you hit a nonblack one.
If you're doing this for many/all of them, how about this: Build a 2-D array the size of the image, where each cell stores the distance to the nearest nonblack pixel (and if necessary, the coordinates of that pixel). Do four line sweeps: left to right, right to left, bottom to top, and top to bottom. Consider the left to right sweep; as you sweep, keep a 1-D column containing the last nonblack pixel seen in each row, and mark each cell in the 2-D array with the distance to and/or coordinates of that pixel. O(n^2).
Alternatively, a k-d tree is overkill; you could use a quadtree. Only a little more difficult to code than my line sweep, a little more memory (but less than twice as much), and possibly faster.
Search "Nearest neighbor search", first two links in Google should help you.
If you are only doing this for 1 pixel per image, I think your best bet is just a linear search, 1 pixel width box at time outwards. You can't take the first point you find, if your search box is square. You have to be careful
Yes, the Nearest neighbor search is good, but does not guarantee you'll find the 'nearest'. Moving one pixel out each time will produce a square search - the diagonals will be farther away than the horizontal / vertical. If this is important, you'll want to verify - continue expanding until the absolute horizontal has a distance greater than the 'found' pixel, and then calculate distances on all non-black pixels that were located.
Ok, this sounds interesting.
I made a c++ version of a soulution, I don't know if this helps you. I think it works fast enough as it's almost instant on a 800*600 matrix. If you have any questions just ask.
Sorry for any mistakes I've made, it's a 10min code...
This is a iterative version (I was planing on making a recursive one too, but I've changed my mind).
The algorithm could be improved by not adding any point to the points array that is to a larger distance from the starting point then the min_dist, but this involves calculating for each pixel (despite it's color) the distance from the starting point.
Hope that helps
//(c++ version)
#include<iostream>
#include<cmath>
#include<ctime>
using namespace std;
//ITERATIVE VERSION
//picture witdh&height
#define width 800
#define height 600
//indexex
int i,j;
//initial point coordinates
int x,y;
//variables to work with the array
int p,u;
//minimum dist
double min_dist=2000000000;
//array for memorising the points added
struct point{
int x;
int y;
} points[width*height];
double dist;
bool viz[width][height];
// direction vectors, used for adding adjacent points in the "points" array.
int dx[8]={1,1,0,-1,-1,-1,0,1};
int dy[8]={0,1,1,1,0,-1,-1,-1};
int k,nX,nY;
//we will generate an image with white&black pixels (0&1)
bool image[width-1][height-1];
int main(){
srand(time(0));
//generate the random pic
for(i=1;i<=width-1;i++)
for(j=1;j<=height-1;j++)
if(rand()%10001<=9999) //9999/10000 chances of generating a black pixel
image[i][j]=0;
else image[i][j]=1;
//random coordinates for starting x&y
x=rand()%width;
y=rand()%height;
p=1;u=1;
points[1].x=x;
points[1].y=y;
while(p<=u){
for(k=0;k<=7;k++){
nX=points[p].x+dx[k];
nY=points[p].y+dy[k];
//nX&nY are the coordinates for the next point
//if we haven't added the point yet
//also check if the point is valid
if(nX>0&&nY>0&&nX<width&&nY<height)
if(viz[nX][nY] == 0 ){
//mark it as added
viz[nX][nY]=1;
//add it in the array
u++;
points[u].x=nX;
points[u].y=nY;
//if it's not black
if(image[nX][nY]!=0){
//calculate the distance
dist=(x-nX)*(x-nX) + (y-nY)*(y-nY);
dist=sqrt(dist);
//if the dist is shorter than the minimum, we save it
if(dist<min_dist)
min_dist=dist;
//you could save the coordinates of the point that has
//the minimum distance too, like sX=nX;, sY=nY;
}
}
}
p++;
}
cout<<"Minimum dist:"<<min_dist<<"\n";
return 0;
}
I'm sure this could be done better but here's some code that searches the perimeter of a square around the centre pixel, examining the centre first and moving toward the corners. If a pixel isn't found the perimeter (radius) is expanded until either the radius limit is reached or a pixel is found. The first implementation was a loop doing a simple spiral around the centre point but as noted that doesn't find the absolute closest pixel. SomeBigObjCStruct's creation inside the loop was very slow - removing it from the loop made it good enough and the spiral approach is what got used. But here's this implementation anyway - beware, little to no testing done.
It is all done with integer addition and subtraction.
- (SomeBigObjCStruct *)nearestWalkablePoint:(SomeBigObjCStruct)point {
typedef struct _testPoint { // using the IYMapPoint object here is very slow
int x;
int y;
} testPoint;
// see if the point supplied is walkable
testPoint centre;
centre.x = point.x;
centre.y = point.y;
NSMutableData *map = [self getWalkingMapDataForLevelId:point.levelId];
// check point for walkable (case radius = 0)
if(testThePoint(centre.x, centre.y, map) != 0) // bullseye
return point;
// radius is the distance from the location of point. A square is checked on each iteration, radius units from point.
// The point with y=0 or x=0 distance is checked first, i.e. the centre of the side of the square. A cursor variable
// is used to move along the side of the square looking for a walkable point. This proceeds until a walkable point
// is found or the side is exhausted. Sides are checked until radius is exhausted at which point the search fails.
int radius = 1;
BOOL leftWithinMap = YES, rightWithinMap = YES, upWithinMap = YES, downWithinMap = YES;
testPoint leftCentre, upCentre, rightCentre, downCentre;
testPoint leftUp, leftDown, rightUp, rightDown;
testPoint upLeft, upRight, downLeft, downRight;
leftCentre = rightCentre = upCentre = downCentre = centre;
int foundX = -1;
int foundY = -1;
while(radius < 1000) {
// radius increases. move centres outward
if(leftWithinMap == YES) {
leftCentre.x -= 1; // move left
if(leftCentre.x < 0) {
leftWithinMap = NO;
}
}
if(rightWithinMap == YES) {
rightCentre.x += 1; // move right
if(!(rightCentre.x < kIYMapWidth)) {
rightWithinMap = NO;
}
}
if(upWithinMap == YES) {
upCentre.y -= 1; // move up
if(upCentre.y < 0) {
upWithinMap = NO;
}
}
if(downWithinMap == YES) {
downCentre.y += 1; // move down
if(!(downCentre.y < kIYMapHeight)) {
downWithinMap = NO;
}
}
// set up cursor values for checking along the sides of the square
leftUp = leftDown = leftCentre;
leftUp.y -= 1;
leftDown.y += 1;
rightUp = rightDown = rightCentre;
rightUp.y -= 1;
rightDown.y += 1;
upRight = upLeft = upCentre;
upRight.x += 1;
upLeft.x -= 1;
downRight = downLeft = downCentre;
downRight.x += 1;
downLeft.x -= 1;
// check centres
if(testThePoint(leftCentre.x, leftCentre.y, map) != 0) {
foundX = leftCentre.x;
foundY = leftCentre.y;
break;
}
if(testThePoint(rightCentre.x, rightCentre.y, map) != 0) {
foundX = rightCentre.x;
foundY = rightCentre.y;
break;
}
if(testThePoint(upCentre.x, upCentre.y, map) != 0) {
foundX = upCentre.x;
foundY = upCentre.y;
break;
}
if(testThePoint(downCentre.x, downCentre.y, map) != 0) {
foundX = downCentre.x;
foundY = downCentre.y;
break;
}
int i;
for(i = 0; i < radius; i++) {
if(leftWithinMap == YES) {
// LEFT Side - stop short of top/bottom rows because up/down horizontal cursors check that line
// if cursor position is within map
if(i < radius - 1) {
if(leftUp.y > 0) {
// check it
if(testThePoint(leftUp.x, leftUp.y, map) != 0) {
foundX = leftUp.x;
foundY = leftUp.y;
break;
}
leftUp.y -= 1; // moving up
}
if(leftDown.y < kIYMapHeight) {
// check it
if(testThePoint(leftDown.x, leftDown.y, map) != 0) {
foundX = leftDown.x;
foundY = leftDown.y;
break;
}
leftDown.y += 1; // moving down
}
}
}
if(rightWithinMap == YES) {
// RIGHT Side
if(i < radius - 1) {
if(rightUp.y > 0) {
if(testThePoint(rightUp.x, rightUp.y, map) != 0) {
foundX = rightUp.x;
foundY = rightUp.y;
break;
}
rightUp.y -= 1; // moving up
}
if(rightDown.y < kIYMapHeight) {
if(testThePoint(rightDown.x, rightDown.y, map) != 0) {
foundX = rightDown.x;
foundY = rightDown.y;
break;
}
rightDown.y += 1; // moving down
}
}
}
if(upWithinMap == YES) {
// UP Side
if(upRight.x < kIYMapWidth) {
if(testThePoint(upRight.x, upRight.y, map) != 0) {
foundX = upRight.x;
foundY = upRight.y;
break;
}
upRight.x += 1; // moving right
}
if(upLeft.x > 0) {
if(testThePoint(upLeft.x, upLeft.y, map) != 0) {
foundX = upLeft.x;
foundY = upLeft.y;
break;
}
upLeft.y -= 1; // moving left
}
}
if(downWithinMap == YES) {
// DOWN Side
if(downRight.x < kIYMapWidth) {
if(testThePoint(downRight.x, downRight.y, map) != 0) {
foundX = downRight.x;
foundY = downRight.y;
break;
}
downRight.x += 1; // moving right
}
if(downLeft.x > 0) {
if(testThePoint(upLeft.x, upLeft.y, map) != 0) {
foundX = downLeft.x;
foundY = downLeft.y;
break;
}
downLeft.y -= 1; // moving left
}
}
}
if(foundX != -1 && foundY != -1) {
break;
}
radius++;
}
// build the return object
if(foundX != -1 && foundY != -1) {
SomeBigObjCStruct *foundPoint = [SomeBigObjCStruct mapPointWithX:foundX Y:foundY levelId:point.levelId];
foundPoint.z = [self zWithLevelId:point.levelId];
return foundPoint;
}
return nil;
}
You can combine many ways to speed it up.
A way to accelerate the pixel lookup is to use what I call a spatial lookup map. It is basically a downsampled map (say of 8x8 pixels, but its a tradeoff) of the pixels in that block. Values can be "no pixels set" "partial pixels set" "all pixels set". This way one read can tell if a block/cell is either full, partially full or empty.
scanning a box/rectangle around the center may not be ideal because there are many pixels/cells which are far far away. I use a circle drawing algorithm (Bresenham) to reduce the overhead.
reading the raw pixel values can happen in horizontal batches, for example a byte (for a cell size of 8x8 or multiples of it), dword or long. This should give you a serious speedup again.
you can also use multiple levels of "spatial lookup maps", its again a tradeoff.
For the distance calculatation the mentioned lookup table can be used, but its a (cache)bandwidth vs calculation speed tradeoff (I dunno how it performs on a GPU for example).
Another approach I have investigated and likely will stick to: Utilizing the Bresenham circle algorithm.
It is surprisingly fast as it saves you any sort of distance comparisons!
You effectively just draw bigger and bigger circles around your target point so that when the first time you encounter a non-black pixel you automatically know it is the closest, saving any further checks.
What I have not verified yet is whether the bresenham circle will catch every single pixel but that wasn't a concern for my case as my pixels will occur in blobs of some sort.
I would do a simple lookup table - for every pixel, precalculate distance to the closest non-black pixel and store the value in the same offset as the corresponding pixel. Of course, this way you will need more memory.

Resources