All paint programs, independent of how simple or complex they are, come with a fill tool. This basically replaces the color of a closed region with another color. I know that there are different APIs to do this, but I am interested in the algorithm. What would be an efficient algorithm to implement this tool?
A couple of things I can think of quickly are:
Convert image into a binary map, where pixels in the color to be replaced are 1 and all other colors are 0.
Find a closed region around the point you want to change such that all the pixels inside are 1 and all the neighbouring pixels are 0.
Sample Image
Many implementations are done as a recursive conquer and divide algorithm. If you do a quick google for "flood fill algorithm" you will find plenty of resources including the excellent wikipedia page on the topic.
The Flood Fill algorithm is what's most commonly used. The following is a naive version of it straight out of my old university textbook, "Computer Graphics with OpenGL" by Hearn Baker, 3rd ed:
void floodFill4 (int x, int y, int fillColor, int interiorColor)
{
int color;
/* Set current color to fillColor, then perform the following operations */
getPixel(x, y, color);
if (color == interiorColor)
{
setPixel(x,y); // Set color of pixel to fillColor.
floodFill4(x + 1, y, fillColor, interiorColor);
floodFill4(x - 1, y, fillColor, interiorColor);
floodFill4(x, y + 1, fillColor, interiorColor);
floodFill4(x, y - 1, fillColor, interiorColor);
}
}
For large images, however, the above will probably give you a stack-overflow error due to recursing for every pixel. Often, this algorithm is modified so that it uses iteration when filling a row of pixels, then recursively fills the rows above and below. As #kasperjj stated, wikipedia has a good article about this.
These kinds of algorithms are discussed in detail in Computer Graphics: Principles and Practice. I highly recommend this book if you're interested in understanding how to rasterize lines, fill polygons, writing 3d code without the benefit of using DirectX or OpenGL APIs. Of course, for real world applications, you'll probably want to use existing libraries, but if you're curious about how these libraries work, this is an awesome read.
If you want a time efficient algorithm that doesn't care very about memory efficiency, you can do it by:
1) keeping a boolean memory of which cells you have already visited: Vis[]
2) keeping a list of points you have already visited but have not yet marked the neighbours for: Busy[]
3) start both of those as empty
4) add your start point to Busy
5)
while you have a point P in Busy:
{
for each neighbour N of the point P for which Vis[N] is still false
{
if appropriate (not crossing the boundary of the fill region)
{
set Vis[N] to true
update the colour of N in the bitmap
add N to the end of Busy[]
}
remove P from Busy[]
}
}
Also read about connected component labelling. This is an efficent way to find connected pixels whilst only visiting every pixel twice.
Wikipedia article.
The advantage to this is that the pixel values don't have to necessarily be the same or the function that describes pixels as connected could be something other than raw value - gradient perhaps.
General idea is described as Flood Fill Algorithm and there are various modifications to it. A common one is scanline fill. See the related question How Scanline based 2d rendering engines works?
Related
What I'm looking for
I have 300 or fewer discs of equal radius on a plane. At time 0 each disc is at a position. At time 1 each disc is at a potentially different position. I'm looking to generate a 2D path for each disc for times between 0 and 1 such that the discs do not intersect and the paths are relatively efficient (short) and of low curvature if possible. (for example, straight lines are preferable to squiggly lines)
Lower computation time is generally more important than exactness of solution. (for example, a little intersection is okay, and I don't necessarily need an optimal result)
However, discs shouldn't teleport through each other, stop or slow abruptly, or change direction abruptly -- the "smoother" the better. Only exception is time 0 and 1.
Paths can be expressed in a sampled form or piecewise linear nature (or better) -- I'm not worried about having truly smooth paths via splines. (I can approximate that if I so need.)
What I've tried
You can see a demo of my best attempt (via Javascript + WebGL). Be warned, it will load slowly on older computers due to the computations involved. It appears to work in Firefox/Chrome/IE11 under Windows.
In this demo I've represented each disc as an "elastic band" in 3D (that is, each disc has a position at each time) and ran a simple game-style physics engine that resolves constraints and treats each point in time like a mass with springs to the previous/next time. ('Time' in this case is just the third dimension.)
This actually works pretty well for small N (<20), but in common test cases (for example, start with discs arranged in circle, move each disc to the opposite point on the circle) this fails to generate convincing paths since the constraints and elasticity propagate slowly throughout the springs. (for example, if I slice time into 100 discrete levels, tension in the elastic bands only propagates one level per each simulation cycle) This makes good solutions require many (>10000) iterations, and that is tediously slow for my application. It also fails to reasonably resolve many N>40 cases, but this may be simply because I can't feasibly run enough iterations.
What else I've tried
My initial attempt was a hill-climber that started with straight-line paths which were gradually mutated. Solutions which measured better than the currently best solution replaced the currently best solution. Better measurements resulted from the amount of intersection (that is, completely overlapping measured worse than just grazing) and the length of the paths (shorter paths were better).
This produced some surprisingly good results, but unreliably, likely getting stuck in local minima very often. It was extremely slow for N>20. I tried applying a few techniques (simulated annealing, a genetic algorithms approach, etc) in an attempt to get around the local minima issue, but I never had much success.
What I'm trying
I'm optimizing the "elastic band" model so that tension and constraints propagate much more quickly in the time dimension. This would save a good deal of needed iterations in many cases, however in highly-constrained scenarios (for example, many discs trying to cross the same location) an untenable amount of iterations would still be required. I'm no expert on how to solve constraints or propagate springs more quickly (I've tried reading a few papers on non-stretchable cloth simulation, but I haven't been able to figure out if they apply), so I'd be interested in if there's a good way to go about this.
Ideas on the table
Spektre has implemented a very fast RTS-style unit movement algorithm that works admirably well. It's fast and elegant, however it suffers from RTS-movement style problems: sudden direction changes, units can stop abruptly to resolve collisions. Additionally, units do not all arrive at their destination at the same time, which is essentially an abrupt stop. This may be a good heuristic to make viable non-smooth paths after which the paths could be resampled in time and a "smoothing" algorithm could be run (much like the one used in my demo.)
Ashkan Kzme has suggested that the problem may be related to network flows. It would appear that the minimum cost flow problem could work, as long as space and time could be discritized in a reasonable manner, and the running times could be kept down. The advantage here is that it's a well studied set of problems, but sudden velocity changes would still be an issue and some sort of "smoothing" post-steps may be desirable. The stumbling block I'm currently having is deciding on a network representation of space-time that wouldn't result in discs teleporting through each other.
Jay Kominek posted an answer that uses a nonlinear optimizer to optimize quadratic Bezier curves with some promising results.
Have played with this for fun a bit and here the result:
Algorithm:
process each disc
set speed as constant*destination_vector
multiplicative constant a
and limit the speed to constant v afterwards
test if new iterated position does not conflict any other disc
if it does rotate the speed in one direction by some angle step ang
loop until free direction found or full circle covered
if no free direction found mark disc as stuck
This is how it looks like for circle to inverse circle path:
This is how it looks like for random to random path:
stuck disc are yellow (none in these cases) and not moving discs are at destination already. This can also get stuck if there is no path like if disc already in destination circles another ones destination. To avoid that you need also change the colliding disc also ... You can play with the ang,a,v constants to make different appearance and also you could try random direction of angle rotation to avoid that swirling/twister movement
Here the source code I used (C++):
//---------------------------------------------------------------------------
const int discs =23; // number of discs
const double disc_r=5; // disc radius
const double disc_dd=4.0*disc_r*disc_r;
struct _disc
{
double x,y,vx,vy; // actual position
double x1,y1; // destination
bool _stuck; // is currently stuck?
};
_disc disc[discs]; // discs array
//---------------------------------------------------------------------------
void disc_generate0(double x,double y,double r) // circle position to inverse circle destination
{
int i;
_disc *p;
double a,da;
for (p=disc,a=0,da=2.0*M_PI/double(discs),i=0;i<discs;a+=da,i++,p++)
{
p->x =x+(r*cos(a));
p->y =y+(r*sin(a));
p->x1=x-(r*cos(a));
p->y1=y-(r*sin(a));
p->vx=0.0;
p->vy=0.0;
p->_stuck=false;
}
}
//---------------------------------------------------------------------------
void disc_generate1(double x,double y,double r) // random position to random destination
{
int i,j;
_disc *p,*q;
double a,da;
Randomize();
for (p=disc,a=0,da=2.0*M_PI/double(discs),i=0;i<discs;a+=da,i++,p++)
{
for (j=-1;j<0;)
{
p->x=x+(2.0*Random(r))-r;
p->y=y+(2.0*Random(r))-r;
for (q=disc,j=0;j<discs;j++,q++)
if (i!=j)
if (((q->x-p->x)*(q->x-p->x))+((q->y-p->y)*(q->y-p->y))<disc_dd)
{ j=-1; break; }
}
for (j=-1;j<0;)
{
p->x1=x+(2.0*Random(r))-r;
p->y1=y+(2.0*Random(r))-r;
for (q=disc,j=0;j<discs;j++,q++)
if (i!=j)
if (((q->x1-p->x1)*(q->x1-p->x1))+((q->y1-p->y1)*(q->y1-p->y1))<disc_dd)
{ j=-1; break; }
}
p->vx=0.0;
p->vy=0.0;
p->_stuck=false;
}
}
//---------------------------------------------------------------------------
void disc_iterate(double dt) // iterate positions
{
int i,j,k;
_disc *p,*q;
double v=25.0,a=10.0,x,y;
const double ang=10.0*M_PI/180.0,ca=cos(ang),sa=sin(ang);
const int n=double(2.0*M_PI/ang);
for (p=disc,i=0;i<discs;i++,p++)
{
p->vx=a*(p->x1-p->x); if (p->vx>+v) p->vx=+v; if (p->vx<-v) p->vx=-v;
p->vy=a*(p->y1-p->y); if (p->vy>+v) p->vy=+v; if (p->vy<-v) p->vy=-v;
x=p->x; p->x+=(p->vx*dt);
y=p->y; p->y+=(p->vy*dt);
p->_stuck=false;
for (k=0,q=disc,j=0;j<discs;j++,q++)
if (i!=j)
if (((q->x-p->x)*(q->x-p->x))+((q->y-p->y)*(q->y-p->y))<disc_dd)
{
k++; if (k>=n) { p->x=x; p->y=y; p->_stuck=true; break; }
p->x=+(p->vx*ca)+(p->vy*sa); p->vx=p->x;
p->y=-(p->vx*sa)+(p->vy*ca); p->vy=p->y;
p->x=x+(p->vx*dt);
p->y=y+(p->vy*dt);
j=-1; q=disc-1;
}
}
}
//---------------------------------------------------------------------------
Usage is simple:
call generate0/1 with center and radius of your plane where discs will be placed
call iterate (dt is time elapsed in seconds)
draw the scene
if you want to change this to use t=<0,1>
loop iterate until all disc at destination or timeout
remember any change in speed for each disc in a list
need the position or speed vector and time it occur
after loop rescale the discs list all to the range of <0,1>
render/animate the rescaled lists
[Notes]
My test is running in real time but I did not apply the <0,1> range and have not too many discs. So you need to test if this is fast enough for your setup.
To speed up you can:
enlarge the angle step
test the collision after rotation against last collided disc and only when free test the rest...
segmentate the disc into (overlapping by radius) regions handle each region separately
also I think some field approach here could speed up things like create field map once in a while for better determine the obstacle avoidance direction
[edit1] some tweaks to avoid infinite oscillations around obstacle
For more discs some of them get stuck bouncing around already stopped disc. To avoid that just change the ang step direction once in a while this is the result:
you can see the oscillating bouncing before finish
this is the changed source:
void disc_iterate(double dt) // iterate positions
{
int i,j,k;
static int cnt=0;
_disc *p,*q;
double v=25.0,a=10.0,x,y;
const double ang=10.0*M_PI/180.0,ca=cos(ang),sa=sin(ang);
const int n=double(2.0*M_PI/ang);
// process discs
for (p=disc,i=0;i<discs;i++,p++)
{
// compute and limit speed
p->vx=a*(p->x1-p->x); if (p->vx>+v) p->vx=+v; if (p->vx<-v) p->vx=-v;
p->vy=a*(p->y1-p->y); if (p->vy>+v) p->vy=+v; if (p->vy<-v) p->vy=-v;
// stroe old and compute new position
x=p->x; p->x+=(p->vx*dt);
y=p->y; p->y+=(p->vy*dt);
p->_stuck=false;
// test if coliding
for (k=0,q=disc,j=0;j<discs;j++,q++)
if (i!=j)
if (((q->x-p->x)*(q->x-p->x))+((q->y-p->y)*(q->y-p->y))<disc_dd)
{
k++; if (k>=n) { p->x=x; p->y=y; p->_stuck=true; break; } // if full circle covered? stop
if (int(cnt&128)) // change the rotation direction every 128 iterations
{
// rotate +ang
p->x=+(p->vx*ca)+(p->vy*sa); p->vx=p->x;
p->y=-(p->vx*sa)+(p->vy*ca); p->vy=p->y;
}
else{
//rotate -ang
p->x=+(p->vx*ca)-(p->vy*sa); p->vx=p->x;
p->y=+(p->vx*sa)+(p->vy*ca); p->vy=p->y;
}
// update new position and test from the start again
p->x=x+(p->vx*dt);
p->y=y+(p->vy*dt);
j=-1; q=disc-1;
}
}
cnt++;
}
It isn't perfect, but my best idea has been to move the discs along quadratic Bezier curves. That means you've got just 2 free variables per disc that you're trying to find values for.
At that point, you can "plug" an error function into a nonlinear optimizer. Longer you're willing to wait, the better your solution will be, in terms of discs avoiding each other.
Only one actual hit:
Doesn't bother displaying hits, the discs actually start overlapped:
I've produced a full example, but the key is the error function to be minimized, which I reproduce here:
double errorf(unsigned n, const double *pts, double *grad,
void *data)
{
problem_t *setup = (problem_t *)data;
double error = 0.0;
for(int step=0; step<setup->steps; step++) {
double t = (1.0+step) / (1.0+setup->steps);
for(int i=0; i<setup->N; i++)
quadbezier(&setup->starts[2*i],
&pts[2*i],
&setup->stops[2*i],
t,
&setup->scratch[2*i]);
for(int i=0; i<setup->N; i++)
for(int j=i+1; j<setup->N; j++) {
double d = distance(&setup->scratch[2*i],
&setup->scratch[2*j]);
d /= RADIUS;
error += (1.0/d) * (1.0/d);
}
}
return error / setup->steps;
}
Ignore n, grad and data. setup describes the specific problem being optimized, number of discs, and where they start and stop. quadbezier does the Bezier curve interpolation, placing its answer into ->scratch. We check ->steps points part way along the path, and measure how close the discs are to one another at each step. To make the optimization problem smoother, it doesn't have a hard switch when the discs start touching, it just tries to keep them all as far apart from one another as possible.
Completely compilable code, Makefile and some Python for turning a bunch of quadratic bezier curves into a series of images is available at https://github.com/jkominek/discs
Performance is a bit sluggish on huge numbers of points, but there are a number of options for improvement.
If the user is making minor tweaks to the starting and finishing positions, then after every tweak, rerun the optimization in the background, using the previous solution as the new starting point. Fixing up a close solution should be faster than recreating it from scratch every time.
Parallelize the n^2 loop over all points.
Check to see if other optimization algorithms will do better on this data. Right now it starts with a global optimization pass, and then does a local optimization pass. There are algorithms which already "know" how to do that sort of thing, and are probably smarter about it.
If you can figure out how to compute the gradient function for free or close to, I'm sure it would be worth it to do so, and switch to algorithms that can make use of the gradient information. It might be worth it even if the gradient isn't cheap.
Replace the whole steps thing with a suboptimization that finds the t at which the two discs are closest, and then uses that distance for the error. Figuring out the gradient for that suboptimization should be much easier.
Better data structures for the intermediate points, so you don't perform a bunch of unnecessary distance calculations for discs that are very far apart.
Probably more?
The usual solution for this kind of problem is to use what is called a "heat map" (or "influence map"). For every point in the field, you compute a "heat" value. The disks move towards high values and away from cold values. Heat maps are good for your type of problem because they are very simple to program, yet can generate sophisticated, AI-like behavior.
For example, imagine just two disks. If your heat map rule is equi-radial, then the disks will just move towards each other, then back away, oscillating back and forth. If your rule randomizes intensity on different radials, then the behavior will be chaotic. You can also make the rule depend on velocity in which case disks will accelerate and decelerate as they move around.
Generally, speaking the heat map rule should make areas "hotter" at they approach some optimal distance from a disk. Places that are too near a disk, or too far away get "colder". By changing this optimal distance you can determine how close the disks congregate together.
Here are a couple of articles with example code showing how to use heat maps:
http://haufler.org/2012/05/26/beating-the-scribd-ai-challenge-implementing-traits-through-heuristics-part-1/
http://www.gamedev.net/page/resources/_/technical/artificial-intelligence/the-core-mechanics-of-influence-mapping-r2799
Game AI Pro, Volume 2, chapter on Heat Maps
I don't have enough rep to comment yet, so sorry for the non-answer.
But to the RTS angle, RTS's generally use the A* algorithm for path finding. Is there a reason you're insisting on using a physics-based model?
Secondly, your attempt you linked that operates rather smoothly, but with the acceleration in the middle, behaves how I initially thought. Since your model treats it as a rubber band, it basically is looking for which way to rotate for the shortest path to the desired location.
If you arent worried about a physical approach, I would attempt as follows:
Try to move directly toward the target. if it collides, it should attempt to roll clockwise around its most recent collision until it is in a position on the vector at 90 degrees to the vector from current location to the target location.
If we assume a test case of 5 in a row at the top of a box and five in a row at the bottom, they will move directly toward each other until they collide. The entire top row will slide to the right until they fall over the edge of the bottom row as it moves to the left and floats over the edge of the top row. (Think of what the whiskey and water shot glass trick looks like when it starts)
Since the motion is not determined by a potential energy stored in the spring which will accelerate the object during a rotation, you have complete control over how the speed changes during the simulation.
In a circular test like you have above, if all disks are initialized with the same speed, the entire clump will go to the middle, collide and twist as a unit for approximately a quarter turn at which point they will break away and head for their goal.
If the timing is lightly randomized, I think you'll get the behavior you're looking for.
I hope this helps.
I want to identify lego bricks for building a lego sorting machine (I use c++ with opencv).
That means I have to distinguish between objects which look very similar.
The bricks are coming to my camera individually on a flat conveyer. But they might lay in any possible way: upside down, on the side or "normal".
My approach is to teach the sorting machine the bricks by taping them with the camera in lots of different positions and rotations. Features of each and every view are calculated by surf-algorythm.
void calculateFeatures(const cv::Mat& image,
std::vector<cv::KeyPoint>& keypoints,
cv::Mat& descriptors)
{
// detector == cv::SurfFeatureDetector(10)
detector->detect(image,keypoints);
// extractor == cv::SurfDescriptorExtractor()
extractor->compute(image,keypoints,descriptors);
}
If there is an unknown brick (the brick that i want to sort) its features also get calculated and matched with known ones.
To find wrongly matched features I proceed as described in the book OpenCV 2 Cookbook:
with the matcher (=cv::BFMatcher(cv::NORM_L2)) the two nearest neighbours in both directions are searched
matcher.knnMatch(descriptorsImage1, descriptorsImage2,
matches1,
2);
matcher.knnMatch(descriptorsImage2, descriptorsImage1,
matches2,
2);
I check the ratio between the distances of the found nearest neighbours. If the two distances are very similar it's likely that a false value is used.
// loop for matches1 and matches2
for(iterator matchIterator over all matches)
if( ((*matchIterator)[0].distance / (*matchIterator)[1].distance) > 0.65 )
throw away
Finally only symmatrical match-pairs are accepted. These are matches in which not only n1 is the nearest neighbour to feature f1, but also f1 is the nearest neighbour to n1.
for(iterator matchIterator1 over all matches)
for(iterator matchIterator2 over all matches)
if ((*matchIterator1)[0].queryIdx == (*matchIterator2)[0].trainIdx &&
(*matchIterator2)[0].queryIdx == (*matchIterator1)[0].trainIdx)
// good Match
Now only pretty good matches remain. To filter out some more bad matches I check which matches fit the projection of img1 on img2 using the fundamental matrix.
std::vector<uchar> inliers(points1.size(),0);
cv::findFundamentalMat(
cv::Mat(points1),cv::Mat(points2), // matching points
inliers,
CV_FM_RANSAC,
3,
0.99);
std::vector<cv::DMatch> goodMatches
// extract the surviving (inliers) matches
std::vector<uchar>::const_iterator itIn= inliers.begin();
std::vector<cv::DMatch>::const_iterator itM= allMatches.begin();
// for all matches
for ( ;itIn!= inliers.end(); ++itIn, ++itM)
if (*itIn)
// it is a valid match
The result is pretty good. But in cases of extreme alikeness faults still occur.
In the picture above you can see that a similar brick is recognized well.
However in the second picture a wrong brick is recognized just as well.
Now the question is how I could improve the matching.
I had two different ideas:
The matches in the second picture trace back to the features really fitting, but only if the visual field is intensely changed. To recognize a brick I have to compare it in many different positions anyway (at least as shown in figure three). This means I know that I am only allowed to minimally change the visual field. The information how intensely the visual field is changed should be hidden in the fundamental matrix. How can I read out of this matrix how far the position in the room has changed? Especially the rotation and strong scaling should be of interest; if the brick once is taped farer on the left side this shouldn't matter.
Second idea:
I calculated the fundamental matrix out of 2 pictures and filtered out features that don't fit the projections - shouldn't there be a way to do the same using three or more pictures? (keyword Trifocal tensor). This way the matching should become more stable. But I neither know how to do this using OpenCV nor could I find any information on this on google.
I don't have a complete answer, but I have a few suggestions.
On the image analysis side:
It looks like your camera setup is pretty constant. Easy to just separate the brick from the background. I also see your system finding features in the background. This is unnecessary. Set all non-brick pixels to black to remove them from the analysis.
When you have located just the brick, your first step should be to just filter likely candidates based on the size (i.e. number of pixels) in the brick. That way the example faulty match you show is already less likely.
You can take other features into account such as the aspect ratio of the bounding box of the brick, the major and minor axes (eigevectors of the covariance matrix of the central moments) of the brick etc.
These simpler features will give you a reasonable first filter to limit your search space.
On the mechanical side:
If bricks are actually coming down a conveyor you should be able to "straighten" the bricks along a straight edge using something like a rod that lies at an angle to the direction of the conveyor across the belt so that the bricks arrive more uniformly at your camera like so.
Similar to the previous point, you could use something like a very loose brush suspended across the belt to topple bricks standing up as they pass.
Again both these points will limit your search space.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Detecting thin lines in blurry image
So as the title says, I am trying to detect boundaries of patterns. In the images attached, you can basically see three different patterns.
Close stripe lines
One thick L shaped line
The area between 1 & 2
I am trying to separate these three, in say 3 separate images. Depend on where the answers go, I will upload more images if needed. Both idea or code will be helpful.
You can solve (for some values of "solve") this problem using morphology. First, to make the image more uniform, remove irrelevant minima. One way to do this is using the h-dome transform for regional minima, which suppresses minima of height < h. Now, we want to join the thin lines. That is accomplished by a morphological opening with a horizontal line of length l. If the lines were merged, then the regional minima of the current image is the background. So we can fill holes to obtain the relevant components. The following code summarizes these tasks:
f = rgb2gray(imread('http://i.stack.imgur.com/02X9Z.jpg'));
hm = imhmin(f, h);
o = imopen(hm, strel('line', l, 0));
result = imfill(~imregionalmin(o), 'holes');
Now, you need to determine h and l. The parameter h is expected to be easier since it is not related to the scale of the input, and in your example, values in the range [10, 30] work fine. To determine l maybe a granulometry analysis could help. Another way is to check if the result contains two significant connected components, corresponding to the bigger L shape and the region of the thin lines. There is no need to increase l one by one, you could perform something that resembles a binary search.
Here are the hm, o and result images with h = 30 and l = 15 (l in [13, 19] works equally good here). This approach gives flexibility on parameter choosing, making it easier to pick/find good values.
To calculate the area in the space between the two largest components, we could merge them and simply count the black pixels inside the new connected component.
You can pass a window (10x10 pixels?) and collect features for that window. The features could be something as simple as the cumulative gradients (edges) within that window. This would distinguish the various areas as long as the window is big enough.
Then using each window as a data point, you can do some clustering, or if the patterns don't vary that much you can do some simple thresholds to determine which data points belong to which patterns (the larger gradient sums belong to the small lines: more edges, while the smallest gradient sums belong to the thickest lines: only one edge, and those in between belong to the other "in-between" pattern .
Once you have this classification, you can create separate images if need be.
Just throwing out ideas. You can binarize the image and do connected component labelling. Then perform some analysis on the connected components such as width to discriminate between the regions.
I have some map files consisting of 'polylines' (each line is just a list of vertices) representing tunnels, and I want to try and find the tunnel 'center line' (shown, roughly, in red below).
I've had some success in the past using Delaunay triangulation but I'd like to avoid that method as it does not (in general) allow for easy/frequent modification of my map data.
Any ideas on how I might be able to do this?
An "algorithm" that works well with localized data changes.
The critic's view
The Good
The nice part is that it uses a mixture of image processing and graph operations available in most libraries, may be parallelized easily, is reasonable fast, may be tuned to use a relatively small memory footprint and doesn't have to be recalculated outside the modified area if you store the intermediate results.
The Bad
I wrote "algorithm", in quotes, just because I developed it and surely is not robust enough to cope with pathological cases. If your graph has a lot of cycles you may end up with some phantom lines. More on this and examples later.
And The Ugly
The ugly part is that you need to be able to flood fill the map, which is not always possible. I posted a comment a few days ago asking if your graphs can be flood filled, but didn't receive an answer. So I decided to post it anyway.
The Sketch
The idea is:
Use image processing to get a fine line of pixels representing the center path
Partition the image in chunks commensurated to the tunnel thinnest passages
At each partition, represent a point at the "center of mass" of the contained pixels
Use those pixels to represent the Vertices of a Graph
Add Edges to the Graph based on a "near neighbour" policy
Remove spurious small cycles in the induced Graph
End- The remaining Edges represent your desired path
The parallelization opportunity arises from the fact that the partitions may be computed in standalone processes, and the resulting graph may be partitioned to find the small cycles that need to be removed. These factors also allow to reduce the memory needed by serializing instead of doing calcs in parallel, but I didn't go trough this.
The Plot
I'll no provide pseudocode, as the difficult part is just that not covered by your libraries. Instead of pseudocode I'll post the images resulting from the successive steps.
I wrote the program in Mathematica, and I can post it if is of some service to you.
A- Start with a nice flood filled tunnel image
B- Apply a Distance Transformation
The Distance Transformation gives the distance transform of image, where the value of each pixel is replaced by its distance to the nearest background pixel.
You can see that our desired path is the Local Maxima within the tunnel
C- Convolve the image with an appropriate kernel
The selected kernel is a Laplacian-of-Gaussian kernel of pixel radius 2. It has the magic property of enhancing the gray level edges, as you can see below.
D- Cutoff gray levels and Binarize the image
To get a nice view of the center line!
Comment
Perhaps that is enough for you, as you ay know how to transform a thin line to an approximate piecewise segments sequence. As that is not the case for me, I continued this path to get the desired segments.
E- Image Partition
Here is when some advantages of the algorithm show up: you may start using parallel processing or decide to process each segment at a time. You may also compare the resulting segments with the previous run and re-use the previous results
F- Center of Mass detection
All the white points in each sub-image are replaced by only one point at the center of mass
XCM = (Σ i∈Points Xi)/NumPoints
YCM = (Σ i∈Points Yi)/NumPoints
The white pixels are difficult to see (asymptotically difficult with param "a" age), but there they are.
G- Graph setup from Vertices
Form a Graph using the selected points as Vertex. Still no Edges.
H- select Candidate Edges
Using the Euclidean Distance between points, select candidate edges. A cutoff is used to select an appropriate set of Edges. Here we are using 1.5 the subimagesize.
As you can see the resulting Graph have a few small cycles that we are going to remove in the next step.
H- Remove Small Cycles
Using a Cycle detection routine we remove the small cycles up to a certain length. The cutoff length depends on a few parms and you should figure it empirically for your graphs family
I- That's it!
You can see that the resulting center line is shifted a little bit upwards. The reason is that I'm superimposing images of different type in Mathematica ... and I gave up trying to convince the program to do what I want :)
A Few Shots
As I did the testing, I collected a few images. They are probably the most un-tunnelish things in the world, but my Tunnels-101 went astray.
Anyway, here they are. Remember that I have a displacement of a few pixels upwards ...
HTH !
.
Update
Just in case you have access to Mathematica 8 (I got it today) there is a new function Thinning. Just look:
This is a pretty classic skeletonization problem; there are lots of algorithms available. Some algorithms work in principle on outline contours, but since almost everyone uses them on images, I'm not sure how available such things will be. Anyway, if you can just plot and fill the sewer outlines and then use a skeletonization algorithm, you could get something close to the midline (within pixel resolution).
Then you could walk along those lines and do a binary search with circles until you hit at least two separate line segments (three if you're at a branch point). The midpoint of the two spots you first hit, or the center of a circle touching the three points you first hit, is a good estimate of the center.
Well in Python using package skimage it is an easy task as follows.
import pylab as pl
from skimage import morphology as mp
tun = 1-pl.imread('tunnel.png')[...,0] #your tunnel image
skl = mp.medial_axis(tun) #skeleton
pl.subplot(121)
pl.imshow(tun,cmap=pl.cm.gray)
pl.subplot(122)
pl.imshow(skl,cmap=pl.cm.gray)
pl.show()
How do I segment a 2D image into blobs of similar values efficiently? The given input is a n array of integer, which includes hue for non-gray pixels and brightness of gray pixels.
I am writing a virtual mobile robot using Java, and I am using segmentation to analyze the map and also the image from the camera. This is a well-known problem in Computer Vision, but when it's on a robot performance does matter so I wanted some inputs. Algorithm is what matters, so you can post code in any language.
Wikipedia article: Segmentation (image processing)
[PPT] Stanford CS-223-B Lecture 11 Segmentation and Grouping (which says Mean Shift is perhaps the best technique to date)
Mean Shift Pictures (paper is also available from Dorin Comaniciu)
I would downsample,in colourspace and in number of pixels, use a vision method(probably meanshift) and upscale the result.
This is good because downsampling also increases the robustness to noise, and makes it more likely that you get meaningful segments.
You could use floodfill to smooth edges afterwards if you need smoothness.
Some more thoughts (in response to your comment).
1) Did you blend as you downsampled? y[i]=(x[2i]+x[2i+1])/2 This should eliminate noise.
2)How fast do you want it to be?
3)Have you tried dynamic meanshift?(also google for dynamic x for all algorithms x)
Not sure if it is too efficient, but you could try using a Kohonen neural network (or, self-organizing map; SOM) to group the similar values, where each pixel contains the original color and position and only the color is used for the Kohohen grouping.
You should read up before you implement this though, as my knowledge of the Kohonen network goes as far as that it is used for grouping data - so I don't know what the performance/viability options are for your scenario.
There are also Hopfield Networks. They can be mangled into grouping from what I read.
What I have now:
Make a buffer of the same size as the input image, initialized to UNSEGMENTED.
For each pixel in the image where the corresponding buffer value is not UNSEGMENTED, flood the buffer using the pixel value.
a. The border checking of the flooding is done by checking if pixel is within EPSILON (currently set to 10) of the originating pixel's value.
b. Flood filling algorithm.
Possible issue:
The 2.a.'s border checking is called many times in the flood filling algorithm. I could turn it into a lookup if I could precalculate the border using edge detection, but that may add more time than current check.
private boolean isValuesCloseEnough(int a_lhs, int a_rhs) {
return Math.abs(a_lhs - a_rhs) <= EPSILON;
}
Possible Enhancement:
Instead of checking every single pixel for UNSEGMENTED, I could randomly pick a few points. If you are expecting around 10 blobs, picking random points in that order may suffice. Drawback is that you might miss a useful but small blob.
Check out Eyepatch (eyepatch.stanford.edu). It should help you during the investigation phase by providing a variety of possible filters for segmentation.
An alternative to flood-fill is the connnected-components algorithm. So,
Cheaply classify your pixels. e.g. divide pixels in colour space.
Run the cc to find the blobs
Retain the blobs of significant size
This approach is widely used in early vision approaches. For example in the seminal paper "Blobworld: A System for Region-Based Image Indexing and Retrieval".