I have set of binary images, on which i need to find the cross (examples attached). I use findcontours to extract borders from the binary image. But i can't understand how can i determine is this shape (border) cross or not? Maybe opencv has some built-in methods, which could help to solve this problem. I thought to solve this problem using Machine learning, but i think there is a simpler way to do this. Thanks!
Viola-Jones object detection could be a good start. Though the main usage of the algorithm (AFAIK) is face detection, it was actually designed for any object detection, such as your cross.
The algorithm is Machine-Learning based algorithm (so, you will need a set of classified "crosses" and a set of classified "not crosses"), and you will need to identify the significant "features" (patterns) that will help the algorithm recognize crosses.
The algorithm is implemented in OpenCV as cvHaarDetectObjects()
From the original image, lets say you've extracted sets of polygons that could potentially be your cross. Assuming that all of the cross is visible, to the extent that all edges can be distinguished as having a length, you could try the following.
Reject all polygons that did not have exactly 12 vertices required to
form your polygon.
Re-order the vertices such that the shortest edge length is first.
Create a best fit perspective transformation that maps your vertices onto a cross of uniform size
Examine the residuals generated by using this transformation to project your cross back onto the uniform cross, where the residual for any given point is the distance between the projected point and the corresponding uniform point.
If all the residuals are within your defined tolerance, you've found a cross.
Note that this works primarily due to the simplicity of the geometric shape you're searching for. Your contours will also need to have noise removed for this to work, e.g. each line within the cross needs to be converted to a single simple line.
Depending on your requirements, you could try some local feature detector like SIFT or SURF. Check OpenSURF which is an interesting implementation of the latter.
after some days of struggle, i came to a conclusion that the only robust way here is to use SVM + HOG. That's all.
You could erode each blob and analyze their number of pixels is going down. No mater the rotation scaling of the crosses they should always go down with the same ratio, excepted when you're closing down on the remaining center. Again, when the blob is small enough you should expect it to be in the center of the original blob. You won't need any machine learning algorithm or training data to resolve this.
Related
How can I convert a polygon shape to a curve in JS/SVG?
I have seen this solution: http://jsdraw2d.jsfiction.com/ but this seems to be dealing with VML and not SVG.
Is there something out-of-the-box that can be used to accurately convert a polygon to a path without ANY loss of quality?
When I say path I don't mean a path with >4000 nodes. I mean a path with curves instead of many nodes. Which in turn means reducing the node count since the polygons would be converted into curves.
I assume, that while polygonizing, you sampled points on the curve, and joined them with straight lines.
The reverse process is curve fitting.
You want to do a "Hermite fitting of curve through a set of points". A little search will help you out.
There are more such fitting algorithms. This is maths based and the under the hood solution to what you want. This is also how most such problems are solved.
If you want a quick solution, you would have to find a library that does it for you. i.e take a set of points, and fits a curve through them.
Note: I assume that fitting a curve through more than 4000 nodes is going to be costly. You could try it and see the performance for yourself, as I am not sure how costly would this be. But, I would suggest that if you needed to maintain the accuracy of your boolean operation. You should not have polygonized them at first. It is just redundancy of efforts to lose accuracy only to gain it back. Boolean set operations can be be done, and are done, without polygonizing the curve data.
Links for reference, and demos
http://en.wikipedia.org/wiki/Spline_interpolation
http://www.math.ucla.edu/~baker/java/hoefer/Spline.htm
http://www.math.ucla.edu/~baker/java/hoefer/Lagrange.htm
I'm doing a personal project of trying to find a person's look-alike given a database of photographs of other people all taken in a consistent manner - people looking directly into the camera, neutral expression and no tilt to the head (think passport photo).
I have a system for placing markers for 2d coordinates on the faces and I was wondering if there are any known approaches for finding a look alike of that face given this approach?
I found the following facial recognition algorithms:
http://www.face-rec.org/algorithms/
But none deal with the specific task of finding a look-alike.
Thanks for your time.
I believe you can also try searching for "Face Verification" rather than just "Face Recognition". This might give you more relevant results.
Strictly speaking, the 2 are actually different things in scientific literature but are sometimes lumped under face recognition. For details on their differences and some sample code, take a look here: http://www.idiap.ch/~marcel/labs/faceverif.php
However, for your purposes, what others such as Edvard and Ari has kindly suggested would work too. Basically they are suggesting a K-nearest neighbor style face recognition classifier.
As a start, you can probably try that. First, compute a feature vector for each of your face images in your database. One possible feature to use is the Local Binary Pattern (LBP). You can find the code by googling it. Do the same for your query image. Now, loop through all the feature vectors and compare them to that of your query image using euclidean distance and return the K nearest ones.
While the above method is easy to code, it will generally not be as robust as some of the more sophisticated ones because they generally fail badly when faces are not aligned (known as unconstrained pose. Search for "Labelled Faces in the Wild" to see the results for state of the art for this problem.) or taken under different environmental conditions. But if the faces in your database are aligned and taken under similar conditions as you mentioned, then it might just work. If they are not aligned, you can use the face key points, which you mentioned you are able to compute, to align the faces. In general, comparing faces which are not aligned is a very difficult problem in computer vision and is still a very active area of research. But, if you only consider faces that look alike and in the same pose to be similar (i.e. similar in pose as well as looks) then this shouldn't be a problem.
The website your gave have links to the code for Eigenfaces and Fisherfaces. These are essentially 2 methods for computing feature vectors for your face images. Faces are identified by doing a K nearest neighbor search for faces in the database with feature vectors (computed using PCA and LDA respectively) closest to that of the query image.
I should probably also mention that in the Fisherfaces method, you will need to have "labels" for the faces in your database to identify the faces. This is because Linear Discriminant Analysis (LDA), the classification method used in Fisherfaces, needs this information to compute a projection matrix that will project feature vectors for similar faces close together and dissimilar ones far apart. Comparison is then performed on these projected vectors. Here lies the difference between Face Recognition and Face Verification: for recognition, you need to have "labels" your training images in your database i.e. you need to identify them.
For verification, you are only trying to tell whether any 2 given faces are of the same person. Often, you don't need the "labelled" data in the traditional sense (although some methods might make use of auxiliary training data to help in the face verification).
The code for computing Eigenfaces and Fisherfaces are available in OpenCV in case you use it.
As a side note:
A feature vector is actually just a vector in your linear algebra sense. It is simply n numbers packed together. The word "feature" refers to something like a "statistic" i.e. a feature vector is a vector containing statistics that characterizes the object it represents. For e.g., for the task of face recognition, the simplest feature vector would be the intensity values of the grayscale image of the face. In that case, I just reshape the 2D array of numbers into a n rows by 1 column vector, each entry containing the value of one pixel. The pixel value here is the "feature", and the n x 1 vector of pixel values is the feature vector. In the LBP case, roughly speaking, it computes a histogram at small patches of pixels in the image and joins these histograms together into one histogram, which is then used as the feature vector. So the Local Binary Pattern is the statistic and the histograms joined together is the feature vector. Together they described the "texture" and facial patterns of your face.
Hope this helps.
These two would seem like the equivalent problem, but I do not work in the field. You essentially have the following two problems:
Face recognition: Take a face and try to match it to a person.
Find similar faces: Take a face and try to find similar faces.
Aren't these equivalent? In (1) you start with a picture that you want to match to the owner and you compare it to a database of reference pictures for each person you know. In (2) you pick a picture in your reference database and run (1) for that picture against the other pictures in the database.
Since the algorithms seem to give you a measure of how likely two pictures belong to the same person, in (2) you just sort the measures in decreasing order and pick the top hits.
I assume you should first analyze all the picture in your database with whatever approach you are using. You should then have a set of metrics for each picture which you can compare a specific picture with and statistically find the closest match.
For example, if you can measure the distance between the eyes, you can find faces that have the same distance. You can then find the face that has the overall closest match and return that.
Imagine we have a simple 2D drawing, filled it with lots of non-overlapping circles and only a few stars.
If we are to find all the stars among all these circles, I can think of very few methods. Brute force is one of them. Another one is possibly reduce the image size (to the optimal point where you can still distinguish the objects apart) and then apply brute force and map to the original image. The drawback of brute force is of course, it is very time consuming. I am looking for faster methods, possibly the fastest one.
What is the fastest image processing method to search for the specified item on a simple 2D image?
One typical way of looking for an object in an image is through cross correlation. Basically, you look for the position where the cross-correlation between a mask (the object you're attempting to find) and the image is the highest. That position is the likely location of the object you're trying to find.
For the sake of simplicity, I will refer to the object you're attempting to find as a star, but in general it can be any shape.
Some problems with the above approach:
The size of the mask has to match the size of the star. If you don't know the size of the star, then you will have to try different size masks. Image pyramids are more effective than just iteratively trying different size masks, but still require extra effort.
Similarly, the orientations of the mask and the star have to match. If they don't, the cross-correlation won't work.
For these reasons, the more you know about your problem, the simpler it becomes. This is the reason why people have asked you for more information in the comments. A general purpose solution doesn't really exist, to the best of my knowledge. Maybe someone more knowledgeable can correct me on this.
As you've mentioned, reducing the size of the image will help you reduce the computational time of your approach. In my opinion, it's hardly the core element of a solution -- it's just an optional optimization step.
If the shapes are easy to segment from the background, you might be able to compute distinguishing shape/color descriptors. Depending on your problem you could choose descriptors that are invariant to scale, translation or rotation (e.g. compactness, if it is unique to each shape). I do not know if this will be faster, though.
If you already know the exact shape and have an idea about the size, you might want to have a look at the Generalized Hough Transform, which is basically a formalized description of your "brute force algorithm"
As you list a property that the shapes are not overlapping then I assume an efficient algorithm would be able to
cut out all the shapes by scanning the image in some way (I can imagine relatively efficient and simple algorithm for convex shapes)
when you are left with cut out shapes you could use cross relation misha mentioned
You should describe the problem a bit better
can the shapes be rotated or scaled (or some other transform?)
is the background uniform colour
are the shapes uniform colour
are the shapes filled
Depending on the answer on the above questions you might have more less or more simple solutions.
Also, maybe this article might be interesting.
If the shapes are very regular maybe turning them into vectors could fit your needs nicely, but it might be an overkill, really depends what you want to do later.
Step 1: Thresholding - reduce the image to 1 bit (black or white) if the general image set permits it. [For the type of example you cite, my guess is thresholding would work nicely - leaving enough details to find objects].
Step 2: Optionally do some smoothing/noise removal.
Step 3: Use some clustering approach to gather the foreground objects.
Step 4: Use an appropriate heuristic to identify the objects.
The parameters in steps 1/2 will depend a lot on the type of images as well as experimentation/observation. 3 is usually straightforward if you have worked out 1/2 correctly. 4 will depend very much on the problem (for example, in your case identifying stars - which would depend on what is the actual shape of the stars expected in the images).
I am thinking of implement a image processing based solution for industrial problem.
The image is consists of a Red rectangle. Inside that I will see a matrix of circles. The requirement is to count the number of circles under following constraints. (Real application : Count the number of bottles in a bottle casing. Any missing bottles???)
The time taken for the operation should be very low.
I need to detect the red rectangle as well. My objective is to count the
items in package and there are no
mechanism (sensors) to trigger the
camera. So camera will need to capture
the photos continuously but the
program should have a way to discard
the unnecessary images.
Processing should be realtime.
There may be a "noise" in image capturing. You may see ovals instead of circles.
My questions are as follows,
What is the best edge detection algorithm that matches with the given
scenario?
Are there any other mechanisms that I can use other than the edge
detection?
Is there a big impact between the language I use and the performance of
the system?
AHH - YOU HAVE NOW TOLD US THE BOTTLES ARE IN FIXED LOCATIONS!
IT IS AN INCREDIBLY EASIER PROBLEM.
All you have to do is look at each of the 12 spots and see if there is a black area there or not. Nothing could be easier.
You do not have to do any edge or shape detection AT ALL.
It's that easy.
You then pointed out that the box might be rotatated, things could be jiggled. That the box might be rotated a little (or even a lot, 0 to 360 each time) is very easily dealt with. The fact that the bottles are in "slots" (even if jiggled) massively changes the nature of the problem. You're main problem (which is easy) is waiting until each new red square (crate) is centered under the camera. I just realised you meant "matrix" literally and specifically in the sentence in your original questions. That changes everything totally, compared to finding a disordered jumble of circles. Finding whether or not a blob is "on" at one of 12 points, is a wildly different problem to "identifying circles in an image". Perhaps you could post an image to wrap up the question.
Finally I believe Kenny below has identified the best solution: blob analysis.
"Count the number of bottles in a bottle casing"...
Do the individual bottles sit in "slots"? ie, there are 4x3 = 12 holes, one for each bottle.
In other words, you "only" have to determine if there is, or is not, a bottle in each of the 12 holes.
Is that correct?
If so, your problem is incredibly easier than the more general problem of a pile of bottles "anywhere".
Quite simply, where do we see the bottles from? The top, sides, bottom, or? Do we always see the tops/bottoms, or are they mixed (ie, packed top-to-tail). These issues make huge, huge differences.
Surf/Sift = overkill in this case you certainly don't need it.
If you want real time speed (about 20fps+ on a 800x600 image) I recommend using Cuda to implement edge detection using a standard filter scheme like sobel, then implement binarization + image closure to make sure the edges of circles are not segmented apart.
The hardest part will be fitting circles. This is assuming you already got to the step where you have taken edges and made sure they are connected using image closure (morphology.) At this point I would proceed as follows:
run blob analysis/connected components to segment out circles that do not touch. If circles can touch the next step will be trickier
for each connected componet/blob fit a circle or rectangle using RANSAC which can run in realtime (as opposed to Hough Transform which I believe is very hard to run in real time.)
Step 2 will be much harder if you can not segment the connected components that form circles seperately, so some additional thought should be invested on how to guarantee that condition.
Good luck.
Edit
Having thought about it some more, I feel like RANSAC is ideal for the case where the circle connected components do touch. RANSAC should hypothetically fit the circle to only a part of the connected component (due to its ability to perform well in the case of mostly outlier points.) This means that you could add an extra check to see if the fitted circle encompasses the entire connected component and if it does not then rerun RANSAC on the portion of the connected component that was left out. Rinse and repeat as many times as necessary.
Also I realize that I say circle but you could just as easily fit an ellipse instead of circles using RANSAC.
Also, I'd like to comment that when I say CUDA is a good choice I mean CUDA is a good choice to implement the sobel filter + binirization + image closing on. Connected components and RANSAC are probably best left to the CPU, but you can try pushing them onto CUDA though I don't know how much of an advantage a GPU will give you for those 2 over a CPU.
For the circles, try the Hough transform.
other mechanisms: dunno
Compiled languages will possibly be faster.
SIFT should have a very good response to circular objects - it is patented, though. GLOHis a similar algorithm, but I do not know if there are any implementations readily available.
Actually, doing some more research, SURF is an improved version of SIFT with quite a few implementations available, check out the links on the wikipedia page.
Sum of colors + convex hull to detect boundary. You need, mostly, 4 corners of a rectangle, and not it's sides?
No motion, no second camera, a little choice - lot of math methods against a little input (color histograms, color distribution matrix). Dunno.
Java == high memory consumption, Lisp == high brain consumption, C++ == memory/cpu/speed/brain use optimum.
If the contrast is good, blob analysis is the algorithm for the job.
I have some map files consisting of 'polylines' (each line is just a list of vertices) representing tunnels, and I want to try and find the tunnel 'center line' (shown, roughly, in red below).
I've had some success in the past using Delaunay triangulation but I'd like to avoid that method as it does not (in general) allow for easy/frequent modification of my map data.
Any ideas on how I might be able to do this?
An "algorithm" that works well with localized data changes.
The critic's view
The Good
The nice part is that it uses a mixture of image processing and graph operations available in most libraries, may be parallelized easily, is reasonable fast, may be tuned to use a relatively small memory footprint and doesn't have to be recalculated outside the modified area if you store the intermediate results.
The Bad
I wrote "algorithm", in quotes, just because I developed it and surely is not robust enough to cope with pathological cases. If your graph has a lot of cycles you may end up with some phantom lines. More on this and examples later.
And The Ugly
The ugly part is that you need to be able to flood fill the map, which is not always possible. I posted a comment a few days ago asking if your graphs can be flood filled, but didn't receive an answer. So I decided to post it anyway.
The Sketch
The idea is:
Use image processing to get a fine line of pixels representing the center path
Partition the image in chunks commensurated to the tunnel thinnest passages
At each partition, represent a point at the "center of mass" of the contained pixels
Use those pixels to represent the Vertices of a Graph
Add Edges to the Graph based on a "near neighbour" policy
Remove spurious small cycles in the induced Graph
End- The remaining Edges represent your desired path
The parallelization opportunity arises from the fact that the partitions may be computed in standalone processes, and the resulting graph may be partitioned to find the small cycles that need to be removed. These factors also allow to reduce the memory needed by serializing instead of doing calcs in parallel, but I didn't go trough this.
The Plot
I'll no provide pseudocode, as the difficult part is just that not covered by your libraries. Instead of pseudocode I'll post the images resulting from the successive steps.
I wrote the program in Mathematica, and I can post it if is of some service to you.
A- Start with a nice flood filled tunnel image
B- Apply a Distance Transformation
The Distance Transformation gives the distance transform of image, where the value of each pixel is replaced by its distance to the nearest background pixel.
You can see that our desired path is the Local Maxima within the tunnel
C- Convolve the image with an appropriate kernel
The selected kernel is a Laplacian-of-Gaussian kernel of pixel radius 2. It has the magic property of enhancing the gray level edges, as you can see below.
D- Cutoff gray levels and Binarize the image
To get a nice view of the center line!
Comment
Perhaps that is enough for you, as you ay know how to transform a thin line to an approximate piecewise segments sequence. As that is not the case for me, I continued this path to get the desired segments.
E- Image Partition
Here is when some advantages of the algorithm show up: you may start using parallel processing or decide to process each segment at a time. You may also compare the resulting segments with the previous run and re-use the previous results
F- Center of Mass detection
All the white points in each sub-image are replaced by only one point at the center of mass
XCM = (Σ i∈Points Xi)/NumPoints
YCM = (Σ i∈Points Yi)/NumPoints
The white pixels are difficult to see (asymptotically difficult with param "a" age), but there they are.
G- Graph setup from Vertices
Form a Graph using the selected points as Vertex. Still no Edges.
H- select Candidate Edges
Using the Euclidean Distance between points, select candidate edges. A cutoff is used to select an appropriate set of Edges. Here we are using 1.5 the subimagesize.
As you can see the resulting Graph have a few small cycles that we are going to remove in the next step.
H- Remove Small Cycles
Using a Cycle detection routine we remove the small cycles up to a certain length. The cutoff length depends on a few parms and you should figure it empirically for your graphs family
I- That's it!
You can see that the resulting center line is shifted a little bit upwards. The reason is that I'm superimposing images of different type in Mathematica ... and I gave up trying to convince the program to do what I want :)
A Few Shots
As I did the testing, I collected a few images. They are probably the most un-tunnelish things in the world, but my Tunnels-101 went astray.
Anyway, here they are. Remember that I have a displacement of a few pixels upwards ...
HTH !
.
Update
Just in case you have access to Mathematica 8 (I got it today) there is a new function Thinning. Just look:
This is a pretty classic skeletonization problem; there are lots of algorithms available. Some algorithms work in principle on outline contours, but since almost everyone uses them on images, I'm not sure how available such things will be. Anyway, if you can just plot and fill the sewer outlines and then use a skeletonization algorithm, you could get something close to the midline (within pixel resolution).
Then you could walk along those lines and do a binary search with circles until you hit at least two separate line segments (three if you're at a branch point). The midpoint of the two spots you first hit, or the center of a circle touching the three points you first hit, is a good estimate of the center.
Well in Python using package skimage it is an easy task as follows.
import pylab as pl
from skimage import morphology as mp
tun = 1-pl.imread('tunnel.png')[...,0] #your tunnel image
skl = mp.medial_axis(tun) #skeleton
pl.subplot(121)
pl.imshow(tun,cmap=pl.cm.gray)
pl.subplot(122)
pl.imshow(skl,cmap=pl.cm.gray)
pl.show()