3D surface reconstruction from a sparse point cloud - point-clouds

I have an equipment which performs radial scan. It scans an object along the green lines showing in the image(three lines in the image, but the equipment can perform more).
Thus, I can get a point cloud, which contains points from the upper and lower surfaces of the object. And I want to make surface reconstruction using the point cloud.
I load the point cloud file(.txt format, xyz coordinates of each point) into Meshlab and it shows like this(yellow points are the points in the point cloud):
I then followed a blog teaching simple usage of Meshlab and clicked "Fiter——>Normals,Curtavures and Oreientation——>Smooths normals on a point set" and "Fiter——>Remeshing Simplication and Reconstruction——>Surface Reconstruction:Ball Pivoting". (both default setting)
However, the results was not I want:
It connects points within a scanned image, but the surface should be reconstructed by connecting points between adjacent scanned images.
I can think about two possible reasons:(1) I did not choose right setting in Meshlab. If so, which setting parameters can make reconstruction for the point cloud. (2)My point cloud is too sparse and I need to interpolate the point cloud to make it have more points and which interpolation method should I use?
————————————————————EDIT———————————————————
The normals computed in this image are computed using Neighbour num 10 and smooth iteration 8.
And, the normals computed in this image are computed using Neighbour num 60 and smooth iteration 8. When Neighbour num is greater than 20, normals are similar to the image below.

It is possible that you are having problems due to inequality directional sampling of the surface. You have high density of points in one direction, then a big gap in the other direction. This is a problem due to 'Compute normals for point sets' filter uses a parameter with the number of Neighbours to each point, so the computer normal will be byassed because there is unlikely to find neighbours in a different scanline.
So my proposal is (I will reuse parts from this tutorial )
Point Cloud Simplification and Normals Computation
Start by increasing the number of orientations in the scan. You want to fill those gaps.
If you need to reduce the number of point samples in the center of the object to reduce noise, go to Filters -> Point Set -> Point Cloud Simplification. Make sure Best Sample Heuristic is checked.
After point cloud simplification, make sure to select Simplified point cloud in the Show Layer Dialog on the right hand side. If not visible, it can be opened by navigating to View -> Show Layer Dialog. Now we need to compute normals for point set.
So go to Filters -> Point Set -> Compute normals for point sets . Enter Neighbour num between 10 - 100. Initially try with 10 and try to get a mesh and later see if this can be improved by increasing the neighbour number. For Smooth Iteration initially try with 0 value and may be later it can be tried with values between 5 - 10. I mostly use value 8.
Make sure if your normals are properly computed by going to Render -> Show Normal.
Meshing / Poisson Surface Reconstruction
Next we are going to use Poisson Surface reconstruction to do meshing.
So go to Filters ->Remeshing, Simplification and Reconstruction -> Screened Poisson Surface Reconstruction. Initially try with default parameters then later one can play around with reconstruction depth, number of samples and interpolation weight values.
This will create another mesh layer called Poisson in the Show layer Dialog which has surfaces now. Make sure to select that to peform further operations.
One can observe that it has also created some extra surfaces. To remove them go to Filters -> Selection -> Select Faces with edges longer than .... By default the value is automatically computed, just click on apply. Then click on delete face button (triangle face and three vertex with a cross over it). This will remove extra surfaces.
After this operation, still some noise faces can be seen. To remove them go to Filters -> Cleaning and Repairing -> Remove isolated pieces (wrt Face Num.). Use the default value and make sure Remove unreferenced vertices is checked. This will remove some noise faces.
Even after the above operation some noise faces are seen. To remove them go to Filters -> Selection -> Select non Manifold Vertices. Click apply. Then click on delete face button (triangle and threwe vertex with a cross over it). This will remove remaining extra faces.

Related

Future prospects for improvement of depth data on Project Tango tablet

I am interested in using the Project Tango tablet for 3D reconstruction using arbitrary point features. In the current SDK version, we seem to have access to the following data.
A 1280 x 720 RGB image.
A point cloud with 0-~10,000 points, depending on the environment. This seems to average between 3,000 and 6,000 in most environments.
What I really want is to be able to identify a 3D point for key points within an image. Therefore, it makes sense to project depth into the image plane. I have done this, and I get something like this:
The problem with this process is that the depth points are sparse compared to the RGB pixels. So I took it a step further and performed interpolation between the depth points. First, I did Delaunay triangulation, and once I got a good triangulation, I interpolated between the 3 points on each facet and got a decent, fairly uniform depth image. Here are the zones where the interpolated depth is valid, imposed upon the RGB iamge.
Now, given the camera model, it's possible to project depth back into Cartesian coordinates at any point on the depth image (since the depth image was made such that each pixel corresponds to a point on the original RGB image, and we have the camera parameters of the RGB camera). However, if you look at the triangulation image and compare it to the original RGB image, you can see that depth is valid for all of the uninteresting points in the image: blank, featureless planes mostly. This isn't just true for this single set of images; it's a trend I'm seeing for the sensor. If a person stands in front of the sensor, for example, there are very few depth points within their silhouette.
As a result of this characteristic of the sensor, if I perform visual feature extraction on the image, most of the areas with corners or interesting textures fall in areas without associated depth information. Just an example: I detected 1000 SIFT keypoints from an an RGB image from an Xtion sensor, and 960 of those had valid depth values. If I do the same thing to this system, I get around 80 keypoints with valid depth. At the moment, this level of performance is unacceptable for my purposes.
I can guess at the underlying reasons for this: it seems like some sort of plane extraction algorithm is being used to get depth points, whereas Primesense/DepthSense sensors are using something more sophisticated.
So anyway, my main question here is: can we expect any improvement in the depth data at a later point in time, through improved RGB-IR image processing algorithms? Or is this an inherent limit of the current sensor?
I am from the Project Tango team at Google. I am sorry you are experiencing trouble with depth on the device. Just so that we are sure your device is in good working condition, can you please test the depth performance against a flat wall. Instructions are as below:
https://developers.google.com/project-tango/hardware/depth-test
Even with a device in good working condition, the depth library is known to return sparse depth points on scenes with low IR reflectance objects, small sized objects, high dynamic range scenes, surfaces at certain angles and objects at distances larger than ~4m. While some of these are inherent limitations in the depth solution, we are working with the depth solution provider to bring improvements wherever possible.
Attached an image of a typical conference room scene and the corresponding point cloud. As you can see, 1) no depth points are returned from the laptop screen (low reflectance), the table top objects such as post-its, pencil holder etc (small object sizes), large portions of the table (surface at an angles), room corner at the far right (distance >4m).
But as you move around the device, you will start getting depth point returns. Accumulating depth points is a must to get denser point clouds.
Please also keep us posted on your findings at project-tango-hardware-support#google.com
In my very basic initial experiments, you are correct with respect to depth information returned from the visual field, however, the return of surface points is anything but constant. I find as I move the device I can get major shifts in where depth information is returned, i.e. there's a lot of transitory opacity in the image with respect to depth data, probably due to the characteristics of the surfaces.
So while no return frame is enough, the real question seems to be the construction of a larger model (point cloud to open, possibly voxel spaces as one scales up) to bring successive scans into a common model. It's reminiscent of synthetic aperture algorithms in spirit, but the letters in the equations are from a whole different set of laws.
In short, I think a more interesting approach is to synthesize a more complete model by successive accumulation of point cloud data - now, for this to work, the device team has to have their dead reckoning on the money for whatever scale this is done. Also this addresses an issue that no sensor improvements can address - if your visual sensor is perfect, it still does nothing to help you relate the sides of an object at least be in the close neighborhood of the front of the object.

distinguishing objects with opencv

I want to identify lego bricks for building a lego sorting machine (I use c++ with opencv).
That means I have to distinguish between objects which look very similar.
The bricks are coming to my camera individually on a flat conveyer. But they might lay in any possible way: upside down, on the side or "normal".
My approach is to teach the sorting machine the bricks by taping them with the camera in lots of different positions and rotations. Features of each and every view are calculated by surf-algorythm.
void calculateFeatures(const cv::Mat& image,
std::vector<cv::KeyPoint>& keypoints,
cv::Mat& descriptors)
{
// detector == cv::SurfFeatureDetector(10)
detector->detect(image,keypoints);
// extractor == cv::SurfDescriptorExtractor()
extractor->compute(image,keypoints,descriptors);
}
If there is an unknown brick (the brick that i want to sort) its features also get calculated and matched with known ones.
To find wrongly matched features I proceed as described in the book OpenCV 2 Cookbook:
with the matcher (=cv::BFMatcher(cv::NORM_L2)) the two nearest neighbours in both directions are searched
matcher.knnMatch(descriptorsImage1, descriptorsImage2,
matches1,
2);
matcher.knnMatch(descriptorsImage2, descriptorsImage1,
matches2,
2);
I check the ratio between the distances of the found nearest neighbours. If the two distances are very similar it's likely that a false value is used.
// loop for matches1 and matches2
for(iterator matchIterator over all matches)
if( ((*matchIterator)[0].distance / (*matchIterator)[1].distance) > 0.65 )
throw away
Finally only symmatrical match-pairs are accepted. These are matches in which not only n1 is the nearest neighbour to feature f1, but also f1 is the nearest neighbour to n1.
for(iterator matchIterator1 over all matches)
for(iterator matchIterator2 over all matches)
if ((*matchIterator1)[0].queryIdx == (*matchIterator2)[0].trainIdx &&
(*matchIterator2)[0].queryIdx == (*matchIterator1)[0].trainIdx)
// good Match
Now only pretty good matches remain. To filter out some more bad matches I check which matches fit the projection of img1 on img2 using the fundamental matrix.
std::vector<uchar> inliers(points1.size(),0);
cv::findFundamentalMat(
cv::Mat(points1),cv::Mat(points2), // matching points
inliers,
CV_FM_RANSAC,
3,
0.99);
std::vector<cv::DMatch> goodMatches
// extract the surviving (inliers) matches
std::vector<uchar>::const_iterator itIn= inliers.begin();
std::vector<cv::DMatch>::const_iterator itM= allMatches.begin();
// for all matches
for ( ;itIn!= inliers.end(); ++itIn, ++itM)
if (*itIn)
// it is a valid match
The result is pretty good. But in cases of extreme alikeness faults still occur.
In the picture above you can see that a similar brick is recognized well.
However in the second picture a wrong brick is recognized just as well.
Now the question is how I could improve the matching.
I had two different ideas:
The matches in the second picture trace back to the features really fitting, but only if the visual field is intensely changed. To recognize a brick I have to compare it in many different positions anyway (at least as shown in figure three). This means I know that I am only allowed to minimally change the visual field. The information how intensely the visual field is changed should be hidden in the fundamental matrix. How can I read out of this matrix how far the position in the room has changed? Especially the rotation and strong scaling should be of interest; if the brick once is taped farer on the left side this shouldn't matter.
Second idea:
I calculated the fundamental matrix out of 2 pictures and filtered out features that don't fit the projections - shouldn't there be a way to do the same using three or more pictures? (keyword Trifocal tensor). This way the matching should become more stable. But I neither know how to do this using OpenCV nor could I find any information on this on google.
I don't have a complete answer, but I have a few suggestions.
On the image analysis side:
It looks like your camera setup is pretty constant. Easy to just separate the brick from the background. I also see your system finding features in the background. This is unnecessary. Set all non-brick pixels to black to remove them from the analysis.
When you have located just the brick, your first step should be to just filter likely candidates based on the size (i.e. number of pixels) in the brick. That way the example faulty match you show is already less likely.
You can take other features into account such as the aspect ratio of the bounding box of the brick, the major and minor axes (eigevectors of the covariance matrix of the central moments) of the brick etc.
These simpler features will give you a reasonable first filter to limit your search space.
On the mechanical side:
If bricks are actually coming down a conveyor you should be able to "straighten" the bricks along a straight edge using something like a rod that lies at an angle to the direction of the conveyor across the belt so that the bricks arrive more uniformly at your camera like so.
Similar to the previous point, you could use something like a very loose brush suspended across the belt to topple bricks standing up as they pass.
Again both these points will limit your search space.

Tracking user defined points with OpenCV

I'm working on a project where I need to track two points in an image. So far, the best way I have of identifying these points is to get the user to click on them when the program is first run. I'm using the Lucas-Kanade Pyramid method built into OpenCV (documented here, but as is to be expected, this doesn't work too well. Is there a better alternative algorithm for tracking points in OpenCV, or alternatively some other way of verifying the points I already have?
I'm currently considering using GoodFeaturesToTrack, and getting the distance from each point to the one that I want to track, and maybe some sort of vector pointing out the relationship between the two points, and using this information to determine my new point.
I'm looking for suggestions of ways to go about this, not necessarily code samples.
Thanks
EDIT: I'm tracking small movements, if that helps
If you look for a solution that is implemented in opencv the pyramidal Lucas Kanade (PLK) method is quit good, else I would prefer a Particle Filter based tracker.
To improve your tracking performance with the PLK be sure that you have set up the parameters correctly. E.g. for large motion you need a level at ca. 3 or 4. The window should not be to small ( I prefer 17x17 to 27x27). Also keep in mind that the methods needs textured areas to be able to track the points. That means corner like image content (aperture problem).
I would propose to seed a set of points (ps) in a grid around the points (P) you want to track. And than use a foreward - backward threshold to reject falsly tracked points. The motion of your points (P) will be computed by the mean motion of the particular residual point sets (ps).
The foreward backward confidence is computes by estimating the motion from frame 1 to frame 2. (ptList1 -> ptList2). And that from frame 2 to frame 1 with the points of ptList2 (ptList2 -> ptListRef). Motion vectors will be rejected if (|| ptRef - pt1 || > fb_threshold).

Best approach for specific Object/Image Recognition task?

I'm searching for an certain object in my photograph:
Object: Outline of a rectangle with an X in the middle. It looks like a rectangular checkbox. That's all. So, no fill, just lines. The rectangle will have the same ratios of length to width but it could be any size or any rotation in the photograph.
I've looked a whole bunch of image recognition approaches. But I'm trying to determine the best for this specific task. Most importantly, the object is made of lines and is not a filled shape. Also, there is no perspective distortion, so the rectangular object will always have right angles in the photograph.
Any ideas? I'm hoping for something that I can implement fairly easily.
Thanks all.
You could try using a corner detector (e.g. Harris) to find the corners of the box, the ends and the intersection of the X. That simplifies the problem to finding points in the right configuration.
Edit (response to comment):
I'm assuming you can find the corner points in your image, the 4 corners of the rectangle, the 4 line endings of the X and the center of the X, plus a few other corners in the image due to noise or objects in the background. That simplifies the problem to finding a set of 9 points in the right configuration, out of a given set of points.
My first try would be to look at each corner point A. Then I'd iterate over the points B close to A. Now if I assume that (e.g.) A is the upper left corner of the rectangle and B is the lower right corner, I can easily calculate, where I would expect the other corner points to be in the image. I'd use some nearest-neighbor search (or a library like FLANN) to see if there are corners where I'd expect them. If I can find a set of points that matches these expected positions, I know where the symbol would be, if it is present in the image.
You have to try if that is good enough for your application. If you have too many false positives (sets of corners of other objects that accidentially form a rectangle + X), you could check if there are lines (i.e. high contrast in the right direction) where you would expect them. And you could check if there is low contrast where there are no lines in the pattern. This should be relatively straightforward once you know the points in the image that correspond to the corners/line endings in the object you're looking for.
I'd suggest the Generalized Hough Transform. It seems you have a fairly simple, fixed shape. The generalized Hough transform should be able to detect that shape at any rotation or scale in the image. You many need to threshold the original image, or pre-process it in some way for this method to be useful though.
You can use local features to identify the object in image. Feature detection wiki
For example, you can calculate features on some referent image which contains only the object you're looking for and save the results, let's say, to a plain text file. After that you can search for the object just by comparing newly calculated features (on images with some complex scenes containing the object) with the referent ones.
Here's some good resource on local features:
Local Invariant Feature Detectors: A Survey

Find tunnel 'center line'?

I have some map files consisting of 'polylines' (each line is just a list of vertices) representing tunnels, and I want to try and find the tunnel 'center line' (shown, roughly, in red below).
I've had some success in the past using Delaunay triangulation but I'd like to avoid that method as it does not (in general) allow for easy/frequent modification of my map data.
Any ideas on how I might be able to do this?
An "algorithm" that works well with localized data changes.
The critic's view
The Good
The nice part is that it uses a mixture of image processing and graph operations available in most libraries, may be parallelized easily, is reasonable fast, may be tuned to use a relatively small memory footprint and doesn't have to be recalculated outside the modified area if you store the intermediate results.
The Bad
I wrote "algorithm", in quotes, just because I developed it and surely is not robust enough to cope with pathological cases. If your graph has a lot of cycles you may end up with some phantom lines. More on this and examples later.
And The Ugly
The ugly part is that you need to be able to flood fill the map, which is not always possible. I posted a comment a few days ago asking if your graphs can be flood filled, but didn't receive an answer. So I decided to post it anyway.
The Sketch
The idea is:
Use image processing to get a fine line of pixels representing the center path
Partition the image in chunks commensurated to the tunnel thinnest passages
At each partition, represent a point at the "center of mass" of the contained pixels
Use those pixels to represent the Vertices of a Graph
Add Edges to the Graph based on a "near neighbour" policy
Remove spurious small cycles in the induced Graph
End- The remaining Edges represent your desired path
The parallelization opportunity arises from the fact that the partitions may be computed in standalone processes, and the resulting graph may be partitioned to find the small cycles that need to be removed. These factors also allow to reduce the memory needed by serializing instead of doing calcs in parallel, but I didn't go trough this.
The Plot
I'll no provide pseudocode, as the difficult part is just that not covered by your libraries. Instead of pseudocode I'll post the images resulting from the successive steps.
I wrote the program in Mathematica, and I can post it if is of some service to you.
A- Start with a nice flood filled tunnel image
B- Apply a Distance Transformation
The Distance Transformation gives the distance transform of image, where the value of each pixel is replaced by its distance to the nearest background pixel.
You can see that our desired path is the Local Maxima within the tunnel
C- Convolve the image with an appropriate kernel
The selected kernel is a Laplacian-of-Gaussian kernel of pixel radius 2. It has the magic property of enhancing the gray level edges, as you can see below.
D- Cutoff gray levels and Binarize the image
To get a nice view of the center line!
Comment
Perhaps that is enough for you, as you ay know how to transform a thin line to an approximate piecewise segments sequence. As that is not the case for me, I continued this path to get the desired segments.
E- Image Partition
Here is when some advantages of the algorithm show up: you may start using parallel processing or decide to process each segment at a time. You may also compare the resulting segments with the previous run and re-use the previous results
F- Center of Mass detection
All the white points in each sub-image are replaced by only one point at the center of mass
XCM = (Σ i∈Points Xi)/NumPoints
YCM = (Σ i∈Points Yi)/NumPoints
The white pixels are difficult to see (asymptotically difficult with param "a" age), but there they are.
G- Graph setup from Vertices
Form a Graph using the selected points as Vertex. Still no Edges.
H- select Candidate Edges
Using the Euclidean Distance between points, select candidate edges. A cutoff is used to select an appropriate set of Edges. Here we are using 1.5 the subimagesize.
As you can see the resulting Graph have a few small cycles that we are going to remove in the next step.
H- Remove Small Cycles
Using a Cycle detection routine we remove the small cycles up to a certain length. The cutoff length depends on a few parms and you should figure it empirically for your graphs family
I- That's it!
You can see that the resulting center line is shifted a little bit upwards. The reason is that I'm superimposing images of different type in Mathematica ... and I gave up trying to convince the program to do what I want :)
A Few Shots
As I did the testing, I collected a few images. They are probably the most un-tunnelish things in the world, but my Tunnels-101 went astray.
Anyway, here they are. Remember that I have a displacement of a few pixels upwards ...
HTH !
.
Update
Just in case you have access to Mathematica 8 (I got it today) there is a new function Thinning. Just look:
This is a pretty classic skeletonization problem; there are lots of algorithms available. Some algorithms work in principle on outline contours, but since almost everyone uses them on images, I'm not sure how available such things will be. Anyway, if you can just plot and fill the sewer outlines and then use a skeletonization algorithm, you could get something close to the midline (within pixel resolution).
Then you could walk along those lines and do a binary search with circles until you hit at least two separate line segments (three if you're at a branch point). The midpoint of the two spots you first hit, or the center of a circle touching the three points you first hit, is a good estimate of the center.
Well in Python using package skimage it is an easy task as follows.
import pylab as pl
from skimage import morphology as mp
tun = 1-pl.imread('tunnel.png')[...,0] #your tunnel image
skl = mp.medial_axis(tun) #skeleton
pl.subplot(121)
pl.imshow(tun,cmap=pl.cm.gray)
pl.subplot(122)
pl.imshow(skl,cmap=pl.cm.gray)
pl.show()

Resources