Planes and other artifacts In Tango Point Cloud - google-project-tango

In my app I update mPointCloudmanager at every onXyzIjAvailable. I save TangoXyzIjData and relative TangoMatrixTransformData when a button is pressed.
Here's the code:
TangoXyzIjData xyzIjData = mPointCloudManager.getLatestXyzIj();
TangoSupport.TangoMatrixTransformData matrix =
TangoSupport.getMatrixTransformAtTime(
xyzIjData.timestamp,
TangoPoseData.COORDINATE_FRAME_AREA_DESCRIPTION,
TangoPoseData.COORDINATE_FRAME_CAMERA_DEPTH,
TangoSupport.TANGO_SUPPORT_ENGINE_OPENGL,
TangoSupport.TANGO_SUPPORT_ENGINE_TANGO);
Then I apply matrix transformation to the points inside TangoXyzIjData and save them to a file.
This results in a reconstruction with a decent precision, but it comes also with some strange artifacts floating in the air.
Here's a screenshot of a pointcloud of the front of my bin. You can see that the bin is well formed, but has some planes that floats before him.
Point cloud screenshot
Where does this artifacts come from?
May I be doing something wrong somewhere? Or it's just Tango Limitation?
Is there any technique to get rid of those planes?

Related

Rotation and translation in 3D reconstruction using 2D images

I am trying to do 3D model reconstruction using 2D images from different views. I am following this example code from Matlab to get the desired results:
Structure From Motion From Two Views.
Following are the test images taken from the camera:
Manually taken images of 1st and 2nd image with translation of 1cm:
Overlay with matched features of first and second image:
Manually taken images of 1st and 2nd image with translation of 2cm:
Overlay with matched features of first and second image:
These are the translation vectors and rotation matrices I get for each case:
1cm translation:
translation vector:[0.0245537412606279 -0.855696925927505 -0.516894461905255]
rotation matrix:
[0.999958322438693 0.00879926762261436 0.00243439415451741;
-0.00887800587357739 0.999365801035702 0.0344844418829408;
-0.00212941243132160 -0.0345046172211024 0.999402269855899]
2cm translation:
translation vector:[-0.215835469166982 -0.228607603749042 -0.949291111175908]
rotation matrix:
[0.999989695803078 -0.00104036790630347 -0.00441881457943975;
0.00149220346018613 0.994626852476622 0.103514238930121;
0.00428737874479779 -0.103519766069424 0.994618156086259]
In documentation, it says it is relative rotation and translation between the 2 images.
But I am unable to understand what these numbers mean and what is the unit of the above values.
Can anyone at least let me know in what units are we getting the translation and rotation or how to extract the rotation and translation which is in any way comparable to the real world values like cm/mm and radians/degrees respectively?
You can translate the rotation matrix into a axis-angle-representation where you get the angles in radians. This can be done using the vrrotmat2vec function or by implementing a translater yourself by following this if you don't have access to the package. The angle will then be in radians.
When it comes to translation however you wont get it in a unit that makes sense in the real world, since you don't know the scale. This is unfortunately a problem with structure from motion in general. It is impossible to know if you take a image close to something small or far away from something large.
When using structure from motion to construct a 3D model this is fortunately not a problem since you still get relative distances correctly. Therefore you will be able to capture the scene (by following the rest of the tutorial) but you wont be able to say if something is 2cm or 2km tall, unless you have something in the image that you know the real life size of.
Hope it helps :)

Transforming and registering point clouds

I’m starting to develop with Project Tango API.
I need to save PointCloud data that I get in the event OnXyzIjAvailable;
to do this, I started from your example "PointCloudJava" and wrote PointCloud coordinates in single files (an AsyncTask is started for this purpose).
So I have one file with xyz for each event. On the same event I get the corresponding transformation matrix (mRenderer.getModelMatCalculator(). GetPointCloudModelMatrixCopy()).
Point clouds
Then I’ve imported all this data (xyz point cloud with corresponding transformation matrix; the transformation matrix is applied to the point clouds) but the point clouds doesn’t match exactly; it seems that point clouds are closed each other but not overlapping exactly.
My questions are:
-Why I don’t have the matching between the single point clouds ?
-What I should have to do to have this matching ?
Then I’ve notice the following that is probably related to the above problem; I’ve used Project Tango Explore application (Area learning), I can see my position, but is constantly in motion even if I don't move.
Which is the problem ? Is it necessary a calibration?
Device Information
Poses delivered by Tango have a non-negligible amount of drift. Here is a sample graph of pose position when my tablet was in its stand observing a static scene (ideally the traces should be flat):
When we couple this drift with tracking errors when the device is actually moving then this produces noticeable registration issues. I see this especially when the device is rolled, i.e. rotated about the view axis. The raw pose quality may be sufficient for some applications (e.g. location) but causes problems for others (e.g. 3D scanning, seamless augmented reality).
I was disappointed when I saw this. But if Tango is attempting to measure motion by using the fisheye camera to correct inertial motion prediction - and not by using stereo vision between the fisheye and color cameras - then that is a really hard problem. And the reason for doing that would be to stay within CPU/GPU/RAM/latency/battery budgets to leave something for applications. So after consideration, while I remain disappointed, I can understand it.
I am hopeful that Tango will improve their pose algorithm over time, but I suspect that applications that depend on precise tracking will still have to add their own corrections, e.g. via stereo, structure from motion, point cloud correlation, etc.
Point clouds should be viewed as statistically accurate, not exactly accurate - there is a distance estimation error range that is a function of distance and surface characteristics - a tango fixed in a specific location will not return a constant point clout - rotation of the device can cause apparent drift, but it really isn't, it's just that the error is rotating along with the tango

Getting closest target image in Vuforia

I'm using the video playback Vuforia example for building an app. When the app recognises multiple image targets I would like to know which is the closest one to the centre of the screen (which is supposed to be my camera view). In the source code I have found this line:
const QCAR::Matrix34F & QCAR::TrackableResult::getPose()
which gives me a 3x4 pose matrix of the target. How can use this matrix for extracting this information?
thanks
This Vuforia Knowledge database article explains in detail the meaning of the pose matrix, you should probably have a look at it.
To make it short, the pose matrix is a 3x4 matrix whose last column is the translation vector <x,y,z> from the camera to the detected target. The "closest target to the center of the screen" should thus be the one with the smallest <x,y> vector.
Hope this helps!

3d model construction using multiple images from multiple points (kinect)

is it possible to construct a 3d model of a still object if various images along with depth data was gathered from various angles, what I was thinking was have a sort of a circular conveyor belt where a kinect would be placed and the conveyor belt while the real object that is to be reconstructed in 3d space sits in the middle. The conveyor belt thereafter rotates around the image in a circle and lots of images are captured (perhaps 10 image per second) which would allow the kinect to catch an image from every angle including the depth data, theoretically this is possible. The model would also have to be recreated with the textures.
What I would like to know is whether there are any similar projects/software already available and any links would be appreciated
Whether this is possible within perhaps 6 months
How would I proceed to do this? Such as any similar algorithm you could point me to and such
Thanks,
MilindaD
It is definitely possible and there are a lot of 3D scanners which work out there, with more or less the same principle of stereoscopy.
You probably know this, but just to contextualize: The idea is to get two images from the same point and to use triangulation to compute the 3d coordinates of the point in your scene. Although this is quite easy, the big issue is to find the correspondence between the points in your 2 images, and this is where you need a good software to extract and recognize similar points.
There is an open-source project called Meshlab for 3d vision, which includes 3d reconstruction* algorithms. I don't know the details of the algorithms, but the software is definitely a good entrance point if you want to play with 3d.
I used to know some other ones, I will try to find them and add them here:
Insight3d
(*Wiki page has no content, redirects to login for editing)
Check out https://bitbucket.org/tobin/kinect-point-cloud-demo/overview which is a code sample for the Kinect for Windows SDK that does specifically this. Currently it uses the bitmaps captured by the depth sensor, and iterates through the byte array to create a point cloud in a PLY format that can read by MeshLab. The next stage of us is to apply/refine a delanunay triangle algoirthim to form a mesh instead of points, which a texture can be applied. A third stage would then me a mesh merging formula to combine multiple caputres from the Kinect to form a full 3D object mesh.
This is based on some work I done in June using Kinect for the purposes of 3D printing capture.
The .NET code in this source code repository will however get you started with what you want to achieve.
Autodesk has a piece of software that will do what you are asking for it is called "Photofly". It is currently in the labs section. Using a series of images taken from multiple angles the 3d geometry is created and then photo mapped with your images to create the scene.
If you interested more in theoretical (i mean if you want to know how) part of this problem,
here is some document from Microsoft Research about moving depth camera and 3D reconstruction.
Try out VisualSfM (http://ccwu.me/vsfm/) by Changchang Wu (http://ccwu.me/)
It takes multiple images from different angles of the scene and outputs a 3D point cloud.
The algorithm is called "Structure from Motion".
Brief idea of the algorithm : It involves extracting feature points in each image; finding correspondences between them across images; building feature tracks, estimating camera matrices and thereby the 3D coordinates of the feature points.

Get POINT CLOUD through 360 Degree Rotation and Image Processing

My Question is as below in two parts……
QUESTION (IN SHORT):
• To generate point cloud of real-world object….
• Through 360 degree rotation of it….on rotating table
• Getting 360 images… one image at each degree (1° to 360°).
• I know how to process image and getting pixel value of it.
• See one sample image below…you can see image is black and white...because I have to deal with the objects which are much shiny (glittery)…and it is DIAMOND. So I have setting up background so that shiny object (diamond) converted in to B/W object. And so I can easily scan outer edge of object (e.g. Diamond).
• And one thing to consider is I don’t using any laser… I just using one rotating table and one camera for taking image…you can see one sample project over here… but there MATLAB hides all the things…because that guy using MATLAB’s in Built functionality.
• Actually I am looking for Math routine or Algorithm or any Technique which helping me out to how getting point cloud…….using the way I have mentioned……..
MORE ELABORATION:
I need to have point-cloud of real-world object. So, I can display it in Computer Screen.
For that I am using one rotating table. I will put my object on it and I will rotate table a complete 360° degree rotation and I will take 360 images…one image at each degree (1° to 360°).
Camera which is used for taking image is well calibrated. I have given one sample image as below. I also know how to scan image and getting pixel value of it.
Also take in consideration that my images are Silhouette type…means just black and white... No color images.
But my problem is or where I am trapped down is in...
Getting Points cloud of object…….from the data which I have getting through processing of image.
One same kind of project I found over here……..
But it just using built in MATLAB functions…I am using Microsoft Visual C#.Net so I have to build the entire algorithm myself….because MATLAB hides all the things which I want to know….
Is there any master…….who know this entire thing well and getting me out of trap...!!!!
Thanks…..
I have no experience of this but If I wanted to do something like this I would have tried this:
Use a single color light source
if Possible create a lightsource which falls on a thin verticle slice of the object.
have 360 B/W Images, those Images will be images of a verticle line having variyng intensity. If you use matlab your matrix will have a/few column with sime values.
now asume a verticle line(your Axis of rotation).
5 plot or convert (imageno, rownoOfMatrix, ValueInPopulatedColumnInSameRow)... [Assuming numbering Image from 0 to 360]
under ideal conditions A lame way To get X and Y use K1 * cos imgNo * ValInCol and K1 * sin imgNo * ValInCol, and Z will be some K2 * rowNum.. K1 and K2 can be caliberated knowing actual size of object.
I mean Something like this:
http://fab.cba.mit.edu/content/processes/structured_light/
but instead of using structured light using a single verticle light
http://www.geom.uiuc.edu/~samuelp/del_project.html This link might help in triangulation...

Resources