Does project tango extract any visual features per frame (such as ORB or SIFT/SURF)? or the entire point cloud is just 3D points extracted from depth camera. If so, is it possible to know which algorithm are they using? Is it just corners ?
I would like to dump 3D point cloud along with corresponding features and wondering if it is all possible in real-time.
Unfortunately, they don't expose which features they use. All you get is XYZ + Confidence. Here's the realtime point cloud callback, from the C API:
TangoErrorType TangoService_connectOnPointCloudAvailable(
void(*)(void *context, const TangoPointCloud *cloud) TangoService_onPointCloudAvailable,
...
);
See:
https://developers.google.com/tango/apis/c/reference/group/depth
https://developers.google.com/tango/apis/java/reference/TangoPointCloudData
https://developers.google.com/tango/apis/unity/reference/class/tango/tango-unity-depth
TangoPointCloud is defined here:
https://developers.google.com/tango/apis/c/reference/struct/tango-point-cloud#struct_tango_point_cloud
https://developers.google.com/tango/apis/java/reference/TangoPointCloudData
https://developers.google.com/tango/apis/unity/reference/class/tango/tango-point-cloud-data
As an aside, if you regard Tango's objective as being a portable API that sits atop various different sensors and hardware platforms, then it makes sense that they wouldn't expose the details of the underlying depth estimation method. It might change, from one device to the next.
BTW, they also keep the internals of their ADF (Area Description File) format secret.
Related
I was trying to perform 3D Reconstruction from the point cloud being obtained from ARCore. However, The point cloud I was able to obtain using ARCore was not accurate or dense enough to perform 3D Reconstruction. Specifically, Points that were supposed to be on the same surface were off by a few miilimeteres due to which the surface looked as if it had a thickness.
Am I isolating the point cloud correctly or is it a limitation of ARCore ?
What should be the standard approach towards isolating the point cloud and for 3D Reconstruction ?
I am attaching below the point cloud obtained from a laptop.
( The file is in the .PLY format. The file can be viewed on https://sketchfab.com/ ,http://www.meshlab.net/ or any other software capable of rendering 3D models.)
The Keyboard and the screen of the laptop here look as if they have a thickness, although all the points were supposed to be at the same depth.
Please have a look at the point cloud and guide me as to what is going wrong here, since I have been stuck at it for quite some time now.
Thank you
https://drive.google.com/file/d/18KMchFgYd8KOcyi8hPpB5yEfbnJ6DxmR/view?usp=sharing
As far as I know, the main features of Project Tango SDK are:
motion tracking
area learning
depth perception
But what about object recognition & tracking?
I didn't see anything about that in the SDK, though I assume that Tango hardware would be really good at it (compared with traditional smartphones). Why?
Update 2017/06/05
A marker detection API has been introduced in the Hopak release
There are already good libraries for object tracking in 2D images and the additional features of project tango would likely add only marginal improvement in performance(of existing functions) for major overhauls of the library to support a small set of evolving hardware.
How do you think project tango could improve on existing object recognition & tracking?
With a 3d model of the object to be tracked, and knowledge of the motion and pose of the camera, one could predict what the next image of the tracked object 'should' look like. If the next image is different than predicted, it could be assumed that the tracked object has moved from its prior position. And the actual new 3D image could indicate the tracked object's vectors. That certainly has uses in navigating a live environment.
But that sounds like the sort of solution a self driving car might use. And that would be a valuable piece of tech worth keeping away from competitors despite its value to the community.
This is all just speculation. I have no first hand knowledge.
I'm not really sure what you're expecting for an "open question", but I can tell you one common way that people exploit Tango's capabilities to aid object recognition & tracking. Tango's point cloud, image callbacks, and pose data can be used as input for a library like PCL (http://pointclouds.org/).
Simply browsing the documentation & tutorials will give you a good idea of what's possible and how it can be achieved.
http://pointclouds.org/documentation/
Beyond that, you might browse the pcl-users mail archives:
http://www.pcl-users.org/
Is it possible to train the area-learning module in a project tango device with other data than the one automatically input through the sensors?
I am asking because I want to teach the area algorithm a preexisting 3D model, thereby making object recognition possible.
I am not asking for a highlevel ability to convert any 3D model to an ADF. If I have to generate several point clouds and color buffers myself based on the 3D model, that would also work.
I am also not asking to know about any Google secrets of the internal format of ADFs. Only to have some way to put data in there.
Currently, there's no way of doing that through Tango public APIs. All pipeline, learning or relocalizing have to be done on device.
So I know about setSurface, and have no problem using it as an overlay or whatever - its on a surfacecontrol. That said, I am stumped about getting pixel data
1) I've tried everything I can think of (the control, the root, etc) to use the drawing cache functions to get the bits for the camera surface. Yah, no. The cached bitmap is always zerod out.
2) I've used both SurfaceView and GLSurfaceView successfully as a setSurface taget. I cannot use any other class, such as TextureView.
3) I've investigated the C API and I see the camera exposes connectOnFrameAvailable, which will give me access to the pixels
My guess is that the internal tango logic is just using the surface in java to gain access to the underlying bit transfer channel - in the C API it requires a texture ID, which makes me suspect at the end of the day, the camera data is shipped in to the GPU pretty quickly, and I bet that CUDA lib operates on it - given the state of things, I can't see how to get the bits on the Java side without rooting the device - just cause I have a texture or simple surface view rendering raw bits on the screen doesn't mean I can get to them.
I don't want to peel the image data back out of the GPU. I'd need to switch my busy animation from a watch to a calendar for that.
Before I dive down into the C API, is there any way I can get the camera bits in Java ? I really want to be able to associate them with a specific pose, but right now I can't even figure out how to get them at all. I really want to know the location and color of a 3D point. Camera intrinsics, the point cloud, and the 2d image that generated the point cloud are all I need. But I can't do anything if I can't get the pixels, and the more questionable the relationship between an image and a (pose and a pointcloud) the sketchier any efforts will become.
If I do dive into C, will the connectOnFrameAvailable give me what I need ? How well synced is it with the point cloud generation ? Oh, and have I got this right ? Color camera is used for depth, fisheye is used for pose ?
Can I mix Java and C, i.e. create a Tango instance in Java and then just use C for the image issue ? Or am I going to have to re-realize everything in C and stop using the tango java jar ?
will the connectOnFrameAvailable give me what I need?
Yes, it indeed returns the YUV byte buffer.
How well synced is it with the point cloud generation?
The Tango API itself doesn't provide synchronization between the color image and depth point cloud, however, it does provide the timestampe which allow you to sync at the application level.
Color camera is used for depth, fisheye is used for pose?
Yes, you are right.
Can I mix Java and C (i.e. create a Tango instance in Java and then just use C for the image issue)
Starting two Tango instance is really not the way Tango supported, even though it works, it will be extremely hacky..
As the temp walk-around, you could probably try to use the drawing cache of the view?
I'm working on a project in which images/video will need to be processed in order to detect certain objects within the frame. For this project, I'm in charge of selecting hardware (image sensor, lens, processor, etc), and I'd like to select the lowest cost image sensor possible for cost-constraint reasons. For the application, the camera will adhere to the following two requirements:
The camera will be at a fixed position
The distance to the object of interest is known
So, I'm wondering if there is any data/recommendations like the following:
"Based on algorithm X (e.g. face detection), select an image sensor / lens combination so that the object of interest is covered by a pixel-density of at least 40 pixels-per-foot."
For a few use cases, such as face detection or face recognition, I've found some materials available online that have recommended minimum PPF requirements to ensure acceptable data is fed to the algorithm (Pixel Density Reqs for Face Detection). Obviously, any detection requirements are heavily algorithm dependent, but I was hoping maybe someone could provide some insight or resources they've come across to address this.
Some (slightly) more specific information - if possible, I'd like to be able to perform the following:
Perform facial detection (not necessarily recognition)
Discern/detect a human from the front, back, and top
Also, I'm aware that there are image processing libraries available for computer/machine vision (such as OpenCV), could this library (or similar ones) contain this information? I'll look into it, but if someone has a specific library to reference that would be very helpful.
Thanks!