I'm looking to develop a high precision scanning app and can't seem to find the specs on the precision of the point-cloud.
Does anyone know or has anyone tested the precision of individual points?
With my experience with Tango device, on an average you can achieve 10fps temporal resolution with Depth Camera. Pointcloud density depends on lot of factors
Lighting conditions in room, as structured light based sensors throw IR pattern on scene, presence of Sunlight will affect depth camera badly
Reflective surfaces/Glasses, IR pattern of structured light gets reflected from glasses and at times go through glass surfaces, distorting DepthMap
Distance from camera Optimal operational range of tango depth camera is 0.5m-4m, so it's better to have object to be scanned at ~1m distance.
In my experience, when Tango device is at optimal distance from surfaces you get 15K points, in other cases around 6K points. More the points better the scan density.
Motion tracking functionality of Tango device use SLAM algorithm, presence of strong features in scene will help SLAM, help in obtaining precise motion tracking.
It's advised to use standard registration algorithms when you are scanning structures from different poses.
Related
As far as I know, all the techniques mentioned in the title are rendering algorithms that seem quite similar. All ray based techniques seem to revolve about casting rays through each pixel of an image which are supposed to represent rays of real light. This allows to render very realistic images.
As a matter of fact I am making a simple program that renders such images myself based on Raytracing in one Weekend.
Now the thing is that I wanted to somehow name this program. I used the term “ray tracer” as this is the one used in the book.
I have heard a lot of different terms however and I would be interested to know what exactly is the difference between ray tracing, ray matching, ray casting, path tracing and potentially any other common ray-related algorithms. I was able to find some comparisons of these techniques online, but they all compared only two of these and some definitions overlapped, so I wanted to ask this question about all four techniques.
My understanding of this is:
ray cast
is using raster image to hold the scene and usually stop on first hit (no reflections and ray splitting) and does not necessarily cast ray on per pixel basis (usually per row or column of screen). The 3D version of this is called Voxel space ray cast however the map is not voxel space instead 2 raster images RGB,Height are used.
For more info see:
ray cast
Voxel space ray casting
(back) ray trace
This usually follows physical properties of light so ray split in reflected and refracted and we stop usually after some number of hits. The scene is represented either with BR meshes or with Analytical equations or both.
for more info see:
GLSL 3D Mesh back raytracer
GLSL 3D Volumetric back raytracer
the back means we cast the rays from camera to scene (on per pixel basis) instead of from light source to everywhere ... to speed up the process a lot at the cost of wrong lighting (but that can be remedied with additional methods on top of this)...
The other therms I am not so sure as I do not use those techniques (at least knowingly):
path tracing
is optimization technique to avoid recursive ray split in ray trace using Monte Carlo (stochastic) approach. So it really does not split the ray but chose randomly between the 2 options (similarly how photons behave in real world) and more rendered frames are then blended together.
ray marching
is optimization technique to speed up ray trace by using SDF (signed distance function) to determine safe advance along the ray so it does not hit anything. But it is confined only to analytical scene.
my use case is only concerned with locationing, in fact only 2-d locationing. so a lot of the cool capabilities in tango are probably not useful to me. so I'm trying to see if i could implement the location algorithm myself.
from teardown reports it seems the 9dof sensors are pretty commodity hardware. the basic integration-based location algorithm (even with magnetic field calibration) has been mature knowledge. what algorithm does tango use?
from the description it seems that tango tries to aid in navigation by using the images it sees as a reference, sort of like the "terrain-following" mode in cruise missiles, is this right? this would be too ccomplex for me to implemente
You may easily get 2D position using the TangoPoseData with the correct coordinate system:
Project Tango uses a right-handed, local-level frame for the START_OF_SERVICE and AREA_DESCRIPTION coordinate frames. This convention sets the Z-axis aligned with gravity, with Z+ pointed upwards, and the X-Y plane is perpendicular to gravity and locally level with the ground plane. This local-level convention is based on the local east-north-up (ENU) earth-based coordinate system. Instead of true north, Project Tango uses the direction the back of the device is pointed when the service started as the Y axis, and the X axis is pointed to the right. The START_OF_SERVICE and AREA_DESCRIPTION base coordinate frames of the API will use this local-level frame convention.
Said more simply, use the pose data y/x coordinates for your space as you would latitude/longitude for the earth.
Heading data is also derived from the TangoPoseData and can be converted from quaternion to euler angles. Euler angles may be easier for you to use in your 2D location app.
Tango uses 3D to increase the confidence of its position within the space...even if you don't need 3D. I would let Tango do the hard stuff and extract the 2D position so you can focus on your app.
Tango uses the camera images to detect any change in position. And uses the IMU for device rotation and acceleration. Try blocking the camera and using the Motion Tracking app, it will fail.
I recently came across a product called Kolibree on kickstarted, which is a smart toothbrush. From what they say on their website, it seems that Kolibree can detect each tooth. I have some exposure to gesture recognition and flight dynamics (roll angle, pitch angle, heading angle, ...) the technologies I believe need be used in this product, but I'm confused how it can accurately detect EACH tooth ? I think we can detect the left, right, up and down region using roll and pitch angle, maybe a little more precisely by using the heading angle. but accurate to each tooth is beyond my understanding. Could someone shed light on this ?
thanks,
Ted
from the kickstarter video it has:
Accelerometers
Gyroscopes
Magnetometers
These provide relative position and absolute direction of the device
So how to detect tooths? I would start with this:
tooth shape
by brushing you can collect surface data of close proximity to brush
but only when no significant surface movement is detected then
this can differentiate tooth types by curvature shape/size
so you have an idea in what part of jaw you are
vibrations
spinning brush creates noise pulses in accelerometer readings
these should be dependent on the movement and surface shape
when linear movement is detected (you move brush from side to side)
then the gaps between tooths will create measurable readings in acceleration
this can be used to recognize relative tooth position
angular constraints
when we brush teeth on the left/right side or up down of the mouth
we hold the brush differently
this can be also measured
if overall angular position is within certain borders
then we can assume which side of mouth are actually brushing
when you put all these data together
then we can improve the accuracy of tooth scan to better numbers
also if some kind of calibration is used that can improve it more
for example hold/click some button to start calibration
and move around the mouth by specific calibration movement ...
[notes]
some things that have to be taken in mind
left/right handed people hold the brush differently
this goes also for motoric dis-functions (disabled people)
missing or curved tooth anomalies (can be later used as mark point)
my guess is by adding camera info (for example from the linked device)
for head/jaw position detection can improve detection even more
I've got a fairly simple implementation of normal map lighting working for 2D sprites in webgl (GLSL shaders) which I was able to adapt & optimize from an example. It uses just one directional light and works fine for my purposes. Sprites are rendered flat (2D), only the light direction and normals are 3D vectors. Vertex rotation only happens around the z axis, so it's fairly easy-peasy.
I was hoping to add a bump (height) map to cast shadows. There are 3D bump map shadow casting examples and papers available online, but they're more complex than I need and the math goes over my head; I haven't found an example or explanation of how one might do a simple 2D case.
My first inclination is as follows: for the current pixel in the fragment shader, trace back along the direction of the light and check the altitude of the neighbouring bump map pixel. If it's higher than the light direction vector at that point, then that pixel is in the shade. However since "tall" pixels on the bump map may cast shadow across > 1 pixel distance, I'd have to keep testing pixel by pixel in that direction until I find one tall enough to cast a shadow (or reach the edge of the texture, or reach some arbitrary limit.)
This doesn't sound very optimal, especially for larger textures. I've read that if statements in shaders aren't so fast. Is there a faster/better method?
What you are looking for is called parallax (occlusion) mapping.
It's a technique that does exactly what you described, and it can be understood as on-bumpmap ray tracing in tangent space.
Here are some articles:
nVidia - Per-Pixel displacement (w/ sphere tracing)
nVidia - Cone Tracing for PM
AMD - POM
The ways to optimize search are similar to ordinary raytracing and include: sphere tracing, cone tracing, binary search and similar, instead of constant stepping function.
P. S. If you know the name of some rendering technique, it's generally good idea to Google it adding 'nVidia', 'crytek' or 'gpu' in front of the name, it will show you much more relevant results.
Hope this helps.
So I have picture (not the best one)
I want to detect where the lights come from and what types of lights are they. What algorithm\framework can do such things with static images?
I mentioned shadows because in general if you can separate a shadow from a surface than you can probably determine light type and other its parameters.
I mean general shadows search not only for presented image.
With the image that you presented, there are so many sources of error that I'd be surprised if a trained human, let alone an algorithm could do better than ±20% on any calculations. Here are the problems:
There isn't a known straight line anywhere since everything is hand hewn. The best bet would be the I-beam above the doorway but you don't know it's orientation.
There's heavy barrel distortion in the edges of the image which are introduced by the lens and are characteristic of that lens at that zoom and focus. Without precise calibration of that, you can only guess at the degree of distortion.
The image is skewed with regard to the wall it is facing but none of the walls appear to be all that planar anyway.
You want to know the source of lights. Well the obvious primary light is the sun, but latitude, longitude, time and date all affect that. Then there are the diffuse reflections but unless you have the albedo of the materials you can only guess.
What are you hoping to derive from this image? Usually when doing lighting analysis, someone will put known reference targets of different, known reflectivity in the space to be analyzed. Working from a pocket snapshot camera on an unknown scene really limits what you can extrapolate.