I have a square table with four cameras (Xtion pro), one at each angle.
I'm trying to reconstruct the complete point-cloud of an arbitrary object that is on the table.
I've calibrated the cameras. Intrinsic parameters with a chessboard, and estrinsic parameters with a tag like ARToolkit.
The problem is that when I transform the point-clouds from camera's frame to the tag-defined frame I have errors, quite big.
How can I correct this error? I tried registration with ICP with poor results.
How can I use the obtained transformed cloud as initial guess for a fine registration?
Any suggestion is appreciated!
Edit after D.J.Duff comment:
I'm using Xtion PRO version without the RGB camera. So I'm calibrating the IR camera. To do so I covered the IR projector and performed the calibration with the IR stream with the ready to use ROS calibration tool for intrinsic parameters and the PiTag tags to calibrate extrinsic parameters.
I should manually align the cloud with a known object, but can this be automated? In sense: if I use something like an L-shaped object with no orientation ambiguities, can I automate the registration process to obtain a better transform matrix?
Related
My aim is to calibrate a pair of cameras and use them for simple measurement purposes. For this purpose, I have already calibrated them using HALCON and have all the necessary intrinsic and extrinsic camera Parameters. The next step for me is to basically measure known lengths to verify my calibration accuracies. So far I have been using the method intersect_lines_of_sight to achieve this. This has given me unfavourable results as the lengths are off by a couple of centimeters. Is there any other method which basically triangulates and gives me the 3D coordinates of a Point in HALCON? Or is there any leads as to how this can be done? Any help will be greatly appreciated.
Kindly let me know in case this post Needs to be updated with code samples
In HALCON there is also the operator reconstruct_points_stereo with which you can reconstruct 3D points given the row and column coordinates of a corresponding pixel. For this you will need to generate a StereoModel from your calibration data that is then used in the operator reconstruct_points_stereo.
In you HALCON installation there is an standard HDevelop example that shows the use of this operator. The example is called reconstruct_points_stereo.hdev and can be found in the example browser of HDevelop.
I have a radial profile of a point spread function (PSF) that I want to draw in GalSim, so that I can call FindAdaptiveMom on the resulting image. The profile is in units of normalized intensity vs. angular position in arcseconds. I have looked at the instructions for building a custom object, but am wondering if it's possible to render a GalSim Image without building an object? If not, would it be possible to build an object simply by reading in the profile?
Unfortunately, it's not currently very easy to roll your own custom profile in GalSim. The instructions you pointed to would require the output to still be generated in terms of existing galsim types, so not really what you're looking for.
I think you have two options:
If all you care about is the FindAdaptiveMom bit and you don't want to do anything complicated with the rendering, you can lay down the radial profile yourself. An image is mostly just a wrapper around a numpy array and a bounds (defining what coordinate the origin of the array is). So you could write that array yourself and then make an image from that with im = galsim.Image(array) and call FindAdaptiveMom on that.
If you want your radial profile to be the true surface brightness profile on the sky (rather than as seen on an image) and then properly render it including integration over the pixels, then that's a little trickier. You can coerce GalSim into doing that by doing the above rendering first, and then make a galsim.InterpolatedImage object out of that, which will treat the drawn image as a surface brightness profile, which can then be drawn in the usual way (drawImage).
What algorithms are used for augmented reality like zookazam ?
I think it analyze image and find planes by contrast, but i don't know how.
What topics should I read before starting with app like this?
[Prologue]
This is extremly broad topic and mostly off topic in it's current state. I reedited your question but to make your question answerable within the rules/possibilities of this site
You should specify more closely what your augmented reality:
should do
adding 2D/3D objects with known mesh ...
changing light conditions
adding/removing body parts/clothes/hairs ...
a good idea is to provide some example image (sketch) of input/output of what you want to achieve.
what input it has
video,static image, 2D,stereo,3D. For pure 2D input specify what conditions/markers/illumination/LASER patterns you have to help the reconstruction.
what will be in the input image? empty room, persons, specific objects etc.
specify target platform
many algorithms are limited to memory size/bandwidth, CPU power, special HW capabilities etc so it is a good idea to add tag for your platform. The OS and language is also a good idea to add.
[How augmented reality works]
acquire input image
if you are connecting to some device like camera you need to use its driver/framework or something to obtain the image or use some common API it supports. This task is OS dependent. My favorite way on Windows is to use VFW (video for windows) API.
I would start with some static file(s) from start instead to ease up the debug and incremental building process. (you do not need to wait for camera and stuff to happen on each build). And when your App is ready for live video then switch back to camera...
reconstruct the scene into 3D mesh
if you use 3D cameras like Kinect then this step is not necessary. Otherwise you need to distinguish the object by some segmentation process usually based on the edge detections or color homogenity.
The quality of the 3D mesh depends on what you want to achieve and what is your input. For example if you want realistic shadows and lighting then you need very good mesh. If the camera is fixed in some room you can predefine the mesh manually (hard code it) and compute just the objects in view. Also the objects detection/segmentation can be done very simply by substracting the empty room image from current view image so the pixels with big difference are the objects.
you can also use planes instead of real 3D mesh as you suggested in the OP but then you can forget about more realistic quality of effects like lighting,shadows,intersections... if you assume the objects are standing straight then you can use room metrics to obtain the distance from camera. see:
selection criteria for different projections
estimate measure of photographed things
For pure 2D input you can also use the illumination to estimate the 3D mesh see:
Turn any 2D image into 3D printable sculpture with code
render
Just render the scene back to some image/video/screen... with added/removed features. If you are not changing the light conditions too much you can also use the original image and render directly to it. Shadows can be achieved by darkening the pixels ... For better results with this the illumination/shadows/spots/etc. are usually filtered out from the original image and then added directly by rendering instead. see
White balance (Color Suppression) Formula?
Enhancing dynamic range and normalizing illumination
The rendering process itself is also platform dependent (unless you are doing it by low level graphics in memory). You can use things like GDI,DX,OpenGL,... see:
Graphics rendering
You also need camera parameters for rendering like:
Transformation of 3D objects related to vanishing points and horizon line
[Basic topics to google/read]
2D
DIP digital image processing
Image Segmentation
3D
Vector math
Homogenous coordinates
3D scene reconstruction
3D graphics
normal shading
paltform dependent
image acquisition
rendering
So I know about setSurface, and have no problem using it as an overlay or whatever - its on a surfacecontrol. That said, I am stumped about getting pixel data
1) I've tried everything I can think of (the control, the root, etc) to use the drawing cache functions to get the bits for the camera surface. Yah, no. The cached bitmap is always zerod out.
2) I've used both SurfaceView and GLSurfaceView successfully as a setSurface taget. I cannot use any other class, such as TextureView.
3) I've investigated the C API and I see the camera exposes connectOnFrameAvailable, which will give me access to the pixels
My guess is that the internal tango logic is just using the surface in java to gain access to the underlying bit transfer channel - in the C API it requires a texture ID, which makes me suspect at the end of the day, the camera data is shipped in to the GPU pretty quickly, and I bet that CUDA lib operates on it - given the state of things, I can't see how to get the bits on the Java side without rooting the device - just cause I have a texture or simple surface view rendering raw bits on the screen doesn't mean I can get to them.
I don't want to peel the image data back out of the GPU. I'd need to switch my busy animation from a watch to a calendar for that.
Before I dive down into the C API, is there any way I can get the camera bits in Java ? I really want to be able to associate them with a specific pose, but right now I can't even figure out how to get them at all. I really want to know the location and color of a 3D point. Camera intrinsics, the point cloud, and the 2d image that generated the point cloud are all I need. But I can't do anything if I can't get the pixels, and the more questionable the relationship between an image and a (pose and a pointcloud) the sketchier any efforts will become.
If I do dive into C, will the connectOnFrameAvailable give me what I need ? How well synced is it with the point cloud generation ? Oh, and have I got this right ? Color camera is used for depth, fisheye is used for pose ?
Can I mix Java and C, i.e. create a Tango instance in Java and then just use C for the image issue ? Or am I going to have to re-realize everything in C and stop using the tango java jar ?
will the connectOnFrameAvailable give me what I need?
Yes, it indeed returns the YUV byte buffer.
How well synced is it with the point cloud generation?
The Tango API itself doesn't provide synchronization between the color image and depth point cloud, however, it does provide the timestampe which allow you to sync at the application level.
Color camera is used for depth, fisheye is used for pose?
Yes, you are right.
Can I mix Java and C (i.e. create a Tango instance in Java and then just use C for the image issue)
Starting two Tango instance is really not the way Tango supported, even though it works, it will be extremely hacky..
As the temp walk-around, you could probably try to use the drawing cache of the view?
I am working on a Kinect game where I am supposed "to dress" the player into a kind of garment.
As the player should always stand directly in front of the device, I am using a simple jpg file for this "dressing".
My problem starts when the user, while still standing in the frontal position, bends the knees or leans right or left. I want to apply an appropriate transform to this "dress" image so that it still will cover player's body more or less correctly.
From Kinect sensors I can get a current information about the following player's body parts positions:
Is there any library (C++, C#, Java) or a known algorithm that can make such transformation?
Complex task but possible.
I would split the 'dress' into arms, torso/upper body and lower. you could then use (from memory) AffineTransform in java though most languages have algorithms for matrix transforms against images.
The reason I suggest splitting the image is that when you do a transform you will be distorting the top part of the image and it will allow you to do some rotation (for when people lean) and wrap the arms as they move also.
EDIT:
I would also NOT transform each frame (cpu intensive) I would create a rainbow table of the possible angles and do a lookup for the image