Programmatic correction of camera tilt in a positioning system - image

A quick introduction:
We're developing a positioning system that works the following way. Our camera is situated on a robot and is pointed upwards (looking at the ceiling). On the ceiling we have something like landmarks, thanks to whom we can compute the position of the robot. It looks like this:
Our problem:
The camera is tilted a bit (0-4 degrees I think), because the surface of the robot is not perfectly even. That means, when the robot turns around but stays at the same coordinates, the camera looks at a different position on the ceiling and therefore our positioning program yields a different position of the robot, even though it only turned around and wasn't moved a bit.
Our current (hardcoded) solution:
We've taken some test photos from the camera, turning it around the lens axis. From the pictures we've deduced that it's tilted ca. 4 degrees in the "up direction" of the picture. Using some simple geometrical transformations we've managed to reduce the tilt effect and find the real camera position. On the following pictures the grey dot marks the center of the picture, the black dot is the real place on the ceiling under which the camera is situated. The black dot was transformed from the grey dot (its position was computed correcting the grey dot position). As you can easily notice, the grey dots form a circle on the ceiling and the black dot is the center of this circle.
The problem with our solution:
Our approach is completely unportable. If we moved the camera to a new robot, the angle and direction of tilt would have to be completely recalibrated. Therefore we wanted to leave the calibration phase to the user, that would demand takings some pictures, assessing the tilt parameters by him and then setting them in the program. My question to you is: can you think of any better (more automatic) solution to computing the tilt parameters or correcting the tilt on the pictures?

Nice work. To have an automatic calibration is a nice challenge.
An idea would be to use the parallel lines from the roof tiles:
If the camera is perfectly level, then all lines will be parallel in the picture too.
If the camera is tilted, then all lines will be secant (they intersect in the vanishing point).
Now, this is probably very hard to implement. With the camera you're using, distortion needs to be corrected first so that lines are indeed straight.
Your practical approach is probably simpler and more robust. As you describe it, it seems it can be automated to become user friendly. Make the robot turn on itself and identify pragmatically which point remains at the same place in the picture.

Related

Non rectangular camera matrix

My project combines a projection screen with a head tracking device, where the screen should act as a window through which I could see my virtual "world". Basically, this.
Initially, I thought this would be easy: Map the camera position to the head tracking, have it point towards my window in the virtual world, adjust camera parameters to fit its frustum to the window, and voilà!
Except it doesn't work because I'm viewing the window (both real and virtual) at an angle, so the regular perspective camera doesn't do the trick: If I understand correctly, that camera 'input' is always rectangular, but I need to 'fit' it in a trapezoïd instead.
I think I should be able to achieve that by making my own projection matrix, but I'm a bit lost on how to do that: I have played a bit with basic matrix transforms (translate, scale, rotate), but I have zero experience with more complex stuff (ie perspective).
My best guess for now is trying to deduce the projection matrix from known transformed points (the corners of my window => the corners of the screen) but I feel like it's going to be quite expensive to do that each frame, and that doesn't account for the perspective inside the "window".
thanks for any help!

Can points or meshes be drawn at infinite distance?

I'm interested in drawing a stardome in THREE.js using either mesh points or a particle system.
I don't want the camera to be able to move any closer to any part of the stardome, since the stars are effectively at infinite distance.
I can think of a couple of ways to do this:
A very large mesh (or very large point/particle distances)
Camera and stardome have their movement exactly linked.
Is there any way to specify a mesh, point, or particle system is automaticaly rendered at infinite distance so it is always drawn behind any foreground objects?
I haven't used three.js, but my guess is no. OpenGL camera's need a "near clipping plane" and "far clipping plane", which effectively denote the minimum and maximum distance that it'll render things in. If you've played video games where you move too close to a wall and start to see through it, or see things in the distance suddenly vanish as you move away, those were probably the clipping planes at work.
The workaround is usually one of 2 ways:
1) Set the far clipping plane distance as high as it'll let you go. I don't know what data type three.js would use for this, but my guess is a 32-bit float.
2) Render it in "layers". Render all the stars first before anything else in the scene.
Option 2 is the one I usually use.
Even if you used option 1, you would still synchronize the position of the camera and skybox.
If you do not depth cull, draw the skybox first and match its position, but not rotation, to the camera.
Also disable lighting on the skybox. Instead, bake an ambience directly into its texture.
You're don't want things infinitely away, you just want them not to move with respect to the viewer and to not appear in front of things. The best way to do that is to prevent the viewer from getting closer to them which produces the illusion of the object being far away. The second thing is to modify your depth culling function so that the skybox is always considered further away than whatever you are currently drawing.
If you create a very large mesh object, you'll have to set your camera's far plane large enough to include the mesh which means you'll end up drawing things that you really do want to cull.

Unity and Infrared

I would like to make a game where I use a camera with infrared tracking, so that I can track peoples heads (from top view). For example each player will get a helmet so that the camera or infrared sensor can track him/her.
After that I need to know the exact positions of that person in unity, to place a 3D gameobject at the players position.
Maybe there is another workaround to get peoples positions in unity. I know I could use a kinect, but I need to track at least 10 people at the same time.
Thanks
Note: This is not really a closed answer, just a collection of my thoughts regarding your question on how to transfer recorded positions into unity.
If you really need full 3D positions, I believe you won't be happy when using only one sensor. In order to obtain depth information, which can further be used to calculate 3D positions in a reference coordinate system, you would have to use at least 2 sensors.
Another thing you could do is fixing the camera position and assuming, that all persons are moving in the same plane (e.g. fixed y-component), which would allow you to determine 3D positions utilizing the projection formula given the camera parameters (so camera has to be calibrated).
What also comes to my mind is: You could try to simulate your real camera with a virtual camera in unity. This way you can use the virtual camera to project image coordinates (coming from the real camera) into unity's 3D world. I haven't tried this myself, but there was someone who tried it, you can have a look at that: https://community.unity.com/t5/Editor/How-to-simulate-Unity-Pinhole-Camera-from-its-intrinsic/td-p/1922835
Edit given your comment:
Okay, sticking to your soccer example, you could proceed as follows:
Setup: Say you define your playing area to be rectangular with its origin in the bottom left corner (think of UVs). You set these points in the real world (and in unitys representation of it) as (0,0) (bottom left) and (width, height) (top right), choosing whichever measure you like (e.g. meters, as this is unitys default unit). As your camera is stationary, you can assign the corresponding corner points in image coordinates (pixel coordinates) as well. To make things easier, work with normalized coordinates instead of pixels, thus bottom left is (0,0) ans top right is (1,1).
Tracking: When tracking persons in the image, you can calculate their normalized position (x,y) (with x and y in [0,1]). These normalized positions can be transferred into unitys 3D space (in unity you will have a playable area of the same width and height) by simply calculating a Vector3 as (x*widht, 0, y*height) (in unity x is pointing right, y is pointing up and z is pointing forward).
Edit on Tracking:
For top-view tracking in a game, I would say you are on the right track with using some sort of helmet, which enables you to use some sort of marker based tracking (in my opinion markerless multi-target tracking is not reliable enough for use in a video game) (if you want learn more about object tracking, there are lots of resources in the field of computer vision).
Independent of the sensor you are using (IR or camera), you would go create some unique marker for each helmet, thus enabling you to identify each helmet (and also the player). A marker in that case is some sort of unique pattern, that can be recognized by an algorithm for each recorded frame. In IR you can arrange quadratic IR markers to form a specific pattern and for normal cameras you can use markers like QR codes (there are also libraries for augmented reality related content, that offer functionality for creating and recognizing markers, e.g. ArUco or ARToolkit, although I don't know if they offer C# libraries, I have only used ArUco with c++ a while ago).
When you have your markers of choice, the tracking procedure is then pretty straightforward, for each recorded image:
- detect all markers in the current image (these correspond to all players currently visible)
- follow the steps from my last edit using the detected positions
I hope that helps, feel free to contact me again.

Remove lens distortion from images captured by an wide angle (180) camera

I have some images captured from an wide angle appx. (180 degree) camera.
I am using opencv 2.4.8 which gives some details about camera matrix n distortion matrix.
MatK = [537.43775285, 0, 327.61133999], [0, 536.95118778, 248.89561998], [0, 0, 1]
MatD = [-0.29741743, 0.14930169, 0, 0, 0]
And this info I have used further to remove the distortion.
But the result is not as expected.
I have attached some input images of chess board which i have used to calibrate.
Or Is there any other tools or library by which it can be removed.
input images
from a Normal Camera or even captured by my smart phone
This is not an answer to the question, but something about the "discussion" of distortion and planarness.
In reality you have some straight lines on a pattern:
With (nearly any) lens you'll get some kind of distortion so that those straight lines aren't straight anymore after projection to your image. This effect is much stronger for wide angle lenses. You could expect something like this (for wide angle stronger but similar):
But the images you provided look more like this, which can be because of your pattern wasnt really planar on the ground, or because the lens has some additional "hills" on your lens.
The whole point of the calibration process is to tell OpenCV what a straight line looks like under distortion. A chess board is used to present a number of straight lines that are easy for OpenCV to detect. In your image, these lines are simply not straight. I'm moderately sure that OpenCV also needs square boxes.
So, use a real chess board pattern. Print it out, glue it to a piece of wood or hard plastic or whatever. But make sure it's a regular chessboard pattern on a level plane.
The most common method (used by the Oculus Rift Runtime for example) draws a fine enough textured grid for which the texture coordinates or the grid node positions are chosen to compensate the distortion. To obtain the grid normally one fits a polynomial or a spline to some reference picture. For example the checkerboard in your camera is a common calibration target.

three.js - Overlapping layers flickering

When several objects overlap on the same plane, they start to flicker. How do I tell the renderer to put one of the objects in front?
I tried to use .renderDepth, but it only works partly -
see example here: http://liveweave.com/ahTdFQ
Both boxes have the same size and it works as intended. I can change which of the boxes is visible by setting .renderDepth. But if one of the boxes is a bit smaller (say 40,50,50) the contacting layers are flickering and the render depth doesn't work anymore.
How to fix that issue?
When .renderDepth() doesn't work, you have to set the depths yourself.
Moving whole meshes around is indeed not really efficient.
What you are looking for are offsets bound to materials:
material.polygonOffset = true;
material.polygonOffsetFactor = -0.1;
should solve your issue. See update here: http://liveweave.com/syC0L4
Use negative factors to display and positive factors to hide.
Try for starters to reduce the far range on your camera. Try with 1000. Generally speaking, you shouldn't be having overlapping faces in your 3d scene, unless they are treated in a VERY specific way (look up the term 'decal textures'/'decals'). So basically, you have to create depth offsets, and perhaps even pre sort the objects when doing this, which all requires pretty low-level tinkering.
If the far range reduction helps, then you're experiencing a lack of precision (depending on the device). Also look up 'z fighting'
UPDATE
Don't overlap planes.
How do I tell the renderer to put one of the objects in front?
You put one object in front of the other :)
For example if you have a camera at 0,0,0 looking at an object at 0,0,10, if you want another object to be behind the first object put it at 0,0,11 it should work.
UPDATE2
What is z-buffering:
http://en.wikipedia.org/wiki/Z-buffering
http://msdn.microsoft.com/en-us/library/bb976071.aspx
Take note of "floating point in range of 0.0 - 1.0".
What is z-fighting:
http://en.wikipedia.org/wiki/Z-fighting
...have similar values in the z-buffer. It is particularly prevalent with
coplanar polygons, where two faces occupy essentially the same space,
with neither in front. Affected pixels are rendered with fragments
from one polygon or the other arbitrarily, in a manner determined by
the precision of the z-buffer.
"The renderer cannot reposition anything."
I think that this is completely untrue. The renderer can reposition everything, and probably does if it's not shadertoy, or some video filter or something. Every time you move your camera the renderer repositions everything (the camera is actually the only thing that DOES NOT MOVE).
It seems that you are missing some crucial concepts here, i'd start with this:
http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/
About the depth offset mentioned:
How this would work, say you want to draw a decal on a surface. You can 'draw' another mesh on this surface - by say, projecting a quad onto it. You want to draw a bullet hole over a concrete wall and end up with two coplanar surfaces - the wall, the bullet hole. You can figure out the depth buffer precision, find the smallest value, and then move the bullet hole mesh by that value towards the camera. The object does not get scaled (you're doing this in NDC which you can visualize as a cube and moving planes back and forth in the smallest possible increment), but does translate in depth direction, ending up in front of the other.
I don't see any flicker. The cube movement in 3D seems to be super-smooth. Can you try in a different computer (may be faster one)? I used Chrome on Macbook Pro.

Resources