Ceres Solver: Bundle adjustment - camera-calibration

I have a calibrated camera. I took video with it in circular motion.
I want to find the camera extrinsic for each frame with bundle adjustment.
The matching use cv::findHomography(RANSAC) to remove the outliers. And the result is almost perfect.
After finding the matching points, I use google's ceres solver to do bundle adjustment.
if there are only two frames, the result is good. And I've reproject the point cloud. They looks correct.
However, I failed to do BA on multiple frames. I've tried several strategies:
(I must keep scaling factor constant)
Initialize all point at (0, 0, 100) in world coordinate. Initialize all camera in (0, 0, 0) with zero rotation. Fix the first view extrinsic. Start doing BA..
BA several times. First iteration for frame 1 and frame 2. second iteration for frame 2 and 3. Each iteration I put new observed point on (0, 0, 100) in camera space.(Reprojection function can't handle points behind the camera)
The first approach is bad.
The second one is better. But it is not allowed to have constant scaling factor..

Related

Algorithm for evenly arranging steps in 2 directions

I am currently programming the controller for a CNC machine, and therefore I need to get the amount of stepper motor steps in each direction when I get from point A to B.
For example point A's coordinates are x=0 and y=0 and B's coordinates are x=15 and y=3. So I have to go 15 steps on the x axis and 3 und the y axis.
But how do I get those two values mixed up in a way that is smooth (aka not first x and then y, this results in really ugly lines)?
In my example with x=15 and y=3 I want it arranged like that:
for 3 times do:
x:4 steps y:0 steps
x:1 steps y:1 step
But how can I get these numbers from an algorithm?
I hope you get what my problem is, thanks for your time,
Luca
there are 2 major issues in here:
trajectory
this can be handled by any interpolation/rasterization like:
DDA
Bresenham
the DDA is your best option as it can handle any number of dimensions easily and can be computed on both integer and floating arithmetics. Its also faster (was not true in the x386 days but nowadays CPU architecture changed all)
and even if you got just 2D machine the interpolation itself will be most likely multidimensional as you will probably add another stuff like: holding force, tool rpm, preasures for what ever, etc... That has to be interpolated along your line in the same way.
speed
This one is much much more complicated. You need to drive your motors smoothly from start position to the end concerning with these:
line start/end speeds so you can smoothly connect more lines together
top speed (dependent on the manufactoring process usually constant for each tool)
motor/mechanics resonance
motor speed limits: start/stop and top
When writing about speed I mean frequency [Hz] for the steps of the motor or physical speed of the tool [m/s] or [mm/2].
Linear interpolation is not good for this I am using cubics instead as they can be smoothly connected and provide good shape for the speed change. See:
How can i produce multi point linear interpolation?
The interpolation cubic (form of CATMUL ROM) is exactly what I use for tasks like this (and I derived it for this very purpose)
The main problem is the startup of the motor. You need to drive from 0 Hz to some frequency but usual stepping motor has resonance in the lower frequencies and as they can not be avoided for multidimensional machines you need to spend as small time in such frequencies as possible. There are also another means of handling this shifting resonance of kinematics by adding weights or change of shape, and adding inertial dampeners on the motors itself (rotary motors only)
So usual speed control for single start/stop line looks like this:
So you should have 2 cubics one per start up and one per stopping dividing your line into 2 joined ones. You have to do it so start and stop frequency is configurable ...
Now how to merge speed and time? I am using discrete non linear time for this:
Find start point (time) of each cycle in a sine wave
its the same process but instead of time there is angle. The frequency of sinwave is changing linearly so that part you need to change with the cubic. Also You have not a sinwave so instead of that use the resulting time as interpolation parameter for DDA ... or compare it with time of next step and if bigger or equal do step and compute the next one ...
Here another example of this technique:
how to control the speed of animation, using a Bezier curve?
This one actually does exactly what you should be doing ... interpolate DDA with Speed controled by cubic curve.
When done you need to build another layer on top of this which will configure the speeds for each line of trajectory so the result is as fast as possible and matching your machine speed limits and also matching tool speed if possible. This part is the most complicated one...
In order to show you what is ahead of you when I put all this together mine CNC interpolator has ~166KByte of pure C++ code not counting depending libs like vector math, dynamic lists, communication etc... The whole control code is ~2.2 MByte
If your controller can issue commands faster than the steppers can actually turn, you probably want to use some kind of event-driven timer-based system. You need to calculate when you trigger each of the motors so that the motion is distributed evenly on both axes.
The longer motion should be programmed as fast as it can go (that is, if the motor can do 100 steps per second, pulse it every 1/100th of a second) and the other motion at longer intervals.
Edit: the paragraph above assumes that you want to move the tool as fast as possible. This is not normally the case. Usually, the tool speed is given, so you need to calculate the speed along X and Y (and maybe also Z) axes separately from that. You also should know what tool travel distance corresponds to one step of the motor. So you can calculate the number of steps you need to do per time unit, and also duration of the entire movement, and thus time intervals between successive stepper pulses along each axis.
So you program your timer to fire after the smallest of the calculated time intervals, pulse the corresponding motor, program the timer for the next pulse, and so on.
This is a simplification because motors, like all physical objects, have inertia and need time to accelerate/decelerate. So you need to take this into account if you want to produce smooth movement. There are more considerations to be taken into account. But this is more about physics than programming. The programming model stays the same. You model your machine as a physical object that reacts to known stimuli (stepper pulses) in some known way. Your program calculates timings for stepper pulses from the model, and sits in an event loop, waiting for the next time event to occur.
Consider Bresenham's line drawing algorithm - he invented it for plotters many years ago. (Also DDA one)
In your case X/Y displacements have common divisor GCD=3 > 1, so steps should change evenly, but in general case they won't distributed so uniformly.
You should take the ratio between the distance on each of the coordinates, and then alternate between steps along the coordinate that has the longest distance with steps that do a single unit step on both coordinates.
Here is an implementation in JavaScript -- using only the simplest of its syntax:
function steps(a, b) {
const dx = Math.abs(b.x - a.x);
const dy = Math.abs(b.y - a.y);
const sx = Math.sign(b.x - a.x); // sign = -1, 0, or 1
const sy = Math.sign(b.y - a.y);
const longest = Math.max(dx, dy);
const shortest = Math.min(dx, dy);
const ratio = shortest / longest;
const series = [];
let longDone = 0;
let remainder = 0;
for (let shortStep = 0; shortStep < shortest; shortStep++) {
const steps = Math.ceil((0.5 - remainder) / ratio);
if (steps > 1) {
if (dy === longest) {
series.push( {x: 0, y: (steps-1)*sy} );
} else {
series.push( {x: (steps-1)*sx, y: 0} );
}
}
series.push( {x: sx, y: sy} );
longDone += steps;
remainder += steps*ratio-1;
}
if (longest > longDone) {
if (dy === longest) {
series.push( {x: 0, y: longest-longDone} );
} else {
series.push( {x: longest-longDone, y: 0} );
}
}
return series;
}
// Demo
console.log(steps({x: 0, y: 0}, {x: 3, y: 15}));
Note that the first segment is shorter than all the others, so that it is more symmetrical with how the sequence ends near the second point. If you don't like that, then replace the occurrence of 0.5 in the code with either 0 or 1.

High RMS error while "online" cv:stereoCalibration

I have two cameras setted horizontally (close to each other). I have left camera cam1 and right camera cam2.
First I calibrate cameras (I want to calibrate 50 pairs of images):
I calibrate both cameras separetely using cv::calibrateCamera()
I calibrate stereo using cv::stereoCalibrate()
My questions:
In stereoCalibrate - I assumed that the order of cameras data is important. If data from left camera should be the imagePoints1 and from right camera it should be imagePoints2 or vice versa or it doesn't matters as long as order of cameras is the same in every point of program?
In stereoCalibrate - I get RMS error around 15,9319 and average reprojection error around 8,4536. I get that values if I use all images from cameras. In other case: first I save images, I select pairs where whole chessboard is visible (all of chessborad's squares is in camera view and every square is visible in its entirety) I get RMS around 0,7. If that means that only offline calibration is good and if I want to calibrate camera I should select good images manually? Or there is some way to do calibration online? By online I mean that I start capture view from camera and on every view I found chessboard corners and after stop capture view from camera I calibrate camera.
I need only four values of distortion but I get five of them (with k3). In old api version cvStereoCalibrate2 I got only four values but in cv::stereoCalibrate I don't know how to do this? Is it even possible or the only way is to get 5 values and use only four of them later?
My code:
Mat cameraMatrix[2], distCoeffs[2];
distCoeffs[0] = Mat(4, 1, CV_64F);
distCoeffs[1] = Mat(4, 1, CV_64F);
vector<Mat> rvec1, rvec2, tvec1, tvec2;
double rms1 = cv::calibrateCamera(objectPoints, imagePoints[0], imageSize, cameraMatrix[0], distCoeffs[0],rvec1, tvec1, CALIB_FIX_K3, TermCriteria(
TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON));
double rms2 = cv::calibrateCamera(objectPoints, imagePoints[1], imageSize, cameraMatrix[1], distCoeffs[1],rvec2, tvec2, CALIB_FIX_K3, TermCriteria(
TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON));
qDebug()<<"Rms1: "<<rms1;
qDebug()<<"Rms2: "<<rms2;
Mat R, T, E, F;
double rms = cv::stereoCalibrate(objectPoints, imagePoints[0], imagePoints[1],
cameraMatrix[0], distCoeffs[0],
cameraMatrix[1], distCoeffs[1],
imageSize, R, T, E, F,
TermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 100, 1e-5),
CV_CALIB_FIX_INTRINSIC+
CV_CALIB_SAME_FOCAL_LENGTH);
I had a similar problem. My problem was that I was reading the left images and the right images by assuming that both were sorted. Here a part of the code in Python
I fixed by using "sorted" in the second line.
images = glob.glob(path_left)
for fname in sorted(images):
img = cv2.imread(fname)
gray1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners1 = cv2.findChessboardCorners(gray1, (n, m), None)
# If found, add object points, image points (after refining them)
if ret == True:
i = i + 1
print("Cam1. Chess pattern was detected")
objpoints1.append(objp)
cv2.cornerSubPix(gray1, corners1, (5, 5), (-1, -1), criteria)
imgpoints1.append(corners1)
cv2.drawChessboardCorners(img, (n, m), corners1, ret)
cv2.imshow('img', img)
cv2.waitKey(100)
The only thing why is the order of cameras/image sets important is the rotation and translation you get from stereoCalibrate function. The image set you put into the function as first is taken as the base. So the rotation and translation you get is how is the second camera translated and rotated from the first camera. Of course you can just reverse the result, which is the same as switching image sets. This of course holds only if the images in both sets are corresponding to each other (their order).
This is a bit tricky, but there are few reasons why you are getting this big RMS error.
First, I'm not sure how you detect your chessboard corners, but if the whole chessboard is not visible and you provide valid chessboard model, findChessboardCorners should return false as it does not detect the chessboard. So you're able to automatically (=online) omit these "chessless" images. Of course you have to throw away also the image from second camera, even if that one is valid, to preserve correct order in both sets.
Second option is to back-project all corners for each image and calculate reprojection error for all images separately (not only for whole calibration). Then you can select, for example, best 3/4 images by this error and recalculate calibration without outliers.
Other reason could be the time sync between snapping images from 2 cameras. If the delay is big and you move with the chessboard continuously, you're actually trying to match projections of slightly translated chessboard.
If you want robust online version I'm afraid you will end up with the second option, as it helps you also get rid of blurred images, wrong detections due to light conditions and so. You just need to set the threshold (how many images you will cut of as outliers) carefully to not throw away valid data.
I'm not that sure in this field, but I would say you can calculate 5 of them and use only four coz it looks like you just cut off higher order of Taylor series. But I cannot guarantee it's true.

How use raw Gryoscope Data °/s for calculating 3D rotation?

My question may seem trivial, but the more I read about it - the more confused I get... I have started a little project where I want to roughly track the movements of a rotating object. (A basketball to be precise)
I have a 3-axis accelerometer (low-pass-filtered) and a 3-axis gyroscope measuring °/s.
I know about the issues of a gyro, but as the measurements will only be several seconds and the angles tend to be huge - I don't care about drift and gimbal right now.
My Gyro gives me the rotation speed of all 3 axis. As I want to integrate the acceleration twice to get the position at each timestep, I wanted to convert the sensors coordinate-system into an earthbound system.
For the first try, I want to keep things simple, so I decided to go with the big standard rotation matrix.
But as my results are horrible I wonder if this is the right way to do so. If I understood correctly - the matrix is simply 3 matrices multiplied in a certain order. As rotation of a basketball doesn't have any "natural" order, this may not be a good idea. My sensor measures 3 angular velocitys at once. If I throw them into my system "step by step" it will not be correct since my second matrix calculates the rotation around the "new y-axis" , but my sensor actually measured an angular velocity around the "old y-axis". Is that correct so far?
So how can I correctly calculate the 3D rotation?
Do I need to go for quaternoins? but how do I get one from 3 different rotations? And don't I have the same issue here again?
I start with a unity-matrix ((1, 0, 0)(0, 1, 0)(0, 0, 1)) multiplied with the acceleration vector to give me the first movement.
Then I want use the Rotation matrix to find out, where the next acceleration is really heading so I can simply add the accelerations together.
But right now I am just too confused to find a proper way.
Any suggestions?
btw. sorry for my poor english, I am tired and (obviously) not a native speaker ;)
Thanks,
Alex
Short answer
Yes, go for quaternions and use a first order linearization of the rotation to calculate how orientation changes. This reduces to the following pseudocode:
float pose_initial[4]; // quaternion describing original orientation
float g_x, g_y, g_z; // gyro rates
float dt; // time step. The smaller the better.
// quaternion with "pose increment", calculated from the first-order
// linearization of continuous rotation formula
delta_quat = {1, 0.5*dt*g_x, 0.5*dt*g_y, 0.5*dt*g_z};
// final orientation at start time + dt
pose_final = quaternion_hamilton_product(pose_initial, delta_quat);
This solution is used in PixHawk's EKF navigation filter (it is open source, check out formulation here). It is simple, cheap, stable and accurate enough.
Unit matrix (describing a "null" rotation) is equivalent to quaternion [1 0 0 0]. You can get the quaternion describing other poses using a suitable conversion formula (for example, if you have Euler angles you can go for this one).
Notes:
Quaternions following [w, i, j, k] notation.
These equations assume angular speeds in SI units, this is, radians per second.
Long answer
A gyroscope describes the rotational speed of an object as a decomposition in three rotational speeds around the orthogonal local axes XYZ. However, you could equivalently describe the rotational speed as a single rate around a certain axis --either in reference system that is local to the rotated body or in a global one.
The three rotational speeds affect the body simultaneously, continously changing the rotation axis.
Here we have the problem of switching from the continuous-time real world to a simpler discrete-time formulation that can be easily solved using a computer. When discretizing, we are always going to introduce errors. Some approaches will lead to bigger errors, while others will be notably more accurate.
Your approach of concatenating three simultaneous rotations around orthogonal axes work reasonably well with small integration steps (let's say smaller than 1/1000 s, although it depends on the application), so that you are simulating the continuous change of rotation axis. However, this is computationally expensive, and error grows as you make time steps bigger.
As an alternative to first-order linearization, you can calculate pose increments as a small delta of angular speed gradient (also using quaternion representation):
quat_gyro = {0, g_x, g_y, g_z};
q_grad = 0.5 * quaternion_product(pose_initial, quat_gyro);
// Important to normalize result to get unit quaternion!
pose_final = quaternion_normalize(pose_initial + q_grad*dt);
This technique is used in Madgwick rotation filter (here an implementation), and works pretty fine for me.

XNA 2D Camera loosing precision

I have created a 2D camera (code below) for a top down game. Everything works fine when the players position is close to 0.0x and 0.0y.
Unfortunately as distance increases the transform seems to have problems, at around 0.0x 30e7y (yup that's 30 million y) the camera starts to shudder when the player moves (the camera gets updated with the player position at the end of each update) At really big distances, a billion + the camera wont even track the player, as I'm guessing what ever error is in the matrix is amplified by too much.
My question is: Is there either a problem in the matrix, or is this standard behavior for extreme numbers.
Camera Transform Method:
public Matrix getTransform()
{
Matrix transform;
transform = (Matrix.CreateTranslation(new Vector3(-position.X, -position.Y, 0)) *
Matrix.CreateRotationZ(rotation) * Matrix.CreateScale(new Vector3(zoom, zoom, 1.0f)) *
Matrix.CreateTranslation(new Vector3((viewport.Width / 2.0f), (viewport.Height / 2.0f), 0)));
return transform;
}
Camera Update Method:
This requests the objects position given it's ID, it returns a basic Vector2 which is then set as the cameras position.
if (camera.CameraMode == Camera2D.Mode.Track && cameraTrackObject != Guid.Empty)
{
camera.setFocus(quadTree.getObjectPosition(cameraTrackObject));
}
If any one can see an error or enlighten me as to why the matrix struggles I would be most grateful.
I have actually found the reason for this, it was something I should have thought of.
I'm using single precision floating points, which only have precision to 7 digits. That's fine for smaller numbers (up to around the 2.5 million mark I have found). Anything over this and the multiplication functions in the matrix start to gain precision errors as the floats start to truncate.
The best solution for my particular problem is to introduce some artificial scaling (I need the very large numbers as the simulation is set in space). I have limited my worlds to 5 million units squared (+/- 2.5 million units) and will come up with another way of granulating the world.
I also found a good answer about this here:
Vertices shaking with large camera position values
And a good article that discusses floating points in more detail:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Thank you for the views and comments!!

Finding the angle of stripeline/ Angle of rotation

So I’m trying to find the rotational angle for stripe lines in images like the attached photo.
The only assumption is that the lines are parallel, and their orientation is about 90 degrees approximately more or less [say 5 degrees tolerance].
I have to make sure the stripe lines in the result image will be %100 vertical. The quality of the images varies as well as their histogram/greyscale values. So methods based on non-adaptive thresholding already failed for my cases [I’m not interested in thresholding based methods if I cannot make it adaptive]. Also, there are some random black clusters on top of the stripe lines sometimes.
What I did so far:
1) Of course HoughLines is the first option, but I couldn’t make it work for all my images, I had some partial success though following this great article:
http://felix.abecassis.me/2011/09/opencv-detect-skew-angle/.
The main reason of failure to my understanding was that, I needed to fine tune the parameters for different images. Parameters such as Canny/BW/Morphological edge detection (If needed) | parameters for minLinelength/maxLineGap/etc. For sure there’s a way to hack into this and make it work, but, to me this is a fragile solution!
2) What I’m working on right now, is to divide the image to a top slice and a bottom slice, then find the peaks and valleys of each slice. Then basically find the angle using the width of the image and translation of peaks. I’m currently working on finding which peak of the top slice belongs to which of the bottom slice, since there will be some false positive peaks in my computation due to existence of black/white clusters on top of the strip lines.
Example: Location of peaks for slices:
Top Slice = { 1, 33,67,90,110}
BottomSlice = { 3, 14, 35,63,90,104}
I am actually getting similar vectors when extracting peaks. So as can be seen, the length of vector might vary, any idea how can I get a group like:
{{1,3},{33,35},{67,63},{90,90},{110,104}}
I’m open to any idea about improving any of these algorithms or a completely new approach. If needed, I can upload more images.
If you can get a list of points for a single line, a linear regression will give you a formula for the straight line that best fits the points. A simple trig operation will convert the line formula to an angle.
You can probably use some line thinning operation to turn the stripes into a list of points.
You can run an accumulator of spatial derivatives along different angles. If you want a half-degree precision and a sample of 5 lines, you have a maximum 10*5*1500 = 7.5m iterations. You can safely reduce the sampling rate along the line tenfold, which will give you a sample size of 150 points per sample, reducing the number of iterations to less than a million. Somewhere around that point the operation of straightening the image ought to become the bottleneck.

Resources