Ask for robust global motion estimation algorithm image processing - image

I am working on a video stabilization project and need a robust algorithm to estimate transformation matrix between adjacent frames.
My present approach is detecting Harris corners on a image and applying optical flow to find correspondent points of those corners on the other. Subsequently, I use RANSAC estimation to find Rigid transform between 2 point patterns, which represents the global motion between the two frames. The thing is, when images are low quality, there are few or even no key points detected, which leads to failed estimation of motion matrix.
Other methods such as block-based (using phase correlation) do not give better performance, from my experience. The most accurate method is image registration based on global motion optimization algorithm, but it's unfeasible for my application in term of computational load.
Could anyone please give me an idea?
Thank in advance.

Related

Determine distance to vehicle in image (Matlab, OpenCV)

I have 2 image:
1 Initial image when I detected car.
2 IPM image when I transformed image to another plan.
I don't have any information about camera parameters.
I know the location of the car in original image and in IPM image and I wish to know how can I determinate the distance between the car and the camera. Can we assume the height of the camera is 1 m.
Exist any formula or algorithm for that?
From your question, it appears that you are in the monocular case, where depth/scale information is naturally unavailable. I'm afraid there is no easy way to achieve what you want. Off the top of my head, here are a few options:
1- Using neural networks. This is the least expensive option (from a material and development effort point of view.). Plus, it will work if you do not have a video stream but only single images (although I'm guessing that is not your case). If performance is not an issue, you can take more or less any convolutional neural network, and train it on depth data. Otherwise, a quick search will lead you to faster state of the art networks which you can taylor to your needs. However, databases which contain depth map ground truth are a bit scarce, and you usually have to build the depth map for your training data yourself. In that case many of the open source methods listed here at the bottom of here come to mind. Once you have the depth maps you can train for monocular depth estimation.
2- You can use stereo cameras. Those naturally give you the depth, via for example simple triangulation.
3- You have a video stream and can use IMUs or car odometry. In that case, you can use many (multisensor) Simultaneous Localization And Mapping (SLAM) method . The litterature on the subject is rich, but IMU calibration is usually a bit nightmarish.
4- You can use a cheap GPS receiver (the ublox EVK* series come to mind). In that case, you still need to use some SLAM variant (e.g constrained bundle adjustment or any kalman-based method). Assuming low GPS bias (since you show only peri-urban images which are not affected by multipath) this will give you a decent approximation of the scale.
Note that methods three and four will give you an estimate of the reconstruction, so if you use sparse (i.e feature based) SLAM methods, you will end up having to densify the region where your car is detected (unless crude estimates are ok).

algorithm - warping image to another image and calculate similarity measure

I have a query on calculation of best matching point of one image to another image through intensity based registration. I'd like to have some comments on my algorithm:
Compute the warp matrix at this iteration
For every point of the image A,
2a. We warp the particular image A pixel coordinates with the warp matrix to image B
2b. Perform interpolation to get the corresponding intensity form image B if warped point coordinate is in image B.
2c. Calculate the similarity measure value between warped pixel A intensity and warped image B intensity
Cycle through every pixel in image A
Cycle through every possible rotation and translation
Would this be okay? Is there any relevant opencv code we can reference?
Comments on algorithm
Your algorithm appears good although you will have to be careful about:
Edge effects: You need to make sure that the algorithm does not favour matches where most of image A does not overlap image B. e.g. you may wish to compute the average similarity measure and constrain the transformation to make sure that at least 50% of pixels overlap.
Computational complexity. There may be a lot of possible translations and rotations to consider and this algorithm may be too slow in practice.
Type of warp. Depending on your application you may also need to consider perspective/lighting changes as well as translation and rotation.
Acceleration
A similar algorithm is commonly used in video encoders, although most will ignore rotations/perspective changes and just search for translations.
One approach that is quite commonly used is to do a gradient search for the best match. In other words, try tweaking the translation/rotation in a few different ways (e.g. left/right/up/down by 16 pixels) and pick the best match as your new starting point. Then repeat this process several times.
Once you are unable to improve the match, reduce the size of your tweaks and try again.
Alternative algorithms
Depending on your application you may want to consider some alternative methods:
Stereo matching. If your 2 images come from stereo camera then you only really need to search in one direction (and OpenCV provides useful methods to do this)
Known patterns. If you are able to place a known pattern (e.g. a chessboard) in both your images then it becomes a lot easier to register them (and OpenCV provides methods to find and register certain types of pattern)
Feature point matching. A common approach to image registration is to search for distinctive points (e.g. types of corner or more general places of interest) and then try to find matching distinctive points in the two images. For example, OpenCV contains functions to detect SURF features. Google has published a great paper on using this kind of approach in order to remove rolling shutter noise that I recommend reading.

image registration(non-rigid \ nonlinear)

I'm looking for some algorithm (preferably if source code available)
for image registration.
Image deformation can't been described by homography matrix(because I think that distortion not symmetrical and not
homogeneous),more specifically deformations are like barrel/distortion and trapezoid distortion, maybe some rotation of image.
I want to obtain pairs of pixel of two images and so i can obtain representation of "deformation field".
I google a lot and find out that there are some algorithm base on some phisics ideas, but it seems that they can converge
to local maxima, but not global.
I can affort program to be semi-automatic, it means some simple user interation.
maybe some algorithms like SIFT will be appropriate?
but I think it can't provide "deformation field" with regular sufficient density.
if it important there is no scale changes.
example of complicated field
http://www.math.ucla.edu/~yanovsky/Research/ImageRegistration/2DMRI/2DMRI_lambda400_grid_only1.png
What you are looking for is "optical flow". Searching for these terms will yield you numerous results.
In OpenCV, there is a function called calcOpticalFlowFarneback() (in the video module) that does what you want.
The C API does still have an implementation of the classic paper by Horn & Schunck (1981) called "Determining optical flow".
You can also have a look at this work I've done, along with some code (but be careful, there are still some mysterious bugs in the opencl memory code. I will release a corrected version later this year.): http://lts2www.epfl.ch/people/dangelo/opticalflow
Besides OpenCV's optical flow (and mine ;-), you can have a look at ITK on itk.org for complete image registration chains (mostly aimed at medical imaging).
There's also a lot of optical flow code (matlab, C/C++...) that can be found thanks to google, for example cs.brown.edu/~dqsun/research/software.html, gpu4vision, etc
-- EDIT : about optical flow --
Optical flow is divided in two families of algorithms : the dense ones, and the others.
Dense algorithms give one motion vector per pixel, non-dense ones one vector per tracked feature.
Examples of the dense family include Horn-Schunck and Farneback (to stay with opencv), and more generally any algorithm that will minimize some cost function over the whole images (the various TV-L1 flows, etc).
An example for the non-dense family is the KLT, which is called Lucas-Kanade in opencv.
In the dense family, since the motion for each pixel is almost free, it can deal with scale changes. Keep in mind however that these algorithms can fail in the case of large motions / scales changes because they usually rely on linearizations (Taylor expansions of the motion and image changes). Furthermore, in the variational approach, each pixel contributes to the overall result. Hence, parts that are invisible in one image are likely to deviate the algorithm from the actual solution.
Anyway, techniques such as coarse-to-fine implementations are employed to bypass these limits, and these problems have usually only a small impact. Brutal illumination changes, or large occluded / unoccluded areas can also be explicitly dealt with by some algorithms, see for example this paper that computes a sparse image of "innovation" alongside the optical flow field.
i found some software medical specific, but it's complicate and it's not work with simple image formats, but seems that it do that I need.
http://www.csd.uoc.gr/~komod/FastPD/index.html
Drop - Deformable Registration using Discrete Optimization

Image Warp Filter - Algorithm and Rasterization

I'd like to implement a Filter that allows resampling of an image by moving a number of control points that mark edges and tangent directions. The goal is to be able to freely transform an image as seen in Photoshop when you use "Free Transform" and chose Warpmode "Custom". The image is fitted into a some kind of Spline-Patch (if that is a valid name) that can be manipulated.
I understand how simple splines (paths) work but how do you connect them to form a patch?
And how can you sample such a patch to render the morphed image? For each pixel in the target I'd need to know what pixel in the source image corresponds. I don't even know where to start searching...
Any helpful info (keywords, links, papers, reference implementations) are greatly appreciated!
This document will get you a good insight into warping: http://www.gson.org/thesis/warping-thesis.pdf
However, this will include filtering out high frequencies, which will make the implementation a lot more complicated but will give a better result.
An easy way to accomplish what you want to do would be to loop through every pixel in your final image, plug the coordinates into your splines and retrieve the pixel in your original image. This pixel might have coordinates 0.4/1.2 so you could bilinearly interpolate between 0/1, 1/1, 0/2 and 1/2.
As for splines: there are many resources and solutions online for the 1D case. As for 2D it gets a bit trickier to find helpful resources.
A simple example for the 1D case: http://www-users.cselabs.umn.edu/classes/Spring-2009/csci2031/quad_spline.pdf
Here's a great guide for the 2D case: http://en.wikipedia.org/wiki/Bicubic_interpolation
Based upon this you could derive an own scheme for splines for the 2D case. Define a bivariate (with x and y) polynomial and set your constraints to solve for the coefficients of the polynomial.
Just keep in mind that the borders of the spline patches have to be consistent (both in value and derivative) to avoid ugly jumps.
Good luck!

Practical Uses of Fractals in Programming

Fractals have always been a bit of a mystery for me.
What practical uses (beyond rendering to beautiful images) are there for fractals in the various programming problem domains? And please, don't just list areas that use them. I'm interested in specific algorithms and how fractals are used with those algorithms to solve something in practice. Please at least give a short description of the algorithm.
Absolutely computer graphics. It's not about generating beautiful abstract images, but realistic and not repeating landscapes. Read about Fractal Landscapes.
Perlin Noise, which might be considered a simple fractal is used in computer graphics everywhere. The author joked around that if he would patent it, he'd be a millionare now. Fractals are also used in animation and lossy image compression.
A Peano curve is a space-filling fractal, which allows you to cover a 2-D area (or higher-dimensional region) uniformly with a 1-D path. If you are doing local operations on a multidimensional array, storing and/or accessing the array data in space-filling curve order can increase your cache coherence, for all levels of cache.
Fractal image compression. There are some more applications thought not all in programming here.
Error diffusion along a Hilbert curve.
It's a simple idea - suppose that you convert an image to a 0-1 black & white bitmap. Converting a 55% brightness pixel to white yields a +45% error. Instead of just forgetting it, you keep the 45% to take into account when processing the next pixel. Suppose its value is 80%. Normally it would be converted to white, but a neighboring pixel is too bright, so taking the +45% error into account, you convert it to black (80%-45%=35%), keeping a -35% error to be spread into next pixels.
This way a 75% gray area will have white/black pixel ratio close to 75/25, which is good. But if you process the pixels left-to-right, the error only spreads in one direction, which yields worse looking images. Enter space-filling curves. Processing the pixels along a Hilbert curve gets good locality of the error spread. More here, with pictures.
Fractals are used in finance for analyzing the prices of stock. The are also used in the study of complex systems (complexity theory) and in art.
One can use computer science algorithms to compute the fractal dimension, or Haussdorff dimension of black-and-white images.
It is not that difficult to implement.
It turns out that this is used in biology and medicine to analyze cell samples, for example, analyze how aggressive a cancer cell is, or how far a disease have gone. A cell is in general more healthy the higher the dimension is, meaning you wish for low fractal dimension for cancer samples.
Another uses of fractal theory is fractal image interpolation. For example, Perfect Resize 7 is using fractals to resize images with very good quality. They are, most likely, using partition iterated function systems (PIFS), that assume that different parts of an image are self-similar to each other. The algorithm is based on searching of self-similar parts of an image and describing transformation between them.
used in image compression, any mobile phone, the antenna chip design is a fractal for maximum surface area, texture generation, mountain generation, understanding trees, cliffs, jellyfish, emulating any natural phenomena where there is a degree of recursion and self similarity at different scales. a lot of scientific applications.

Resources