I am new to this optical flow in image space, and I am kind of confused that weather the optical flow computed in OpenCV by Lucas-Kanade method is distance, displacement or velocity. Perhaps I might sound foolish but I am really confused.
I feel its velocity but I just want to confirm?
I assume you refer to opencv function calcOpticalFlowPyrLK.
This function tracks the position of interest points found in old-frame and returns their position at the new-frame.
The Lucas-Kanade method estimates the local image flow (velocity) vector at point p.
This method computes the displacement of some interest points between two sucessive frames. The output vector contains the calculated new positions of input features in the second image as it is stated in the following link in documentation as well : http://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html
Related
My aim is to calibrate a pair of cameras and use them for simple measurement purposes. For this purpose, I have already calibrated them using HALCON and have all the necessary intrinsic and extrinsic camera Parameters. The next step for me is to basically measure known lengths to verify my calibration accuracies. So far I have been using the method intersect_lines_of_sight to achieve this. This has given me unfavourable results as the lengths are off by a couple of centimeters. Is there any other method which basically triangulates and gives me the 3D coordinates of a Point in HALCON? Or is there any leads as to how this can be done? Any help will be greatly appreciated.
Kindly let me know in case this post Needs to be updated with code samples
In HALCON there is also the operator reconstruct_points_stereo with which you can reconstruct 3D points given the row and column coordinates of a corresponding pixel. For this you will need to generate a StereoModel from your calibration data that is then used in the operator reconstruct_points_stereo.
In you HALCON installation there is an standard HDevelop example that shows the use of this operator. The example is called reconstruct_points_stereo.hdev and can be found in the example browser of HDevelop.
Specifically I'd ideally want images with point correspondences and a 'Gold Standard' calculated value of F and left and right epipoles. I could work with an Essential matrix and intrinsic and extrinsic camera properties too.
I know that I can construct F from two projection matrices and then generate left and right projected point coordinates from 3D actual points and apply Gaussian noise but I'd really like to work with someone else's reference data since I'm trying to test the efficacy of my code and writing more code to test the first batch of (possibly bad) code doesn't seem smart.
Thanks for any help
Regards
Dave
You should work with ground truth datasets for multi-view reconstructions. I recommend to use the Middlebury Multi-View Stereo datasets. Besides the image data in lossless format, they deliver camera parameters, such as camera pose and intrinsic camera calibration as well as the possibility to evaluate your own multi-view reconstruction system.
Perhaps, the results are not computed by "the" gold standard algorithm proposed in the book of Hartley and Zisserman but you can use it to compute the fundamental matrices you require between two views.
To compute the fundamental matrix F from two projection matrices P1 and P2 refer to the code Andrew Zisserman provides.
I'd like to implement a Filter that allows resampling of an image by moving a number of control points that mark edges and tangent directions. The goal is to be able to freely transform an image as seen in Photoshop when you use "Free Transform" and chose Warpmode "Custom". The image is fitted into a some kind of Spline-Patch (if that is a valid name) that can be manipulated.
I understand how simple splines (paths) work but how do you connect them to form a patch?
And how can you sample such a patch to render the morphed image? For each pixel in the target I'd need to know what pixel in the source image corresponds. I don't even know where to start searching...
Any helpful info (keywords, links, papers, reference implementations) are greatly appreciated!
This document will get you a good insight into warping: http://www.gson.org/thesis/warping-thesis.pdf
However, this will include filtering out high frequencies, which will make the implementation a lot more complicated but will give a better result.
An easy way to accomplish what you want to do would be to loop through every pixel in your final image, plug the coordinates into your splines and retrieve the pixel in your original image. This pixel might have coordinates 0.4/1.2 so you could bilinearly interpolate between 0/1, 1/1, 0/2 and 1/2.
As for splines: there are many resources and solutions online for the 1D case. As for 2D it gets a bit trickier to find helpful resources.
A simple example for the 1D case: http://www-users.cselabs.umn.edu/classes/Spring-2009/csci2031/quad_spline.pdf
Here's a great guide for the 2D case: http://en.wikipedia.org/wiki/Bicubic_interpolation
Based upon this you could derive an own scheme for splines for the 2D case. Define a bivariate (with x and y) polynomial and set your constraints to solve for the coefficients of the polynomial.
Just keep in mind that the borders of the spline patches have to be consistent (both in value and derivative) to avoid ugly jumps.
Good luck!
I'm trying to build something like the Liquify filter in Photoshop. I've been reading through image distortion code but I'm struggling with finding out what will create similar effects. The closest reference I could find was the iWarp filter in Gimp but the code for that isn't commented at all.
I've also looked at places like ImageMagick but they don't have anything in this area
Any pointers or a description of algorithms would be greatly appreciated.
Excuse me if I make this sound a little simplistic, I'm not sure how much you know about gfx programming or even what techniques you're using (I'd do it with HLSL myself).
The way I would approach this problem is to generate a texture which contains offsets of x/y coordinates in the r/g channels. Then the output colour of a pixel would be:
Texture inputImage
Texture distortionMap
colour(x,y) = inputImage(x + distortionMap(x, y).R, y + distortionMap(x, y).G)
(To tell the truth this isn't quite right, using the colours as offsets directly means you can only represent positive vectors, it's simple enough to subtract 0.5 so that you can represent negative vectors)
Now the only problem that remains is how to generate this distortion map, which is a different question altogether (any image would generate a distortion of some kind, obviously, working on a proper liquify effect is quite complex and I'll leave it to someone more qualified).
I think liquefy works by altering a grid.
Imagine each pixel is defined by its location on the grid.
Now when the user clicks on a location and move the mouse he's changing the grid location.
The new grid is again projected into the 2D view able space of the user.
Check this tutorial about a way to implement the liquify filter with Javascript. Basically, in the tutorial, the effect is done transforming the pixel Cartesian coordinates (x, y) to Polar coordinates (r, α) and then applying Math.sqrt on r.
I saw a question on reverse projecting 4 2D points to derive the corners of a rectangle in 3D space. I have a kind of more general version of the same problem:
Given either a focal length (which can be solved to produce arcseconds / pixel) or the intrinsic camera matrix (a 3x2 matrix that defines the properties of the pinhole camera model being used - it's directly related to focal length), compute the camera ray that goes through each pixel.
I'd like to take a series of frames, derive the candidate light rays from each frame, and use some sort of iterative solving approach to derive the camera pose from each frame (given a sufficiently large sample, of course)... All of that is really just massively-parallel implementations of a generalized Hough algorithm... it's getting the candidate rays in the first place that I'm having the problem with...
A friend of mine found the source code from a university for the camera matching in PhotoSynth. I'd Google around for it, if I were you.
That's a good suggestion... and I will definitely look into it (photosynth kind of resparked my interest in this subject - but I've been working on it for months for robochamps) - but it's a sparse implementation - it looks for "good" features (points in the image that should be easily identifiable in other views of the same image), and while I certainly plan to score each match based on how good the feature it's matching is, I want the full dense algorithm to derive every pixel... or should I say voxel lol?
After a little poking around, isn't it the extrinsic matrix that tells you where the camera actually is in 3-space?
I worked at a company that did a lot of this, but I always used the tools that the algorithm guys wrote. :)