i have a picture that captured from a fixed position [X Y Z] and angle [Pitch Yaw Roll] and a focal length of F (i think this information is called camera matrix)
i want to change the captured picture to a different position like it was taken in up position
the result image should be like:
in fact i have picture taken from this position:
and i want to change my picture in a way that it was taken in this position:
i hope that i could express my problem.
thnx in advance
It can be done accurately only for the (green) plane itself. The 3D objects standing onto the plane will be deformed after remapping, but the deformation may be acceptable if their height is small relative to the camera distance.
If the camera is never moving, all you need to do is identify on the perspective image four points that are the four vertices of a rectangle of known size (e.g. the soccer field itself), then compute the homography that maps those four points to that rectangle, and apply it to the whole image.
For details and code, see the OpenCV links at the bottom of that Wikipedia article.
Related
In my program (using MATLAB), I specified(through dragging) the pedestrian lane as my Region Of Interest (ROI) with the coordinates [7, 178, 620, 190] (in xmin, ymin, width, and height respectively) using the getrect, roipoly and insertshape function. Refer to the image below.
The video from where this snapshot is taken is in 640x480 pixels resolution (480p)
Defining a real world space as my ROI by mouse dragging is barbaric. That's why the ROI coordinates must be derived mathematically.
What I'm going at is using real-world measurements from the video capturing site and use the Pythagorean Theorem from where the camera is positioned:
How do I obtain the equivalent pixel coordinates and parameters using the real-world measurements?
I'll try to split your question into 2 smaller questions.
A) How do I obtain the equivalent pixel coordinates of an interesting
point? (pratical question)
Your program shoudl be able to retrieve/reconnaise a feature/marker that you positioned in the "real-world" interesting point. The output is a coordinate in pixel. This can be done quite easily (think about QR-codes, for example)
B) What is the analytical relationship between 1 point in 3D space and
its pixel coordinate in the image? (theoretical question)
This is the projection equation based on the pinhole camera model. X,Y,Z 3D coordinates are related with x,y pixel coordinates
Cool, but some detail have to be explained (and there will be any "automatic short formula")
s represent the scale factor. A single pixel in an image could be the projection of infinite different point, due to perspective. In your photo, a pixel containing a piece of a car (when the car is present) will be the same pixel that contain a piece of street under the car (when the car is passed).
So there is not an univocal relationship starting from pixels coordinates
The matrix on the left involves the camera parameters (focal length, etc.) which are called intrinsic parameters. They have to be known to build the relationship between 3D coordinates and pixel coordinates
The matrix on the right seems to be trivial, is the combination of an identity matrix which represents rotation and a column array of zeros which represents translation. Something like T = [R|t].
Which rotation, which translation? You have to consider that every set of coordinates is implicitly expressed in its own reference system. So you have to determine the relationship between the reference system of your measurement and the camera reference system: not only to retrieve position of the camera in your 3D space with euclidean geometry, but also orientation of the camera (angles).
Been reading this paper:
http://photon07.pd.infn.it:5210/users/dazzi/Thesis_doctorate/Info/Chapter_6/Stereoscopy_(Mrovlje).pdf
to figure out how to use 2 parallel camera to find the depth of an object. Seems like some how we need the field of view of the camera at exact plane (which is the depth which the cameras try to measure anyway) to get the depth.
Am I interpreting this wrong? Or anyone else knows how does one use a pair of camera to measure distance of an object from the camera pair?
Kelvin
Camera sensors either have to lie on the same plane or their images has to be rectified so that 'virtually' they lie in the same plane. This is the only requirement and it simplifies the search for matches between the left and right image: whatever you have in the left image will be located in the right at the same row so you don't need to check other rows. You can skip this requirement but then your search will be more extensive. When you done with finding correspondences you can figure out the depth from them.
In rectified camera, the depth is determined from the shift: for example if the left image has a feature in row 4, column 11 and the left image has this feature in row 4 (same row since camera was rectified) column 1 then we say that disparity is 11-1=10. The disparity D is inversely proportional to dept Z:
Z=fB/D , where B is distance between cameras.
At the end you will have depth estimates everywhere where you found correspondences. So called dense stereo aims to get more than 90% of image area covered where sparse stereo recovers only a few depth measurements.
Note that it is hard to find correspondences if there is a little texture on the surface of the object or in other words it is uniformly colored. Some cameras such as Kinect project their own pattern on the objects to solve the problem of feature absence.
I am writing a program in Matlab to detect a circle.
I've already managed to detect shapes such as the square, rectangle and the triangle, basically by searching for corners, and determining what shape it is based on the distance between them. The images are black and white, with black being the background and white the shape, so for me to find the corners I just have to search each pixel in the image until I find a white pixel.
However I just can't figure out how I can identify the circle.
Here it the an example of how a circle input would look like:
It is difficult to say what the best method is without more information: for example, whether more than one circle may be present, whether it is always centred in the image, and how resilient the algorithm needs to be to distortions. Also whether you need to determine the location and dimensions of the shape or simply a 'yes'/'no' output.
However a really simple approach, assuming only one circle is present, is as follows:
Scan the image from top to bottom until you find the first white pixel at (x1,y1)
Scan the image from bottom to top until you find the last white pixel at (x2,y2)
Derive the diameter of the suspected circle as y2 - y1
Derive the centre of the suspected circle as ((x1+x2)/2, y1+(y2-y1)/2)
Now you are able to score each pixel in the image as to whether it matches this hypothetical circle or not. For example, if a pixel is inside the suspected circle, score 0 if it is white and 1 if it black, and vice-versa if it is outside the suspected circle.
Sum the pixel scores. If the result is zero then the image contains a perfect circle. A higher score indicates an increasing level of distortion.
I think you may read about this two topics:
Theoretical:
Binary images
Hough transform
Matlab:
Circle Detection via Standard Hough Transform
Hough native in matlab
Binary images
As a followup to my previous question about determining camera parameters I have formulated a new problem.
I have two pictures of the same rectangle:
The first is an image without any transformations and shows the rectangle as it is.
The second image shows the rectangle after some 3d transformation (XYZ-rotation, scaling, XY-translation) is applied. This has caused the rectangle to look a trapezoid.
I hope the following picture describes my problem:
alt text http://wilco.menge.nl/application.data/cms/upload/transformation%20matrix.png
How do determine what transformations (more specifically: what transformation matrix) have caused this tranformation?
I know the pixel locations of the corners in both images, hence i also know the distances between the corners.
I'm confused. Is this a 2d or a 3d problem?
The way I understand it, you have a flat rectangle embedded in 3d space, and you're looking at two 2d "pictures" of it - one of the original version and one based on the transformed version. Is this correct?
If this is correct, then there is not enough information to solve the problem. For example, suppose the two pictures look exactly the same. This could be because the translation is the identity, or it could be because the translation moves the rectangle twice as far away from the camera and doubles its size (thus making it look exactly the same).
This is a math problem, not programming ..
you need to define a set of equations (your transformation matrix, my guess is 3 equations) and then solve it for the 4 transformations of the corner-points.
I've only ever described this using German words ... so the above will sound strange ..
Based on the information you have, this is not that easy. I will give you some ideas to play with, however. If you had the 3D coordinates of the corners, you'd have an easier time. Here's the basic idea.
Move a corner to the origin. Thereafter, rotations will take place about the origin.
Determine vectors of the axes. Do this by subtracting the adjacent corners from the origin point. These will be a local x and y axis for your world.
Determine angles using the vectors. You can use the dot and cross products to determine the angle between the local x axis and the global x axis (1, 0, 0).
Rotate by the angle in step 3. This will give you a new x axis which should match the global x axis and a new local y axis. You can then determine another rotation about the x axis which will bring the y axis into alignment with the global y axis.
Without the z coordinates, you can see that this will be difficult, but this is the general process. I hope this helps.
The solution will not be unique, as Alex319 points out.
If the second image is really a trapezoid as you say, then this won't be too hard. It is a trapezoid (not a parallelogram) because of perspective, so it must be an isosceles trapezoid.
Draw the two diagonals. They intersect at the center of the rectangle, so that takes care of the translation.
Rotate the trapezoid until its parallel sides are parallel to two sides of the original rectangle. (Which two? It doesn't matter.)
Draw a third parallel through the center. Scale this to the sides of the rectangle you chose.
Now for the rotation out of the plane. Measure the distance from the center to one of the parallel sides and use the law of sines.
If it's not a trapezoid, just a quadralateral, then it'll be harder, you'll have to use the angles between the diagonals to find the axis of rotation.
Where can I find algorithms for image distortions? There are so much info of Blur and other classic algorithms but so little of more complex ones. In particular, I am interested in swirl effect image distortion algorithm.
I can't find any references, but I can give a basic idea of how distortion effects work.
The key to the distortion is a function which takes two coordinates (x,y) in the distorted image, and transforms them to coordinates (u,v) in the original image. This specifies the inverse function of the distortion, since it takes the distorted image back to the original image
To generate the distorted image, one loops over x and y, calculates the point (u,v) from (x,y) using the inverse distortion function, and sets the colour components at (x,y) to be the same as those at (u,v) in the original image. One ususally uses interpolation (e.g. http://en.wikipedia.org/wiki/Bilinear_interpolation ) to determine the colour at (u,v), since (u,v) usually does not lie exactly on the centre of a pixel, but rather at some fractional point between pixels.
A swirl is essentially a rotation, where the angle of rotation is dependent on the distance from the centre of the image. An example would be:
a = amount of rotation
b = size of effect
angle = a*exp(-(x*x+y*y)/(b*b))
u = cos(angle)*x + sin(angle)*y
v = -sin(angle)*x + cos(angle)*y
Here, I assume for simplicity that the centre of the swirl is at (0,0). The swirl can be put anywhere by subtracting the swirl position coordinates from x and y before the distortion function, and adding them to u and v after it.
There are various swirl effects around: some (like the above) swirl only a localised area, and have the amount of swirl decreasing towards the edge of the image. Others increase the swirling towards the edge of the image. This sort of thing can be done by playing about with the angle= line, e.g.
angle = a*(x*x+y*y)
There is a Java implementation of lot of image filters/effects at Jerry's Java Image Filters. Maybe you can take inspiration from there.
The swirl and others like it are a matrix transformation on the pixel locations. You make a new image and get the color from a position on the image that you get from multiplying the current position by a matrix.
The matrix is dependent on the current position.
here is a good CodeProject showing how to do it
http://www.codeproject.com/KB/GDI-plus/displacementfilters.aspx
there has a new graphic library have many feature
http://code.google.com/p/picasso-graphic/
Take a look at ImageMagick. It's a image conversion and editing toolkit and has interfaces for all popular languages.
The -displace operator can create swirls with the correct displacement map.
If you are for some reason not satisfied with the ImageMagick interface, you can always take a look at the source code of the filters and go from there.