Looking for algorithm to map an image to 4 sides polygon - algorithm

This is more a math question than a programming question beside the fact that I must implement it using Delphi inside a graphic application.
Assuming I have a picture of a sheet of paper. The actual sheet of paper is of course a rectangular area. When the picture is shown on a computer screen the rectangular area is no more rectangular because when the picture was taken, the camera was not perfectly positioned above the sheet of paper. There is all kinds of perspective effects which result in deformations.
My application needs to tweak the image so that the original rectangular area is displayed as a rectangular area on screen.
Most photo processing software have an interactive tool to do that. The user draw a rectangular area on screen around the rectangular object and then drag each corner to deform the displayed rectangular area until he see the real area as rectangular. What I'm looking for is the algorithm to do that computation.

You need to split the problem into 2 steps. Find the edges or corners of the sheet and remap the pixels.
To find the corners or edges it's a really hard problem since they might be invisible, outside of the picture, obstructed, bent or deformed. Assuming you have a very simple setup (black uniform background, white paper, very little distortion) you could run an edge detection kernel over the image then find the 4 outer edges. If you find the edges you can intersect them to find the corners and the other way around.
Once you find the corners run an interpolation over the image to map the pixels onto the rectangle you want. You should be able to get the graphics engine to do this for you if you provide the coordinates of the corners as texture coordinates for the rectangle and map the image as a texture.
I made it sound simple, but you will encounter many parameters to set and experiment with.

It seems (because you mentioned bilinear interpolation) that you need perspective transformations.
There is implementation of perspective transformations (mapping of arbitrary convex quad to rectangle and vice versa) in Anti-Grain Geometry library (exe example). Delphi port.
With agg_trans_perspective one can calculate the matrix of persp. transformation and then apply it to map coordinates from one quad to another.

Related

Camera Geometry: Algorithm for "object area correction"

A project I've been working on for the past few months is calculating the top area of ​​an object taken with a 3D depth camera from top view.
workflow of my project:
capture a group of objects image(RGB,DEPTH data) from top-view
Instance Segmentation with RGB image
Calculate the real area of ​​the segmented mask with DEPTH data
Some problem on the project:
All given objects have different shapes
The side of the object, not the top, begins to be seen as it moves to the outside of the image.
Because of this, the mask area to be segmented gradually increases.
As a result, the actual area of ​​an object located outside the image is calculated to be larger than that of an object located in the center.
In the example image, object 1 is located in the middle of the angle, so only the top of the object is visible, but object 2 is located outside the angle, so part of the top is lost and the side is visible.
Because of this, the mask area to be segmented is larger for objects located on the periphery than for objects located in the center.
I only want to find the area of ​​the top of an object.
example what I want image:
Is there a way to geometrically correct the area of ​​an object located on outside of the image?
I tried to calibrate by multiplying the area calculated according to the angle formed by Vector 1 connecting the center point of the camera lens to the center point of the floor and Vector 2 connecting the center point of the lens to the center of gravity of the target object by a specific value.
However, I gave up because I couldn't logically explain how much correction was needed.
fig 3:
What I would do is convert your RGB and Depth image to 3D mesh (surface with bumps) using your camera settings (FOVs,focal length) something like this:
Align already captured rgb and depth images
and then project it onto ground plane (perpendicul to camera view direction in the middle of screen). To obtain ground plane simply take 3 3D positions of the ground p0,p1,p2 (forming triangle) and using cross product to compute the ground normal:
n = normalize(cross(p1-p0,p2-p1))
now you plane is defined by p0,n so just each 3D coordinate convert like this:
by simply adding normal vector (towards ground) multiplied by distance to ground, if I see it right something like this:
p' = p + n * dot(p-p0,n)
That should eliminate the problem with visible sides on edges of FOV however you should also take into account that by showing side some part of top is also hidden so to remedy that you might also find axis of symmetry, and use just half of top side (that is not hidden partially) and just multiply the measured half area by 2 ...
Accurate computation is virtually hopeless, because you don't see all sides.
Assuming your depth information is available as a range image, you can consider the points inside the segmentation mask of a single chicken, estimate the vertical direction at that point, rotate and project the points to obtain the silhouette.
But as a part of the surface is occluded, you may have to reconstruct it using symmetry.
There is no way to do this accurately for arbitrary objects, since there can be parts of the object that contribute to the "top area", but which the camera cannot see. Since the camera cannot see these parts, you can't tell how big they are.
Since all your objects are known to be chickens, though, you could get a pretty accurate estimate like this:
Use Principal Component Analysis to determine the orientation of each chicken.
Using many objects in many images, find a best-fit polynomial that estimates apparent chicken size by distance from the image center, and orientation relative to the distance vector.
For any given chicken, then, you can divide its apparent size by the estimated average apparent size for its distance and orientation, to get a normalized chicken size measurement.

Isometric Sprites

This might be a stupid question but I'm stuck and can't get passed it. I'm making a isometric game and I have my map built using tiles, I just followed this tutorial to build the map, http://www.binpress.com/tutorial/creating-a-city-building-game-with-sfml/137. But now I don't know how to add character sprites. Do I have to add these sprites using tiles as well or do I just draw the the sprites into position of the screen. Any help would be much appreciated.
As far as I can tell from the engine, just follow the "Textures and Animations" guide and draw the Animation to the screen after you have drawn the tiles. This isn't a complicated engine, so you are only working with 2D sprites being drawn to the screen (the 3D effect is merely tricks of painter's algorithm to make it work...there is no z-axis from what the tutorial indicates)
The depth is done by the order of tile rendering
The same goes for objects,players,etc... Let assume plane XY is parallel with the ground and Z axis is the altitude. Then your grid would be something like this (assuming diamond shape layout):
Order of rendering
You have to handle object,players and stuff sprites in the same way as tiles (and in the same time). so you should render all cells in specific order dependent on your grid layout and sprite combination equation. If your sprites can overwrite already rendered stuff then you should render from the most distant tiles to the closest to the "camera". In that case the blue direction arrow on above image is correct and Z axis should be increasing in the most inner loop.
So now if you got any object,player or stuff placed in cell (x,y,z) then you should render it directly after the cell (x,y,z) was rendered prior to rendering any other cell.
To speed up is a good idea to have objects and players in your tile map as a cell. But for that you have to have the tiles in the right manner and also your map representations must be capable of doing so.

Best Elliptical Fit for irregular shapes in an image

I have an image with arbitrary regions shape (say objects), let's assume the background pixels are labeled as zeros whereas any object has a unique label (pixels of object 1 are labeled as 1, object 2 pixels are labeled as 2,...). Now for every object, I need to find the best elliptical fit of its pixels. This requires finding the center of the object, the major and minor axis, and the rotation angle. How can I find these?
Thank you;
Principal Component Analysis (PCA) is one way to go. See Wikipedia here.
The centroid is easy enough to find if your shapes are convex - just a weighted average of intensities over the xy positions - and PCA will give you the major and minor axes, hence the orientation.
Once you have the centre and axes, you have the basis for a set of ellipses that cover your shape. Extending the axes - in proportion - and testing each pixel for in/out, you can find the ellipse that just covers your shape. Or if you prefer, you can project each pixel position onto the major and minor axes and find the rough limits in one pass and then test in/out on "corner" cases.
It may help if you post an example image.
As you seem to be using Matlab, you can simply use the regionprops command, given that you have the Image Processing Toolbox.
It can extract all the information you need (and many more properties of image regions) and it will do the PCA for you, if the PCA-based based approach suits your needs.
Doc is here, look for the 'Centroid', 'Orientation', 'MajorAxisLength' and 'MinorAxisLength' parameters specifically.

Map image/texture to a predefined uneven surface (t-shirt with folds, mug, etc.)

Basically I was trying to achieve this: impose an arbitrary image to a pre-defined uneven surface. (See examples below).
-->
I do not have a lot of experience with image processing or 3D algorithms, so here is the best method I can think of so far:
Predefine a set of coordinates (say if we have a 10x10 grid, we have 100 coordinates that starts with (0,0), (0,10), (0,20), ... etc. There will be 9x9 = 81 grids.
Record the transformations for each individual coordinate on the t-shirt image e.g. (0,0) becomes (51,31), (0, 10) becomes (51, 35), etc.
Triangulate the original image into 81x2=162 triangles (with 2 triangles for each grid). Transform each triangle of the image based on the coordinate transformations obtained in Step 2 and draw it on the t-shirt image.
Problems/questions I have:
I don't know how to smooth out each triangle so that the image on t-shirt does not look ragged.
Is there a better way to do it? I want to make sure I'm not reinventing the wheels here before I proceed with an implementation.
Thanks!
This is called digital image warping. There was a popular graphics text on it in the 1990s (probably from somebody's thesis). You can also find an article on it from Dr. Dobb's Journal.
Your process is essentially correct. If you work pixel by pixel, rather than trying to use triangles, you'll avoid some of the problems you're facing. Scan across the pixels in target bitmap, and apply the local transformation based on the cell you're in to determine the coordinate of the corresponding pixel in the source bitmap. Copy that pixel over.
For a smoother result, you do your coordinate transformations in floating point and interpolate the pixel values from the source image using something like bilinear interpolation.
It's not really a solution for the problem, it's just a workaround :
If you have the 3D model that represents the T-Shirt.
you can use directX\OpenGL and put your image as a texture of the t-shirt.
Then you can ask it to render the picture you want from any point of view.

How do I add an outline to a 2d concave polygon?

I'm successfully drawing the convex polys which make up the following white concave shape.
The orange color is my attempt to add a uniform outline around the white shape. As you can see it's not so uniform. On some edges the orange doesn't show at all.
Evidently using...
glScalef(1.1, 1.1, 0.0);
... to draw a slightly larger orange shape before I drew the white shape wasn't the way to go.
I just have a nagging feeling I'm missing a more simple way to do this.
Note that the white part is going to be mapped with a texture which has areas of transparency, so the orange part needs to be behind the white shapes too, not just surrounding them.
Also, I'm using a parallel projection matrix, that's why glScalef's z is set to 0.0 - reminds me there is no perspective scaling.
Any ideas? Thanks!
Nope, you wont be going anywhere with glScale in this case. Possible options are
a) construct an extruded polygon from the original one (possibly rounding sharp corners)
b) draw the polygon with GL_LINES and set glLineWidth to your desired outline width (in fact you might want to draw the outline with 2x width first)
The first approach will generate CPU load, the second one might slow down rendering significantly AFAIK.
You can displace your polygon in the 8 directions of the compass.
You can have a look at this link: http://simonschreibt.de/gat/cell-shading/
It's a nice trick, and might do the job
Unfortunately there is no simple way to get an outline of consistent width - you just have to do the maths:
For each edge: calculate the normal, scale to the desired width, and add to the edge vertices to get a line segment on the new expanded edge
Calculate the intersection of the lines through two adjacent segments to find the expanded vertex positions
A distinct answer from those offered to date, posted just for interest; if you're in GLES 2.0 have access to shaders then you could render the source polygon to a framebuffer with a texture bound as the colour renderbuffer, then do a second parse to write to the screen (so you're using the image of the white polygon as the input texture and running a post-processing pixel shader to every pixel on the screen) with a shader that obeys the following logic for an outline of thickness q:
if the input is white then output a white pixel
if the input pixel is black then sample every pixel within a radius of q from the current pixel; if any one of them is white then output an orange pixel, otherwise output a black pixel
In practise you'd spend an awful lot on texture sampling and probably turn that into the bottleneck. And they'd be mostly dependent reads, which are bad for the pipeline on lots of GPUs — including the PowerVR SGX that powers the overwhelming majority of OpenGL ES 2.0 devices.
EDIT: actually, you could speed this up substantially; if your radius is q then have the hardware generate mip maps for your framebuffer object, take the first one for which the output pixels are at least q by q in the source image. You've then essentially got a set of bins that'll be pure black if there were no bits of the polygon in that region and pure white if that area was entirely internal to the polygon. For each output fragment that you're considering might be on the border you can quite possibly just straight to a conclusion of definitely in or definitely out and beyond the border based on four samples of the mipmap.

Resources