Transform point position in trapezoid to rectangle position - matrix

I am trying to find out how I can transform a coordinate Pxy within the green trapezoid below into the equivalent coordinate on the real ground plane.
I have the exact measures of the room, meaning I can exactly say how long A,B,C and D are in that room shown below.
Also I know how long A,B,C and D are in that green triangle (coordinate wise).
I have already been reading about homography and matrix transformation, but can't really wrap my head around it. Any input steering me into the right direction would be appreciated.
Thanks!

There is the code computes the affine transformation matrix using the library Opencv (it shows how to trasform your trapezoid to rectangle and how to find transformation matrix for futher calculations):
//example from book
// Learning OpenCV: Computer Vision with the OpenCV Library
// by Gary Bradski and Adrian Kaehler
// Published by O'Reilly Media, October 3, 2008
#include <cv.h>
#include <highgui.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
IplImage *src=0, *dst=0;
// absolute or relative path to image should be in argv[1]
char* filename = argc == 2 ? argv[1] : "Image0.jpg";
// get the picture
src = cvLoadImage(filename,1);
printf("[i] image: %s\n", filename);
assert( src != 0 );
// points (corners of )
CvPoint2D32f srcQuad[4], dstQuad[4];
// transformation matrix
CvMat* warp_matrix = cvCreateMat(3,3,CV_32FC1);
// clone image
dst = cvCloneImage(src);
// define all the points
//here the coordinates of corners of your trapezoid
srcQuad[0].x = ??; //src Top left
srcQuad[0].y = ??;
srcQuad[1].x = ??; //src Top right
srcQuad[1].y = ??;
srcQuad[2].x = ??; //src Bottom left
srcQuad[2].y = ??;
srcQuad[3].x = ??; //src Bot right
srcQuad[3].y = ??;
//- - - - - - - - - - - - - -//
//coordinates of rectangle in src image
dstQuad[0].x = 0; //dst Top left
dstQuad[0].y = 0;
dstQuad[1].x = src->width-1; //dst Top right
dstQuad[1].y = 0;
dstQuad[2].x = 0; //dst Bottom left
dstQuad[2].y = src->height-1;
dstQuad[3].x = src->width-1; //dst Bot right
dstQuad[3].y = src->height-1;
// get transformation matrix that you can use to calculate
//coordinates of point Pxy
cvGetPerspectiveTransform(srcQuad,dstQuad,warp_matrix);
// perspective transformation
cvWarpPerspective(src,dst,warp_matrix);
cvNamedWindow( "cvWarpPerspective", 1 );
cvShowImage( "cvWarpPerspective", dst );
cvWaitKey(0);
cvReleaseMat(&warp_matrix);
cvReleaseImage(&src);
cvReleaseImage(&dst);
cvDestroyAllWindows();
return 0;
Hope it will be helpfull!

If I understand your question correctly, you are looking for the transform matrix that expresses the position and orientation (aka the "pose") of your camera in relation to the world. If you have this matrix - lets call it M - you could map any point from your camera coordinate frame to the world coordinate frame and vice versa. In your case you'll want to transform a rectangle onto the plane (0, 1, 0)^T + 0 in world coordinates.
There are several ways to derive this pose Matrix. First of all you'll need to know another matrix - K - which describes the internal camera parameters to convert positions in the camera coordinate frame to actual pixel positions. This involves a standard pinhole projection as well as radial distortion and a few other things.
To determine both K and M you have to calibrate your camera. This is usually done by taking a calibration pattern (e.g. a chessboard-pattern) for which the positions of the chessboard-fields are known. Then you can establish so called Point-Correspondences between the known positions on the pattern and the observed pixel-positions. Once you have enough of these point-pairs you can solve a Matrix H = KM. This is your Homography matrix you've mentioned already. Once you have that, you can reconstruct K and M.
So much for the theory. For the practical part I would suggest to have a look at the OpenCV-Documentations (e.g. you could start here: OpenCV Camera calibration and here: OpenCV Pose estimation).
I hope this will point you in the right directions ;)

Just for the sake of completion. I ended up looking at the thread suggested by #mmgp and implemented a solution that is equivalent to the one presented by Christopher R. Wren:
Perspective Transform Estimation
This turned out to work really well for my case, although there was some distortion from the camera.

Related

How to convert a screen coordinate into a translation for a projection matrix?

(More info at end)----->
I am trying to render a small picture-in-picture display over my scene. The PiP is just a smaller texture, but it is intended to reveal secret objects in the scene when it is placed over them.
To do this, I want to render my scene, then render the SAME scene on the smaller texture, but with the exact same positioning as the main scene. The intended result would be something like this:
My problem is... I cannot get the scene on the smaller texture to match up 1:1. I keep trying various kludges, but ultimately I suspect that I need to do something to the projection matrix to pan it over to the location of the frame. I can get it to zoom correctly...just can't get it to pan.
Can anyone suggest what I need to do to my projection matrix to render my scene 1:1 (but panned by x,y) onto a smaller texture?
The data I have:
Resolution of the full-screen framebuffer
Resolution of the smaller texture
XY coordinate where I want to draw the smaller texture as an overlay sprite
The world/view/projection matrices from the original full-screen scene
The viewport from the original full-screen scene
(Edit)
Here is the function I use to produce the 3D camera:
void Make3DCamera(Vector theCameraPos, Vector theLookAt, Vector theUpVector, float theFOV, Point theRez, Matrix& theViewMatrix,Matrix& theProjectionMatrix)
{
Matrix aCombinedViewMatrix;
Matrix aViewMatrix;
aCombinedViewMatrix.Scale(1,1,-1);
theCameraPos.mZ*=-1;
theLookAt.mZ*=-1;
theUpVector.mZ*=-1;
aCombinedViewMatrix.Translate(-theCameraPos);
Vector aLookAtVector=theLookAt-theCameraPos;
Vector aSideVector=theUpVector.Cross(aLookAtVector);
theUpVector=aLookAtVector.Cross(aSideVector);
aLookAtVector.Normalize();
aSideVector.Normalize();
theUpVector.Normalize();
aViewMatrix.mData.m[0][0] = -aSideVector.mX;
aViewMatrix.mData.m[1][0] = -aSideVector.mY;
aViewMatrix.mData.m[2][0] = -aSideVector.mZ;
aViewMatrix.mData.m[3][0] = 0;
aViewMatrix.mData.m[0][1] = -theUpVector.mX;
aViewMatrix.mData.m[1][1] = -theUpVector.mY;
aViewMatrix.mData.m[2][1] = -theUpVector.mZ;
aViewMatrix.mData.m[3][1] = 0;
aViewMatrix.mData.m[0][2] = aLookAtVector.mX;
aViewMatrix.mData.m[1][2] = aLookAtVector.mY;
aViewMatrix.mData.m[2][2] = aLookAtVector.mZ;
aViewMatrix.mData.m[3][2] = 0;
aViewMatrix.mData.m[0][3] = 0;
aViewMatrix.mData.m[1][3] = 0;
aViewMatrix.mData.m[2][3] = 0;
aViewMatrix.mData.m[3][3] = 1;
if (gG.mRenderToSprite) aViewMatrix.Scale(1,-1,1);
aCombinedViewMatrix*=aViewMatrix;
// Projection Matrix
float aAspect = (float) theRez.mX / (float) theRez.mY;
float aNear = gG.mZRange.mData1;
float aFar = gG.mZRange.mData2;
float aWidth = gMath.Cos(theFOV / 2.0f);
float aHeight = gMath.Cos(theFOV / 2.0f);
if (aAspect > 1.0) aWidth /= aAspect;
else aHeight *= aAspect;
float s = gMath.Sin(theFOV / 2.0f);
float d = 1.0f - aNear / aFar;
Matrix aPerspectiveMatrix;
aPerspectiveMatrix.mData.m[0][0] = aWidth;
aPerspectiveMatrix.mData.m[1][0] = 0;
aPerspectiveMatrix.mData.m[2][0] = gG.m3DOffset.mX/theRez.mX/2;
aPerspectiveMatrix.mData.m[3][0] = 0;
aPerspectiveMatrix.mData.m[0][1] = 0;
aPerspectiveMatrix.mData.m[1][1] = aHeight;
aPerspectiveMatrix.mData.m[2][1] = gG.m3DOffset.mY/theRez.mY/2;
aPerspectiveMatrix.mData.m[3][1] = 0;
aPerspectiveMatrix.mData.m[0][2] = 0;
aPerspectiveMatrix.mData.m[1][2] = 0;
aPerspectiveMatrix.mData.m[2][2] = s / d;
aPerspectiveMatrix.mData.m[3][2] = -(s * aNear / d);
aPerspectiveMatrix.mData.m[0][3] = 0;
aPerspectiveMatrix.mData.m[1][3] = 0;
aPerspectiveMatrix.mData.m[2][3] = s;
aPerspectiveMatrix.mData.m[3][3] = 0;
theViewMatrix=aCombinedViewMatrix;
theProjectionMatrix=aPerspectiveMatrix;
}
Edit to add more information:
Just playing and tweaking numbers, I have come to a "close" result. However the "close" result requires a multiplication by some kludge numbers, that I don't understand.
Here's what I'm doing to to perspective matrix to produce my close result:
//Before calling Make3DCamera, adjusting FOV:
aFOV*=smallerTexture.HeightF()/normalRenderSize.HeightF(); // Zoom it
aFOV*=1.02f // <- WTH is this?
//Then, to pan the camera over to the x/y position I want, I do:
Matrix aPM=GetCurrentProjectionMatrix();
float aX=(screenX-normalRenderSize.WidthF()/2.0f)/2.0f;
float aY=(screenY-normalRenderSize.HeightF()/2.0f)/2.0f;
aX*=1.07f; // <- WTH is this?
aY*=1.07f; // <- WTH is this?
aPM.mData.m[2][0]=-aX/normalRenderSize.HeightF();
aPM.mData.m[2][1]=-aY/normalRenderSize.HeightF();
SetCurrentProjectionMatrix(aPM);
When I do this, my new picture is VERY close... but not exactly perfect-- the small render tends to drift away from "center" the further the "magic window" is from the center. Without the kludge number, the drift away from center with the magic window is very pronounced.
The kludge numbers 1.02f for zoom and 1.07 for pan reduce the inaccuracies and drift to a fraction of a pixel, but those numbers must be a ratio from somewhere, right? They work at ANY RESOLUTION, though-- so I have have a 1280x800 screen and a 256,256 magic window texture... if I change the screen to 1024x768, it all still works.
Where the heck are these numbers coming from?
If you don't care about sub-optimal performance (i.e., drawing the whole scene twice) and if you don't need the smaller scene in a texture, an easy way to obtain the overlay with pixel perfect precision is:
Set up main scene (model/view/projection matrices, etc.) and draw it as you are now.
Use glScissor to set the rectangle for the overlay. glScissor takes the screen-space x, y, width, and height and discards anything outside that rectangle. It looks like you have those four data items already, so you should be good to go.
Call glEnable(GL_SCISSOR_TEST) to actually turn on the test.
Set the shader variables (if you're using shaders) for drawing the greyscale scene/hidden objects/etc. You still use the same view and projection matrices that you used for the main scene.
Draw the greyscale scene/hidden objects/etc.
Call glDisable(GL_SCISSOR_TEST) so you won't be scissoring at the start of the next frame.
Draw the red overlay border, if desired.
Now, if you actually need the overlay in its own texture for some reason, this probably won't be adequate...it could be made to work either with framebuffer objects and/or pixel readback, but this would be less efficient.
Most people completely overcomplicate such issues. There is absolutely no magic to applying transformations after applying the projection matrix.
If you have a projection matrix P (and I'm assuming default OpenGL conventions here where P is constructed in a way that the vector is post-multiplied to the matrix, so for an eye space vector v_eye, we get v_clip = P * v_eye), you can simply pre-multiply some other translate and scale transforms to cut out any region of interest.
Assume you have a viewport of size w_view * h_view pixels, and you want to find a projection matrix which renders only a tile w_tile * h_tile pixels , beginning at pixel location (x_tile, y_tile) (again, assuming default GL conventions here, window space origin is bottom left, so y_tile is measured from the bottom). Also note that the _tile coordinates are to be interpreted relative to the viewport, in the typical case, that would start at (0,0) and have the size of your full framebuffer, but this is by no means required nor assumed here.
Since after applying the projection matrix we are in clip space, we need to transform our coordinates from window space pixels to clip space. Note that clip space is a 4D homogeneous space, but we can use any w value we like (except 0) to represent any point (as a point in the 3D space we care about forms a line in the 4D space we work in), so let's just use w=1 for simplicity's sake.
The view volume in clip space is denoted by the [-w,w] range, so in the w=1 hyperplane, it is [-1,1]. Converting our tile into this space yields:
x_clip = 2 * (x_tile / w_view) -1
y_clip = 2 * (y_tile / h_view) -1
w_clip = 2 * (w_tile / w_view) -1
h_clip = 2 * (h_tile / h_view) -1
We now just need to translate the objects such that the center of the tile is moved to the center of the view volume, which by definition is the origin, and scale the w_clip * h_clip sized region to the full [-1,1] extent in each dimension.
That means:
T = translate(-(x_clip + 0.5*w_clip), -(y_clip + 0.5 *h_clip), 0)
S = scale(2.0/w_clip, 2.0/h_clip, 1.0)
We can now create the modified projection matrix P' as P' = S * T * P, and that's all there is. Rendering with P' instead of P will render exactly the region of your tile to whatever viewport you are using, so for it to be pixel-exact with respect to your original viewport, you must now render with a viewport which is also w_tile * h_tile pixels big.
Note that there is also another approach: The viewport is not clamped against the framebuffer you're rendering to. It is actually valid to provide negative values for x and y. If your framebuffer for rendering your tile into is exactly w_tile * h_tile pixels, you simply could set glViewport(-x_tile, -y_tile, x_tile + w_tile, y_tile + h_tile) and render with the unmodified projection matrix P instead.

Calculating Normal of Bbox/Cube

I am working on ray tracing, and decided to use bounding boxes( axis aligned bbox) as objects (cubes), and shade them. I am able to find the correct t value, and intersection point; however, I could not find a way to calculate the surface normal since I only have ray direction, ray origin, intersection point, and t value, and min-max values of the bbox.
Is there a way to calculate the normal at the intersection point (or deciding which face of the cube ray intersected) with the information I have?
I am using "An Efficientand Robust Ray–Box Intersection Algorithm" by Williams et al.
If you have the intersection point and the AABB (BoundingBox) center, you can make a quick calcul to obtain an index corresponding to the face you hit.
Then with an array that stores normals you can get your data.
Vector3 ComputeNormal(Vector3 inter, Vector3 aabbCenter)
{
static const Vector3 normals[] = { // A cube has 3 possible orientations
Vector3(1,0,0),
Vector3(0,1,0),
Vector3(0,0,1)
};
const Vector3 interRelative = inter - aabbCenter;
const float xyCoef = interRelative.y / interRelative.x;
const float zyCoef = interRelative.y / interRelative.z;
const int coef = (isBetweenInclusive<1,-1>(xyCoef) ? 1 :
(isBetweenExclusive<1,-1>(zyCoef) ? 2 : 0));
// Here it's exclusive to avoid coef to be 3
return normals[coef] * SIGN(interRelative); // The sign he is used to know direction of the normal
}
I have not tested it so don't be surprised if it does not work directly ;) but it should do the trick.

How to obtain the bounding ellipse of a target connect component

Suppose we have a connected component in the image as the following image illustrates:image http://dl.dropbox.com/u/92688392/ellipse.jpg.
My question is how can calculate the bounding ellipse of the connected components (the red ellipse in the image). I have checked MATLAB function regionprops, and understand how MATLAB can do that. I also notice that Opencv has similar function to do that CBlob::GetEllipse(). However, although I understand how they obtain the result by reading the code, the fundamental theory behind it is still unclear to me. I am therefore wondering whether there are some standard algorithms to do the job. Thanks!
EDIT:
Based on the comments, I reorganized my question: in image moment Wikipedia the calculation formula of the longest axis angle is
However, in the MATLAB function regionprops, the codes are as follows:
% Calculate orientation.
if (uyy > uxx)
num = uyy - uxx + sqrt((uyy - uxx)^2 + 4*uxy^2);
den = 2*uxy;
else
num = 2*uxy;
den = uxx - uyy + sqrt((uxx - uyy)^2 + 4*uxy^2);
end
This implementation is inconsistent with the formula in Wikipedia. I was wondering which one is correct.
If you're looking for a OpenCV implementation than I can give it to you. The algorithm is the following:
Convert image to 1bit (b&w)
Find all contours
Create contour that contains all points from founded contours
Calculate convex hull of this contour
Find rotated ellipse (rectangle) with minimal square that contains calculated in previous step contour
Here's code:
Mat src = imread("ellipse.jpg"), tmp;
vector<Vec4i> hierarchy;
vector<vector<Point> > contours;
vector<Point> bigContour, hull;
RotatedRect ell;
//step 1
cvtColor(src, tmp, CV_BGR2GRAY);
threshold(tmp, tmp, 100, 255, THRESH_BINARY);
//step 2
findContours(tmp, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
//step 3
for (size_t i=0; i<contours.size(); i++)
{
for (size_t j=0; j<contours[i].size(); j++)
{
bigContour.push_back(contours[i][j]);
}
}
//step 4
convexHull(bigContour, hull);
//step 5
ell = fitEllipse(hull);
//drawing result
ellipse(src, ell, Scalar(0,0,255), 2);
imshow("result", src);
waitKey();
This is the input:
And here's a result:
I was trying to find out what's the algorithm behind it as well so I could write my own implementation of it. I found it on a blog post of mathworks. In one of the comments, the author says:
regionprops calculates the 2nd-order moments of the object in question and then returns measurements of the ellipse with the same 2nd-order moments.
and later later:
The equations used are from Haralick and Shapiro, Computer and Robot Vision vol. 1, Appendix A, Addison-Wesley 1992. I did a sanity check by constructing an image containing an ellipse with major axis length = 100 and minor axis length = 50, and regionprops returned the correct measurements.
I don't have that book but seems I'll need to get a copy of it.
I'm not sure how matlab or opencv calculates the ellipsoid. But if you are interested in math behind it, there is a very nice optimization approach called Löwner-John ellipsoid. You can find more information about this method in the Stanford Convex Optimization course. I hope it helps...

Detection of coins (and fit ellipses) on an image

I am currently working on a project where I am trying to detect a few coins lying on a flat surface (i.e. a desk). The coins do not overlap and are not hidden by other objects. But there might be other objects visible and the lighting conditions may not be perfect... Basically, consider yourself filming your desk which has some coins on it.
So each point should be visible as an Ellipse. Since I don't know the position of the camera the shape of the ellipses may vary, from a circle (view from top) to flat ellipses depending on the angle the coins are filmed from.
My problem is that I am not sure how to extract the coins and finally fit ellipses over them (which I am looking for to do further calculations).
For now, I have just made the first attempt by setting a threshold value in OpenCV, using findContours() to get the contour lines and fitting an ellipse. Unfortunately, the contour lines only rarely give me the shape of the coins (reflections, bad lighting, ...) and this way is also not preferred since I don't want the user to set any threshold.
Another idea was to use a template matching method of an ellipse on that image, but since I don't know the angle of the camera nor the size of the ellipses I don't think this would work well...
So I wanted to ask if anybody could tell me a method that would work in my case.
Is there a fast way to extract the three coins from the image? The calculations should be made in realtime on mobile devices and the method should not be too sensitive for different or changing lights or the color of the background.
Would be great if anybody could give me any tips on which method could work for me.
Here's some C99 source implementing the traditional approach (based on OpenCV doco):
#include "cv.h"
#include "highgui.h"
#include <stdio.h>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
//
// We need this to be high enough to get rid of things that are too small too
// have a definite shape. Otherwise, they will end up as ellipse false positives.
//
#define MIN_AREA 100.00
//
// One way to tell if an object is an ellipse is to look at the relationship
// of its area to its dimensions. If its actual occupied area can be estimated
// using the well-known area formula Area = PI*A*B, then it has a good chance of
// being an ellipse.
//
// This value is the maximum permissible error between actual and estimated area.
//
#define MAX_TOL 100.00
int main( int argc, char** argv )
{
IplImage* src;
// the first command line parameter must be file name of binary (black-n-white) image
if( argc == 2 && (src=cvLoadImage(argv[1], 0))!= 0)
{
IplImage* dst = cvCreateImage( cvGetSize(src), 8, 3 );
CvMemStorage* storage = cvCreateMemStorage(0);
CvSeq* contour = 0;
cvThreshold( src, src, 1, 255, CV_THRESH_BINARY );
//
// Invert the image such that white is foreground, black is background.
// Dilate to get rid of noise.
//
cvXorS(src, cvScalar(255, 0, 0, 0), src, NULL);
cvDilate(src, src, NULL, 2);
cvFindContours( src, storage, &contour, sizeof(CvContour), CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0));
cvZero( dst );
for( ; contour != 0; contour = contour->h_next )
{
double actual_area = fabs(cvContourArea(contour, CV_WHOLE_SEQ, 0));
if (actual_area < MIN_AREA)
continue;
//
// FIXME:
// Assuming the axes of the ellipse are vertical/perpendicular.
//
CvRect rect = ((CvContour *)contour)->rect;
int A = rect.width / 2;
int B = rect.height / 2;
double estimated_area = M_PI * A * B;
double error = fabs(actual_area - estimated_area);
if (error > MAX_TOL)
continue;
printf
(
"center x: %d y: %d A: %d B: %d\n",
rect.x + A,
rect.y + B,
A,
B
);
CvScalar color = CV_RGB( rand() % 255, rand() % 255, rand() % 255 );
cvDrawContours( dst, contour, color, color, -1, CV_FILLED, 8, cvPoint(0,0));
}
cvSaveImage("coins.png", dst, 0);
}
}
Given the binary image that Carnieri provided, this is the output:
./opencv-contour.out coin-ohtsu.pbm
center x: 291 y: 328 A: 54 B: 42
center x: 286 y: 225 A: 46 B: 32
center x: 471 y: 221 A: 48 B: 33
center x: 140 y: 210 A: 42 B: 28
center x: 419 y: 116 A: 32 B: 19
And this is the output image:
What you could improve on:
Handle different ellipse orientations (currently, I assume the axes are perpendicular/horizontal). This would not be hard to do using image moments.
Check for object convexity (have a look at cvConvexityDefects)
Your best way of distinguishing coins from other objects is probably going to be by shape. I can't think of any other low-level image features (color is obviously out). So, I can think of two approaches:
Traditional object detection
Your first task is to separate the objects (coins and non-coins) from the background. Ohtsu's method, as suggested by Carnieri, will work well here. You seem to worry about the images being bipartite but I don't think this will be a problem. As long as there is a significant amount of desk visible, you're guaranteed to have one peak in your histogram. And as long as there are a couple of visually distinguishable objects on the desk, you are guaranteed your second peak.
Dilate your binary image a couple of times to get rid of noise left by thresholding. The coins are relatively big so they should survive this morphological operation.
Group the white pixels into objects using region growing -- just iteratively connect adjacent foreground pixels. At the end of this operation you will have a list of disjoint objects, and you will know which pixels each object occupies.
From this information, you will know the width and the height of the object (from the previous step). So, now you can estimate the size of the ellipse that would surround the object, and then see how well this particular object matches the ellipse. It may be easier just to use width vs height ratio.
Alternatively, you can then use moments to determine the shape of the object in a more precise way.
I don't know what the best method for your problem is. About thresholding specifically, however, you can use Otsu's method, which automatically finds the optimal threshold value based on an analysis of the image histogram. Use OpenCV's threshold method with the parameter ThresholdType equal to THRESH_OTSU.
Be aware, though, that Otsu's method work well only in images with bimodal histograms (for instance, images with bright objects on a dark background).
You've probably seen this, but there is also a method for fitting an ellipse around a set of 2D points (for instance, a connected component).
EDIT: Otsu's method applied to a sample image:
Grayscale image:
Result of applying Otsu's method:
If anyone else comes along with this problem in the future as I did, but using C++:
Once you have used findContours to find the contours (as in Misha's answer above), you can easily fit ellipses using fitEllipse, eg
vector<vector<Point> > contours;
findContours(img, contours, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0,0));
RotatedRect rotRecs[contours.size()];
for (int i = 0; i < contours.size(); i++) {
rotRecs[i] = fitEllipse(contours[i]);
}

About: Extracting Region From Bitmap

i am trying to extract outline path from given bitmap, i create a fast algorithm (for me) on as3 and that is:
//#v source bitmap's vector data
//#x x to starting extraction
//#y y to stating extraction
//#w extraction width
//#h extraction height
//#mw source bitmap width
//#mh source bitmap height
private function extractRects(v:Vector.<uint>, x:int, y:int,
w:int, h:int, mw:int, mh:int):Array
{
var ary:Array = [], yy:int=y, vStart:int, _xx:int, xx:int;
var lcold:int = 0, lc:int;
//first line to last line
while(yy < h)
{
//first index of current vector
vStart = yy * mw + x;
xx = x;
lc = 0;
//first vector to last on current scan
while(xx < w)
{
/*
if current vector value (color) is
empty (transparent) then
check next
*/
while(xx < w && !v[vStart])
{
vStart++;
xx++;
}
//it is not empty so copy first index
_xx = xx;
//check till value is not empty
while(xx < w && v[vStart])
{
xx++;
vStart++;
}
//create rectangle
var rr:Rectangle = new Rectangle(_xx, yy, (xx-_xx), 1);
//if previous line has the same rectangle index
if(lc < lcold)
{
var oldrr:Rectangle = ary[ary.length - lcold];
//if previous neighbour rect on top
//has same horizontal position then
//resize its height
if(oldrr.left == rr.left && oldrr.width == rr.width)
oldrr.height++;
else
ary.push(rr);
}
else
ary.push(rr);
lc++;
xx++;
}
lcold = lc;
yy++;
}
return ary;
}
With the above method, I extract the region and create shape by drawing rectangles..
Drawing rectangles does not seem to be a good solution because of non-smooth view.
In order to have a smoother view, I must use lines or curves but, using point neighbouring technique is really big headache for me right now.
Could anyone please recommend me any better solution?
as3, c++, c#, vb.net, vb6, delphi, java or similar languages will be fine for answers.
EDIT FOR CLEARIFICATION
I am trying to extract non-transparent pixels' x, y coordinates from a bitmap to draw on different path data. (32 bit ARGB) (creating shape)
For drawing, I could use lineTo, curveTo, moveTo operations.
moveTo(x, y)
lineTo(x, y)
curveTo(cx, cy, ax, ay)
in my code, I thought that I could extract the rectangles of current non-transparent blocks and I could use the same rectangles with moveTo and lineTo operations on further graphic methods
The problem is that this method gives non-smooth look on edges which is neither horizontal nor vertical.
So, the solution is creating a point map on edges, detecting the point neighborhood, using the lineTo operation (because it generates antialiased lines) between neighbour points on rows, or calculating the points placement on nearest circle area and using curveTo method..
Question Could anyone recommend me some algorithms or methods for extracting job?
Thanks in advance
What you're looking for is bitmap/raster image to vector software. To get a good quality result, there are many non-trivial steps that must be performed. There is an open source project called Potrace which does this - click here for a technical description of how it works. If you'd like to try its algorithm in an GUI program, you can use Inkscape with its Trace Bitmap function.

Resources