I am thinking a way to transform a distorted rectangle back to normal but I have no idea how. I cannot find any information about it as well. Could anyone provide some information about how to do the trick only knowing the x, y position of the 4 junctions? Known the rectangle was 640 x 480 before distorted.
I think the problem can be solved by finding the transformation matrix in linear algebra. Since there is information about the original and the final matrix.
You have coordinates of 4 corners of distorted rectangle and want to transform it to some rectangle. In this case you need perspective transformation. Just define points for resulting rectangle as you want (for example, unit square in coordinate origin).
Now you have 8 pairs of corresponding parameters (x and y for every point), and need to calculate 8 parameters of matrix using 8 equations
//four pairs of such equations:
x' = (A * x + B * y + C) / (G * x + H * y + 1.0)
y' = (D * x + E * y + F) / (G * x + H * y + 1.0)
Theory of finding perspective transformation matrix is described in Paul Heckbert article.
C++ implementation could be found in antigrain library (file agg_trans_perspective.h)
Use OpenCV.
sample tip
Point lt = Point(200, 40) ;
Point rt = Point(500, 44) ;
Point rb = Point(740, 355) ;
Point lb = Point(30, 200) ;
vector<Point> rect ;
rect.push_back(lt) ;
rect.push_back(rt) ;
rect.push_back(rb) ;
rect.push_back(lb) ;
double w1 = sqrt(pow(rb.x-lb.x, 2)+pow(rb.y-lb.y, 2)) ;
double w2 = sqrt(pow(rt.x-lt.x, 2)+pow(rt.y-lt.y, 2)) ;
double h1 = sqrt(pow(rb.x-rt.x, 2)+pow(rb.y-rt.y, 2)) ;
double h2 = sqrt(pow(lb.x-lt.x, 2)+pow(lb.y-lt.y, 2)) ;
double maxW = (w1<w2)? w2 : w1 ;
double maxH = (h1<h2)? h2 : h1 ;
Point2f src[4], dst[4] ;
src[0]=Point2f(lt.x, lt.y) ;
src[1]=Point2f(rt.x, rt.y) ;
src[2]=Point2f(rb.x, rb.y) ;
src[4]=Point2f(lb.x, lb.y) ;
dst[0]=Point2f(0,0) ;
dst[1]=Point2f(maxW, 0) ;
dst[2]=Point2f(maxW, maxH) ;
dst[3]=Point2f(0, maxH) ;
tfMat = getPerspectiveTransform(src, dst) ;
wrapPerspective(srcImage, dstImage, trMat, Size(maxW, maxH)) ;
Related
I am working on a Spherical Map Script that generates a linear gradient along the y-axis and an angular gradient around the y-axis in the green and red channels, respectively. I have been unable to find any documentation suited to this online, but have experimented using examples in Javascript and C#. The linear gradient has worked out fine, but the angular gradient (one describing a 360 degree arc around the y-axis) continues to elude me.
My working script is as follows.
-- 3d sphere
-- produces rgb gradudient mapped to a 3d sphere, but not correctly.
-- this is basically missing an angle gradient around the y-axis...
function prepare()
-- tilt & rotation precalc
toRad = 180/math.pi
-- toCir = 360/math.p -- may or may not work for circumference...
radius = get_slider_input(RADIUS)
angle = get_angle_input(ROTATION)/toRad
cosa = math.cos(angle)
sina = math.sin(angle)
tilt = get_angle_input(TILT)/toRad
cosa2 = math.cos(tilt)
sina2 = math.sin(tilt)
end;
function get_sample(x, y)
local r, g, b, a = get_sample_map(x, y, SOURCE)
-- color gradient example
-- local r = x
-- local g = y
-- local b = (x + y) / 2
-- local a = 1
-- spherical mapping formulae (polar to cartesian, apparently)
-- local x = x * math.pi -- * aspect
-- local y = y * math.pi
-- local nx = math.cos(x) * math.sin(y)
-- local ny = math.sin(x) * math.sin(y)
-- local nz = math.cos(y)
-- cartesian to polar (reference)
-- example 1
-- r = math.sqrt(((x * x) + (y * y) + (z * z)))
-- long = math.acos(x / math.sqrt((x * x) + (y * y))) * (y < 0 ? -1 : 1)
-- lat = math.acos(z / radius) * (z < 0 ? -1 : 1)
-- example 2
-- r = math.sqrt((x * x) + (y * y) + (z * z))
-- long = math.acos(x / math.sqrt((x * x) + (y * y))) * (y < 0 ? -1 : 1)
-- lat = math.acos(z / r)
-- equations cannot accept boolean comparison
-- boolean syntax may not be valid in lua
-- image generation
-- shift origin to center and set radius limits
local px = (x*2.0) - 1.0
local py = (y*2.0) - 1.0
px = px/radius
py = py/radius
local len = math.sqrt((px*px)+(py*py))
if len > 1.0 then return 0,0,0,0 end
local z = -math.sqrt(1.0 - ((px*px)+(py*py)))
-- cartesian to polar
-- r = math.sqrt((x * x) + (y * y) + (z * z))
-- lat = math.acos(z / r)
-- long = math.acos(x / math.sqrt((x * x) + (y * y))) * (y < 0) ? -1 : 1)
-- apply rotaqtion and tilt (order is important)
local tz = (cosa2 * z) - (sina2 * py)
local ty = (sina2 * z) + (cosa2 * py) -- gradient along y-axis is correct
z = tz
py = ty
local tx = (cosa * px) - (sina * z) -- gradient needs to go around y-axis
local tz = (sina * px) + (cosa * z)
px = tx
z = tz
-- r = math.sqrt((x * x) + (y * y) + (z * z))
-- lat = math.acos(z / r) -- invalid z for this; what is correct source?
-- long = math.acos(x / math.sqrt((x * x) + (y * y))) -- map -1 : 1 needed
-- long = (sina * px) + (cosa * z) -- ok; 2 full rotations
-- return r, g, b, a
-- return px,py,z,a
return px/2+.5,py/2+.5,z/2+.5,a
-- return px/2+.5,px/2+.5,px/2+.5,a
-- return long,long,long,a
-- return px/2+.5,py/2+.5,long,a
end;
I found a way to construct a spherical angular gradient from the hemispherical axis gradients I started with, by using three as the RBG channels in an RBG to HSL conversion. I've used this to map textures to spheres in Filter Forge in map scripts like this.
I thought I'd share the solution for anyone who's followed my question here.
In this article I've found some examples of using MutableImage with readPixel and writePixel functions, but I think it's too complicated, I mean, can I do that without ST Monad?
Let's say I have this
eDynamicImage <- readImage
eImage <- convertRGBA8 <$> eDynamicImage
so I can apply pixelMap based filters to that image like this negative <$> eImage, but how do I work with pixels and it's coordinates? Can someone explain me what is Mutable Image and how can I get this from DynamicImage or Image?
Is there any clean code that can rotate image using this function?
UPD 2.0: I've made totally working rotation function using built-in JuicyPixels generateImage function:
rotate :: Double -> Image PixelRGBA8 -> Image PixelRGBA8
rotate n img#Image {..} = generateImage rotater newW newH
where rotater x y = if srcX x y < imageWidth && srcX x y >= 0 && srcY x y < imageHeight && srcY x y >= 0
then pixelAt img (srcX x y) (srcY x y)
else PixelRGBA8 255 255 255 255
srcX x y = getX center + rounding (fromIntegral (x - getX newCenter) * cos' + fromIntegral (y - getY newCenter) * sin')
srcY x y = getY center + rounding (fromIntegral (y - getY newCenter) * cos' - fromIntegral (x - getX newCenter) * sin')
center = (imageWidth `div` 2, imageHeight `div` 2)
newCenter = (newW `div` 2, newH `div` 2)
newW = rounding $ abs (fromIntegral imageHeight * sin') + abs (fromIntegral imageWidth * cos')
newH = rounding $ abs (fromIntegral imageHeight * cos') + abs (fromIntegral imageWidth * sin')
sin' = sin $ toRad n
cos' = cos $ toRad n
where
rounding a = floor (a + 0.5)
getX = fst
getY = snd
toRad deg = deg * (pi/180)
That's it if someone need it. Thanks #Carsten for advice!
An MutableImage is one you can mutate (change in place) - Images are immutable by default. You'll need some kind of monad that allows that though (see the documentation - there are a few including ST and IO).
To get an MutableImage you can use thawImage - then you can work (get/set) pixels with readPixel and writePixel - after you can freezeImage again to get back an immutable Image
If you want to know how you can rotate images you can check the source code of rotateLeft :
rotateLeft90 :: Pixel a => Image a -> Image a
rotateLeft90 img#Image {..} =
generateImage gen imageHeight imageWidth
where
gen x y = pixelAt img (imageWidth - 1 - y) x
as you can see it does not use a mutable image but generates a new one from the olds pixels (using pixelAt instead of readPixel)
just in case - this is using the record-wildcards extension to extract all the fields from the Image a - parameter (that is what the Image {..} does - imageHeight,imageWidth is pulled from there)
Bresenham's line drawing algorithm is well known and quite simple to implement.
While there are more advanced ways to draw anti-ailesed lines, Im interested in writing a function which draws a single pixel width non anti-aliased line, based on floating point coordinates.
This means while the first and last pixels will remain the same, the pixels drawn between them will have a bias based on the sub-pixel position of both end-points.
In principle this shouldn't be all that complicated, since I assume its possible to use the sub-pixel offsets to calculate an initial error value to use when plotting the line, and all other parts of the algorithm remain the same.
No sub pixel offset:
X###
###X
Assuming the right hand point has a sub-pixel position close to the top, the line could look like this:
With sub pixel offset for example:
X######
X
Is there a tried & true method of drawing a line that takes sub-pixel coordinates into account?
Note:
This seems like a common operation, I've seen OpenGL drivers take this into account for example - using GL_LINE, though from a quick search I didn't find any answers online - maybe used wrong search terms?
At a glance this question looks like it might be a duplicate of: Precise subpixel line drawing algorithm (rasterization algorithm)However that is asking about drawing a wide line, this is asking about offsetting a single pixel line.
If there isn't some standard method, I'll try write this up to post as an answer.
Having just encountered the same challenge, I can confirm that this is possible as you expected.
First, return to the simplest form of the algorithm: (ignore the fractions; they'll disappear later)
x = x0
y = y0
dx = x1 - x0
dy = y1 - y0
error = -0.5
while x < x1:
if error > 0:
y += 1
error -= 1
paint(x, y)
x += 1
error += dy/dx
This means that for integer coordinates, we start half a pixel above the pixel boundary (error = -0.5), and for each pixel we advance in x, we increase the ideal y coordinate (and therefore the current error) by dy/dx.
First let's see what happens if we stop forcing x0, y0, x1 and y1 to be integers: (this will also assume that instead of using pixel centres, the coordinates are relative to the bottom-left of each pixel1, since once you support sub-pixel positions you can simply add half the pixel width to the x and y to return to pixel-centred logic)
x = x0
y = y0
dx = x1 - x0
dy = y1 - y0
error = (0.5 - (x0 % 1)) * dy/dx + (y0 % 1) - 1
while x < x1:
if error > 0:
y += 1
error -= 1
paint(x, y)
x += 1
error += dy/dx
The only change was the initial error calculation. The new value comes from simple trig to calculate the y coordinate when x is at the pixel centre. It's worth noting that you can use the same idea to clip the line's start position to be within some bound, which is another challenge you'll likely face when you want to start optimising things.
Now we just need to convert this into integer-only arithmetic. We'll need some fixed multiplier for the fractional inputs (scale), and the divisions can be handled by multiplying them out, just as the standard algorithm does.
# assumes x0, y0, x1 and y1 are pre-multiplied by scale
x = x0
y = y0
dx = x1 - x0
dy = y1 - y0
error = (scale - 2 * (x0 % scale)) * dy + 2 * (y0 % scale) * dx - 2 * dx * scale
while x < x1:
if error > 0:
y += scale
error -= 2 * dx * scale
paint(x / scale, y / scale)
x += scale
error += 2 * dy * scale
Note that x, y, dx and dy keep the same scaling factor as the input variables (scale), whereas error has a more complex scaling factor: 2 * dx * scale. This allows it to absorb the division and fraction in its original formulation, but means we need to apply the same scale everywhere we use it.
Obviously there's a lot of room to optimise here, but that's the basic algorithm. If we assume scale is a power-of-two (2^n), we can start to make things a little more efficient:
dx = x1 - x0
dy = y1 - y0
mask = (1 << n) - 1
error = (2 * (y0 & mask) - (2 << n)) * dx - (2 * (x0 & mask) - (1 << n)) * dy
x = x0 >> n
y = y0 >> n
while x < (x1 >> n):
if error > 0:
y += 1
error -= 2 * dx << n
paint(x, y)
x += 1
error += 2 * dy << n
As with the original, this only works in the (x >= y, x > 0, y >= 0) octant. The usual rules apply for extending it to all cases, but note that there are a few extra gotchyas due to the coordinates no-longer being centred in the pixel (i.e. reflections become more complex).
You'll also need to watch out for integer overflows: error has twice the precision of the input variables, and a range of up to twice the length of the line. Plan your inputs, precision, and variable types accordingly!
1: Coordinates are relative to the corner which is closest to 0,0. For an OpenGL-style coordinate system that's the bottom left, but it could be the top-left depending on your particular scenario.
I had a similar problem, with the addition of needing sub-pixel endpoints, I also needed to make sure all pixels which intersect the line are drawn.
I'm not sure that my solution will be helpful to OP, both because its been 4+ years, and because of the sentence "This means while the first and last pixels will remain the same..." For me, that is actually a problem (More on that later). Hopefully this may be helpful to others.
I don't know if this can be considered to be Bresenham's algorithm, but it is awful similar. I'll explain it for the (+,+) quadrant. Lets say you wish to draw a line from point (Px,Py) to (Qx,Qy) over a grid of pixels with width W. Having a grid width W > 1 allows for sub-pixel endpoints.
For a line going in the (+,+) quadrant, the starting point is easy to calculate, just take the floor of (Px,Py). As you will see later, this only works if Qx >= Px & Qy >= Py.
Now you need to find which pixel to go to next. There are 3 possibilities: (x+1,y), (x,y+1), & (x+1,y+1). To make this decision, I use the 2D cross product defined as:
If this value is negative, vector b is right/clockwise of vector a.
If this value is positive, vector b is left/anti-clockwise of vector a.
If this value is zero vector b points in the same direction as vector a.
To make the decision on which pixel is next, compare the cross product between the line P-Q [red in image below] and a line between the point P and the top-right pixel (x+1,y+1) [blue in image below].
The vector between P & the top-right pixel can be calculated as:
So, we will use the value from the 2D cross product:
If this value is negative, the next pixel will be (x,y+1).
If this value is positive, the next pixel will be (x+1,y).
If this value is exactly zero, the next pixel will be (x+1,y+1).
That works fine for the starting pixel, but the rest of the pixels will not have a point that lies inside them. Luckily, after the initial point, you don't need a point to be inside the pixel for the blue vector. You can keep extending it like so:
The blue vector starts at the starting point of the line, and is updated to the (x+1,y+1) for every pixel. The rule for which pixel to take is the same. As you can see, the red vector is right of the blue vector. So, the next pixel will be the one right of the green pixel.
The value for the cross product needs updated for every pixel, depending on which pixel you took.
Add dx if the next pixel was (x+1), add dy if the pixel was (y+1). Add both if the pixel went to (x+1,y+1).
This process is repeated until it reaches the ending pixel, (Qx / W, Qy / W).
All combined this leads to the following code:
int dx = x2 - x2;
int dy = y2 - y1;
int local_x = x1 % width;
int local_y = y1 % width;
int cross_product = dx*(width-local_y) - dy*(width-local_x);
int dx_cross = -dy*width;
int dy_cross = dx*width;
int x = x1 / width;
int y = y1 / width;
int end_x = x2 / width;
int end_y = y2 / width;
while (x != end_x || y != end_y) {
SetPixel(x,y,color);
int old_cross = cross_product;
if (old_cross >= 0) {
x++;
cross_product += dx_cross;
}
if (old_cross <= 0) {
y++;
cross_product += dy_cross;
}
}
Making it work for all quadrants is a matter of reversing the local coordinates and some absolute values. Heres the code which works for all quadrants:
int dx = x2 - x1;
int dy = y2 - y1;
int dx_x = (dx >= 0) ? 1 : -1;
int dy_y = (dy >= 0) ? 1 : -1;
int local_x = x1 % square_width;
int local_y = y1 % square_width;
int x_dist = (dx >= 0) ? (square_width - local_x) : (local_x);
int y_dist = (dy >= 0) ? (square_width - local_y) : (local_y);
int cross_product = abs(dx) * abs(y_dist) - abs(dy) * abs(x_dist);
dx_cross = -abs(dy) * square_width;
dy_cross = abs(dx) * square_width;
int x = x1 / square_width;
int y = y1 / square_width;
int end_x = x2 / square_width;
int end_y = y2 / square_width;
while (x != end_x || y != end_y) {
SetPixel(x,y,color);
int old_cross = cross_product;
if (old_cross >= 0) {
x += dx_x;
cross_product += dx_cross;
}
if (old_cross <= 0) {
y += dy_y;
cross_product += dy_cross;
}
}
However there is a problem! This code will not stop in some cases. To understand why, you need to really look into exactly what conditions count as the intersection between a line and a pixel.
When exactly is a pixel drawn?
I said I need to make that all pixels which intersect a line need to be drawn. But there's some ambiguity in the edge cases.
Here is a list of all possible intersections in which a pixel will be drawn for a line where Qx >= Px & Qy >= Py:
A - If a line intersects the pixel completely, the pixel will be drawn.
B - If a vertical line intersects the pixel completely, the pixel will be drawn.
C - If a horizontal line intersects the pixel completely, the pixel will be drawn.
D - If a vertical line perfectly touches the left of the pixel, the pixel will be drawn.
E - If a horizontal line perfectly touches the bottom of the pixel, the pixel will be drawn.
F - If a line endpoint starts inside of a pixel going (+,+), the pixel will be drawn.
G - If a line endpoint starts exactly on the left side of a pixel going (+,+), the pixel will be drawn.
H - If a line endpoint starts exactly on the bottom side of a pixel going (+,+), the pixel will be drawn.
I - If a line endpoint starts exactly on the bottom left corner of a pixel going (+,+), the pixel will be drawn.
And here are some pixels which do NOT intersect the line:
A' - If a line obviously doesn't intersect a pixel, the pixel will NOT be drawn.
B' - If a vertical line obviously doesn't intersect a pixel, the pixel will NOT be drawn.
C' - If a horizontal line obviously doesn't intersect a pixel, the pixel will NOT be drawn.
D' - If a vertical line exactly touches the right side of a pixel, the pixel will NOT be drawn.
E' - If a horizontal line exactly touches the top side of a pixel, the pixel will NOT be drawn.
F' - If a line endpoint starts exactly on the top right corner of a pixel going in the (+,+) direction, the pixel will NOT be drawn.
G' - If a line endpoint starts exactly on the top side of a pixel going in the (+,+) direction, the pixel will NOT be drawn.
H' - If a line endpoint starts exactly on the right side of a pixel going in the (+,+) direction, the pixel will NOT be drawn.
I' - If a line exactly touches a corner of the pixel, the pixel will NOT be drawn. This applies to all corners.
Those rules apply as you would expect (just flip the image) for the other quadrants. The problem I need to highlight is when an endpoint lies exactly on the edge of a pixel. Take a look at this case:
This is like image G' above, except the y-axis is flipped because the Qy < Py. There are 4x4 red dots because W is 4, making the pixel dimensions 4x4. Each of the 4 dots are the ONLY endpoints a line can touch. The line drawn goes from (1.25, 1.0) to (somewhere).
This shows why it's incorrect (at least how I defined pixel-line intersections) to say the pixel endpoints can be calculated as the floor of the line endpoints. The floored pixel coordinate for that endpoint seems to be (1,1), but it is clear that the line never really intersects that pixel. It just touches it, so I don't want to draw it.
Instead of flooring the line endpoints, you need to floor the minimal endpoints, and ceil the maximal endpoints minus 1 across both x & y dimensions.
So finally here is the complete code which does this flooring/ceiling:
int dx = x2 - x1;
int dy = y2 - y1;
int dx_x = (dx >= 0) ? 1 : -1;
int dy_y = (dy >= 0) ? 1 : -1;
int local_x = x1 % square_width;
int local_y = y1 % square_width;
int x_dist = (dx >= 0) ? (square_width - local_x) : (local_x);
int y_dist = (dy >= 0) ? (square_width - local_y) : (local_y);
int cross_product = abs(dx) * abs(y_dist) - abs(dy) * abs(x_dist);
dx_cross = -abs(dy) * square_width;
dy_cross = abs(dx) * square_width;
int x = x1 / square_width;
int y = y1 / square_width;
int end_x = x2 / square_width;
int end_y = y2 / square_width;
// Perform ceiling/flooring of the pixel endpoints
if (dy < 0)
{
if ((y1 % square_width) == 0)
{
y--;
cross_product += dy_cross;
}
}
else if (dy > 0)
{
if ((y2 % square_width) == 0)
end_y--;
}
if (dx < 0)
{
if ((x1 % square_width) == 0)
{
x--;
cross_product += dx_cross;
}
}
else if (dx > 0)
{
if ((x2 % square_width) == 0)
end_x--;
}
while (x != end_x || y != end_y) {
SetPixel(x,y,color);
int old_cross = cross_product;
if (old_cross >= 0) {
x += dx_x;
cross_product += dx_cross;
}
if (old_cross <= 0) {
y += dy_y;
cross_product += dy_cross;
}
}
This code itself hasn't been tested, but it comes slightly modified from my GitHub project where it has been tested.
Let's assume you want to draw a line from P1 = (x1, y1) to P2 = (x2, y2) where all the numbers are floating point pixel coordinates.
Calculate the true pixel coordinates of P1 and P2 and paint them: P* = (round(x), round(y)).
If abs(x1* - x2*) <= 1 && abs(y1* - y2*) <= 1 then you are finished.
Decide whether it is a horizontal (true) or a vertical line (false): abs(x1 - x2) >= abs(y1 - y2).
If it is a horizontal line and x1 > x2 or if it is a vertical line and y1 > y2: swap P1 with P2 (and also P1* with P2*).
If it is a horizontal line you can get the y-coordinates for all the x-coordinates between x1* and x2* with the following formula:
y(x) = round(y1 + (x - x1) / (x2 - x1) * (y2 - y1))
If you have a vertical line you can get the x-coordinates for all the y-coordinates between y1* and y2* with this formula:
x(y) = round(x1 + (y - y1) / (y2 - y1) * (x2 - x1))
Here is a demo you can play around with, you can try different points on line 12.
Say that I have a scalar information assigned to every pixel of a 2D rectangle R, e.g. a grayscale image, or a depth-map/bump-map.
Such a scalar information is canonically encoded in a 8 bit image, which allows 2^8=256 different tones. Conveniently, tones here have a pretty intuitive meaning, e.g. white=0, black=1, gray=somewhere between 0 and 1.
Once the image is saved, e.g. in .png, the tone t, 0 <= t <= 255, is encoded in the RGB color [t,t,t] (which wastes 16bit per pixel).
Question:
Say, that the resolution provided by the 8 bit grayscale is not enough for my purpose.
Are there established ways to losslessly encode a 24bit (1D) information to the RGB color space preserving some intuitive meaning of colors?
You might want to consider a Hilbert curve. This is an embedding of a one-dimensional curve into a higher dimensional (2, 3 or more) space.
Here's what it might look like in the case of mapping a 1d curve into a two-dimensional colour space. The white curve has 2^16 = 65,536 points, and is embedded into a 2^8 x 2^8 = 256 x 256 dimensional colour space. Any two neighbouring points on the curve are very similar.
It's possible to generalize this to embed a curve into three dimensions, though I haven't got the code to hand. I can make the Matlab code that generates this plot available if you like, though I'm not convinced it will be very helpful...
This is the color scale you end up with by following the Hilbert curve through the image. Not super intuitive, but it does cover all 65,536 colors.
Edit - here's the code
function [x,y] = d2xy(n,d)
# D2XY Embeds a point d into an n*n square (assuming n is a power of 2). For
# example, if n = 8 then we can embed the points d = 0:63 into it.
x = uint32(0);
y = uint32(0);
t = uint32(d);
n = uint32(n);
s = uint32(1);
while s < n
rx = bitand(1, idivide(t, 2));
ry = bitand(1, bitxor(t,rx));
[x,y] = rot(s,x,y,rx,ry);
x = x + s * rx;
y = y + s * ry;
t = idivide(t, 4);
s = s * 2;
end
end
function [x,y] = rot(n,x,y,rx,ry)
if ry == 0
if rx == 1
x = n-1-x;
y = n-1-y;
end
# Swap x and y
t = x; x = y; y = t;
end
end
b = zeros(65536, 2);
for d = 0:65535
[x,y] = d2xy(256, d);
b(d+1,1) = x;
b(d+1,2) = y;
end
plot(b(:,1), b(:,2)), xlim([-1,256]), ylim([-1,256]), axis square
HSV or HSI color space is the exactly what you are looking for:
http://www.mathworks.com/help/matlab/ref/rgb2hsv.html
HSI means hue, saturation and illumination or intensity. Hue gives the color frequency and it is quite intuitive to represent colors
I think you could quantize to 8, i.e. distribute intensity / 8 to as equal to 3 colors (you get a grey) and add the remaining bits (intensity % 8) to LSB of separed R,G,B. We can do with a local addition. Something like (untested but compiled)
int convert(int N) {
int Q = N / 8, // Q + Z == N
Z = N % 8;
int R = (Q >> 0) & 0xFF,
G = (Q >> 8) & 0xFF,
B = (Q >> 16) & 0xFF,
S = R + ((Z >> 0) & 1),
H = G + ((Z >> 1) & 1),
C = B + ((Z >> 2) & 1);
return (S << 0) + (H << 8) + (C << 16);
}
I'm trying to find the best way to calculate the biggest (in area) rectangle which can be contained inside a rotated rectangle.
Some pictures should help (I hope) in visualizing what I mean:
The width and height of the input rectangle is given and so is the angle to rotate it. The output rectangle is not rotated or skewed.
I'm going down the longwinded route which I'm not even sure if it will handle the corner cases (no pun intended). I'm certain there is an elegant solution to this. Any tips?
EDIT: The output rectangle points don't necessarily have to touch the input rectangles edges. (Thanks to Mr E)
I just came here looking for the same answer. After shuddering at the thought of so much math involved, I thought I would resort to a semi-educated guess. Doodling a bit I came to the (intuitive and probably not entirely exact) conclusion that the largest rectangle is proportional to the outer resulting rectangle, and its two opposing corners lie at the intersection of the diagonals of the outer rectangle with the longest side of the rotated rectangle. For squares, any of the diagonals and sides would do... I guess I am happy enough with this and will now start brushing the cobwebs off my rusty trig skills (pathetic, I know).
Minor update... Managed to do some trig calculations. This is for the case when the Height of the image is larger than the Width.
Update. Got the whole thing working. Here is some js code. It is connected to a larger program, and most variables are outside the scope of the functions, and are modified directly from within the functions. I know this is not good, but I am using this in an isolated situation, where there will be no confusion with other scripts: redacted
I took the liberty of cleaning the code and extracting it to a function:
function getCropCoordinates(angleInRadians, imageDimensions) {
var ang = angleInRadians;
var img = imageDimensions;
var quadrant = Math.floor(ang / (Math.PI / 2)) & 3;
var sign_alpha = (quadrant & 1) === 0 ? ang : Math.PI - ang;
var alpha = (sign_alpha % Math.PI + Math.PI) % Math.PI;
var bb = {
w: img.w * Math.cos(alpha) + img.h * Math.sin(alpha),
h: img.w * Math.sin(alpha) + img.h * Math.cos(alpha)
};
var gamma = img.w < img.h ? Math.atan2(bb.w, bb.h) : Math.atan2(bb.h, bb.w);
var delta = Math.PI - alpha - gamma;
var length = img.w < img.h ? img.h : img.w;
var d = length * Math.cos(alpha);
var a = d * Math.sin(alpha) / Math.sin(delta);
var y = a * Math.cos(gamma);
var x = y * Math.tan(gamma);
return {
x: x,
y: y,
w: bb.w - 2 * x,
h: bb.h - 2 * y
};
}
I encountered some problems with the gamma-calculation, and modified it to take into account in which direction the original box is the longest.
-- Magnus Hoff
Trying not to break tradition putting the solution of the problem as a picture:)
Edit:
Third equations is wrong. The correct one is:
3.w * cos(α) * X + w * sin(α) * Y - w * w * sin(α) * cos(α) - w * h = 0
To solve the system of linear equations you can use Cramer rule, or Gauss method.
First, we take care of the trivial case where the angle is zero or a multiple of pi/2. Then the largest rectangle is the same as the original rectangle.
In general, the inner rectangle will have 3 points on the boundaries of the outer rectangle. If it does not, then it can be moved so that one vertex will be on the bottom, and one vertex will be on the left. You can then enlarge the inner rectangle until one of the two remaining vertices hits a boundary.
We call the sides of the outer rectangle R1 and R2. Without loss of generality, we can assume that R1 <= R2. If we call the sides of the inner rectangle H and W, then we have that
H cos a + W sin a <= R1
H sin a + W cos a <= R2
Since we have at least 3 points on the boundaries, at least one of these inequality must actually be an equality. Let's use the first one. It is easy to see that:
W = (R1 - H cos a) / sin a
and so the area is
A = H W = H (R1 - H cos a) / sin a
We can take the derivative wrt. H and require it to equal 0:
dA/dH = ((R1 - H cos a) - H cos a) / sin a
Solving for H and using the expression for W above, we find that:
H = R1 / (2 cos a)
W = R1 / (2 sin a)
Substituting this in the second inequality becomes, after some manipulation,
R1 (tan a + 1/tan a) / 2 <= R2
The factor on the left-hand side is always at least 1. If the inequality is satisfied, then we have the solution. If it isn't satisfied, then the solution is the one that satisfies both inequalities as equalities. In other words: it is the rectangle which touches all four sides of the outer rectangle. This is a linear system with 2 unknowns which is readily solved:
H = (R2 cos a - R1 sin a) / cos 2a
W = (R1 cos a - R2 sin a) / cos 2a
In terms of the original coordinates, we get:
x1 = x4 = W sin a cos a
y1 = y2 = R2 sin a - W sin^2 a
x2 = x3 = x1 + H
y3 = y4 = y2 + W
Edit: My Mathematica answer below is wrong - I was solving a slightly different problem than what I think you are really asking.
To solve the problem you are really asking, I would use the following algorithm(s):
On the Maximum Empty Rectangle Problem
Using this algorithm, denote a finite amount of points that form the boundary of the rotated rectangle (perhaps a 100 or so, and make sure to include the corners) - these would be the set S decribed in the paper.
.
.
.
.
.
For posterity's sake I have left my original post below:
The inside rectangle with the largest area will always be the rectangle where the lower mid corner of the rectangle (the corner near the alpha on your diagram) is equal to half of the width of the outer rectangle.
I kind of cheated and used Mathematica to solve the algebra for me:
From this you can see that the maximum area of the inner rectangle is equal to 1/4 width^2 * cosecant of the angle times the secant of the angle.
Now I need to figure out what is the x value of the bottom corner for this optimal condition. Using the Solve function in mathematica on my area formula, I get the following:
Which shows that the x coordinate of the bottom corner equals half of the width.
Now just to make sure, I'll going to test our answer empirically. With the results below you can see that indeed the highest area of all of my tests (definately not exhaustive but you get the point) is when the bottom corner's x value = half of the outer rectangle's width.
#Andri is not working correctly for image where width > height as I tested.
So, I fixed and optimized his code by such way (with only two trigonometric functions):
calculateLargestRect = function(angle, origWidth, origHeight) {
var w0, h0;
if (origWidth <= origHeight) {
w0 = origWidth;
h0 = origHeight;
}
else {
w0 = origHeight;
h0 = origWidth;
}
// Angle normalization in range [-PI..PI)
var ang = angle - Math.floor((angle + Math.PI) / (2*Math.PI)) * 2*Math.PI;
ang = Math.abs(ang);
if (ang > Math.PI / 2)
ang = Math.PI - ang;
var sina = Math.sin(ang);
var cosa = Math.cos(ang);
var sinAcosA = sina * cosa;
var w1 = w0 * cosa + h0 * sina;
var h1 = w0 * sina + h0 * cosa;
var c = h0 * sinAcosA / (2 * h0 * sinAcosA + w0);
var x = w1 * c;
var y = h1 * c;
var w, h;
if (origWidth <= origHeight) {
w = w1 - 2 * x;
h = h1 - 2 * y;
}
else {
w = h1 - 2 * y;
h = w1 - 2 * x;
}
return {
w: w,
h: h
}
}
UPDATE
Also I decided to post the following function for proportional rectange calculating:
calculateLargestProportionalRect = function(angle, origWidth, origHeight) {
var w0, h0;
if (origWidth <= origHeight) {
w0 = origWidth;
h0 = origHeight;
}
else {
w0 = origHeight;
h0 = origWidth;
}
// Angle normalization in range [-PI..PI)
var ang = angle - Math.floor((angle + Math.PI) / (2*Math.PI)) * 2*Math.PI;
ang = Math.abs(ang);
if (ang > Math.PI / 2)
ang = Math.PI - ang;
var c = w0 / (h0 * Math.sin(ang) + w0 * Math.cos(ang));
var w, h;
if (origWidth <= origHeight) {
w = w0 * c;
h = h0 * c;
}
else {
w = h0 * c;
h = w0 * c;
}
return {
w: w,
h: h
}
}
Coproc solved this problem on another thread (https://stackoverflow.com/a/16778797) in a simple and efficient way. Also, he gave a very good explanation and python code there.
Below there is my Matlab implementation of his solution:
function [ CI, T ] = rotateAndCrop( I, ang )
%ROTATEANDCROP Rotate an image 'I' by 'ang' degrees, and crop its biggest
% inner rectangle.
[h,w,~] = size(I);
ang = deg2rad(ang);
% Affine rotation
R = [cos(ang) -sin(ang) 0; sin(ang) cos(ang) 0; 0 0 1];
T = affine2d(R);
B = imwarp(I,T);
% Largest rectangle
% solution from https://stackoverflow.com/a/16778797
wb = w >= h;
sl = w*wb + h*~wb;
ss = h*wb + w*~wb;
cosa = abs(cos(ang));
sina = abs(sin(ang));
if ss <= 2*sina*cosa*sl
x = .5*min([w h]);
wh = wb*[x/sina x/cosa] + ~wb*[x/cosa x/sina];
else
cos2a = (cosa^2) - (sina^2);
wh = [(w*cosa - h*sina)/cos2a (h*cosa - w*sina)/cos2a];
end
hw = flip(wh);
% Top-left corner
tl = round(max(size(B)/2 - hw/2,1));
% Bottom-right corner
br = tl + round(hw);
% Cropped image
CI = B(tl(1):br(1),tl(2):br(2),:);
sorry for not giving a derivation here, but I solved this problem in Mathematica a few days ago and came up with the following procedure, which non-Mathematica folks should be able to read. If in doubt, please consult http://reference.wolfram.com/mathematica/guide/Mathematica.html
The procedure below returns the width and height for a rectangle with maximum area that fits into another rectangle of width w and height h that has been rotated by alpha.
CropRotatedDimensionsForMaxArea[{w_, h_}, alpha_] :=
With[
{phi = Abs#Mod[alpha, Pi, -Pi/2]},
Which[
w == h, {w,h} Csc[phi + Pi/4]/Sqrt[2],
w > h,
If[ Cos[2 phi]^2 < 1 - (h/w)^2,
h/2 {Csc[phi], Sec[phi]},
Sec[2 phi] {w Cos[phi] - h Sin[phi], h Cos[phi] - w Sin[phi]}],
w < h,
If[ Cos[2 phi]^2 < 1 - (w/h)^2,
w/2 {Sec[phi], Csc[phi]},
Sec[2 phi] {w Cos[phi] - h Sin[phi], h Cos[phi] - w Sin[phi]}]
]
]
Here is the easiest way to do this... :)
Step 1
//Before Rotation
int originalWidth = 640;
int originalHeight = 480;
Step 2
//After Rotation
int newWidth = 701; //int newWidth = 654; //int newWidth = 513;
int newHeight = 564; //int newHeight = 757; //int newHeight = 664;
Step 3
//Difference in height and width
int widthDiff ;
int heightDiff;
int ASPECT_RATIO = originalWidth/originalHeight; //Double check the Aspect Ratio
if (newHeight > newWidth) {
int ratioDiff = newHeight - newWidth;
if (newWidth < Constant.camWidth) {
widthDiff = (int) Math.floor(newWidth / ASPECT_RATIO);
heightDiff = (int) Math.floor((originalHeight - (newHeight - originalHeight)) / ASPECT_RATIO);
}
else {
widthDiff = (int) Math.floor((originalWidth - (newWidth - originalWidth) - ratioDiff) / ASPECT_RATIO);
heightDiff = originalHeight - (newHeight - originalHeight);
}
} else {
widthDiff = originalWidth - (originalWidth);
heightDiff = originalHeight - (newHeight - originalHeight);
}
Step 4
//Calculation
int targetRectanleWidth = originalWidth - widthDiff;
int targetRectanleHeight = originalHeight - heightDiff;
Step 5
int centerPointX = newWidth/2;
int centerPointY = newHeight/2;
Step 6
int x1 = centerPointX - (targetRectanleWidth / 2);
int y1 = centerPointY - (targetRectanleHeight / 2);
int x2 = centerPointX + (targetRectanleWidth / 2);
int y2 = centerPointY + (targetRectanleHeight / 2);
Step 7
x1 = (x1 < 0 ? 0 : x1);
y1 = (y1 < 0 ? 0 : y1);
This is just an illustration of Jeffrey Sax's solution above, for my future reference.
With reference to the diagram above, the solution is:
(I used the identity tan(t) + cot(t) = 2/sin(2t))