MatLab - Shifting an image using FFT - image

I want to shift an image (represented by a 2D matrix) using the multiplication of its fft by exp(-j*2*pi*x*F), where x is the displacement. I have:
input=peaks(200);
H=fftshift(fft2(fftshift(input)));
x=19;
H=H*exp(-1i*x*2*pi*F);
IF_image=fftshift(ifft2(fftshift(H)));
imshow(IF_image)
But I'm having troubles identifying/representing the F in H[F] since my input is a 2 dimensional array. How could I do this?
The desired output will be my original image shifted in the horizontal axis (by x units) in the same frame so it would start at x+1. As an example:
If input=
1 2 3 4 5
6 7 8 9 0
and x=2, I want:
4 5 1 2 3
9 0 6 7 8

You identified the property for translation / shifting in 1D. For 2D, it's slightly different but based on the same principle. To achieve the translation in 2D, this is the translation / shift property, which is defined as:
x0,y0 would be the shift you want to introduce. As such, positive value of x0 would shift your 2D signal to the right, while a negative value would shift to the left. Similarly, a positive value of y0 would shift your 2D image downwards, while a negative value would shift upwards.
Therefore, given your Fourier Transform in 2D, you would need to add an additional term to the exponent. In addition, you must normalize by N or the size of your 2D signal. This is assuming that your 2D signal has the same number of rows and columns. If this isn't the case, then you would have to take u*x0 and you would divide by the number of columns and v*y0 would be divided by the number of rows.
Now, the reason why you're confused about F in your above code is because you aren't sure how to define this in 2D. You must define a frequency value for every point in the 2D grid. Because of your fftshift call, we would define the x and y values between -100 and 99, as your 2D signal is of size 200 x 200 and this would centre our 2D signal to be in the middle. This is actually what fftshift is doing. Similarly, ifftshift undoes the centering done by fftshift. To define these points in 2D, I use meshgrid. Once you define these points, you would take each pair of (x,y) co-ordinates, then create the complex exponential as you see in the above property.
As such, your code would have to be modified in this fashion. Bear in mind that I got rid of redundant fftshift and ifftshift calls in your original code. You would call the fft, then do fftshift to centre the spectrum. I also changed your variable input to in, as input is a function in MATLAB, and we don't want to unintentionally shadow the function with a variable.
I've also defined the x shift to be -35, and the y shift to be -50. This will mean that the resultant signal will shift to the left by 35, and up by 50.
Therefore:
in=peaks(200); %// Define input signal
H=fftshift(fft2(in)); %// Compute 2D Fourier Transform
x0=-35; %// Define shifts
y0=-50;
%// Define shift in frequency domain
[xF,yF] = meshgrid(-100:99,-100:99);
%// Perform the shift
H=H.*exp(-1i*2*pi.*(xF*x0+yF*y0)/200);
%// Find the inverse Fourier Transform
IF_image=ifft2(ifftshift(H));
%// Show the images
figure;
subplot(1,2,1);
imshow(in);
subplot(1,2,2);
imshow(real(IF_image));
Take note that I displayed the real component of the resultant image. This is due to the fact that once you take the inverse Fourier Transform, there may be some numerical imprecision and the complex part of the signal is actually quite small. We can ignore this by just using the real component of the signal.
This is the image I get:
As you can see, the image did shift properly, as verified by the property seen above. If you want to specify different shifts, you just need to change x0 and y0 to suit your tastes. In your case, you would specify y0 = 0, then x0 can be whatever you wish as you want a horizontal translation.

Related

How can you iterate linearly through a 3D grid?

Assume we have a 3D grid that spans some 3D space. This grid is made out of cubes, the cubes need not have integer length, they can have any possible floating point length.
Our goal is, given a point and a direction, to check linearly each cube in our path once and exactly once.
So if this was just a regular 3D array and the direction is say in the X direction, starting at position (1,2,0) the algorithm would be:
for(i in number of cubes)
{
grid[1+i][2][0]
}
But of course the origin and the direction are arbitrary and floating point numbers, so it's not as easy as iterating through only one dimension of a 3D array. And the fact the side lengths of the cubes are also arbitrary floats makes it slightly harder as well.
Assume that your cube side lengths are s = (sx, sy, sz), your ray direction is d = (dx, dy, dz), and your starting point is p = (px, py, pz). Then, the ray that you want to traverse is r(t) = p + t * d, where t is an arbitrary positive number.
Let's focus on a single dimension. If you are currently at the lower boundary of a cube, then the step length dt that you need to make on your ray in order to get to the upper boundary of the cube is: dt = s / d. And we can calculate this step length for each of the three dimensions, i.e. dt is also a 3D vector.
Now, the idea is as follows: Find the cell where the ray's starting point lies in and find the parameter values t where the first intersection with the grid occurs per dimension. Then, you can incrementally find the parameter values where you switch from one cube to the next for each dimension. Sort the changes by the respective t value and just iterate.
Some more details:
cell = floor(p - gridLowerBound) / s <-- the / is component-wise division
I will only cover the case where the direction is positive. There are some minor changes if you go in the negative direction but I am sure that you can do these.
Find the first intersections per dimension (nextIntersection is a 3D vector):
nextIntersection = ((cell + (1, 1, 1)) * s - p) / d
And calculate the step length:
dt = s / d
Now, just iterate:
if(nextIntersection.x < nextIntersection.y && nextIntersection.x < nextIntersection.z)
cell.x++
nextIntersection.x += dt.x
else if(nextIntersection.y < nextIntersection.z)
cell.y++
nextIntersection.y += dt.y
else
cell.z++
nextIntersection.z += dt.z
end if
if cell is outside of grid
terminate
I have omitted the case where two or three cells are changed at the same time. The above code will only change one at a time. If you need this, feel free to adapt the code accordingly.
Well if you are working with floats, you can make the equation for the line in direction specifiedd. Which is parameterized by t. Because in between any two floats there is a finite number of points, you can simply check each of these points which cube they are in easily cause you have point (x,y,z) whose components should be in, a respective interval defining a cube.
The issue gets a little bit harder if you consider intervals that are, dense.
The key here is even with floats this is a discrete problem of searching. The fact that the equation of a line between any two points is a discrete set of points means you merely need to check them all to the cube intervals. What's better is there is a symmetry (a line) allowing you to enumerate each point easily with arithmetic expression, one after another for checking.
Also perhaps consider integer case first as it is same but slightly simpler in determining the discrete points as it is a line in Z_2^8?

Understanding Hough Transform [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am trying to understand MATLAB's code for the Hough Transform.
Some items are clear to me in this picture,
binary_image is the monochrome version of input_image.
hough_lines is a vector containing detected lines in the image. I see that, four lines have been detected.
T contain the thetas in the (ϴ, ρ) space of the image.
R contain the rhos in the (ϴ, ρ) space of the image.
I have the following questions,
Why is the image rotated before applying Hough Transform?
What do the entries in H represent?
Why is H(Hough Matrix) of size 45x180? Where does this size come from?
Why is T of size 1x180? Where does this size come from?
Why is R of size 1x45? Where does this size come from?
What do the entries in P represent? Are they (x, y) or (ϴ, ρ) ?
29 162
29 165
28 170
21 5
29 158
Why is the value 5 passed into houghpeaks()?
What is the logic behind ceil(0.3*max(H(:)))?
Relevant source code
% Read image into workspace.
input_image = imread('Untitled.bmp');
%Rotate the image.
rotated_image = imrotate(input_image,33,'crop');
% convert rgb to grascale
rotated_image = rgb2gray(rotated_image);
%Create a binary image.
binary_image = edge(rotated_image,'canny');
%Create the Hough transform using the binary image.
[H,T,R] = hough(binary_image);
%Find peaks in the Hough transform of the image.
P = houghpeaks(H,5,'threshold',ceil(0.3*max(H(:))));
%Find lines
hough_lines = houghlines(binary_image,T,R,P,'FillGap',5,'MinLength',7);
% Plot the detected lines
figure, imshow(rotated_image), hold on
max_len = 0;
for k = 1:length(hough_lines)
xy = [hough_lines(k).point1; hough_lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
% Plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'x','LineWidth',2,'Color','yellow');
plot(xy(2,1),xy(2,2),'x','LineWidth',2,'Color','red');
% Determine the endpoints of the longest line segment
len = norm(hough_lines(k).point1 - hough_lines(k).point2);
if ( len > max_len)
max_len = len;
xy_long = xy;
end
end
% Highlight the longest line segment by coloring it cyan.
plot(xy_long(:,1),xy_long(:,2),'LineWidth',2,'Color','cyan');
Those are some good questions. Here are my answers for you:
Why is the image rotated before applying Hough Transform?
This I don't believe is MATLAB's "official example". I just took a quick look at the documentation page for the function. I believe you pulled this from another website that we don't have access to. In any case, in general it is not necessary for you to rotate the images prior to using the Hough Transform. The goal of the Hough Transform is to find lines in the image in any orientation. Rotating them should not affect the results. However, if I were to guess the rotation was performed as a preemptive measure because the lines in the "example image" were most likely oriented at a 33 degree angle clockwise. Performing the reverse rotation would make the lines more or less straight.
What do the entries in H represent?
H is what is known as an accumulator matrix. Before we get into what the purpose of H is and how to interpret the matrix, you need to know how the Hough Transform works. With the Hough transform, we first perform an edge detection on the image. This is done using the Canny edge detector in your case. If you recall the Hough Transform, we can parameterize a line using the following relationship:
rho = x*cos(theta) + y*sin(theta)
x and y are points in the image and most customarily they are edge points. theta would be the angle made from the intersection of a line drawn from the origin meeting with the line drawn through the edge point. rho would be the perpendicular distance from the origin to this line drawn through (x, y) at the angle theta.
Note that the equation can yield infinity many lines located at (x, y) so it's common to bin or discretize the total number of possible angles to a predefined amount. MATLAB by default assumes there are 180 possible angles that range from [-90, 90) with a sampling factor of 1. Therefore [-90, -89, -88, ... , 88, 89]. What you generally do is for each edge point, you search over a predefined number of angles, determine what the corresponding rho is. After, we count how many times you see each rho and theta pair. Here's a quick example pulled from Wikipedia:
Source: Wikipedia: Hough Transform
Here we see three black dots that follow a straight line. Ideally, the Hough Transform should determine that these black dots together form a straight line. To give you a sense of the calculations, take a look at the example at 30 degrees. Consulting earlier, when we extend a line where the angle made from the origin to this line is 30 degrees through each point, we find the perpendicular distance from this line to the origin.
Now what's interesting is if you see the perpendicular distance shown at 60 degrees for each point, the distance is more or less the same at about 80 pixels. Seeing this rho and theta pair for each of the three points is the driving force behind the Hough Transform. Also, what's nice about the above formula is that it will implicitly find the perpendicular distance for you.
The process of the Hough Transform is very simple. Suppose we have an edge detected image I and a set of angles theta:
For each point (x, y) in the image:
For each angle A in the angles theta:
Substitute theta into: rho = x*cos(theta) + y*sin(theta)
Solve for rho to find the perpendicular distance
Remember this rho and theta and count up the number of times you see this by 1
So ideally, if we had edge points that follow a straight line, we should see a rho and theta pair where the count of how many times we see this pair is relatively high. This is the purpose of the accumulator matrix H. The rows denote a unique rho value and the columns denote a unique theta value.
An example of this is shown below:
Source: Google Patents
Therefore using an example from this matrix, located at theta between 25 - 30 with a rho of 4 - 4.5, we have found that there are 8 edge points that would be characterized by a line given this rho, theta range pair.
Note that the range of rho is also infinitely many values so you need to not only restrict the range of rho that you have, but you also have to discretize the rho with a sampling interval. The default in MATLAB is 1. Therefore, if you calculate a rho value it will inevitably have floating point values, so you remove the decimal precision to determine the final rho.
For the above example the rho resolution is 0.5, so that means that for example if you calculated a rho value that falls between 2 to 2.5, it falls in the first column. Also note that the theta values are binned in intervals of 5. You traditionally would compute the Hough Transform with a theta sampling interval of 1, then you merge the bins together. However for the defaults of MATLAB, the bin size is 1. This accumulator matrix tells you how many times an edge point fits a particular rho and theta combination. Therefore, if we see many points that get mapped to a particular rho and theta value, this is a great potential for a line to be detected here and that is defined by rho = x*cos(theta) + y*sin(theta).
Why is H(Hough Matrix) of size 45x180? Where does this size come from?
This is a consequence of the previous point. Take note that the largest distance we would expect from the origin to any point in the image is bounded by the diagonal of the image. This makes sense because going from the top left corner to the bottom right corner, or from the bottom left corner to the top right corner would give you the greatest distance expected in the image. In general, this is defined as D = sqrt(rows^2 + cols^2) where rows and cols are the rows and columns of the image.
For the MATLAB defaults, the range of rho is such that it spans from -round(D) to round(D) in steps of 1. Therefore, your rows and columns are both 16, and so D = sqrt(16^2 + 16^2) = 22.45... and so the range of D will span from -22 to 22 and hence this results in 45 unique rho values. Remember that the default resolution of theta goes from [-90, 90) (with steps of 1) resulting in 180 unique angle values. Going with this, we have 45 rows and 180 columns in the accumulator matrix and hence H is 45 x 180.
Why is T of size 1x180? Where does this size come from?
This is an array that tells you all of the angles that were being used in the Hough Transform. This should be an array going from -90 to 89 in steps of 1.
Why is R of size 1x45? Where does this size come from?
This is an array that tells you all of the rho values that were being used in the Hough Transform. This should be an array that spans from -22 to 22 in steps of 1.
What you should take away from this is that each value in H determines how many times we have seen a particular pair of rho and theta such that for R(i) <= rho < R(i + 1) and T(j) <= theta < T(j + 1), where i spans from 1 to 44 and j spans from 1 to 179, this determines how many times we see edge points for a particular range of rho and theta defined previously.
What do the entries in P represent? Are they (x, y) or (ϴ, ρ)?
P is the output of the houghpeaks function. Basically, this determines what the possible lines are by finding where the peaks in the accumulator matrix happen. This gives you the actual physical locations in P where there is a peak. These locations are:
29 162
29 165
28 170
21 5
29 158
Each row gives you a gateway to the rho and theta parameters required to generate the detected line. Specifically, the first line is characterized by rho = R(29) and theta = T(162). The second line is characterized by rho = R(29) and theta = T(165) etc. To answer your question, the values in P are neither (x, y) or (ρ, ϴ). They represent the physical locations in P where cross-referencing R and T, it would give you the parameters to characterize the line that was detected in the image.
Why is the value 5 passed into houghpeaks()?
The extra 5 in houghpeaks returns the total number of lines you'd like to detect ideally. We can see that P is 5 rows, corresponding to 5 lines. If you can't find 5 lines, then MATLAB will return as many lines possible.
What is the logic behind ceil(0.3*max(H(:)))?
The logic behind this is that if you want to determine peaks in the accumulator matrix, you have to define a minimum threshold that would tell you whether the particular rho and theta combination would be considered a valid line. Making this threshold too low would report a lot of false lines and making this threshold too high misses a lot of lines. What they decided to do here was find the largest bin count in the accumulator matrix, take 30% of that, take the mathematical ceiling and any values in the accumulator matrix that are larger than this amount, those would be candidate lines.
Hope this helps!

Vector decomposition in matlab

this is my situation: I have a 30x30 image and I want to calculate the radial and tangent component of the gradient of each point (pixel) along the straight line passing through the centre of the image (15,15) and the same (i,j) point.
[dx, dy] = gradient(img);
for i=1:30
for j=1:30
pt = [dx(i, j), dy(i,j)];
line = [i-15, j-15];
costh = dot(line, pt)/(norm(line)*norm(pt));
par(i,j) = norm(costh*line);
tang(i,j) = norm(sin(acos(costh))*line);
end
end
is this code correct?
I think there is a conceptual error in your code, I tried to get your results with a different approach, see how it compares to yours.
[dy, dx] = gradient(img);
I inverted x and y because the usual convention in matlab is to have the first dimension along the rows of a matrix while gradient does the opposite.
I created an array of the same size as img but with each pixel containing the angle of the vector from the center of the image to this point:
[I,J] = ind2sub(size(img), 1:numel(img));
theta=reshape(atan2d(I-ceil(size(img,1)/2), J-ceil(size(img,2)/2)), size(img))+180;
The function atan2d ensures that the 4 quadrants give distinct angle values.
Now the projection of the x and y components can be obtained with trigonometry:
par=dx.*sind(theta)+dy.*cosd(theta);
tang=dx.*cosd(theta)+dy.*sind(theta);
Note the use of the .* to achieve point-by-point multiplication, this is a big advantage of Matlab's matrix computations which saves you a loop.
Here's an example with a well-defined input image (no gradient along the rows and a constant gradient along the columns):
img=repmat(1:30, [30 1]);
The results:
subplot(1,2,1)
imagesc(par)
subplot(1,2,2)
imagesc(tang)
colorbar

How to filter a set of 2D points moving in a certain way

I have a list of points moving in two dimensions (x- and y-axis) represented as rows in an array. I might have N points - i.e., N rows:
1 t1 x1 y1
2 t2 x2 y2
.
.
.
N tN xN yN
where ti, xi, and yi, is the time-index, x-coordinate, and the y-coordinate for point i. The time index-index ti is an integer from 1 to T. The number of points at each such possible time index can vary from 0 to N (still with only N points in total).
My goal is the filter out all the points that do not move in a certain way; or to keep only those that do. A point must move in a parabolic trajectory - with decreasing x- and y-coordinate (i.e., moving to the left and downwards only). Points with other dynamic behaviour must be removed.
Can I use a simple sorting mechanism on this array - and then analyse the order of the time-index? I have also considered the fact each point having the same time-index ti are physically distinct points, and so should be paired up with other points. The complexity of the problem grew - and now I turn to you.
NOTE: You can assume that the points are confined to a sub-region of the (x,y)-plane between two parabolic curves. These curves intersect only at only at one point: A point close to the origin of motion for any point.
More Information:
I have made some datafiles available:
MATLAB datafile (1.17 kB)
same data as CSV with semicolon as column separator (2.77 kB)
Necessary context:
The datafile hold one uint32 array with 176 rows and 5 columns. The columns are:
pixel x-coordinate in 175-by-175 lattice
pixel y-coordinate in 175-by-175 lattice
discrete theta angle-index
time index (from 1 to T = 10)
row index for this original sorting
The points "live" in a 175-by-175 pixel-lattice - and again inside the upper quadrant of a circle with radius 175. The points travel on the circle circumference in a counterclockwise rotation to a certain angle theta with horizontal, where they are thrown off into something close to a parabolic orbit. Column 3 holds a discrete index into a list with indices 1 to 45 from 0 to 90 degress (one index thus spans 2 degrees). The theta-angle was originally deduces solely from the points by setting up the trivial equations of motions and solving for the angle. This gives rise to a quasi-symmetric quartic which can be solved in close-form. The actual metric radius of the circle is 0.2 m and the pixel coordinate were converted from pixel-coordinate to metric using simple linear interpolation (but what we see here are the points in original pixel-space).
My problem is that some points are not behaving properly and since I need to statistics on the theta angle, I need to remove the points that certainly do NOT move in a parabolic trajoctory. These error are expected and fully natural, but still need to be filtered out.
MATLAB plot code:
% load data and setup variables:
load mat_points.mat;
num_r = 175;
num_T = 10;
num_gridN = 20;
% begin plotting:
figure(1000);
clf;
plot( ...
num_r * cos(0:0.1:pi/2), ...
num_r * sin(0:0.1:pi/2), ...
'Color', 'k', ...
'LineWidth', 2 ...
);
axis equal;
xlim([0 num_r]);
ylim([0 num_r]);
hold all;
% setup grid (yea... went crazy with one):
vec_tickValues = linspace(0, num_r, num_gridN);
cell_tickLabels = repmat({''}, size(vec_tickValues));
cell_tickLabels{1} = sprintf('%u', vec_tickValues(1));
cell_tickLabels{end} = sprintf('%u', vec_tickValues(end));
set(gca, 'XTick', vec_tickValues);
set(gca, 'XTickLabel', cell_tickLabels);
set(gca, 'YTick', vec_tickValues);
set(gca, 'YTickLabel', cell_tickLabels);
set(gca, 'GridLineStyle', '-');
grid on;
% plot points per timeindex (with increasing brightness):
vec_grayIndex = linspace(0,0.9,num_T);
for num_kt = 1:num_T
vec_xCoords = mat_points((mat_points(:,4) == num_kt), 1);
vec_yCoords = mat_points((mat_points(:,4) == num_kt), 2);
plot(vec_xCoords, vec_yCoords, 'o', ...
'MarkerEdgeColor', 'k', ...
'MarkerFaceColor', vec_grayIndex(num_kt) * ones(1,3) ...
);
end
Thanks :)
Why, it looks almost as if you're simulating a radar tracking debris from the collision of two missiles...
Anyway, let's coin a new term: object. Objects are moving along parabolae and at certain times they may emit flashes that appear as points. There are also other points which we are trying to filter out.
We will need some more information:
Can we assume that the objects obey the physics of things falling under gravity?
Must every object emit a point at every timestep during its lifetime?
Speaking of lifetime, do all objects begin at the same time? Can some expire before others?
How precise is the data? Is it exact? Is there a measure of error? To put it another way, do we understand how poorly the points from an object might fit a perfect parabola?
Sort the data with (index,time) as keys and for all locations of a point i see if they follow parabolic trajectory?
Which part are you facing problem? Sorting should be very easy. IMHO, it is the second part (testing if a set of points follow parabolic trajectory) that is difficult.

Measuring the average thickness of traces in an image

Here's the problem: I have a number of binary images composed by traces of different thickness. Below there are two images to illustrate the problem:
First Image - size: 711 x 643 px
Second Image - size: 930 x 951 px
What I need is to measure the average thickness (in pixels) of the traces in the images. In fact, the average thickness of traces in an image is a somewhat subjective measure. So, what I need is a measure that have some correlation with the radius of the trace, as indicated in the figure below:
Notes
Since the measure doesn't need to be very precise, I am willing to trade precision for speed. In other words, speed is an important factor to the solution of this problem.
There might be intersections in the traces.
The trace thickness might not be constant, but an average measure is OK (even the maximum trace thickness is acceptable).
The trace will always be much longer than it is wide.
I'd suggest this algorithm:
Apply a distance transformation to the image, so that all background pixels are set to 0, all foreground pixels are set to the distance from the background
Find the local maxima in the distance transformed image. These are points in the middle of the lines. Put their pixel values (i.e. distances from the background) image into a list
Calculate the median or average of that list
I was impressed by #nikie's answer, and gave it a try ...
I simplified the algorithm for just getting the maximum value, not the mean, so evading the local maxima detection algorithm. I think this is enough if the stroke is well-behaved (although for self intersecting lines it may not be accurate).
The program in Mathematica is:
m = Import["http://imgur.com/3Zs7m.png"] (* Get image from web*)
s = Abs[ImageData[m] - 1]; (* Invert colors to detect background *)
k = DistanceTransform[Image[s]] (* White Pxs converted to distance to black*)
k // ImageAdjust (* Show the image *)
Max[ImageData[k]] (* Get the max stroke width *)
The generated result is
The numerical value (28.46 px X 2) fits pretty well my measurement of 56 px (Although your value is 100px :* )
Edit - Implemented the full algorithm
Well ... sort of ... instead of searching the local maxima, finding the fixed point of the distance transformation. Almost, but not quite completely unlike the same thing :)
m = Import["http://imgur.com/3Zs7m.png"]; (*Get image from web*)
s = Abs[ImageData[m] - 1]; (*Invert colors to detect background*)
k = DistanceTransform[Image[s]]; (*White Pxs converted to distance to black*)
Print["Distance to Background*"]
k // ImageAdjust (*Show the image*)
Print["Local Maxima"]
weights =
Binarize[FixedPoint[ImageAdjust#DistanceTransform[Image[#], .4] &,s]]
Print["Stroke Width =",
2 Mean[Select[Flatten[ImageData[k]] Flatten[ImageData[weights]], # != 0 &]]]
As you may see, the result is very similar to the previous one, obtained with the simplified algorithm.
From Here. A simple method!
3.1 Estimating Pen Width
The pen thickness may be readily estimated from the area A and perimeter length L of the foreground
T = A/(L/2)
In essence, we have reshaped the foreground into a rectangle and measured the length of the longest side. Stronger modelling of the pen, for instance, as a disc yielding circular ends, might allow greater precision, but rasterisation error would compromise the signicance.
While precision is not a major issue, we do need to consider bias and singularities.
We should therefore calculate area A and perimeter length L using functions which take into account "roundedness".
In MATLAB
A = bwarea(.)
L = bwarea(bwperim(.; 8))
Since I don't have MATLAB at hand, I made a small program in Mathematica:
m = Binarize[Import["http://imgur.com/3Zs7m.png"]] (* Get Image *)
k = Binarize[MorphologicalPerimeter[m]] (* Get Perimeter *)
p = N[2 Count[ImageData[m], Except[1], 2]/
Count[ImageData[k], Except[0], 2]] (* Calculate *)
The output is 36 Px ...
Perimeter image follows
HTH!
Its been a 3 years since the question was asked :)
following the procedure of #nikie, here is a matlab implementation of the stroke width.
clc;
clear;
close all;
I = imread('3Zs7m.png');
X = im2bw(I,0.8);
subplottight(2,2,1);
imshow(X);
Dist=bwdist(X);
subplottight(2,2,2);
imshow(Dist,[]);
RegionMax=imregionalmax(Dist);
[x, y] = find(RegionMax ~= 0);
subplottight(2,2,3);
imshow(RegionMax);
List(1:size(x))=0;
for i = 1:size(x)
List(i)=Dist(x(i),y(i));
end
fprintf('Stroke Width = %u \n',mean(List));
Assuming that the trace has constant thickness, is much longer than it is wide, is not too strongly curved and has no intersections / crossings, I suggest an edge detection algorithm which also determines the direction of the edge, then a rise/fall detector with some trigonometry and a minimization algorithm. This gives you the minimal thickness across a relatively straight part of the curve.
I guess the error to be up to 25%.
First use an edge detector that gives us the information where an edge is and which direction (in 45° or PI/4 steps) it has. This is done by filtering with 4 different 3x3 matrices (Example).
Usually I'd say it's enough to scan the image horizontally, though you could also scan vertically or diagonally.
Assuming line-by-line (horizontal) scanning, once we find an edge, we check if it's a rise (going from background to trace color) or a fall (to background). If the edge's direction is at a right angle to the direction of scanning, skip it.
If you found one rise and one fall with the correct directions and without any disturbance in between, measure the distance from the rise to the fall. If the direction is diagonal, multiply by squareroot of 2. Store this measure together with the coordinate data.
The algorithm must then search along an edge (can't find a web resource on that right now) for neighboring (by their coordinates) measurements. If there is a local minimum with a padding of maybe 4 to 5 size units to each side (a value to play with - larger: less information, smaller: more noise), this measure qualifies as a candidate. This is to ensure that the ends of the trail or a section bent too much are not taken into account.
The minimum of that would be the measurement. Plausibility check: If the trace is not too tangled, there should be a lot of values in that area.
Please comment if there are more questions. :-)
Here is an answer that works in any computer language without the need of special functions...
Basic idea: Try to fit a circle into the black areas of the image. If you can, try with a bigger circle.
Algorithm:
set image background = 0 and trace = 1
initialize array result[]
set minimalExpectedWidth
set w = minimalExpectedWidth
loop
set counter = 0
create a matrix of zeros size w x w
within a circle of diameter w in that matrix, put ones
calculate area of the circle (= PI * w)
loop through all pixels of the image
optimization: if current pixel is of background color -> continue loop
multiply the matrix with the image at each pixel (e.g. filtering the image with that matrix)
(you can do this using the current x and y position and a double for loop from 0 to w)
take the sum of the result of each multiplication
if the sum equals the calculated circle's area, increment counter by one
store in result[w - minimalExpectedWidth]
increment w by one
optimization: include algorithm from further down here
while counter is greater zero
Now the result array contains the number of matches for each tested width.
Graph it to have a look at it.
For a width of one this will be equal to the number of pixels of trace color. For a greater width value less circle areas will fit into the trace. The result array will thus steadily decrease until there is a sudden drop. This is because the filter matrix with the circular area of that width now only fits into intersections.
Right before the drop is the width of your trace. If the width is not constant, the drop will not be that sudden.
I don't have MATLAB here for testing and don't know for sure about a function to detect this sudden drop, but we do know that the decrease is continuous, so I'd take the maximum of the second derivative of the (zero-based) result array like this
Algorithm:
set maximum = 0
set widthFound = 0
set minimalExpectedWidth as above
set prevvalue = result[0]
set index = 1
set prevFirstDerivative = result[1] - prevvalue
loop until index is greater result length
firstDerivative = result[index] - prevvalue
set secondDerivative = firstDerivative - prevFirstDerivative
if secondDerivative > maximum or secondDerivative < maximum * -1
maximum = secondDerivative
widthFound = index + minimalExpectedWidth
prevFirstDerivative = firstDerivative
prevvalue = result[index]
increment index by one
return widthFound
Now widthFound is the trace width for which (in relation to width + 1) many more matches were found.
I know that this is in part covered in some of the other answers, but my description is pretty much straightforward and you don't have to have learned image processing to do it.
I have interesting solution:
Do edge detection, for edge pixels extraction.
Do physical simulation - consider edge pixels as positively charged particles.
Now put some number of free positively charged particles in the stroke area.
Calculate electrical force equations for determining movement of these free particles.
Simulate particles movement for some time until particles reach position equilibrium.
(As they will repel from both stoke edges after some time they will stay in the middle line of stoke)
Now stroke thickness/2 would be average distance from edge particle to nearest free particle.

Resources