Best algorithm to find a region of the same values - algorithm

I would like to return a list of the pixels that belongs to the same region, after clicking on one of them. The input would be the chosen pixel (seed) and the output would be a list of all pixels that have the same value and belongs to the same region (are not separatet by any pixel of different value).
My idea was to create an auxiliary list of seeds and check the neighbours of each of them. If the value of the neighbour is the same as of the seed, it is appended to the region list. My python implementation is below:
def region_growing(x, y):
value = image[x,y]
region = [(x,y),]
seeds = [(x,y),]
while seeds:
seed = seeds.pop()
x = seed[0]
y = seed[1]
for i in range(x-1, x+2):
for j in range(y-1, y+2):
if image[i,j] == value:
point = (i,j,z)
if point not in region:
return region
It works, but is very slow for bigger regions. What algorithm would you suggest?

The problem is the instruction if point not in region whose execution time will increase with the size of the region. The complexity is thus quadratic.
Another problem is that you visit the same pixels multiple times at the boundary of the region since you only keep track of pixels in the region.
You can avoid this by using a dictionary of visited pixels with the point as key.
def region_growing(x, y):
value = image[x,y]
region = [(x,y),]
seeds = [(x,y),]
visited = {(x,y):true}
while seeds:
seed = seeds.pop()
x = seed[0]
y = seed[1]
for i in range(x-1, x+2):
for j in range(y-1, y+2):
point = (i,j)
if point in visited:
visited[point] = true
if image[i,j] == value:
return region
Another method is to use a matrix of booleans instead of the dictionary. This is faster but requires more memory space.

I can suggest you to use any region-fill/paint algorithm and patch it not to paint but to track pixels of the same region. The Smith's algorithm is known to be fast and efficient, see Tint Fill Algorithm.
Note that it is inefficient to store all pixels, but as the algorithm suggest horizontal segments are sufficient (thus only two pixels par segment).


Automatically choose x locations of scatter plot in front of bar graph

I'd like an algorithm to organize a 2D cloud of points in front of a bar graph so that a viewer could easily see the spread of the data. The y location of the point needs to be equal/scaled/proportional to the value of the data, but the x location doesn't matter and would be determined by the algorithm. I imagine a good strategy would be to minimize overlap among the points and center the points.
Here is an example of such a plot without organizing the points:
I generate my bar graphs with points in front of it with MATLAB, but I'm interested just in the best way to choose the x location values of the points.
I have been organizing the points by hand afterwards in Adobe Illustrator, which is time-consuming. Any recommendations? Is this a sub-problem of an already solved problem? What is this kind of plot called?
For high sample sizes, I imagine something like the following would be better than a cloud of points.
I think, mathematically, starting with some array of y-values, it would maximize the sum of the difference between every element from every other element, inversely scaled by the distance between the elements, by rearranging the order of the elements in the array.
Here is the MATLAB code I used to generate the graph:
y = zeros(20,6);
yMean = zeros(1,6);
for i=1:6
y(:,i) = 5 + (8-5).*rand(20,1);
yMean(i) = mean(y(:,i));
hold on
for i=1:6
x = linspace(i-0.3,i+0.3,20);
Here is one way that determines x-locations based on grouping into (histogram) bins. The result is similar to e.g. the plot in, but retains the original y-values. For convenience the points are sorted, but they could be displayed in order of appearance using the bin_index. Whether this is "the best way" of choosing the x-coordinates depends on what you are trying to achieve.
% Create some dummy data
dummy_data_y = 1+0.1*randn(10,3);
% Create bar plot (assuming you are interested in the mean)
bar_obj = bar(mean(dummy_data_y));
% Obtain data size info
n = size(dummy_data_y, 2);
% Algorithm that creates an x vector for each data column
sorted_data_y = sort(dummy_data_y, 'ascend'); % for convenience
number_of_bins = 5;
for j=1:n
% Get histogram information
[bin_count, ~, bin_index] = histcounts(sorted_data_y(:, j), number_of_bins);
% Create x-location data for current column
xj = [];
for k = 1:number_of_bins
xj = [xj 0:bin_count(k)-1];
% Collect x locations per column, scale and translate
sorted_data_x(:, j) = j + (xj-(bin_count(bin_index)-1)/2)'/...
% Plot the individual data points
line(sorted_data_x, sorted_data_y, 'linestyle', 'none', 'marker', '.', 'color', 'r')
Whether this is a good way to display your data remains open to discussion.

How do I programmatically create a grid of pixels given only the pixels neighbors

I'm trying to figure out an algorithm for building a grid, based on number of pixels and surrounding pixels. For instance let's say I have 200 random pixels. I have pixel a, and I can get references to each pixel surrounding it. This holds true for all the pixels. In essence each pixel is s puzzle piece, and each piece has a reference to all its neighbors. How do Programmatically creat the grid of pixels ( the finished puzzle ) given that information
Assuming your
Input is a list of pixels and each pixel has the attributes top, left, bottom and right (references to the surrounding pixels) and your
Output will be an 2D Array grid of pixels,
you can do as follows:
def pixel_graph_to_grid(pixels):
if len(pixels) == 0:
return [[]]
# (1) Finding the top left pixel.
p = pixels[0]
p =
while p.left:
p = p.left
# (2) Go row-wise through the image.
grid = []
first_of_row = p
while True:
p = first_of_row
row = [p]
while p.right:
p = p.right
if first_of_row.bottom:
first_of_row = first_of_row.bottom
You can also do some counting similar to (1) to know how much memory you have to allocate for the grid.
This algorithm has linear runtime and requires constant extra space, so it should be optimal.

Best way to find all points of lattice in sphere

Given a bunch of arbitrary vectors (stored in a matrix A) and a radius r, I'd like to find all integer-valued linear combinations of those vectors which land inside a sphere of radius r. The necessary coordinates I would then store in a Matrix V. So, for instance, if the linear combination
K=[0; 1; 0]
lands inside my sphere, i.e. something like
if norm(A*K) <= r then
The vectors in A are sure to be the simplest possible basis for the given lattice and the largest vector will have length 1. Not sure if that restricts the vectors in any useful way but I suspect it might. - They won't have as similar directions as a less ideal basis would have.
I tried a few approaches already but none of them seem particularly satisfying. I can't seem to find a nice pattern to traverse the lattice.
My current approach involves starting in the middle (i.e. with the linear combination of all 0s) and go through the necessary coordinates one by one. It involves storing a bunch of extra vectors to keep track of, so I can go through all the octants (in the 3D case) of the coordinates and find them one by one. This implementation seems awfully complex and not very flexible (in particular it doesn't seem to be easily generalizable to arbitrary numbers of dimension - although that isn't strictly necessary for the current purpose, it'd be a nice-to-have)
Is there a nice* way to find all the required points?
(*Ideally both efficient and elegant**. If REALLY necessary, it wouldn't matter THAT much to have a few extra points outside the sphere but preferably not that many more. I definitely do need all the vectors inside the sphere. - if it makes a large difference, I'm most interested in the 3D case.
**I'm pretty sure my current implementation is neither.)
Similar questions I found:
Find all points in sphere of radius r around arbitrary coordinate - this is actually a much more general case than what I'm looking for. I am only dealing with periodic lattices and my sphere is always centered at 0, coinciding with one point on the lattice.
But I don't have a list of points but rather a matrix of vectors with which I can generate all the points.
How to efficiently enumerate all points of sphere in n-dimensional grid - the case for a completely regular hypercubic lattice and the Manhattan-distance. I'm looking for completely arbitary lattices and euclidean distance (or, for efficiency purposes, obviously the square of that).
Offhand, without proving any assertions, I think that 1) if the set of vectors is not of maximal rank then the number of solutions is infinite; 2) if the set is of maximal rank, then the image of the linear transformation generated by the vectors is a subspace (e.g., plane) of the target space, which intersects the sphere in a lower-dimensional sphere; 3) it follows that you can reduce the problem to a 1-1 linear transformation (kxk matrix on a k-dimensional space); 4) since the matrix is invertible, you can "pull back" the sphere to an ellipsoid in the space containing the lattice points, and as a bonus you get a nice geometric description of the ellipsoid (principal axis theorem); 5) your problem now becomes exactly one of determining the lattice points inside the ellipsoid.
The latter problem is related to an old problem (counting the lattice points inside an ellipse) which was considered by Gauss, who derived a good approximation. Determining the lattice points inside an ellipse(oid) is probably not such a tidy problem, but it probably can be reduced one dimension at a time (the cross-section of an ellipsoid and a plane is another ellipsoid).
I found a method that makes me a lot happier for now. There may still be possible improvements, so if you have a better method, or find an error in this code, definitely please share. Though here is what I have for now: (all written in SciLab)
Step 1: Figure out the maximal ranges as defined by a bounding n-parallelotope aligned with the axes of the lattice vectors. Thanks for ElKamina's vague suggestion as well as this reply to another of my questions over on by chappers:
function I=findMaxComponents(A,r) //given a matrix A of lattice basis vectors
//and a sphere radius r,
//find the corners of the bounding parallelotope
//built from the lattice, and store it in I.
[dims,vecs]=size(A); //figure out how many vectors there are in A (and, unnecessarily, how long they are)
U=eye(vecs,vecs); //builds matching unit matrix
iATA=pinv(A'*A); //finds the (pseudo-)inverse of A^T A
iAT=pinv(A'); //finds the (pseudo-)inverse of A^T
I=[]; //initializes I as an empty vector
for i=1:vecs //for each lattice vector,
t=r*(iATA*U(:,i))/norm(iAT*U(:,i)) //find the maximum component such that
//it fits in the bounding n-parallelotope
//of a (n-1)-sphere of radius r
I=[I,t(i)]; //and append it to I
I=[-I;I]; //also append the minima (by symmetry, the negative maxima)
In my question I only asked for a general basis, i.e, for n dimensions, a set of n arbitrary but linearly independent vectors. The above code, by virtue of using the pseudo-inverse, works for matrices of arbitrary shapes and, similarly, Scilab's "A'" returns the conjugate transpose rather than just the transpose of A so it equally should work for complex matrices.
In the last step I put the corresponding minimal components.
For one such A as an example, this gives me the following in Scilab's console:
A =
0.9701425 - 0.2425356 0.
0.2425356 0.4850713 0.7276069
0.2425356 0.7276069 - 0.2425356
I =
- 2.9494438 - 3.4186986 - 4.0826424
2.9494438 3.4186986 4.0826424
I =
- 2. - 3. - 4.
2. 3. 4.
The values found by findMaxComponents are the largest possible coefficients of each lattice vector such that a linear combination with that coefficient exists which still land on the sphere. Since I'm looking for the largest such combinations with integer coefficients, I can safely drop the part after the decimal point to get the maximal plausible integer ranges. So for the given matrix A, I'll have to go from -2 to 2 in the first component, from -3 to 3 in the second and from -4 to 4 in the third and I'm sure to land on all the points inside the sphere (plus superfluous extra points, but importantly definitely every valid point inside) Next up:
Step 2: using the above information, generate all the candidate combinations.
function K=findAllCombinations(I) //takes a matrix of the form produced by
//findMaxComponents() and returns a matrix
//which lists all the integer linear combinations
//in the respective ranges.
v=I(1,:); //starting from the minimal vector
next=1; //keeps track of what component to advance next
changed=%F; //keeps track of whether to add the vector to the output
while or(v~=I(2,:)) //as long as not all components of v match all components of the maximum vector
if v <= I(2,:) then //if each current component is smaller than each largest possible component
if ~changed then
K=[K;v]; //store the vector and
v(next)=v(next)+1; //advance the component by 1
next=1; //also reset next to 1
v(1:next)=I(1,1:next); //reset all components smaller than or equal to the current one and
next=next+1; //advance the next larger component next time
K=[K;I(2,:)]'; //while loop ends a single iteration early so add the maximal vector too
//also transpose K to fit better with the other functions
So now that I have that, all that remains is to check whether a given combination actually does lie inside or outside the sphere. All I gotta do for that is:
Step 3: Filter the combinations to find the actually valid lattice points
function points=generatePoints(A,K,r)
possiblePoints=A*K; //explicitly generates all the possible points
for i=possiblePoints
if i'*i<=r*r then //filter those that are too far from the origin
points=[points i];
And I get all the combinations that actually do fit inside the sphere of radius r.
For the above example, the output is rather long: Of originally 315 possible points for a sphere of radius 3 I get 163 remaining points.
The first 4 are: (each column is one)
- 0.2425356 0.2425356 1.2126781 - 0.9701425
- 2.4253563 - 2.6678919 - 2.4253563 - 2.4253563
1.6977494 0. 0.2425356 0.4850713
so the remainder of the work is optimization. Presumably some of those loops could be made faster and especially as the number of dimensions goes up, I have to generate an awful lot of points which I have to discard, so maybe there is a better way than taking the bounding n-parallelotope of the n-1-sphere as a starting point.
Let us just represent K as X.
The problem can be represented as:
(a11x1 + a12x2..)^2 + (a21x1 + a22x2..)^2 ... < r^2
(x1,x2,...) will not form a sphere.
This can be done with recursion on dimension--pick a lattice hyperplane direction and index all such hyperplanes that intersect the r-radius ball. The ball intersection of each such hyperplane itself is a ball, in one lower dimension. Repeat. Here's the calling function code in Octave:
function lat_points(lat_bas_mx,rr)
% **globals for hyperplane lattice point recursive function**
clear global; % this seems necessary/important between runs of this function
global MLB;
global NN_hat;
global NN_len;
global INP; % matrix of interior points, each point(vector) a column vector
global ctr; % integer counter, for keeping track of lattice point vectors added
% in the pre-allocated INP matrix; will finish iteration with actual # of points found
ctr = 0; % counts number of ball-interior lattice points found
MLB = lat_bas_mx;
ndim = size(MLB)(1);
% **create hyperplane normal vectors for recursion step**
% given full-rank lattice basis matrix MLB (each vector in lattice basis a column),
% form set of normal vectors between successive, nested lattice hyperplanes;
% store them as columnar unit normal vectors in NN_hat matrix and their lengths in NN_len vector
NN_hat = [];
for jj=1:ndim-1
tmp_mx = MLB(:,jj+1:ndim);
tmp_mx = [NN_hat(:,1:jj-1),tmp_mx];
NN_hat(:,jj) = null(tmp_mx'); % null space of transpose = orthogonal to columns
tmp_len = norm(NN_hat(:,jj));
NN_hat(:,jj) = NN_hat(:,jj)/tmp_len;
NN_len(jj) = dot(MLB(:,jj),NN_hat(:,jj));
if (NN_len(jj)<0) % NN_hat(:,jj) and MLB(:,jj) must have positive dot product
% for cutting hyperplane indexing to work correctly
NN_hat(:,jj) = -NN_hat(:,jj);
NN_len(jj) = -NN_len(jj);
NN_len(ndim) = norm(MLB(:,ndim));
NN_hat(:,ndim) = MLB(:,ndim)/NN_len(ndim); % the lowest recursion level normal
% is just the last lattice basis vector
% **estimate number of interior lattice points, and pre-allocate memory for INP**
vol_ppl = prod(NN_len); % the volume of the ndim dimensional lattice paralellepiped
% is just the product of the NN_len's (they amount to the nested altitudes
% of hyperplane "paralellepipeds")
vol_bll = exp( (ndim/2)*log(pi) + ndim*log(rr) - gammaln(ndim/2+1) ); % volume of ndim ball, radius rr
est_num_pts = ceil(vol_bll/vol_ppl); % estimated number of lattice points in the ball
err_fac = 1.1; % error factor for memory pre-allocation--assume max of err_fac*est_num_pts columns required in INP
INP = zeros(ndim,ceil(err_fac*est_num_pts));
% **call the (recursive) function**
% for output, global variable INP (matrix of interior points)
% stores each valid lattice point (as a column vector)
clp = zeros(ndim,1); % confirmed lattice point (start at origin)
bpt = zeros(ndim,1); % point at center of ball (initially, at origin)
rd = 1; % initial recursion depth must always be 1
printf("%i lattice points found\n",ctr);
INP = INP(:,1:ctr); % trim excess zeros from pre-allocation (if any)
Regarding the NN_len(jj)*NN_hat(:,jj) vectors--they can be viewed as successive (nested) altitudes in the ndim-dimensional "parallelepiped" formed by the vectors in the lattice basis, MLB. The volume of the lattice basis parallelepiped is just prod(NN_len)--for a quick estimate of the number of interior lattice points, divide the volume of the ndim-ball of radius rr by prod(NN_len). Here's the recursive function code:
function hyp_fun(clp,bpt,rr,ndim,rd)
clp = the lattice point we're entering this lattice hyperplane with
bpt = location of center of ball in this hyperplane
rr = radius of ball
rd = recrusion depth--from 1 to ndim
global MLB;
global NN_hat;
global NN_len;
global INP;
global ctr;
% hyperplane intersection detection step
nml_hat = NN_hat(:,rd);
nh_comp = dot(clp-bpt,nml_hat);
ix_hi = floor((rr-nh_comp)/NN_len(rd));
ix_lo = ceil((-rr-nh_comp)/NN_len(rd));
if (ix_hi<ix_lo)
return % no hyperplane intersections detected w/ ball;
% get out of this recursion level
hp_ix = [ix_lo:ix_hi]; % indices are created wrt the received reference point
hp_ln = length(hp_ix);
% loop through detected hyperplanes (updated)
if (rd<ndim)
bpt_new_mx = bpt*ones(1,hp_ln) + NN_len(rd)*nml_hat*hp_ix; % an ndim by length(hp_ix) matrix
clp_new_mx = clp*ones(1,hp_ln) + MLB(:,rd)*hp_ix; % an ndim by length(hp_ix) matrix
dd_vec = nh_comp + NN_len(rd)*hp_ix; % a length(hp_ix) row vector
rr_new_vec = sqrt(rr^2-dd_vec.^2);
for jj=1:hp_ln
else % rd=ndim--so at deepest level of recursion; record the points on the given 1-dim
% "lattice line" that are inside the ball
INP(:,ctr+1:ctr+hp_ln) = clp + MLB(:,rd)*hp_ix;
ctr += hp_ln;
This has some Octave-y/Matlab-y things in it, but most should be easily understandable; M(:,jj) references column jj of matrix M; the tic ' means take transpose; [A B] concatenates matrices A and B; A=[] declares an empty matrix.
Updated / better optimized from original answer:
"vectorized" the code in the recursive function, to avoid most "for" loops (those slowed it down a factor of ~10; the code now is a bit more difficult to understand though)
pre-allocated memory for the INP matrix-of-interior points (this speeded it up by another order of magnitude; before that, Octave was having to resize the INP matrix for every call to the innermost recursion level--for large matrices/arrays that can really slow things down)
Because this routine was part of a project, I also coded it in Python. From informal testing, the Python version is another 2-3 times faster than this (Octave) version.
For reference, here is the old, much slower code in the original posting of this answer:
% (OLD slower code, using for loops, and constantly resizing
% the INP matrix) loop through detected hyperplanes
if (rd<ndim)
for jj=1:length(hp_ix)
bpt_new = bpt + hp_ix(jj)*NN_len(rd)*nml_hat;
clp_new = clp + hp_ix(jj)*MLB(:,rd);
dd = nh_comp + hp_ix(jj)*NN_len(rd);
rr_new = sqrt(rr^2-dd^2);
else % rd=ndim--so at deepest level of recursion; record the points on the given 1-dim
% "lattice line" that are inside the ball
for jj=1:length(hp_ix)
clp_new = clp + hp_ix(jj)*MLB(:,rd);
INP = [INP clp_new];

Algorithm for calculating the area of a region in a grid of squares

I'm working on a game which uses a tilemap. Squares on the map can either be walls or they can be empty. The algorithm I'm trying to develop should take a point on the map and return the number of cells that can be reached from that point (which is equal to the area of the sector containing the point).
Let the function which carries out the algorithm take an x-coordinate, a y-coordinate and a map in the form of a 2D array.
function sectorArea(x_coord,y_coord,map) { ... }
Say the map looks like this (where 1's represent walls):
map = [0,0,1,0,0,0],
Then sectorArea(0,0,map) == 4 and sectorArea(4,0,map) == 15.
My naive implementation is recursive. The target cell is passed to the go function, which then recurses on any adjacent cells which are empty - eventually spreading across all empty cells in the sector. It runs too slowly and reaches the call stack limit very quickly:
function sectorArea(x_coord,y_coord,map) {
# First convert the map into an array of objects of the form:
# { value: 0 or 1,
# visited: false }
objMap = convertMap(map);
# The recursive function:
function go(x,y) {
if ( outOfBounds(x) || outOfBounds(y) ||
objMap[y][x].value == 1 || objMap[y][x].visited )
return 0;
objMap[y][x].visited = true;
return 1 + go(x+1,y) + go(x-1,y) + go(x,y+1) + go(x,y-1);
return go(x_coord,y_coord);
Could anyone suggest a better algorithm? A non-deterministic solution would actually be fine if it is sufficiently accurate, as speed is the main issue (the algorithm could be called 3 or 4 times on different points during a single tick).
Maybe you can speed up the algorithm itself. Wikipedia suggests that a scanline approach is efficient.
As for the repeated calls: You can cache the results so that you don't have to run the area calculation again every time.
An approach might be to keep an region map of integers alongside your tiles. This denotes several regions, where a special value, -1 for example, means no region. (This region map also serves as your visited attribute.) In addition to that, keep a (short) array of regions with their areas.
In your example above:
When you calculate the area of (0, 0), you will assign 0 to the four tiles in the northwest corner. You will also append the area, 4, to the area array.
When you calculate the area of (0, 1), you notice that the region map for that coordinate has a value of zero, not -1. That means that the area was already calculated.
When you calculate the area of (4, 4), you find -1 in the region map. That means that the region hasn't been calculated yet. Do that, mark the region with 1 and append the new area, 15, to the area array.
I don't know how often the board changes. When you must recalculate the regions, blank out the region map and empty the array list.
The region map is only created once, it isn't recreated for every tick. (I can see this as a potential bottleneck in your code, when the objMap is frequently recreated instead of just being overwritten.)

Measuring the average thickness of traces in an image

Here's the problem: I have a number of binary images composed by traces of different thickness. Below there are two images to illustrate the problem:
First Image - size: 711 x 643 px
Second Image - size: 930 x 951 px
What I need is to measure the average thickness (in pixels) of the traces in the images. In fact, the average thickness of traces in an image is a somewhat subjective measure. So, what I need is a measure that have some correlation with the radius of the trace, as indicated in the figure below:
Since the measure doesn't need to be very precise, I am willing to trade precision for speed. In other words, speed is an important factor to the solution of this problem.
There might be intersections in the traces.
The trace thickness might not be constant, but an average measure is OK (even the maximum trace thickness is acceptable).
The trace will always be much longer than it is wide.
I'd suggest this algorithm:
Apply a distance transformation to the image, so that all background pixels are set to 0, all foreground pixels are set to the distance from the background
Find the local maxima in the distance transformed image. These are points in the middle of the lines. Put their pixel values (i.e. distances from the background) image into a list
Calculate the median or average of that list
I was impressed by #nikie's answer, and gave it a try ...
I simplified the algorithm for just getting the maximum value, not the mean, so evading the local maxima detection algorithm. I think this is enough if the stroke is well-behaved (although for self intersecting lines it may not be accurate).
The program in Mathematica is:
m = Import[""] (* Get image from web*)
s = Abs[ImageData[m] - 1]; (* Invert colors to detect background *)
k = DistanceTransform[Image[s]] (* White Pxs converted to distance to black*)
k // ImageAdjust (* Show the image *)
Max[ImageData[k]] (* Get the max stroke width *)
The generated result is
The numerical value (28.46 px X 2) fits pretty well my measurement of 56 px (Although your value is 100px :* )
Edit - Implemented the full algorithm
Well ... sort of ... instead of searching the local maxima, finding the fixed point of the distance transformation. Almost, but not quite completely unlike the same thing :)
m = Import[""]; (*Get image from web*)
s = Abs[ImageData[m] - 1]; (*Invert colors to detect background*)
k = DistanceTransform[Image[s]]; (*White Pxs converted to distance to black*)
Print["Distance to Background*"]
k // ImageAdjust (*Show the image*)
Print["Local Maxima"]
weights =
Binarize[FixedPoint[ImageAdjust#DistanceTransform[Image[#], .4] &,s]]
Print["Stroke Width =",
2 Mean[Select[Flatten[ImageData[k]] Flatten[ImageData[weights]], # != 0 &]]]
As you may see, the result is very similar to the previous one, obtained with the simplified algorithm.
From Here. A simple method!
3.1 Estimating Pen Width
The pen thickness may be readily estimated from the area A and perimeter length L of the foreground
T = A/(L/2)
In essence, we have reshaped the foreground into a rectangle and measured the length of the longest side. Stronger modelling of the pen, for instance, as a disc yielding circular ends, might allow greater precision, but rasterisation error would compromise the signicance.
While precision is not a major issue, we do need to consider bias and singularities.
We should therefore calculate area A and perimeter length L using functions which take into account "roundedness".
A = bwarea(.)
L = bwarea(bwperim(.; 8))
Since I don't have MATLAB at hand, I made a small program in Mathematica:
m = Binarize[Import[""]] (* Get Image *)
k = Binarize[MorphologicalPerimeter[m]] (* Get Perimeter *)
p = N[2 Count[ImageData[m], Except[1], 2]/
Count[ImageData[k], Except[0], 2]] (* Calculate *)
The output is 36 Px ...
Perimeter image follows
Its been a 3 years since the question was asked :)
following the procedure of #nikie, here is a matlab implementation of the stroke width.
close all;
I = imread('3Zs7m.png');
X = im2bw(I,0.8);
[x, y] = find(RegionMax ~= 0);
for i = 1:size(x)
fprintf('Stroke Width = %u \n',mean(List));
Assuming that the trace has constant thickness, is much longer than it is wide, is not too strongly curved and has no intersections / crossings, I suggest an edge detection algorithm which also determines the direction of the edge, then a rise/fall detector with some trigonometry and a minimization algorithm. This gives you the minimal thickness across a relatively straight part of the curve.
I guess the error to be up to 25%.
First use an edge detector that gives us the information where an edge is and which direction (in 45° or PI/4 steps) it has. This is done by filtering with 4 different 3x3 matrices (Example).
Usually I'd say it's enough to scan the image horizontally, though you could also scan vertically or diagonally.
Assuming line-by-line (horizontal) scanning, once we find an edge, we check if it's a rise (going from background to trace color) or a fall (to background). If the edge's direction is at a right angle to the direction of scanning, skip it.
If you found one rise and one fall with the correct directions and without any disturbance in between, measure the distance from the rise to the fall. If the direction is diagonal, multiply by squareroot of 2. Store this measure together with the coordinate data.
The algorithm must then search along an edge (can't find a web resource on that right now) for neighboring (by their coordinates) measurements. If there is a local minimum with a padding of maybe 4 to 5 size units to each side (a value to play with - larger: less information, smaller: more noise), this measure qualifies as a candidate. This is to ensure that the ends of the trail or a section bent too much are not taken into account.
The minimum of that would be the measurement. Plausibility check: If the trace is not too tangled, there should be a lot of values in that area.
Please comment if there are more questions. :-)
Here is an answer that works in any computer language without the need of special functions...
Basic idea: Try to fit a circle into the black areas of the image. If you can, try with a bigger circle.
set image background = 0 and trace = 1
initialize array result[]
set minimalExpectedWidth
set w = minimalExpectedWidth
set counter = 0
create a matrix of zeros size w x w
within a circle of diameter w in that matrix, put ones
calculate area of the circle (= PI * w)
loop through all pixels of the image
optimization: if current pixel is of background color -> continue loop
multiply the matrix with the image at each pixel (e.g. filtering the image with that matrix)
(you can do this using the current x and y position and a double for loop from 0 to w)
take the sum of the result of each multiplication
if the sum equals the calculated circle's area, increment counter by one
store in result[w - minimalExpectedWidth]
increment w by one
optimization: include algorithm from further down here
while counter is greater zero
Now the result array contains the number of matches for each tested width.
Graph it to have a look at it.
For a width of one this will be equal to the number of pixels of trace color. For a greater width value less circle areas will fit into the trace. The result array will thus steadily decrease until there is a sudden drop. This is because the filter matrix with the circular area of that width now only fits into intersections.
Right before the drop is the width of your trace. If the width is not constant, the drop will not be that sudden.
I don't have MATLAB here for testing and don't know for sure about a function to detect this sudden drop, but we do know that the decrease is continuous, so I'd take the maximum of the second derivative of the (zero-based) result array like this
set maximum = 0
set widthFound = 0
set minimalExpectedWidth as above
set prevvalue = result[0]
set index = 1
set prevFirstDerivative = result[1] - prevvalue
loop until index is greater result length
firstDerivative = result[index] - prevvalue
set secondDerivative = firstDerivative - prevFirstDerivative
if secondDerivative > maximum or secondDerivative < maximum * -1
maximum = secondDerivative
widthFound = index + minimalExpectedWidth
prevFirstDerivative = firstDerivative
prevvalue = result[index]
increment index by one
return widthFound
Now widthFound is the trace width for which (in relation to width + 1) many more matches were found.
I know that this is in part covered in some of the other answers, but my description is pretty much straightforward and you don't have to have learned image processing to do it.
I have interesting solution:
Do edge detection, for edge pixels extraction.
Do physical simulation - consider edge pixels as positively charged particles.
Now put some number of free positively charged particles in the stroke area.
Calculate electrical force equations for determining movement of these free particles.
Simulate particles movement for some time until particles reach position equilibrium.
(As they will repel from both stoke edges after some time they will stay in the middle line of stoke)
Now stroke thickness/2 would be average distance from edge particle to nearest free particle.
