Extraction of contour positions and orientations of an image - image

I'm basically following a paper, " Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system".
Here the author has extracted a vector of 9 features from each sliding window. quoting the paper:
The first three features are the weight of the window, its centre of
gravity and the second order moment of the window.
Features four and five define the position of the upper and lower
contour in the window, features six and seven give the orientation of
the upper and lower contour by the gradient of the contour at the
windows position, feature eight gives the number of black to white
transitions in vertical direction, while feature nine gives the number
of black pixels between the upper and lower contour.
I managed to calculate the first three features the paper is talking about, but I seem to have trouble understanding the features 4,5,6,7,8.
I can calculate the contour of an image. Suppose, this is a window of one of the text lines(windows is of length 14 pixels, as suggested by paper):
And this is the extracted contour of the image:
So what exactly is the upper and lower contour here? from where can I consider the limits, if it refers to the top and bottom pixels then I could have extracted those without contour extraction? Similarly the orientation of these contours is equally confusing.
I would really appreciate some guidance here.

I gave a look at the paper, and I am pretty sure that "upper" and "lower" should be read as "uppest" and "lowest". This especially makes sense as the authors have a special focus on the preprocessing of their data that they normalize in both the horizontal and vertical directions. They take care to have a kind of robustness to scale, writing angle,...
I guess that features 4 and 5 can be the extremal ordinates of the contours, which, combined with features 6 & 7 which are the gradients = orientations, give a good idea of the shape of these parts of the contour.
Feature 9, will be mostly useful to make the difference between letters that can have similar vertical shapes I guess, such as "i", "l", "j".
This is my understanding. Hope this helps!

Related

Painting stroke generation algorithm for robot arm

I am writing a code that generate start and end points of strokes of a picture (Raster images) to let robot arm paint.
I have wrote an algorithm but with too many overlapping strokes:
https://github.com/Evrid/Painting-stroke-generation-for-robot-arm-or-CNC-machine
The input of my algorithm:
and the output (which is mirrored and re-assigned to the colors I have) with 50 ThresholdOfError (you can see the strokes are overlapping):
Things to notice are:
*The strokes needs to be none overlapping (if overlapping then have too many strokes)
*Painting have different colors, the same color better draw together
*The stroke size is like rectangles
*Some coloring area are disconnected, like below only yellow from a sun flower:
I am not sure which algorithm should I use, here is some possible ones I have thought about:
Method 1.Generate 50k (or more) random direction and position large size rectangles, if its area overlap the same color area and not overlapping other rectangles, then keep it, then decrease generated rectangle size and after a couple rounds keep decreasing again
Method 2.Extract certain color first then generate random direction and position large size rectangles (we have less area and calculation time)
Method 3.Do edge detection first, then rectangles are generated with direction along the edge, if its area overlap the same color area and not overlapping other rectangles, then keep it, then decrease generated rectangle size and after a couple rounds keep decreasing again
Method 4: Generate random circle, let the pen draw points instead (but may result too many points)
Any suggestions about which algorithm I should use?
I would start with:
Quantize your image to your palette
so reduce colors to your palette first see:
Effective gif/image color quantization?
Converting BMP image to set of instructions for a plotter?
segmentate your image by similar colors
for this you can use flood fill or growth fill to create labels (region index) in form of ROI
see Fracture detection in hand using image proccessing
for each ROI create infill path with thick brush
this is simple hatching you do this by generating zig zag like path with "big" brush width in major direction of ROI so use either AABB or OBB or PCA to detect major direction (direction with biggest size of ROI) and just AND it with polygon ROI
for each ROI create outline path with "thin" brush
IIRC this is also called contour extraction, simply select boundary pixels of selected ROI
then you can use A* on ROI boundary to sort the pixels into 2 halves (or more if complex shape with holes or thin parts) so backtrack the pixels and then reorder them to form a closed loop(s)
this will preserve details on boundary (while using infill with thick brush)
Something like this:
In case your colors are combinable you can use CMY color space and Substractive color mixing and process each C,M,Y channel separately (max 3 overlapping strokes) to have much better color match.
If you want much better colors you can also add dithering however that will slow down the painting a lot as it requires much much more path segments and its not optimal for plotter with tool up/down movement (they are better for printing heads or printing triggered without additional movements ...). To partially overcome this issue you could use partial dithering where you can specify the amount of dithering created (leading to less segments)
there are a lot of things you can improve/add to this like:
remove outline from ROI (to limit the overlaps and prevent details overpaint)
do all infills first and then all outlines
set infill brush width based on ROI size
adjust infill hatching pattern to better match your arm kinematics
order ROIs so they painted faster (variation of Traveling Sailsman problem TSP)
infill with more than just one brush width to preserve details near borders
Suggest you use the flood fill algorithm.
Start at top right pixel.
Flood fill that pixel color. https://en.wikipedia.org/wiki/Flood_fill
Fit rectangles into the filled area.
Move onto the next pixel that is not in the filled area.
When the entire picture has been covered, sort the rectangles by color.

Algorithms: Ellipse matching

I have many images like the following (only white and black):
My final problem is to find well matching ellipses. Unfortunately the real used images are not always that nice like this. They could be deformed a bit, which makes ellipse matching probably harder.
My idea is to find "break points". I markes them in the following picture:
Maybe these points could help to make a matching for the ellipses. The end result should be something like this:
Has someone an idea what algorithm may be used to find these break points? Or even better to make good ellipse matching?
Thank you very much
Sample the circumference points
Just scan your image and select All Black pixels with any White neighbor. You can do this by recoloring the remaining black pixels to any unused color (Blue).
After whole image is done you can recolor the inside back from unused color (Blue) to white.
form a list of ordered circumference points per cluster/ellipse
Just scan your image and find first black pixel. Then use A* to order the circumference points and store the path in some array or list pnt[] and handle it as circular array.
Find the "break points"
They can be detect by peak in the angle between neighbors of found points. something like
float a0=atan2(pnt[i].y-pnt[i-1].y,pnt[i].x-pnt[i-1].x);
float a1=atan2(pnt[i+1].y-pnt[i].y,pnt[i+1].x-pnt[i].x);
float da=fabs(a0-a1); if (da>M_PI) da=2.0*M_PI-da;
if (da>treshold) pnt[i] is break point;
or use the fact that on break point the slope angle delta change sign:
float a1=atan2(pnt[i-1].y-pnt[i-2].y,pnt[i-1].x-pnt[i-2].x);
float a1=atan2(pnt[i ].y-pnt[i-1].y,pnt[i ].x-pnt[i-1].x);
float a2=atan2(pnt[i+1].y-pnt[i ].y,pnt[i+1].x-pnt[i ].x);
float da0=a1-a0; if (da0>M_PI) da0=2.0*M_PI-da0; if (da0<-M_PI) da0=2.0*M_PI+da0;
float da1=a2-a1; if (da1>M_PI) da1=2.0*M_PI-da1; if (da1<-M_PI) da1=2.0*M_PI+da1;
if (da0*da1<0.0) pnt[i] is break point;
fit ellipses
so if no break points found you can fit the entire pnt[] as single ellipse. For example Find bounding box. It's center is center of ellipse and its size gives you semi-axises.
If break points found then first find the bounding box of whole pnt[] to obtain limits for semi-axises and center position area search. Then divide the pnt[] to parts between break points. Handle each part as separate part of ellipse and fit.
After all the pnt[] parts are fitted check if some ellipses are not the same for example if they are overlapped by another ellipse the they would be divided... So merge the identical ones (or average to enhance precision). Then recolor all pnt[i] points to white, clear the pnt[] list and loop #2 until no more black pixel is found.
how to fit ellipse from selection of points?
algebraically
use ellipse equation with "evenly" dispersed known points to form system of equations to compute ellipse parameters (x0,y0,rx,ry,angle).
geometrically
for example if you detect slope 0,90,180 or 270 degrees then you are at semi-axis intersection with circumference. So if you got two such points (one for each semi-axis) that is all you need for fitting (if it is axis-aligned ellipse).
for non-axis-aligned ellipses you need to have big enough portion of the circumference available. You can exploit the fact that center of bounding box is also the center of ellipse. So if you got the whole ellipse you know also the center. The semi-axises intersections with circumference can be detected with biggest and smallest tangent change. If you got center and two points its all you need. In case you got only partial center (only x, or y coordinate) you can combine with more axis points (find 3 or 4)... or approximate the missing info.
Also the half H,V lines axis is intersecting ellipse center so it can be used to detect it if not whole ellipse in the pnt[] list.
approximation search
You can loop through "all" possible combination of ellipse parameters within limits found in #4 and select the one that is closest to your points. That would be insanely slow of coarse so use binary search like approach something like mine approx class. Also see
Curve fitting with y points on repeated x positions (Galaxy Spiral arms)
on how it is used for similar fit to yours.
hybrid
You can combine geometrical and approximation approach. First compute what you can by geometrical approach. And then compute the rest with approximation search. you can also increase precision of the found values.
In rare case when two ellipses are merged without break point the fitted ellipse will not match your points. So if such case detected you have to subdivide the used points into groups until their fits matches ...
This is what I have in mind with this:
You probably need something like this:
https://en.wikipedia.org/wiki/Circle_Hough_Transform
Your edge points are simply black pixels with at least one white 4-neighbor.
Unfortunately, though, you say that your ellipses may be “tilted”. Generic ellipses are described by quadratic equations like
x² + Ay² + Bxy + Cx + Dy + E = 0
with B² < 4A (⇒ A > 0). This means that, compared to the circle problem, you don't have 3 dimensions but 5. This causes the Hough transform to be considerably harder. Luckily, your example suggests that you don't need a high resolution.
See also: algorithm for detecting a circle in an image
EDIT
The above idea for an algorithm was too optimistic, at least if applied in a straightforward way. The good news is that it seems that two smart guys (Yonghong Xie and Qiang Ji) have already done the homework for us:
https://www.ecse.rpi.edu/~cvrl/Publication/pdf/Xie2002.pdf
I'm not sure I would create my own algorithm. Why not leverage the work other teams have done to figure out all that curve fitting of bitmaps?
INKSCAPE (App Link)
Inkscape is an open source tool which specializes in vector graphics editing with some ability to work with raster (bitmap) parts too.
Here is a link to a starting point for Inkscape's API:
http://wiki.inkscape.org/wiki/index.php/Script_extensions
It looks like you can script within Inkscape, or access Inkscape via external scripts.
You also may be able to do something with zero scripting, from the inkscape command line interface:
http://wiki.inkscape.org/wiki/index.php/Frequently_asked_questions#Can_Inkscape_be_used_from_the_command_line.3F
COREL DRAW (App Link)
Corel Draw is recognized as the premier industry solution for vector graphics, and has some great tools for converting rasterized images into vector images.
Here's a link to their API:
https://community.coreldraw.com/sdk/api
Here's a link to Corel Draw batch image processing (non-script solution):
http://howto.corel.com/en/c/Automating_tasks_and_batch-processing_images_in_Corel_PHOTO-PAINT

How to find this kind of geometry in images

Suppose I have an image of a scene as depicted above. A sort of a pole with a blob on it next to possibly similar objects with no blobs.
How can I find the blob marked by the red circle (a binary image indicating which pixels belong to the blob).
Note that the pole together with the blob may be rotated arbitrarily and also size may vary.
Can you try to do it in below 4 steps?
Circle detection like: writing robust (color and size invariant) circle detection with opencv (based on Hough transform or other features)
Line detection, like: Finding location of rectangles in an image with OpenCV
Identify rectangle position by combining neighboring lines (For each line segment you have the start and end point position, you also know the direction of each line segment. So that you can figure out if two connecting line segments (whose endpoints are close) are orthogonal. Your goal is to find 3 such segments for each rectangle.)
Check the relative position of each circle and rectangle to see if any pair can form the knob shape.
One approach could be using Viola-Jones object detection framework.
Though the framework is mostly used for face detection - it is actually designed for generic objects you feed to the algorithm.
The algorithm basic idea is to feed samples of "good object" (what you are looking for) and "bad objects" to a machine learning algorithm - which generates patterns from the images as its features.
During Classification - using a sliding window the algorithm will search for a "match" to the object (the classifier returned a positive answer).
The algorithm uses supervised learning and thus requires a labeled set of examples (both positive and negative ones)
I'm sure there is some boundary-map algorithm in image processing to do this.
Otherwise, here is a quick fix: pick a pixel at the center of the
"undiscovered zone", which initially is the whole image.
trace the horizantal and vertical lines at 4 directions each ending at the
borders of the zone and find the value changes from 0 to 1 or the vice verse.
Trace each such value switch and complete the boundary of each figure (Step-A).
Do the same for the zones
that still are undiscovered: start at some center
point and skim thru the lines connecting the center to the image border or to a
pixel at the boundary of a known zone.
In Step-A, you can also check to see whether the boundary you traced is
a line or a curve. Whenever it is a curve, you need only two points on it--
points at some distance from one another for the accuracy of the calculation.
The lines perpendicular each to these two points of tangency
intersect at the center of the circle red in your figure.
You can segment the image. Then use only the pixels in the segments to contribute to a Hough-transform to find the circles.
Then you will only have segments with circle in them. You can use a modified hough transform to find rectangles. The 'best' rectangle and square combination will then be your match. This is very computationally intentsive.
Another approach, if you already have these binary pictures, is to transform to a (for example 256 bin) sample by taking the distance to the centroid compared to the distance travelled along the edge. If you start at the point furthest away from the centroid you have a fairly rotational robust featurevector.

Algorithm to Calculate Symmetry of Points

Given a set of 2D points, I want to calculate a measure of how horizontally symmetrical and vertically symmetrical those points are.
Alternatively, for each set of points I will also have a rasterised image of the lines between those points, so is there any way to calculate a measure of symmetry for images?
BTW, this is for use in a feature vector that will be presented to a neural network.
Clarification
The image on the left is 'horizontally' symmetrical. If we imagine a vertical line running down the middle of it, the left and right parts are symmetrical. Likewise, the image on the right is 'vertically' symmetrical, if you imagine a horizontal line running across its center.
What I want is a measure of just how horizontally symmetrical they are, and another of just how vertically symmetrical they are.
This is just a guideline / idea, you'll need to work out the details:
To detect symmetry with respect to horizontal reflection:
reflect the image horizontally
pad the original (unreflected) image horizontally on both sides
compute the correlation of the padded and the reflected images
The position of the maximum in the result of the correlation will give you the location of the axis of symmetry. The value of the maximum will give you a measure of the symmetry, provided you do a suitable normalization first.
This will only work if your images are "symmetric enough", and it works for images only, not sets of points. But you can create an image from a set of points too.
Leonidas J. Guibas from Stanford University talked about it in ETVC'08.
Detection of Symmetries and Repeated Patterns in 3D Point Cloud Data.

Drawing a Topographical Map

I've been working on a visualization project for 2-dimensional continuous data. It's the kind of thing you could use to study elevation data or temperature patterns on a 2D map. At its core, it's really a way of flattening 3-dimensions into two-dimensions-plus-color. In my particular field of study, I'm not actually working with geographical elevation data, but it's a good metaphor, so I'll stick with it throughout this post.
Anyhow, at this point, I have a "continuous color" renderer that I'm very pleased with:
The gradient is the standard color-wheel, where red pixels indicate coordinates with high values, and violet pixels indicate low values.
The underlying data structure uses some very clever (if I do say so myself) interpolation algorithms to enable arbitrarily deep zooming into the details of the map.
At this point, I want to draw some topographical contour lines (using quadratic bezier curves), but I haven't been able to find any good literature describing efficient algorithms for finding those curves.
To give you an idea for what I'm thinking about, here's a poor-man's implementation (where the renderer just uses a black RGB value whenever it encounters a pixel that intersects a contour line):
There are several problems with this approach, though:
Areas of the graph with a steeper slope result in thinner (and often broken) topo lines. Ideally, all topo lines should be continuous.
Areas of the graph with a flatter slope result in wider topo lines (and often entire regions of blackness, especially at the outer perimeter of the rendering region).
So I'm looking at a vector-drawing approach for getting those nice, perfect 1-pixel-thick curves. The basic structure of the algorithm will have to include these steps:
At each discrete elevation where I want to draw a topo line, find a set of coordinates where the elevation at that coordinate is extremely close (given an arbitrary epsilon value) to the desired elevation.
Eliminate redundant points. For example, if three points are in a perfectly-straight line, then the center point is redundant, since it can be eliminated without changing the shape of the curve. Likewise, with bezier curves, it is often possible to eliminate cetain anchor points by adjusting the position of adjacent control points.
Assemble the remaining points into a sequence, such that each segment between two points approximates an elevation-neutral trajectory, and such that no two line segments ever cross paths. Each point-sequence must either create a closed polygon, or must intersect the bounding box of the rendering region.
For each vertex, find a pair of control points such that the resultant curve exhibits a minimum error, with respect to the redundant points eliminated in step #2.
Ensure that all features of the topography visible at the current rendering scale are represented by appropriate topo lines. For example, if the data contains a spike with high altitude, but with extremely small diameter, the topo lines should still be drawn. Vertical features should only be ignored if their feature diameter is smaller than the overall rendering granularity of the image.
But even under those constraints, I can still think of several different heuristics for finding the lines:
Find the high-point within the rendering bounding-box. From that high point, travel downhill along several different trajectories. Any time the traversal line crossest an elevation threshold, add that point to an elevation-specific bucket. When the traversal path reaches a local minimum, change course and travel uphill.
Perform a high-resolution traversal along the rectangular bounding-box of the rendering region. At each elevation threshold (and at inflection points, wherever the slope reverses direction), add those points to an elevation-specific bucket. After finishing the boundary traversal, start tracing inward from the boundary points in those buckets.
Scan the entire rendering region, taking an elevation measurement at a sparse regular interval. For each measurement, use it's proximity to an elevation threshold as a mechanism to decide whether or not to take an interpolated measurement of its neighbors. Using this technique would provide better guarantees of coverage across the whole rendering region, but it'd be difficult to assemble the resultant points into a sensible order for constructing paths.
So, those are some of my thoughts...
Before diving deep into an implementation, I wanted to see whether anyone else on StackOverflow has experience with this sort of problem and could provide pointers for an accurate and efficient implementation.
Edit:
I'm especially interested in the "Gradient" suggestion made by ellisbben. And my core data structure (ignoring some of the optimizing interpolation shortcuts) can be represented as the summation of a set of 2D gaussian functions, which is totally differentiable.
I suppose I'll need a data structure to represent a three-dimensional slope, and a function for calculating that slope vector for at arbitrary point. Off the top of my head, I don't know how to do that (though it seems like it ought to be easy), but if you have a link explaining the math, I'd be much obliged!
UPDATE:
Thanks to the excellent contributions by ellisbben and Azim, I can now calculate the contour angle for any arbitrary point in the field. Drawing the real topo lines will follow shortly!
Here are updated renderings, with and without the ghetto raster-based topo-renderer that I've been using. Each image includes a thousand random sample points, represented by red dots. The angle-of-contour at that point is represented by a white line. In certain cases, no slope could be measured at the given point (based on the granularity of interpolation), so the red dot occurs without a corresponding angle-of-contour line.
Enjoy!
(NOTE: These renderings use a different surface topography than the previous renderings -- since I randomly generate the data structures on each iteration, while I'm prototyping -- but the core rendering method is the same, so I'm sure you get the idea.)
Here's a fun fact: over on the right-hand-side of these renderings, you'll see a bunch of weird contour lines at perfect horizontal and vertical angles. These are artifacts of the interpolation process, which uses a grid of interpolators to reduce the number of computations (by about 500%) necessary to perform the core rendering operations. All of those weird contour lines occur on the boundary between two interpolator grid cells.
Luckily, those artifacts don't actually matter. Although the artifacts are detectable during slope calculation, the final renderer won't notice them, since it operates at a different bit depth.
UPDATE AGAIN:
Aaaaaaaand, as one final indulgence before I go to sleep, here's another pair of renderings, one in the old-school "continuous color" style, and one with 20,000 gradient samples. In this set of renderings, I've eliminated the red dot for point-samples, since it unnecessarily clutters the image.
Here, you can really see those interpolation artifacts that I referred to earlier, thanks to the grid-structure of the interpolator collection. I should emphasize that those artifacts will be completely invisible on the final contour rendering (since the difference in magnitude between any two adjacent interpolator cells is less than the bit depth of the rendered image).
Bon appetit!!
The gradient is a mathematical operator that may help you.
If you can turn your interpolation into a differentiable function, the gradient of the height will always point in the direction of steepest ascent. All curves of equal height are perpendicular to the gradient of height evaluated at that point.
Your idea about starting from the highest point is sensible, but might miss features if there is more than one local maximum.
I'd suggest
pick height values at which you will draw lines
create a bunch of points on a fine, regularly spaced grid, then walk each point in small steps in the gradient direction towards the nearest height at which you want to draw a line
create curves by stepping each point perpendicular to the gradient; eliminate excess points by killing a point when another curve comes too close to it-- but to avoid destroying the center of hourglass like figures, you might need to check the angle between the oriented vector perpendicular to the gradient for both of the points. (When I say oriented, I mean make sure that the angle between the gradient and the perpendicular value you calculate is always 90 degrees in the same direction.)
In response to your comment to #erickson and to answer the point about calculating the gradient of your function. Instead of calculating the derivatives of your 300 term function you could do a numeric differentiation as follows.
Given a point [x,y] in your image you could calculate the gradient (direction of steepest decent)
g={ ( f(x+dx,y)-f(x-dx,y) )/(2*dx),
{ ( f(x,y+dy)-f(x,y-dy) )/(2*dy)
where dx and dy could be the spacing in your grid. The contour line will run perpendicular to the gradient. So, to get the contour direction, c, we can multiply g=[v,w] by matrix, A=[0 -1, 1 0] giving
c = [-w,v]
Alternately, there is the marching squares algorithm which seems appropriate to your problem, although you may want to smooth the results if you use a coarse grid.
The topo curves you want to draw are isosurfaces of a scalar field over 2 dimensions. For isosurfaces in 3 dimensions, there is the marching cubes algorithm.
I've wanted something like this myself, but haven't found a vector-based solution.
A raster-based solution isn't that bad, though, especially if your data is raster-based. If your data is vector-based too (in other words, you have a 3D model of your surface), you should be able to do some real math to find the intersection curves with horizontal planes at varying elevations.
For a raster-based approach, I look at each pair of neighboring pixels. If one is above a contour level, and one is below, obviously a contour line runs between them. The trick I used to anti-alias the contour line is to mix the contour line color into both pixels, proportional to their closeness to the idealized contour line.
Maybe some examples will help. Suppose that the current pixel is at an "elevation" of 12 ft, a neighbor is at an elevation of 8 ft, and contour lines are every 10 ft. Then, there is a contour line half way between; paint the current pixel with the contour line color at 50% opacity. Another pixel is at 11 feet and has a neighbor at 6 feet. Color the current pixel at 80% opacity.
alpha = (contour - neighbor) / (current - neighbor)
Unfortunately, I don't have the code handy, and there might have been a bit more to it (I vaguely recall looking at diagonal neighbors too, and adjusting by sqrt(2) / 2). I hope this enough to give you the gist.
It occurred to me that what you're trying to do would be pretty easy to do in MATLAB, using the contour function. Doing things like making low-density approximations to your contours can probably be done with some fairly simple post-processing of the contours.
Fortunately, GNU Octave, a MATLAB clone, has implementations of the various contour plotting functions. You could look at that code for an algorithm and implementation that's almost certainly mathematically sound. Or, you might just be able to offload the processing to Octave. Check out the page on interfacing with other languages to see if that would be easier.
Disclosure: I haven't used Octave very much, and I haven't actually tested it's contour plotting. However, from my experience with MATLAB, I can say that it will give you almost everything you're asking for in just a few lines of code, provided you get your data into MATLAB.
Also, congratulations on making a very VanGough-esque slopefield plot.
I always check places like http://mathworld.wolfram.com before going to deep on my own :)
Maybe their curves section would help? Or maybe the entry on maps.
compare what you have rendered with a real-world topo map - they look identical to me! i wouldn't change a thing...
Write the data out as an HGT file (very simple digital elevation data format used by USGS) and use the free and open-source gdal_contour tool to create contours. That works very well for terrestrial maps, the constraint being that the data points are signed 16-bit numbers, which fits the earthly range of heights in metres very well, but may not be enough for your data, which I assume not to be a map of actual terrain - although you do mention terrain maps.
I recommend the CONREC approach:
Create an empty line segment list
Split your data into regular grid squares
For each grid square, split the square into 4 component triangles:
For each triangle, handle the cases (a through j):
If a line segment crosses one of the cases:
Calculate its endpoints
Store the line segment in the list
Draw each line segment in the line segment list
If the lines are too jagged, use a smaller grid. If the lines are smooth enough and the algorithm is taking too long, use a larger grid.

Resources