How to reduce the number of points in a curve while preserving its overall shape? - algorithm

I have a list of points that make a curve, and I would like to reduce the number of points, but still keep the overall shape of the curve.
Basically, I want to go from this:
To this:
So the algorithm would remove the points that are redundant but preserve those that really define the shape (like the points at the bottom of the curve). Is there any known algorithm to do that? I expect there is but I'm not sure what to search for on Google. Any help would be appreciated.

Consider Douglas–Peucker_algorithm

There are several algorithms for this.
The simplest one is probably to just keep removing the point whose angle between neighboring points is closest to 180 degrees, until some threshold, or until you've reached a desired number of points.
If the curve is smooth as in your picture, you'll probably get better approximations (or fewer points if you so like) by using Bezier curves for instance.

Related

Algorithm to detect curved lines from list of 2D points

I am trying to extract horizontal lines from a set of 2D points generated from the photo of the model of a human torso:
The points "mostly" form horizontal(ish) lines in a more or less regular way, but with possible gaps/missing-points:
There can be regions where the lines deform a bit:
And regions with background noise:
Of course I would need to tune things so I exclude those parts with defects. What I am looking for with this question is a suggested algorithm to find lines where they are well-behaved, filling eventual gaps and avoiding eventual noise, and also terminating the lines properly upon some discontinuity condition.
I believe there could be some optimizing or voting "flood fill" variant that would score line candidates and yield only well-formed lines, but I am not experienced with this and cannot figure anything by myself.
This dataset is in a gist here, and it is important to note that X coordinates are integers, so points are aligned vertically. The Y coordinates though are decimal numbers.
I would start by finding the nearest neighbor of every dot, then the second nearest neighbor on the other side (I mean only considering the dots in the half plane opposite to the first neighbor).
If the distance to the second neighbor exceeds twice the distance to the first, ignore it.
Just doing that, I bet that you will reconstruct a great deal of the curves, with gaps left unfilled.
By estimating the local curvature along the curve (f.i. by computing the circumscribed circle of three dots, taking every other dot, you can discard noisy portions.
Then to fill the gaps, you can detect the curve endpoints and look for the nearest endpoint in an angle around the extrapolated direction.
First step in the processing:
These are integral curves to the vector field representing the direction pattern.
So maybe start by finding for each point the slope vector, the predominant direction, by taking points from the neighborhood and fitting a line with LS or performing a PCA. Increasing the neighborhood radius should allow to deal with the data irregularities thereby picking up a greater-scale slope trend instead of a local noise.
If you decide to do this, could you post here the slope field you find, so instead of points could we see some tangents?

Find the Best fit plane from a list of 3D coordinates

I would like to find the best fit plane from a list of 3D points. It means the plane has the least square distance from all the points. I read the article
Best fit plane by minimizing orthogonal distances
and
3D Least Squares Plane
I fully understand the solutio but it turns other to be impractical in my situation. I need to read a very very large list of 3d points, direcltly impementation would result in ill posed problem. Even I subtract the data with their average,(refere to the document here-> part3 : http://www.geometrictools.com/Documentation/LeastSquaresFitting.pdf) the number is still very large. So what can I do?
Is there an iterative way to implement it ?
I have changed the way to ask the question, I hope may be there are someone can give me more advices on it ?
Given a list of 3D Points
{(x0,y0,z0),
(x1,y1,z1)...
(xn-1,yn-1,zn-1)}
I would like to construct a plane by fitting all the 3D points. In this sense, I mean to find the plane with format (Ax+By+Cy+D = 0), thus its uses four parameters(A,B,C,D) to characterize a plane. The sum of distance between each point and the plane should be minimium.
I do try the menthod provided in the below link
http://www.geometrictools.com/Documentation/LeastSquaresFitting.pdf
But there are two problems:
-During calculation, the above algorithm needs to do summation of all points value, which lead to overflow problem if my number of points increases
-given newly added points, it has to do all the calculation again, is there a way to use the before calculated plane parameter and the newly given points to somehow fine tune the planes parameters?
PS:I am a bit greedy, if we need to involve all the points, it is possible that the plane finally obtained isn't good enough.I am thinking of using random sample consensus(RANSAC), is it the right direction?
If you are expecting a plane then most of the points are not that useful since even a handful should give you a good approximation of the final solution (module a bit more noise).
So here's the solution. Sample down your data set to something that works and run the smaller set through the fitting algorithm.
If you are not expecting that the points are on a plane then sub-sampling should still work, but you must consider error ranges for any solution (since they will likely be fairly big).

How to divide points into two sets - upper and lower part of profile

I have lists of points in 3d (x,y,z)
For each list I want to divide that list into to two lists, one containing points from upper part of profile and second the lower half, just like that:
My question here is how to determine which point should go to upper-part and which one should go to lower part just from having those points with their coordinates (x,y,z).
Since points can be split in 'halves' in a lot of ways, it is good to have more criteria how to split them.
In this case it seems like you are looking for a curve, that splits point cloud, which has a shape similar to that cloud. Fitting curve of type that can cover your shapes can help. Probably polynomial of second or third order are good for these shapes.
Second idea is to create something that goes through 'middle of geometry'. In 2D case you could use medial axis approach. It can be computed for point cloud by Delaunay triangulation. If points are near some plane, you can project them on the plane and use this approach.
First, create "lines" between each adjacent point. Assuming the points are given in order around the loop, this should be easy.
Then, cast a line from 0,0(upper left) to each point. If it intersects another line to get there, it's not on the upper side. If it doesn't, it is.
It's O(n^2), so I'm sure there is a better solution, but for small sets of points, it should be fine. Note that it won't work on extremely concave shapes, but will for all of those shown.
Join adjacent edges to get lines.. Taking anticlockwise angles as positive..
In the upper part of the cloud, successive lines have increasing angles..
while in the lower part the successive lines have decreasing angles..
A lil trial and error should lead you to an appropriate hueristic..

Algorithms to normalize finger touch data (reduce the number of points)

I'm working on an app that lets users select regions by finger painting on top of a map. The points then get converted to a latitude/longitude and get uploaded to a server.
The touch screen is delivering way too many points to be uploaded over 3G. Even small regions can accumulate up to ~500 points.
I would like to smooth this touch data (approximate it within some tolerance). The accuracy of drawing does not really matter much as long as the general area of the region is the same.
Are there any well known algorithms to do this? Is this work for a Kalman filter?
There is the Ramer–Douglas–Peucker algorithm (wikipedia).
The purpose of the algorithm is, given
a curve composed of line segments, to
find a similar curve with fewer
points. The algorithm defines
'dissimilar' based on the maximum
distance between the original curve
and the simplified curve. The
simplified curve consists of a subset
of the points that defined the
original curve.
You probably don't need anything too exotic to dramatically cut down your data.
Consider something as simple as this:
Construct some sort of error metric. An easy one would be a normalized sum of the distances from the omitted points to the line that was approximating them. Decide what a tolerable error using this metric is.
Then starting from the first point construct the longest line segment that falls within the tolerable error range. Repeat this process until you have converted the entire path into a polyline.
This will not give you the globally optimal approximation but it will probably be good enough.
If you want the approximation to be more "curvey" you might consider using splines or bezier curves rather than straight line segments.
You want to subdivide the surface into a grid with a quadtree or a space-filling-curve. A sfc reduce the 2d complexity to a 1d complexity. You want to look for Nick's hilbert curve quadtree spatial index blog.
I was going to do something this in an app, but was intending on generating a path from the points on-the-fly. I was going to use a technique mentioned in this Point Sequence Interpolation thread

Drawing a Topographical Map

I've been working on a visualization project for 2-dimensional continuous data. It's the kind of thing you could use to study elevation data or temperature patterns on a 2D map. At its core, it's really a way of flattening 3-dimensions into two-dimensions-plus-color. In my particular field of study, I'm not actually working with geographical elevation data, but it's a good metaphor, so I'll stick with it throughout this post.
Anyhow, at this point, I have a "continuous color" renderer that I'm very pleased with:
The gradient is the standard color-wheel, where red pixels indicate coordinates with high values, and violet pixels indicate low values.
The underlying data structure uses some very clever (if I do say so myself) interpolation algorithms to enable arbitrarily deep zooming into the details of the map.
At this point, I want to draw some topographical contour lines (using quadratic bezier curves), but I haven't been able to find any good literature describing efficient algorithms for finding those curves.
To give you an idea for what I'm thinking about, here's a poor-man's implementation (where the renderer just uses a black RGB value whenever it encounters a pixel that intersects a contour line):
There are several problems with this approach, though:
Areas of the graph with a steeper slope result in thinner (and often broken) topo lines. Ideally, all topo lines should be continuous.
Areas of the graph with a flatter slope result in wider topo lines (and often entire regions of blackness, especially at the outer perimeter of the rendering region).
So I'm looking at a vector-drawing approach for getting those nice, perfect 1-pixel-thick curves. The basic structure of the algorithm will have to include these steps:
At each discrete elevation where I want to draw a topo line, find a set of coordinates where the elevation at that coordinate is extremely close (given an arbitrary epsilon value) to the desired elevation.
Eliminate redundant points. For example, if three points are in a perfectly-straight line, then the center point is redundant, since it can be eliminated without changing the shape of the curve. Likewise, with bezier curves, it is often possible to eliminate cetain anchor points by adjusting the position of adjacent control points.
Assemble the remaining points into a sequence, such that each segment between two points approximates an elevation-neutral trajectory, and such that no two line segments ever cross paths. Each point-sequence must either create a closed polygon, or must intersect the bounding box of the rendering region.
For each vertex, find a pair of control points such that the resultant curve exhibits a minimum error, with respect to the redundant points eliminated in step #2.
Ensure that all features of the topography visible at the current rendering scale are represented by appropriate topo lines. For example, if the data contains a spike with high altitude, but with extremely small diameter, the topo lines should still be drawn. Vertical features should only be ignored if their feature diameter is smaller than the overall rendering granularity of the image.
But even under those constraints, I can still think of several different heuristics for finding the lines:
Find the high-point within the rendering bounding-box. From that high point, travel downhill along several different trajectories. Any time the traversal line crossest an elevation threshold, add that point to an elevation-specific bucket. When the traversal path reaches a local minimum, change course and travel uphill.
Perform a high-resolution traversal along the rectangular bounding-box of the rendering region. At each elevation threshold (and at inflection points, wherever the slope reverses direction), add those points to an elevation-specific bucket. After finishing the boundary traversal, start tracing inward from the boundary points in those buckets.
Scan the entire rendering region, taking an elevation measurement at a sparse regular interval. For each measurement, use it's proximity to an elevation threshold as a mechanism to decide whether or not to take an interpolated measurement of its neighbors. Using this technique would provide better guarantees of coverage across the whole rendering region, but it'd be difficult to assemble the resultant points into a sensible order for constructing paths.
So, those are some of my thoughts...
Before diving deep into an implementation, I wanted to see whether anyone else on StackOverflow has experience with this sort of problem and could provide pointers for an accurate and efficient implementation.
Edit:
I'm especially interested in the "Gradient" suggestion made by ellisbben. And my core data structure (ignoring some of the optimizing interpolation shortcuts) can be represented as the summation of a set of 2D gaussian functions, which is totally differentiable.
I suppose I'll need a data structure to represent a three-dimensional slope, and a function for calculating that slope vector for at arbitrary point. Off the top of my head, I don't know how to do that (though it seems like it ought to be easy), but if you have a link explaining the math, I'd be much obliged!
UPDATE:
Thanks to the excellent contributions by ellisbben and Azim, I can now calculate the contour angle for any arbitrary point in the field. Drawing the real topo lines will follow shortly!
Here are updated renderings, with and without the ghetto raster-based topo-renderer that I've been using. Each image includes a thousand random sample points, represented by red dots. The angle-of-contour at that point is represented by a white line. In certain cases, no slope could be measured at the given point (based on the granularity of interpolation), so the red dot occurs without a corresponding angle-of-contour line.
Enjoy!
(NOTE: These renderings use a different surface topography than the previous renderings -- since I randomly generate the data structures on each iteration, while I'm prototyping -- but the core rendering method is the same, so I'm sure you get the idea.)
Here's a fun fact: over on the right-hand-side of these renderings, you'll see a bunch of weird contour lines at perfect horizontal and vertical angles. These are artifacts of the interpolation process, which uses a grid of interpolators to reduce the number of computations (by about 500%) necessary to perform the core rendering operations. All of those weird contour lines occur on the boundary between two interpolator grid cells.
Luckily, those artifacts don't actually matter. Although the artifacts are detectable during slope calculation, the final renderer won't notice them, since it operates at a different bit depth.
UPDATE AGAIN:
Aaaaaaaand, as one final indulgence before I go to sleep, here's another pair of renderings, one in the old-school "continuous color" style, and one with 20,000 gradient samples. In this set of renderings, I've eliminated the red dot for point-samples, since it unnecessarily clutters the image.
Here, you can really see those interpolation artifacts that I referred to earlier, thanks to the grid-structure of the interpolator collection. I should emphasize that those artifacts will be completely invisible on the final contour rendering (since the difference in magnitude between any two adjacent interpolator cells is less than the bit depth of the rendered image).
Bon appetit!!
The gradient is a mathematical operator that may help you.
If you can turn your interpolation into a differentiable function, the gradient of the height will always point in the direction of steepest ascent. All curves of equal height are perpendicular to the gradient of height evaluated at that point.
Your idea about starting from the highest point is sensible, but might miss features if there is more than one local maximum.
I'd suggest
pick height values at which you will draw lines
create a bunch of points on a fine, regularly spaced grid, then walk each point in small steps in the gradient direction towards the nearest height at which you want to draw a line
create curves by stepping each point perpendicular to the gradient; eliminate excess points by killing a point when another curve comes too close to it-- but to avoid destroying the center of hourglass like figures, you might need to check the angle between the oriented vector perpendicular to the gradient for both of the points. (When I say oriented, I mean make sure that the angle between the gradient and the perpendicular value you calculate is always 90 degrees in the same direction.)
In response to your comment to #erickson and to answer the point about calculating the gradient of your function. Instead of calculating the derivatives of your 300 term function you could do a numeric differentiation as follows.
Given a point [x,y] in your image you could calculate the gradient (direction of steepest decent)
g={ ( f(x+dx,y)-f(x-dx,y) )/(2*dx),
{ ( f(x,y+dy)-f(x,y-dy) )/(2*dy)
where dx and dy could be the spacing in your grid. The contour line will run perpendicular to the gradient. So, to get the contour direction, c, we can multiply g=[v,w] by matrix, A=[0 -1, 1 0] giving
c = [-w,v]
Alternately, there is the marching squares algorithm which seems appropriate to your problem, although you may want to smooth the results if you use a coarse grid.
The topo curves you want to draw are isosurfaces of a scalar field over 2 dimensions. For isosurfaces in 3 dimensions, there is the marching cubes algorithm.
I've wanted something like this myself, but haven't found a vector-based solution.
A raster-based solution isn't that bad, though, especially if your data is raster-based. If your data is vector-based too (in other words, you have a 3D model of your surface), you should be able to do some real math to find the intersection curves with horizontal planes at varying elevations.
For a raster-based approach, I look at each pair of neighboring pixels. If one is above a contour level, and one is below, obviously a contour line runs between them. The trick I used to anti-alias the contour line is to mix the contour line color into both pixels, proportional to their closeness to the idealized contour line.
Maybe some examples will help. Suppose that the current pixel is at an "elevation" of 12 ft, a neighbor is at an elevation of 8 ft, and contour lines are every 10 ft. Then, there is a contour line half way between; paint the current pixel with the contour line color at 50% opacity. Another pixel is at 11 feet and has a neighbor at 6 feet. Color the current pixel at 80% opacity.
alpha = (contour - neighbor) / (current - neighbor)
Unfortunately, I don't have the code handy, and there might have been a bit more to it (I vaguely recall looking at diagonal neighbors too, and adjusting by sqrt(2) / 2). I hope this enough to give you the gist.
It occurred to me that what you're trying to do would be pretty easy to do in MATLAB, using the contour function. Doing things like making low-density approximations to your contours can probably be done with some fairly simple post-processing of the contours.
Fortunately, GNU Octave, a MATLAB clone, has implementations of the various contour plotting functions. You could look at that code for an algorithm and implementation that's almost certainly mathematically sound. Or, you might just be able to offload the processing to Octave. Check out the page on interfacing with other languages to see if that would be easier.
Disclosure: I haven't used Octave very much, and I haven't actually tested it's contour plotting. However, from my experience with MATLAB, I can say that it will give you almost everything you're asking for in just a few lines of code, provided you get your data into MATLAB.
Also, congratulations on making a very VanGough-esque slopefield plot.
I always check places like http://mathworld.wolfram.com before going to deep on my own :)
Maybe their curves section would help? Or maybe the entry on maps.
compare what you have rendered with a real-world topo map - they look identical to me! i wouldn't change a thing...
Write the data out as an HGT file (very simple digital elevation data format used by USGS) and use the free and open-source gdal_contour tool to create contours. That works very well for terrestrial maps, the constraint being that the data points are signed 16-bit numbers, which fits the earthly range of heights in metres very well, but may not be enough for your data, which I assume not to be a map of actual terrain - although you do mention terrain maps.
I recommend the CONREC approach:
Create an empty line segment list
Split your data into regular grid squares
For each grid square, split the square into 4 component triangles:
For each triangle, handle the cases (a through j):
If a line segment crosses one of the cases:
Calculate its endpoints
Store the line segment in the list
Draw each line segment in the line segment list
If the lines are too jagged, use a smaller grid. If the lines are smooth enough and the algorithm is taking too long, use a larger grid.

Resources