Simulation data fitting - curve-fitting

I have 154 scatter points and I want to fit them.
What is the best way to fit them to a curve? I tried with Excel sheet to fit the data to polynomial curve but I found the highest fitting power is 6. which program or numerical technique would help in fitting these data?

Related

K-Means for diagonal clusters

I currently have 2 clusters which essentially lie along 2 lines on a 3D surface:
I've tried a simple kmeans algorithm which yielded the above separation. (Big red dots are the means)
I have tried clustering with soft k-means with different variances for each mean along the 3 dimensions. However, it also failed to model this, presumably as it cannot fit the shape of a diagonal Gaussian.
Is there another algorithm that can take into account that the data is diagonal? Alternatively is there a way to essentially "rotate" the data such that soft k-means could be made to work?
K-means is not well prepared for correlations.
But these clusters look reasonably Gaussian to me, you should try Gaussian Mixture Modeling instead.

Surface Reconstruction from Cocone algorithms

I have a 3D point cloud that I attained from tracing out the outline of a shape with sensors attached to my fingertips. The resulting data has non-uniform density with large gaps between some of the points.
What are some good surface reconstruction algorithms to use on this kind of data that is recorded by hand and therefore has issues of varying density?
I have been attempting to use the Cocone, Robust Cocone, and Tight Cocone Surface Reconstruction algorithms from Tamal Dey to reconstruct the shape, but I am having difficulty because I believe my data is much less uniform than the example point sets provided with the algorithms. I have read Tamal's literature on each reconstruction algorithm because there are variables that can be altered in the algorithms, but I have been unable to find the right settings to get my data to work with any of the Cocone algorithms.
Does anyone understand the user settings in these algorithms?
What would be the best settings for very non-uniform data points? I can provide the 3D point data of the shape upon request.

von Karman curve fitting to field measured wind spectrum

So for this wind monitoring project I'm getting data from a couple of 3d sonic anemometers, specifically 2 R.M.Young 81000. The data output is made digitally with a sampling frequency of 10Hz for periods of 10min. After all the pre-processing (coordinate rotation, trend removal...) I get 3 orthogonal time series of the turbulent data. Right now I'm using the stationary data of 2 hours of measurements with windows of 4096 points and a 50% overlapping to obtain the frequency spectrums in all three directions. After obtaining the spectrum I apply a logarithmic frequency smoothing algorithm, which averages the obtained spectrum in logarithmic spaced intervals.
I have two questions:
The spectrums I obtain from the measured show a clear downward trend in the highest frequencies as seen in the attached figure. I wonder if this loss of energy can have anything to do with an internal filter from the sonic anemometer? Or what else? Is there a way to compensate this loss or better just to consider the spectrum until the "break frequency"?
http://i.stack.imgur.com/B11uP.png
When applying the curve fitting algorithm to determine the integral length scales according to the von Karman equation what is the correct procedure: curve fitting the original data, which gives more weight to higher frequency data points? or using the logarithmic frequency smoothed data to approximate the von karman equation, giving an equal weight to data in the logarithmic scale? In some cases I obtain very different estimates for the integral length scales using both approaches (ex: Original -> Lu=113.16 Lv=42.68 Lw=9.23; Freq. Smoothed -> Lu=148.60 Lv=30.91 Lw=14.13).
Curve fitting with Logarithmic frequency smoothing and with Original data:
http://i.imgur.com/VL2cf.png
Let me know if something is not clear. I'm relatively new in this field, and I might me be making some mistakes in my approach, so if you could give me some advice or tips it would be amazing.

Algorithms to normalize finger touch data (reduce the number of points)

I'm working on an app that lets users select regions by finger painting on top of a map. The points then get converted to a latitude/longitude and get uploaded to a server.
The touch screen is delivering way too many points to be uploaded over 3G. Even small regions can accumulate up to ~500 points.
I would like to smooth this touch data (approximate it within some tolerance). The accuracy of drawing does not really matter much as long as the general area of the region is the same.
Are there any well known algorithms to do this? Is this work for a Kalman filter?
There is the Ramer–Douglas–Peucker algorithm (wikipedia).
The purpose of the algorithm is, given
a curve composed of line segments, to
find a similar curve with fewer
points. The algorithm defines
'dissimilar' based on the maximum
distance between the original curve
and the simplified curve. The
simplified curve consists of a subset
of the points that defined the
original curve.
You probably don't need anything too exotic to dramatically cut down your data.
Consider something as simple as this:
Construct some sort of error metric. An easy one would be a normalized sum of the distances from the omitted points to the line that was approximating them. Decide what a tolerable error using this metric is.
Then starting from the first point construct the longest line segment that falls within the tolerable error range. Repeat this process until you have converted the entire path into a polyline.
This will not give you the globally optimal approximation but it will probably be good enough.
If you want the approximation to be more "curvey" you might consider using splines or bezier curves rather than straight line segments.
You want to subdivide the surface into a grid with a quadtree or a space-filling-curve. A sfc reduce the 2d complexity to a 1d complexity. You want to look for Nick's hilbert curve quadtree spatial index blog.
I was going to do something this in an app, but was intending on generating a path from the points on-the-fly. I was going to use a technique mentioned in this Point Sequence Interpolation thread

writing a similarity function for images for clustering data

I know how to write a similarity function for data points in euclidean space (by taking the negative min sqaured error.) Now if I want to check my clustering algorithms on images how can I write a similarity function for data points in images? Do I base it on their RGB values or what? and how?
I think we need to clarify better some points:
Are you clustering only on color? So, take RGB values for pixels and apply your metric function (minimize sum of sq. error, or just calculate SAD - Sum of Absolute Differences).
Are you clustering on space basis (in an image)? In this case, you should take care of position, as you specified for euclidean space, just considering the image as your samples' domain. It's a 2D space anyway... 3D if you consider color information too (see next).
Are you looking for 3D information from image? (2D position + 1D color) It's the most probable case. Consider segmentation techniques if your image shows regular or well defined shapes, as first approach. If it fails, or you wanted a less hand tuned algorithm, consider reducing the 3D space of information to 2D or even 1D by doing PCA on data. By analyzing Principal Components you could drop off unuseful information from your collection and/or exploiting intrinsic data structure in some way.
The argument would need much more than a post to be solved, but I hope this could help a bit.

Resources