Need Help in Contour Plot in Matlab - algorithm

I am trying to plot contour of the Shifted Schwefel Problem function but keep having this error: Z must be size 2x2 or greater. I have searched on this forum and the information i have helped a little, but could not fix the above error. The information i got from this forum lead me to trying this code:
min = -50;
max = 50;
steps = 20;
c = linspace(min, max, steps); % Create the mesh
[x, y] = meshgrid(c, c); % Create the grid
%o=-50+100*rand(1,2);
%c = c - repmat(o,1,10);
for I=1:length(x)
for J=1:length(y)
o=-50+100*rand(1,2);
x=x-repmat(o,20,10);
f = max(abs(x), [], 2);
end
end
figure,
contour(x,y,f);
figure,
surfc(x, y,f);
Now i have error that z, which the the value of f most be atleast 2x2 or greater. I know my f is taking only one input and therefore will output only one. I tried having it in a nested for loops, but still giving me a array of vectors not matrix of atleast 2x2. if the input was two, then the problem will be fine, but the problem is, it is one input. Does anyone know how i can make this "f" output a matrix of atleast 2x2 so that i can plot the z of the contour?

There are a few things to note:
1.) As Jacob Robbins pointed out correctly in his comment, you should avoid using names from Matlab functions as variable names (in your case min and max). One very easy way to do this, is to use only upper case letters for variable names.
2.) You are correct in saying that your fis only one output (though one output in this case is not a single number, but a vector). That is, because you don't assign any indexing to it within the loop.
3.) Yes, both contour and surfc need at least 2x2 - because they plot information on a grid, which is itself at least 2x2 in nature.
4.) In your particular case, two loops may not be necessary. You seem to only be manipulating the x-vector and your grid is regular. Hence you might want to try this loop:
for I=1:length(x)
o=-50+100*rand(1,2);
x=x-repmat(o,20,10);
f(:,I) = max(abs(x), [], 2);
end
Now, f will be of size 20x20, which corresponds to your x- and y-grid. Also, now your contour and surfc command will produce plots.
5.) One last comment: The output of your function and the results of a web-search for "Shifted Schwefel function" are very different. But the question if your implementation of the Shifted Schwefel function is correct, should be asked as a new question.

Related

Automatically choose x locations of scatter plot in front of bar graph

I'd like an algorithm to organize a 2D cloud of points in front of a bar graph so that a viewer could easily see the spread of the data. The y location of the point needs to be equal/scaled/proportional to the value of the data, but the x location doesn't matter and would be determined by the algorithm. I imagine a good strategy would be to minimize overlap among the points and center the points.
Here is an example of such a plot without organizing the points:
I generate my bar graphs with points in front of it with MATLAB, but I'm interested just in the best way to choose the x location values of the points.
I have been organizing the points by hand afterwards in Adobe Illustrator, which is time-consuming. Any recommendations? Is this a sub-problem of an already solved problem? What is this kind of plot called?
For high sample sizes, I imagine something like the following would be better than a cloud of points.
I think, mathematically, starting with some array of y-values, it would maximize the sum of the difference between every element from every other element, inversely scaled by the distance between the elements, by rearranging the order of the elements in the array.
Here is the MATLAB code I used to generate the graph:
y = zeros(20,6);
yMean = zeros(1,6);
for i=1:6
y(:,i) = 5 + (8-5).*rand(20,1);
yMean(i) = mean(y(:,i));
end
figure
hold on
bar(yMean,0.5)
for i=1:6
x = linspace(i-0.3,i+0.3,20);
plot(x,y(:,i),'ro')
end
axis([0,7,0,10])
Here is one way that determines x-locations based on grouping into (histogram) bins. The result is similar to e.g. the plot in https://stackoverflow.com/a/1934882/4720018, but retains the original y-values. For convenience the points are sorted, but they could be displayed in order of appearance using the bin_index. Whether this is "the best way" of choosing the x-coordinates depends on what you are trying to achieve.
% Create some dummy data
dummy_data_y = 1+0.1*randn(10,3);
% Create bar plot (assuming you are interested in the mean)
bar_obj = bar(mean(dummy_data_y));
% Obtain data size info
n = size(dummy_data_y, 2);
% Algorithm that creates an x vector for each data column
sorted_data_y = sort(dummy_data_y, 'ascend'); % for convenience
number_of_bins = 5;
for j=1:n
% Get histogram information
[bin_count, ~, bin_index] = histcounts(sorted_data_y(:, j), number_of_bins);
% Create x-location data for current column
xj = [];
for k = 1:number_of_bins
xj = [xj 0:bin_count(k)-1];
end
% Collect x locations per column, scale and translate
sorted_data_x(:, j) = j + (xj-(bin_count(bin_index)-1)/2)'/...
max(bin_count)*bar_obj.BarWidth;
end
% Plot the individual data points
line(sorted_data_x, sorted_data_y, 'linestyle', 'none', 'marker', '.', 'color', 'r')
Whether this is a good way to display your data remains open to discussion.

How to detect a portion of an image based upon the matrix values?

I have a simple pcolor plot in Matlab (Version R 2016b) which I have uploaded as shown in the image below. I need to get only the blue sloped line which extends from the middle of the leftmost corner to the rightmost corner without hard-coding the matrix values.
For instance: One can see that the desired slope line has values somewhere approximately between 20 to 45 from the pcolor plot. (From a rough guess just by looking at the graph)
I'm applying the following code on the matrix named Slant which contains the plotted values.
load('Slant.mat');
Slant(Slant<20|Slant>50)=0;
pcolor(Slant); colormap(jet); shading interp; colorbar;
As one can see I hard-coded the values which I don't want to. Is there any method of detecting certain matrix values while making the rest equal to zero?
I used an other small algorithm of taking half the maximum value from the matrix and setting it to zero. But this doesn't work for other images.
[maxvalue, row] = max(Slant);
max_m=max(maxvalue);
Slant(Slant>max_m/2)=0;
pcolor(Slant); colormap(jet); shading interp; colorbar;
Here is another suggestion:
Remove all the background.
Assuming this "line" results in a Bimodal distribution of the data (after removing the zeros), find the lower mode.
Assuming the values of the line are always lower than the background, apply a logic mask that set to zeros all values above the minimum + 2nd_mode, as demonstrated in the figure below (in red circle):
Here is how it works:
A = Slant(any(Slant,2),:); % save in A only the nonzero data
Now we have A that looks like this:
[y,x] = findpeaks(histcounts(A)); % find all the mode in the histogram of A
sorted_x = sortrows([x.' y.'],-2); % sort them by their hight in decendet order
mA = A<min(A(:))+sorted_x(2,1); % mask all values above the second mode
result = A.*mA; % apply the mask on A
And we get the result:
The resulted line has some holes within it, so you might want to interpolate the whole line from the result. This can be done with simple math on the indices:
[y1,x1] = find(mA,1); % find the first nonzero row
[y2,x2] = find(mA,1,'last'); % find the last nonzero row
m = (y1-y2)/(x1-x2); % the line slope
n = y1-m*x1; % the intercept
f_line = #(x) m.*x+n; % the line function
So we get a line function f_line like this (in red below):
Now we want to make this line thicker, like the line in the data, so we take the mode of the thickness (by counting the values in each column, you might want to take max instead), and 'expand' the line by half of this factor to both sides:
thick = mode(sum(mA)); % mode thickness of the line
tmp = (1:thick)-ceil(thick/2); % helper vector for expanding
rows = bsxfun(#plus,tmp.',floor(f_line(1:size(A,2)))); % all the rows for each coloumn
rows(rows<1) = 1; % make sure to not get out of range
rows(rows>size(A,1)) = size(A,1); % make sure to not get out of range
inds = sub2ind(size(A),rows,repmat(1:size(A,2),thick,1)); % convert to linear indecies
mA(inds) = 1; % add the interpolation to the mask
result = A.*mA; % apply the mask on A
And now result looks like this:
Idea: Use the Hough transform:
First of all it is best to create a new matrix with only the rows and columns we are interested in.
In order to apply matlab's built in hough we have to create a binary image: As the line always has lower values than the rest, we could e.g. determine the lowest quartile of the brightnesses present in the picture (using quantile, and set these to white, everything else to black.
Then to find the line, we can use hough directly on that BW image.

Sparse matrix plot matlab

I have a 5000 *5000 sparse matrix with 4 different values. I want to visualise the nonzero elements with 4 different colors such that I can recognise the ratio of this values and the relationships between them,I use imagesc but I can not recognise very well among different values, especially the values with smaller ratio.I think if I use some symboles for each value , it works but I don't know how is it in Matlab. Any suggestion? The result of Dan code is in figure below.
You could reform the matrix to be a set of [X, Y, F] coordinates (re-using my answer from Resampling Matrix and restoring in one single Matrix):
Assuming your matrix is M
[X, Y] = meshgrid(1:size(M,1), 1:size(M,2));
Mf = M(:); %used again later, hence stored
V = [X(:), Y(:), Mf];
get rid of the zero elements
V(Mf == 0, :) = [];
At this point, if you have access to the statistics toolbox you can just go gscatter(V(:,1), V(:,2), V(:,3)) to get the correct plot otherwise continue with the following if you don't have the toolbox:
Find a list of the unique values in M
Vu = unique(V(:,3));
For each such value, plot the points as an xy scatter plot, note hold all makes sure the colour changes each time a new plot is added i.e. each new iteration of the loop
hold all;
for g = 1:length(Vu)
Vg = V(V(:,3)==Vu(g),:)
plot(Vg(:,1), Vg(:,2), '*');
a{g}=num2str(Vu(g));
end
legend(a);
Example M:
M = zeros(1000);
M(200,30) = 7;
M(100, 900) = 10;
M(150, 901) = 13;
M(600, 600) = 13;
Result:
Now i can answer the first part of the question. I suppose you need to do something like
sum(histc(A, unique(A)),2)
to count the number of unique values in the matrix.
temp = histc(A, unique(A)) "is a matrix of column histogram counts." So you get the counts of all values of unique(A) as they appear in A columns.
I'm doing stat = sum(temp,2) to get counts of all values of unique(A) in the whole matrix.
Then you can use the code proposed from #Dan to visualize the result.
hold all;
u=unique(A);
for i = 1:length(stat)
plot(u(i), stat(i)/max(stat), '*');
end
Please clarify what kind of relationship between the values do you mean?

Suggested algorithms/methods for laying out labels on an image

Given an image and a set of labels attached to particular points on the image, I'm looking for an algorithm to lay out the labels to the sides of the image with certain constraints (roughly same number of labels on each side, labels roughly equidistant, lines connecting the labels to their respective points with no lines crossing).
Now, an approximate solution can typically be found quite naively by ordering the labels by Y coordinate (of the point they refer to), as in this example (proof of concept only, please ignore accuracy or otherwise of actual data!).
Now to satisfy the condition of no crossings, some ideas that occurred to me:
use a genetic algorithm to find an ordering of labels with no crossovers;
use another method (e.g. dynamic programming algorithm) to search for such an ordering;
use one of the above algorithms, allowing for variations in the spacing as well as ordering, to find the solution that minimises number of crossings and variation from even spacing;
maybe there are criteria I can use to brute search through every possible ordering of the labels within certain criteria (do not re-order two labels if their distance is greater than X);
if all else fails, just try millions of random orderings/spacing offsets and take the one that gives the minimum crossings/spacing variation. (Advantage: straightforward to program and will probably find a good enough solution; slight disadvantage, though not a show-stopper: maybe can't then run it on the fly during the application to allow user to change layout/size of the image.)
Before I embark on one of these, I would just welcome some other people's input: has anybody else experience with a similar problem and have any information to report on the success/failure of any of the above methods, or if they have a better/simpler solution that isn't occurring to me? Thanks for your input!
Lucas Bradsheet's honours thesis Labelling Maps using Multi-Objective Evolutionary Algorithms has quite a good discussion of this.
First off, this paper creates usable metrics for a number of metrics of labelling quality.
For example, clarity (how obvious the mapping between sites and labels was): clarity(s)=rs+rs1/rt
where rs is the distance between a site and its label and rt is the distance between a site and there closest other label).
It also has useful metrics for the conflicts between labels, sites and borders, as well as for measuring the density and symmetry of labels. Bradsheet then uses a multiple objective genetic algorithm to generate a "Pareto frontier" of feasible solutions. It also includes information about how he mutated the individuals, and some notes on improving the speed of the algorithm.
There's a lot of detail in it, and it should provide some good food for thought.
Let's forget about information design for a moment. This tasks recalls some memories related to PCB routing algorithms. Actually there are a lot of common requirements, including:
intersections optimization
size optimization
gaps optimization
So, it could be possible to turn the initial task into something similar to PCB routing.
There are a lot of information available, but I would suggest to look through Algorithmic studies on PCB routing by Tan Yan.
It provides a lot of details and dozens of hints.
Adaptation for the current task
The idea is to treat markers on the image and labels as two sets of pins and use escape routing to solve the task. Usually the PCB area is represented as an array of pins. Same can be done to the image with possible optimizations:
avoid low contrast areas
avoid text boxes if any
etc
So the task can be reduced to "routing in case of unused pins"
Final result can be really close to the requested style:
Algorithmic studies on PCB routing by Tan Yan is a good place to continue.
Additional notes
I chn change the style of the drawing a little bit, in order to accentuate the similarity.
It should not be a big problem to do some reverse transformation, keeping the good look and readability.
Anyway, adepts of simplicity (like me, for example) can spend several minutes and invent something better (or something different):
As for me, curves do not look like a complete solution, at least on this stage. Anyway, I've just tried to show there is room for enhancements, so PCB routing approach can be considered as an option.
One option is to turn it into an integer programming problem.
Lets say you have n points and n corresponding labels distributed around the outside of the diagram.
The number of possible lines is n^2, if we look at all possible intersections, there are less than n^4 intersections (if all possible lines were displayed).
In our integer programming problem we add the following constraints:
(to decide if a line is switched on (i.e. displayed to the screen) )
For each point on the diagram, only one of the possible n lines
connecting to it is to be switched on.
For each label, only one of the possible n lines connecting to it is
to be switched on.
For each pair of intersecting line segments line1 and line2, only
zero or one of these lines may be switched on.
Optionally, we can minimize the total distance of all the switched on lines. This enhances aesthetics.
When all of these constraints hold at the same time, we have a solution:
The code below produced the above diagram for 24 random points.
Once You start to get more than 15 or so points, the run time of the program will start to slow.
I used the PULP package with its default solver. I used PyGame for the display.
Here is the code:
__author__ = 'Robert'
import pygame
pygame.font.init()
import pulp
from random import randint
class Line():
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
self.length = (p1[0] - p2[0])**2 + (p1[1] - p2[1])**2
def intersect(self, line2):
#Copied some equations for wikipedia. Not sure if this is the best way to check intersection.
x1, y1 = self.p1
x2, y2 = self.p2
x3, y3 = line2.p1
x4, y4 = line2.p2
xtop = (x1*y2-y1*x2)*(x3-x4)-(x1-x2)*(x3*y4-y3*x4)
xbottom = (x1-x2)*(y3-y4) - (y1-y2)*(x3-x4)
ytop = (x1*y2-y1*x2)*(y3-y4)-(y1-y2)*(x3*y4-y3*x4)
ybottom = xbottom
if xbottom == 0:
#lines are parallel. Can only intersect if they are the same line. I'm not checking that however,
#which means there could be a rare bug that occurs if more than 3 points line up.
if self.p1 in (line2.p1, line2.p2) or self.p2 in (line2.p1, line2.p2):
return True
return False
x = float(xtop) / xbottom
y = float(ytop) / ybottom
if min(x1, x2) <= x <= max(x1, x2) and min(x3, x4) <= x <= max(x3, x4):
if min(y1, y2) <= y <= max(y1, y2) and min(y3, y4) <= y <= max(y3, y4):
return True
return False
def solver(lines):
#returns best line matching
lines = list(lines)
prob = pulp.LpProblem("diagram labelling finder", pulp.LpMinimize)
label_points = {} #a point at each label
points = {} #points on the image
line_variables = {}
variable_to_line = {}
for line in lines:
point, label_point = line.p1, line.p2
if label_point not in label_points:
label_points[label_point] = []
if point not in points:
points[point] = []
line_on = pulp.LpVariable("point{0}-point{1}".format(point, label_point),
lowBound=0, upBound=1, cat=pulp.LpInteger) #variable controls if line used or not
label_points[label_point].append(line_on)
points[point].append(line_on)
line_variables[line] = line_on
variable_to_line[line_on] = line
for lines_to_point in points.itervalues():
prob += sum(lines_to_point) == 1 #1 label to each point..
for lines_to_label in label_points.itervalues():
prob += sum(lines_to_label) == 1 #1 point for each label.
for line1 in lines:
for line2 in lines:
if line1 > line2 and line1.intersect(line2):
line1_on = line_variables[line1]
line2_on = line_variables[line2]
prob += line1_on + line2_on <= 1 #only switch one on.
#minimize length of switched on lines:
prob += sum(i.length * line_variables[i] for i in lines)
prob.solve()
print prob.solutionTime
print pulp.LpStatus[prob.status] #should say "Optimal"
print len(prob.variables())
for line_on, line in variable_to_line.iteritems():
if line_on.varValue > 0:
yield line #yield the lines that are switched on
class Diagram():
def __init__(self, num_points=20, width=700, height=800, offset=150):
assert(num_points % 2 == 0) #if even then labels align nicer (-:
self.background_colour = (255,255,255)
self.width, self.height = width, height
self.screen = pygame.display.set_mode((width, height))
pygame.display.set_caption('Diagram Labeling')
self.screen.fill(self.background_colour)
self.offset = offset
self.points = list(self.get_points(num_points))
self.num_points = num_points
self.font_size = min((self.height - 2 * self.offset)//num_points, self.offset//4)
def get_points(self, n):
for i in range(n):
x = randint(self.offset, self.width - self.offset)
y = randint(self.offset, self.height - self.offset)
yield (x, y)
def display_outline(self):
w, h = self.width, self.height
o = self.offset
outline1 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 100, 100), True, outline1, 1)
o = self.offset - self.offset//4
outline2 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 200, 0), True, outline2, 1)
def display_points(self, color=(100, 100, 0), radius=3):
for point in self.points:
pygame.draw.circle(self.screen, color, point, radius, 2)
def get_label_heights(self):
for i in range((self.num_points + 1)//2):
yield self.offset + 2 * i * self.font_size
def get_label_endpoints(self):
for y in self.get_label_heights():
yield (self.offset, y)
yield (self.width - self.offset, y)
def get_all_lines(self):
for point in self.points:
for end_point in self.get_label_endpoints():
yield Line(point, end_point)
def display_label_lines(self, lines):
for line in lines:
pygame.draw.line(self.screen, (255, 0, 0), line.p1, line.p2, 1)
def display_labels(self):
myfont = pygame.font.SysFont("Comic Sans MS", self.font_size)
label = myfont.render("label", True, (155, 155, 155))
for y in self.get_label_heights():
self.screen.blit(label, (self.offset//4 - 10, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.offset - self.offset//4, y), (self.offset, y), 1)
for y in self.get_label_heights():
self.screen.blit(label, (self.width - 2*self.offset//3, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.width - self.offset + self.offset//4, y), (self.width - self.offset, y), 1)
def display(self):
self.display_points()
self.display_outline()
self.display_labels()
#self.display_label_lines(self.get_all_lines())
self.display_label_lines(solver(self.get_all_lines()))
diagram = Diagram()
diagram.display()
pygame.display.flip()
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
I think an actual solution of this problem is on the slightly different layer. It doesn't seem to be right idea to start solving algorithmic problem totally ignoring Information design. There is an interesting example found here
Let's identify some important questions:
How is the data best viewed?
Will it confuse people?
Is it readable?
Does it actually help to better understand the picture?
By the way, chaos is really confusing. We like order and predictability. There is no need to introduce additional informational noise to the initial image.
The readability of a graphical message is determined by the content and its presentation. Readability of a message involves the reader’s ability to understand the style of text and pictures. You have that interesting algorithmic task because of the additional "noisy" approach. Remove the chaos -- find better solution :)
Please note, this is just a PoC. The idea is to use only horizontal lines with clear markers. Labels placement is straightforward and deterministic. Several similar ideas can be proposed.
With such approach you can easily balance left-right labels, avoid small vertical gaps between lines, provide optimal vertical density for labels, etc.
EDIT
Ok, let's see how initial process may look.
User story: as a user I want important images to be annotated in order to simplify understanding and increase it's explanatory value.
Important assumptions:
initial image is a primary object for the user
readability is a must
So, the best possible solution is to have annotations but do not have them. (I would really suggest to spend some time reading about the theory of inventive problem solving).
Basically, there should be no obstacles for the user to see the initial picture, but annotations should be right there when needed. It can be slightly confusing, sorry for that.
Do you think intersections issue is the only one behind the following image?
Please note, the actual goal behind the developed approach is to provide two information flows (image and annotations) and help the user to understand everything as fast as possible. By the way, vision memory is also very important.
What are behind human vision:
Selective attention
Familiarity detection
Pattern detection
Do you want to break at least one of these mechanisms? I hope you don't. Because it will make the actual result not very user-friendly.
So what can distract me?
strange lines randomly distributed over the image (random geometric objects are very distractive)
not uniform annotations placement and style
strange complex patterns as a result of final merge of the image and the annotation layer
Why my proposal should be considered?
It has simple pattern, so pattern detection will let the user stop noticing annotations, but see the picture instead
It has uniform design, so familiarity detection will work too
It does not affect initial image so much as other solutions because lines have minimal width.
Lines are horizontal, anti-aliasing is not used, so it saves more information and provides clean result
Finally, it does simplify routing algorithm a lot.
Some additional comments:
Do not use random points to test your algorithms, use simple but yet important cases. You'll see automated solutions sometimes may fail dramatically.
I do not suggest to use approach proposed by me as is. There are a lot of possible enhancements.
What I'm really suggest is to go one level up and do several iterations on the meta-level.
Grouping can be used to deal with the complex case, mentioned by Robert King:
Or I can imagine for a second some point is located slightly above it's default location. But only for a second, because I do not want to break the main processing flow and affect other markers.
Thank you for reading.
You can find the center of your diagram, and then draw the lines from the points radially outward from the center. The only way you could have a crossing is if two of the points lie on the same ray, in which case you just shift one of the lines a bit one way, and shift the other a bit the other way, like so:
With only actual parts showing:
In case there are two or more points colinear with the center, you can shift the lines slightly to the side:
While this doen't produce very good multisegment line things, it very clearly labels the diagram. Also, to make it more fisually appealing, it may be better to pick a point for the center that is actually the center of your object, rather than just the center of the point set.
I would add one more thing to your prototype - may be it will be acceptable after this:
Iterate through every intersection and swap labels, repeat until there are intersections.
This process is finite, because number of states is finite and every swap reduces sum of all line lengths - so no loop is possible.
This problem can be cast as graph layout.
I recommend you look at e.g. the Graphviz library. I have not done any experiments, but believe that by expressing the points to be labeled and the labels themselves as nodes and the lead lines as edges, you would get good results.
You would have to express areas where labels should not go as "dummy" nodes not to be overlapped.
Graphvis has bindings for many languages.
Even if Graphviz does not have quite enough flexibility to do exactly what you need, the "Theory" section of that page has references for energy minimization and spring algorithms that can be applied to your problem. The literature on graph layout is enormous.

Fastest way to fit a parabola to set of points?

Given a set of points, what's the fastest way to fit a parabola to them? Is it doing the least squares calculation or is there an iterative way?
Thanks
Edit:
I think gradient descent is the way to go. The least squares calculation would have been a little bit more taxing (having to do qr decomposition or something to keep things stable).
If the points have no error associated, you may interpolate by three points. Otherwise least squares or any equivalent formulation is the way to go.
I recently needed to find a parabola that passes through 3 points.
suppose you have (x1,y1), (x2,y2) and (x3,y3) and you want the parabola
y-y0 = a*(x-x0)^2
to pass through them: find y0, x0, and a.
You can do some algebra and get this solution (providing the points aren't all on a line) :
let c = (y1-y2) / (y2-y3)
x0 = ( -x1^2 + x2^2 + c*( x2^2 - x3^2 ) ) / (2.0*( -x1+x2 + c*x2 - c*x3 ))
a = (y1-y2) / ( (x1-x0)^2 - (x2-x0)^2 )
y0 = y1 - a*(x1-x0)^2
Note in the equation for c if y2==y3 then you've got a problem. So in my algorithm I check for this and swap say x1, y1 with x2, y2 and then proceed.
hope that helps!
Paul Probert
A calculated solution is almost always faster than an iterative solution. The "exception" would be for low iteration counts and complex calculations.
I would use the least squares method. I've only every coded it for linear regression fits but it can be used for parabolas (I had reason to look it up recently - sources included an old edition of "Numerical Recipes" Press et al; and "Engineering Mathematics" Kreyzig).
ALGORITHM FOR PARABOLA
Read no. of data points n and order of polynomial Mp .
Read data values .
If n< Mp
[ Regression is not possible ]
stop
else
continue ;
Set M=Mp + 1 ;
Compute co-efficient of C-matrix .
Compute co-efficient of B-matrix .
Solve for the co-efficients
a1,a2,. . . . . . . an .
Write the co-efficient .
Estimate the function value at the glren of independents variables .
Using the free arbitrary accuracy math program "PARI" (for Mac or PC):
Here is how I would fit a parabola to a set of 641 points,
and I also show how to find the minimum of that parabola:
Set a high number of digits of precision:
\p 300
Write the data points to a text file separated by one space
for each data point
(use ASCII characters in base ten, no space at file start or file end, and no returns, write extremely large or small floating points as for example
"9.0E-23" but not "9.0D-23" ).
make a string to point to that file:
fileone="./desktop/data.txt"
read that file into PARI using the following instructions:
fileopen(fileone,r)
readsplit(file) = my(cmd);cmd="perl -ne \"chomp; print '[' . join(',', split(/ +/)) . ']\n';\"";eval(externstr(Str(cmd," ",file)))
readsplit(fileone)
Label that data with a name:
in = %
V = in[1]
Define a least squares fit function:
lsf(X,Y,n) = my(M=matrix(#X,n+1,i,j,X[i]^(j-1)));fit=Polrev(matsolve(M~*M,M~*Y~))
Apply that lsf function to your 641 data points:
lsf([-320..320],V, 2)
Then if you want to show the minimum of that parabolic fit, enter:
xextreme = solve (x=-1000,1000,eval(deriv(fit)));print (xextreme*(124.5678-123.5678)/640+(124.5678+123.5678)/2);x=xextreme;print(eval(fit))
(I had to adjust for my particular x-axis scaling before the "print" statement in that command line above).
(Note: A sacrifice made to simplify this algorithm
causes it to work only
when the data set has equally spaced x-axis coordinates.)
I was worried that my last post
was too compact to follow and
too hard to convert to other environments.
I would like to show here how to solve the
generalized problem of parabolic data fitting explicitly
without specialized matrix math terminology;
and so that each multiplication, division,
subtraction and addition can be seen at once.
To save ink this fit reparameterizes the x-axis as evenly
spaced points centered on zero
so that odd powered sums all get eliminated
(saving a lot of space and time),
so the x-coordinates of the N data points
are effectively labeled by points
of this vector: X=[-(N-1)/2..(N-1)/2].
For example "xextreme" will be returned
versus those integer indices
and so (if desired) a simple (consumes very little CPU time)
linear transformation must be applied after the algorithm below
to get it versus your problem's particular x-axis labels.
This is written in the language of
the free program "PARI" but all the
commands are simple to translate to any language.
Step 1: assign a label to the y-axis data:
? V=[5,2,1,2,5]
"PARI" confirms that entry:
%280 = [5, 2, 1, 2, 5]
Then type in the following processing algorithm
which calculates a best fit parabola
through any y-axis data set with constant x-axis separation:
? g=#V;h=(g-1)*g*(g+1)/3;i=h*(3*g*g-7)/5;\
a=sum(i=1,g,V[i]);b=sum(i=1,g,(2*i-1-g)*V[i]);c=sum(i=1,g,(2*i-1-g)*(2*i-1-g)*V[i]);\
A=matdet([a,c;h,i])/matdet([g,h;h,i]);B=b/h*2;C=matdet([g,h;a,c])/matdet([g,h;h,i])*4;\
xextreme=-B/(2*C);yextreme=-B*B/(4*C)+A;fit=Polrev([A,B,C]);\
print("\n","y of extreme is ",yextreme,"\n","which occurs this many data points from center of data: ",xextreme)
(Note for non-PARI users:
the command "matdet([a,c;h,i])"
is just another way of entering "a*i-c*h")
Those commands then produce the following screen output:
y of extreme is 1
which occurs this many data points from center of data: 0
The algorithm stores the polynomial of the fit in the variable "fit":
? fit
%282 = x^2 + 1
?
(Note that to make that algorithm short
the x-axis labels are assigned as X=[-(N-1)/2..(N-1)/2],
thus they are X=[-2,-1,0,1,2]
To correct that
for the same polynomial as parameterized
by an x-axis coordinate data set of say X=[−1,0,1,2,3]:
just apply a simple linear transform, in this case:
"x^2 + 1" --> "(t - 1)^2 + 1".)

Resources