Suggested algorithms/methods for laying out labels on an image - algorithm

Given an image and a set of labels attached to particular points on the image, I'm looking for an algorithm to lay out the labels to the sides of the image with certain constraints (roughly same number of labels on each side, labels roughly equidistant, lines connecting the labels to their respective points with no lines crossing).
Now, an approximate solution can typically be found quite naively by ordering the labels by Y coordinate (of the point they refer to), as in this example (proof of concept only, please ignore accuracy or otherwise of actual data!).
Now to satisfy the condition of no crossings, some ideas that occurred to me:
use a genetic algorithm to find an ordering of labels with no crossovers;
use another method (e.g. dynamic programming algorithm) to search for such an ordering;
use one of the above algorithms, allowing for variations in the spacing as well as ordering, to find the solution that minimises number of crossings and variation from even spacing;
maybe there are criteria I can use to brute search through every possible ordering of the labels within certain criteria (do not re-order two labels if their distance is greater than X);
if all else fails, just try millions of random orderings/spacing offsets and take the one that gives the minimum crossings/spacing variation. (Advantage: straightforward to program and will probably find a good enough solution; slight disadvantage, though not a show-stopper: maybe can't then run it on the fly during the application to allow user to change layout/size of the image.)
Before I embark on one of these, I would just welcome some other people's input: has anybody else experience with a similar problem and have any information to report on the success/failure of any of the above methods, or if they have a better/simpler solution that isn't occurring to me? Thanks for your input!

Lucas Bradsheet's honours thesis Labelling Maps using Multi-Objective Evolutionary Algorithms has quite a good discussion of this.
First off, this paper creates usable metrics for a number of metrics of labelling quality.
For example, clarity (how obvious the mapping between sites and labels was): clarity(s)=rs+rs1/rt
where rs is the distance between a site and its label and rt is the distance between a site and there closest other label).
It also has useful metrics for the conflicts between labels, sites and borders, as well as for measuring the density and symmetry of labels. Bradsheet then uses a multiple objective genetic algorithm to generate a "Pareto frontier" of feasible solutions. It also includes information about how he mutated the individuals, and some notes on improving the speed of the algorithm.
There's a lot of detail in it, and it should provide some good food for thought.

Let's forget about information design for a moment. This tasks recalls some memories related to PCB routing algorithms. Actually there are a lot of common requirements, including:
intersections optimization
size optimization
gaps optimization
So, it could be possible to turn the initial task into something similar to PCB routing.
There are a lot of information available, but I would suggest to look through Algorithmic studies on PCB routing by Tan Yan.
It provides a lot of details and dozens of hints.
Adaptation for the current task
The idea is to treat markers on the image and labels as two sets of pins and use escape routing to solve the task. Usually the PCB area is represented as an array of pins. Same can be done to the image with possible optimizations:
avoid low contrast areas
avoid text boxes if any
etc
So the task can be reduced to "routing in case of unused pins"
Final result can be really close to the requested style:
Algorithmic studies on PCB routing by Tan Yan is a good place to continue.
Additional notes
I chn change the style of the drawing a little bit, in order to accentuate the similarity.
It should not be a big problem to do some reverse transformation, keeping the good look and readability.
Anyway, adepts of simplicity (like me, for example) can spend several minutes and invent something better (or something different):
As for me, curves do not look like a complete solution, at least on this stage. Anyway, I've just tried to show there is room for enhancements, so PCB routing approach can be considered as an option.

One option is to turn it into an integer programming problem.
Lets say you have n points and n corresponding labels distributed around the outside of the diagram.
The number of possible lines is n^2, if we look at all possible intersections, there are less than n^4 intersections (if all possible lines were displayed).
In our integer programming problem we add the following constraints:
(to decide if a line is switched on (i.e. displayed to the screen) )
For each point on the diagram, only one of the possible n lines
connecting to it is to be switched on.
For each label, only one of the possible n lines connecting to it is
to be switched on.
For each pair of intersecting line segments line1 and line2, only
zero or one of these lines may be switched on.
Optionally, we can minimize the total distance of all the switched on lines. This enhances aesthetics.
When all of these constraints hold at the same time, we have a solution:
The code below produced the above diagram for 24 random points.
Once You start to get more than 15 or so points, the run time of the program will start to slow.
I used the PULP package with its default solver. I used PyGame for the display.
Here is the code:
__author__ = 'Robert'
import pygame
pygame.font.init()
import pulp
from random import randint
class Line():
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
self.length = (p1[0] - p2[0])**2 + (p1[1] - p2[1])**2
def intersect(self, line2):
#Copied some equations for wikipedia. Not sure if this is the best way to check intersection.
x1, y1 = self.p1
x2, y2 = self.p2
x3, y3 = line2.p1
x4, y4 = line2.p2
xtop = (x1*y2-y1*x2)*(x3-x4)-(x1-x2)*(x3*y4-y3*x4)
xbottom = (x1-x2)*(y3-y4) - (y1-y2)*(x3-x4)
ytop = (x1*y2-y1*x2)*(y3-y4)-(y1-y2)*(x3*y4-y3*x4)
ybottom = xbottom
if xbottom == 0:
#lines are parallel. Can only intersect if they are the same line. I'm not checking that however,
#which means there could be a rare bug that occurs if more than 3 points line up.
if self.p1 in (line2.p1, line2.p2) or self.p2 in (line2.p1, line2.p2):
return True
return False
x = float(xtop) / xbottom
y = float(ytop) / ybottom
if min(x1, x2) <= x <= max(x1, x2) and min(x3, x4) <= x <= max(x3, x4):
if min(y1, y2) <= y <= max(y1, y2) and min(y3, y4) <= y <= max(y3, y4):
return True
return False
def solver(lines):
#returns best line matching
lines = list(lines)
prob = pulp.LpProblem("diagram labelling finder", pulp.LpMinimize)
label_points = {} #a point at each label
points = {} #points on the image
line_variables = {}
variable_to_line = {}
for line in lines:
point, label_point = line.p1, line.p2
if label_point not in label_points:
label_points[label_point] = []
if point not in points:
points[point] = []
line_on = pulp.LpVariable("point{0}-point{1}".format(point, label_point),
lowBound=0, upBound=1, cat=pulp.LpInteger) #variable controls if line used or not
label_points[label_point].append(line_on)
points[point].append(line_on)
line_variables[line] = line_on
variable_to_line[line_on] = line
for lines_to_point in points.itervalues():
prob += sum(lines_to_point) == 1 #1 label to each point..
for lines_to_label in label_points.itervalues():
prob += sum(lines_to_label) == 1 #1 point for each label.
for line1 in lines:
for line2 in lines:
if line1 > line2 and line1.intersect(line2):
line1_on = line_variables[line1]
line2_on = line_variables[line2]
prob += line1_on + line2_on <= 1 #only switch one on.
#minimize length of switched on lines:
prob += sum(i.length * line_variables[i] for i in lines)
prob.solve()
print prob.solutionTime
print pulp.LpStatus[prob.status] #should say "Optimal"
print len(prob.variables())
for line_on, line in variable_to_line.iteritems():
if line_on.varValue > 0:
yield line #yield the lines that are switched on
class Diagram():
def __init__(self, num_points=20, width=700, height=800, offset=150):
assert(num_points % 2 == 0) #if even then labels align nicer (-:
self.background_colour = (255,255,255)
self.width, self.height = width, height
self.screen = pygame.display.set_mode((width, height))
pygame.display.set_caption('Diagram Labeling')
self.screen.fill(self.background_colour)
self.offset = offset
self.points = list(self.get_points(num_points))
self.num_points = num_points
self.font_size = min((self.height - 2 * self.offset)//num_points, self.offset//4)
def get_points(self, n):
for i in range(n):
x = randint(self.offset, self.width - self.offset)
y = randint(self.offset, self.height - self.offset)
yield (x, y)
def display_outline(self):
w, h = self.width, self.height
o = self.offset
outline1 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 100, 100), True, outline1, 1)
o = self.offset - self.offset//4
outline2 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 200, 0), True, outline2, 1)
def display_points(self, color=(100, 100, 0), radius=3):
for point in self.points:
pygame.draw.circle(self.screen, color, point, radius, 2)
def get_label_heights(self):
for i in range((self.num_points + 1)//2):
yield self.offset + 2 * i * self.font_size
def get_label_endpoints(self):
for y in self.get_label_heights():
yield (self.offset, y)
yield (self.width - self.offset, y)
def get_all_lines(self):
for point in self.points:
for end_point in self.get_label_endpoints():
yield Line(point, end_point)
def display_label_lines(self, lines):
for line in lines:
pygame.draw.line(self.screen, (255, 0, 0), line.p1, line.p2, 1)
def display_labels(self):
myfont = pygame.font.SysFont("Comic Sans MS", self.font_size)
label = myfont.render("label", True, (155, 155, 155))
for y in self.get_label_heights():
self.screen.blit(label, (self.offset//4 - 10, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.offset - self.offset//4, y), (self.offset, y), 1)
for y in self.get_label_heights():
self.screen.blit(label, (self.width - 2*self.offset//3, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.width - self.offset + self.offset//4, y), (self.width - self.offset, y), 1)
def display(self):
self.display_points()
self.display_outline()
self.display_labels()
#self.display_label_lines(self.get_all_lines())
self.display_label_lines(solver(self.get_all_lines()))
diagram = Diagram()
diagram.display()
pygame.display.flip()
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False

I think an actual solution of this problem is on the slightly different layer. It doesn't seem to be right idea to start solving algorithmic problem totally ignoring Information design. There is an interesting example found here
Let's identify some important questions:
How is the data best viewed?
Will it confuse people?
Is it readable?
Does it actually help to better understand the picture?
By the way, chaos is really confusing. We like order and predictability. There is no need to introduce additional informational noise to the initial image.
The readability of a graphical message is determined by the content and its presentation. Readability of a message involves the reader’s ability to understand the style of text and pictures. You have that interesting algorithmic task because of the additional "noisy" approach. Remove the chaos -- find better solution :)
Please note, this is just a PoC. The idea is to use only horizontal lines with clear markers. Labels placement is straightforward and deterministic. Several similar ideas can be proposed.
With such approach you can easily balance left-right labels, avoid small vertical gaps between lines, provide optimal vertical density for labels, etc.
EDIT
Ok, let's see how initial process may look.
User story: as a user I want important images to be annotated in order to simplify understanding and increase it's explanatory value.
Important assumptions:
initial image is a primary object for the user
readability is a must
So, the best possible solution is to have annotations but do not have them. (I would really suggest to spend some time reading about the theory of inventive problem solving).
Basically, there should be no obstacles for the user to see the initial picture, but annotations should be right there when needed. It can be slightly confusing, sorry for that.
Do you think intersections issue is the only one behind the following image?
Please note, the actual goal behind the developed approach is to provide two information flows (image and annotations) and help the user to understand everything as fast as possible. By the way, vision memory is also very important.
What are behind human vision:
Selective attention
Familiarity detection
Pattern detection
Do you want to break at least one of these mechanisms? I hope you don't. Because it will make the actual result not very user-friendly.
So what can distract me?
strange lines randomly distributed over the image (random geometric objects are very distractive)
not uniform annotations placement and style
strange complex patterns as a result of final merge of the image and the annotation layer
Why my proposal should be considered?
It has simple pattern, so pattern detection will let the user stop noticing annotations, but see the picture instead
It has uniform design, so familiarity detection will work too
It does not affect initial image so much as other solutions because lines have minimal width.
Lines are horizontal, anti-aliasing is not used, so it saves more information and provides clean result
Finally, it does simplify routing algorithm a lot.
Some additional comments:
Do not use random points to test your algorithms, use simple but yet important cases. You'll see automated solutions sometimes may fail dramatically.
I do not suggest to use approach proposed by me as is. There are a lot of possible enhancements.
What I'm really suggest is to go one level up and do several iterations on the meta-level.
Grouping can be used to deal with the complex case, mentioned by Robert King:
Or I can imagine for a second some point is located slightly above it's default location. But only for a second, because I do not want to break the main processing flow and affect other markers.
Thank you for reading.

You can find the center of your diagram, and then draw the lines from the points radially outward from the center. The only way you could have a crossing is if two of the points lie on the same ray, in which case you just shift one of the lines a bit one way, and shift the other a bit the other way, like so:
With only actual parts showing:
In case there are two or more points colinear with the center, you can shift the lines slightly to the side:
While this doen't produce very good multisegment line things, it very clearly labels the diagram. Also, to make it more fisually appealing, it may be better to pick a point for the center that is actually the center of your object, rather than just the center of the point set.

I would add one more thing to your prototype - may be it will be acceptable after this:
Iterate through every intersection and swap labels, repeat until there are intersections.
This process is finite, because number of states is finite and every swap reduces sum of all line lengths - so no loop is possible.

This problem can be cast as graph layout.
I recommend you look at e.g. the Graphviz library. I have not done any experiments, but believe that by expressing the points to be labeled and the labels themselves as nodes and the lead lines as edges, you would get good results.
You would have to express areas where labels should not go as "dummy" nodes not to be overlapped.
Graphvis has bindings for many languages.
Even if Graphviz does not have quite enough flexibility to do exactly what you need, the "Theory" section of that page has references for energy minimization and spring algorithms that can be applied to your problem. The literature on graph layout is enormous.

Related

Understanding Support Vector Regression (SVR) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I'm working with SVR, and using this resource. Erverything is super clear, with epsilon intensive loss function (from figure). Prediction comes with tube, to cover most training sample, and generalize bounds, using support vectors.
Then we have this explanation. This can be described by introducing (non-negative) slack variables , to measure the deviation of training samples outside -insensitive zone. I understand this error, outside tube, but don't know, how we can use this in optimization. Could somebody explain this?
In local source. I'm trying to achieve very simple optimization solution, without libraries. This what I have for loss function.
import numpy as np
# Kernel func, linear by default
def hypothesis(x, weight, k=None):
k = k if k else lambda z : z
k_x = np.vectorize(k)(x)
return np.dot(k_x, np.transpose(weight))
.......
import math
def boundary_loss(x, y, weight, epsilon):
prediction = hypothesis(x, weight)
scatter = np.absolute(
np.transpose(y) - prediction)
bound = lambda z: z \
if z >= epsilon else 0
return np.sum(np.vectorize(bound)(scatter))
First, let's look at the objective function. The first term, 1/2 * w^2 (wish this site had LaTeX support but this will suffice) correlates with the margin of the SVM. The article you linked doesn't, in my opinion, explain this very well and calls this term describing "the model's complexity", but perhaps this is not the best way of explaining it. Minimizing this term maximizes the margin (while still representing the data well), which is the predominant goal of using SVM's doing regression.
Warning, Math Heavy Explanation: The reason this is the case is that when maximizing the margin, you want to find the "farthest" non-outlier points right on the margin and minimize its distance. Let this farthest point be x_n. We want to find its Euclidean distance d from the plane f(w, x) = 0, which I will rewrite as w^T * x + b = 0 (where w^T is just the transpose of the weights matrix so that we can multiply the two). To find the distance, let us first normalize the plane such that |w^T * x_n + b| = epsilon, which we can do WLOG as w is still able to form all possible planes of the form w^T * x + b= 0. Then, let's note that w is perpendicular to the plane. This is obvious if you have dealt a lot with planes (particularly in vector calculus), but can be proven by choosing two points on the plane x_1 and x_2, then noticing that w^T * x_1 + b = 0, and w^T * x_2 + b = 0. Subtracting the two equations we get w^T(x_1 - x_2) = 0. Since x_1 - x_2 is just any vector strictly on the plane, and its dot product with w is 0, then we know that w is perpendicular to the plane. Finally, to actually calculate the distance between x_n and the plane, we take the vector formed by x_n' and some point on the plane x' (The vectors would then be x_n - x', and projecting it onto the vector w. Doing this, we get d = |w * (x_n - x') / |w||, which we can rewrite as d = (1 / |w|) * | w^T * x_n - w^T x'|, and then add and subtract b to the inside to get d = (1 / |w|) * | w^T * x_n + b - w^T * x' - b|. Notice that w^T * x_n + b is epsilon (from our normalization above), and that w^T * x' + b is 0, as this is just a point on our plane. Thus, d = epsilon / |w|. Notice that maximizing this distance subject to our constraint of finding the x_n and having |w^T * x_n + b| = epsilon is a difficult optimization problem. What we can do is restructure this optimization problem as minimizing 1/2 * w^T * w subject to the first two constraints in the picture you attached, that is, |y_i - f(x_i, w)| <= epsilon. You may think that I have forgotten the slack variables, and this is true, but when just focusing on this term and ignoring the second term, we ignore the slack variables for now, I will bring them back later. The reason these two optimizations are equivalent is not obvious, but the underlying reason lies in discrimination boundaries, which you are free to read more about (it's a lot more math that frankly I don't think this answer needs more of). Then, note that minimizing 1/2 * w^T * w is the same as minimizing 1/2 * |w|^2, which is the desired result we were hoping for. End of the Heavy Math
Now, notice that we want to make the margin big, but not so big that includes noisy outliers like the one in the picture you provided.
Thus, we introduce a second term. To motivate the margin down to a reasonable size the slack variables are introduced, (I will call them p and p* because I don't want to type out "psi" every time). These slack variables will ignore everything in the margin, i.e. those are the points that do not harm the objective and the ones that are "correct" in terms of their regression status. However, the points outside the margin are outliers, they do not reflect well on the regression, so we penalize them simply for existing. The slack error function that is given there is relatively easy to understand, it just adds up the slack error of every point (p_i + p*_i) for i = 1,...,N, and then multiplies by a modulating constant C which determines the relative importance of the two terms. A low value of C means that we are okay with having outliers, so the margin will be thinned and more outliers will be produced. A high value of C indicates that we care a lot about not having slack, so the margin will be made bigger to accommodate these outliers at the expense of representing the overall data less well.
A few things to note about p and p*. First, note that they are both always >= 0. The constraint in your picture shows this, but it also intuitively makes sense as slack should always add to the error, so it is positive. Second, notice that if p > 0, then p* = 0 and vice versa as an outlier can only be on one side of the margin. Last, all points inside the margin will have p and p* be 0, since they are fine where they are and thus do not contribute to the loss.
Notice that with the introduction of the slack variables, if you have any outliers then you won't want the condition from the first term, that is, |w^T * x_n + b| = epsilon as the x_n would be this outlier, and your whole model would be screwed up. What we allow for, then, is to change the constraint to be |w^T * x_n + b| = epsilon + (p + p*). When translated to the new optimization's constraint, we get the full constraint from the picture you attached, that is, |y_i - f(x_i, w)| <= epsilon + p + p*. (I combined the two equations into one here, but you could rewrite them as the picture is and that would be the same thing).
Hopefully after covering all this up, the motivation for the objective function and the corresponding slack variables makes sense to you.
If I understand the question correctly, you also want code to calculate this objective/loss function, which I think isn't too bad. I have not tested this (yet), but I think this should be what you want.
# Function for calculating the error/loss for a SVM. I assume that:
# - 'x' is 2d array representing the vectors of the data points
# - 'y' is an array representing the values each vector actually gives
# - 'weights' is an array of weights that we tune for the regression
# - 'epsilon' is a scalar representing the breadth of our margin.
def optimization_objective(x, y, weights, epsilon):
# Calculates first term of objective (note that norm^2 = dot product)
margin_term = np.dot(weight, weight) / 2
# Now calculate second term of objective. First get the sum of slacks.
slack_sum = 0
for i in range(len(x)): # For each observation
# First find the absolute distance between expected and observed.
diff = abs(hypothesis(x[i]) - y[i])
# Now subtract epsilon
diff -= epsilon
# If diff is still more than 0, then it is an 'outlier' and will have slack.
slack = max(0, diff)
# Add it to the slack sum
slack_sum += slack
# Now we have the slack_sum, so then multiply by C (I picked this as 1 aribtrarily)
C = 1
slack_term = C * slack_sum
# Now, simply return the sum of the two terms, and we are done.
return margin_term + slack_term
I got this function working on my computer with small data, and you may have to change it a little to work with your data if, for example, the arrays are structured differently, but the idea is there. Also, I am not the most proficient with python, so this may not be the most efficient implementation, but my intent was to make it understandable.
Now, note that this just calculates the error/loss (whatever you want to call it). To actually minimize it requires going into Lagrangians and intense quadratic programming which is a much more daunting task. There are libraries available for doing this but if you want to do this library free as you are doing with this, I wish you good luck because doing that is not a walk in the park.
Finally, I would like to note that most of this information I got from notes I took in my ML class I took last year, and the professor (Dr. Abu-Mostafa) was a great help to have me learn the material. The lectures for this class are online (by the same prof), and the pertinent ones for this topic are here and here (although in my very biased opinion you should watch all the lectures, they were a great help). Leave a comment/question if you need anything cleared up or if you think I made a mistake somewhere. If you still don't understand, I can try to edit my answer to make more sense. Hope this helps!

Intersection of axis-aligned rectangular cuboids (MBR) in one dimension

Currently I'm doing benchmarks on time series indexing algorithms. Since most of the time no reference implementations are available, I have to write my own implementations (all in Java). At the moment I am stuck a little at section 6.2 of a paper called Indexing multi-dimensional time-series with support for multiple distance measures available here in PDF : http://hadjieleftheriou.com/papers/vldbj04-2.pdf
A MBR (minimum bounding rectangle) is basically a rectanglular cubiod with some coordinates and directions. As an example P and Q are two MBRs with P.coord={0,0,0} and P.dir={1,1,3} and Q.coords={0.5,0.5,1} and Q.dir={1,1,1} where the first entries represent the time dimension.
Now I would like to calculate the MINDIST(Q,P) between Q and P :
However I am not sure how to implement the "intersection of two MBRs in the time dimension" (Dim 1) since I am not sure what the intersection in the time dimension actually means. It is also not clear what h_Q, l_Q, l_P, h_P mean, since this notation is not explained (my guess is they mean something like highest or lowest value of a dimension in the intersection).
I would highly appreciate it, if someone could explain to me how to calculate the intersection of two MBRs in the first dimension and maybe enlighten me with an interpretation of the notation. Thanks!
Well, Figure 14 in your paper explains the time intersection. And the rectangles are axis-aligned, thus it makes sense to use high and low on each coordinate.
The multiplication sign you see is not a cross product, just a normal multiplication, because on both sides of it you have a scalar, and not vectors.
However I must agree that the discussions on page 14 are rather fuzzy, but they seem to tell us that both types of intersections (complete and partial), when they are have a t subscript, mean the norm of the intersection along the t coordinate.
Thus it seems you could factorize the time intersection to get a formula that would be :
It is worth noting that, maybe counter-intuitively, when your objects don't intersect on the time plane, their MINDIST is defined to be 0.
Hence the following pseudo-code ;
mindist(P, Q)
{
if( Q.coord[0] + Q.dir[0] < P.coord[0] ||
Q.coord[0] > P.coord[0] + P.dir[0] )
return 0;
time = min(Q.coord[0] + Q.dir[0], P.coord[0] + P.dir[0]) - max(Q.coord[0], P.coord[0]);
sum = 0;
for(d=1; d<D; ++d)
{
if( Q.coord[d] + Q.dir[d] < P.coord[d] )
x = Q.coord[d] + Q.dir[d] - P.coord[d];
else if( P.coord[d] + P.dir[d] < Q.coord[d] )
x = P.coord[d] + P.dir[d] - Q.coord[d];
else
x = 0;
sum += x*x;
}
return sqrt(time * sum);
}
Note the absolute values in the paper are unnecessary since we just checked which values where bigger, and we thus know we only add positive numbers.

Multiliteration implementation with inaccurate distance data

I am trying to create an android smartphone application which uses Apples iBeacon technology to determine the current indoor location of itself. I already managed to get all available beacons and calculate the distance to them via the rssi signal.
Currently I face the problem, that I am not able to find any library or implementation of an algorithm, which calculates the estimated location in 2D by using 3 (or more) distances of fixed points with the condition, that these distances are not accurate (which means, that the three "trilateration-circles" do not intersect in one point).
I would be deeply grateful if anybody can post me a link or an implementation of that in any common programming language (Java, C++, Python, PHP, Javascript or whatever). I already read a lot on stackoverflow about that topic, but could not find any answer I were able to convert in code (only some mathematical approaches with matrices and inverting them, calculating with vectors or stuff like that).
EDIT
I thought about an own approach, which works quite well for me, but is not that efficient and scientific. I iterate over every meter (or like in my example 0.1 meter) of the location grid and calculate the possibility of that location to be the actual position of the handset by comparing the distance of that location to all beacons and the distance I calculate with the received rssi signal.
Code example:
public Location trilaterate(ArrayList<Beacon> beacons, double maxX, double maxY)
{
for (double x = 0; x <= maxX; x += .1)
{
for (double y = 0; y <= maxY; y += .1)
{
double currentLocationProbability = 0;
for (Beacon beacon : beacons)
{
// distance difference between calculated distance to beacon transmitter
// (rssi-calculated distance) and current location:
// |sqrt(dX^2 + dY^2) - distanceToTransmitter|
double distanceDifference = Math
.abs(Math.sqrt(Math.pow(beacon.getLocation().x - x, 2)
+ Math.pow(beacon.getLocation().y - y, 2))
- beacon.getCurrentDistanceToTransmitter());
// weight the distance difference with the beacon calculated rssi-distance. The
// smaller the calculated rssi-distance is, the more the distance difference
// will be weighted (it is assumed, that nearer beacons measure the distance
// more accurate)
distanceDifference /= Math.pow(beacon.getCurrentDistanceToTransmitter(), 0.9);
// sum up all weighted distance differences for every beacon in
// "currentLocationProbability"
currentLocationProbability += distanceDifference;
}
addToLocationMap(currentLocationProbability, x, y);
// the previous line is my approach, I create a Set of Locations with the 5 most probable locations in it to estimate the accuracy of the measurement afterwards. If that is not necessary, a simple variable assignment for the most probable location would do the job also
}
}
Location bestLocation = getLocationSet().first().location;
bestLocation.accuracy = calculateLocationAccuracy();
Log.w("TRILATERATION", "Location " + bestLocation + " best with accuracy "
+ bestLocation.accuracy);
return bestLocation;
}
Of course, the downside of that is, that I have on a 300m² floor 30.000 locations I had to iterate over and measure the distance to every single beacon I got a signal from (if that would be 5, I do 150.000 calculations only for determine a single location). That's a lot - so I will let the question open and hope for some further solutions or a good improvement of this existing solution in order to make it more efficient.
Of course it has not to be a Trilateration approach, like the original title of this question was, it is also good to have an algorithm which includes more than three beacons for the location determination (Multilateration).
If the current approach is fine except for being too slow, then you could speed it up by recursively subdividing the plane. This works sort of like finding nearest neighbors in a kd-tree. Suppose that we are given an axis-aligned box and wish to find the approximate best solution in the box. If the box is small enough, then return the center.
Otherwise, divide the box in half, either by x or by y depending on which side is longer. For both halves, compute a bound on the solution quality as follows. Since the objective function is additive, sum lower bounds for each beacon. The lower bound for a beacon is the distance of the circle to the box, times the scaling factor. Recursively find the best solution in the child with the lower lower bound. Examine the other child only if the best solution in the first child is worse than the other child's lower bound.
Most of the implementation work here is the box-to-circle distance computation. Since the box is axis-aligned, we can use interval arithmetic to determine the precise range of distances from box points to the circle center.
P.S.: Math.hypot is a nice function for computing 2D Euclidean distances.
Instead of taking confidence levels of individual beacons into account, I would instead try to assign an overall confidence level for your result after you make the best guess you can with the available data. I don't think the only available metric (perceived power) is a good indication of accuracy. With poor geometry or a misbehaving beacon, you could be trusting poor data highly. It might make better sense to come up with an overall confidence level based on how well the perceived distance to the beacons line up with the calculated point assuming you trust all beacons equally.
I wrote some Python below that comes up with a best guess based on the provided data in the 3-beacon case by calculating the two points of intersection of circles for the first two beacons and then choosing the point that best matches the third. It's meant to get started on the problem and is not a final solution. If beacons don't intersect, it slightly increases the radius of each up until they do meet or a threshold is met. Likewise, it makes sure the third beacon agrees within a settable threshold. For n-beacons, I would pick 3 or 4 of the strongest signals and use those. There are tons of optimizations that could be done and I think this is a trial-by-fire problem due to the unwieldy nature of beaconing.
import math
beacons = [[0.0,0.0,7.0],[0.0,10.0,7.0],[10.0,5.0,16.0]] # x, y, radius
def point_dist(x1,y1,x2,y2):
x = x2-x1
y = y2-y1
return math.sqrt((x*x)+(y*y))
# determines two points of intersection for two circles [x,y,radius]
# returns None if the circles do not intersect
def circle_intersection(beacon1,beacon2):
r1 = beacon1[2]
r2 = beacon2[2]
dist = point_dist(beacon1[0],beacon1[1],beacon2[0],beacon2[1])
heron_root = (dist+r1+r2)*(-dist+r1+r2)*(dist-r1+r2)*(dist+r1-r2)
if ( heron_root > 0 ):
heron = 0.25*math.sqrt(heron_root)
xbase = (0.5)*(beacon1[0]+beacon2[0]) + (0.5)*(beacon2[0]-beacon1[0])*(r1*r1-r2*r2)/(dist*dist)
xdiff = 2*(beacon2[1]-beacon1[1])*heron/(dist*dist)
ybase = (0.5)*(beacon1[1]+beacon2[1]) + (0.5)*(beacon2[1]-beacon1[1])*(r1*r1-r2*r2)/(dist*dist)
ydiff = 2*(beacon2[0]-beacon1[0])*heron/(dist*dist)
return (xbase+xdiff,ybase-ydiff),(xbase-xdiff,ybase+ydiff)
else:
# no intersection, need to pseudo-increase beacon power and try again
return None
# find the two points of intersection between beacon0 and beacon1
# will use beacon2 to determine the better of the two points
failing = True
power_increases = 0
while failing and power_increases < 10:
res = circle_intersection(beacons[0],beacons[1])
if ( res ):
intersection = res
else:
beacons[0][2] *= 1.001
beacons[1][2] *= 1.001
power_increases += 1
continue
failing = False
# make sure the best fit is within x% (10% of the total distance from the 3rd beacon in this case)
# otherwise the results are too far off
THRESHOLD = 0.1
if failing:
print 'Bad Beacon Data (Beacon0 & Beacon1 don\'t intersection after many "power increases")'
else:
# finding best point between beacon1 and beacon2
dist1 = point_dist(beacons[2][0],beacons[2][1],intersection[0][0],intersection[0][1])
dist2 = point_dist(beacons[2][0],beacons[2][1],intersection[1][0],intersection[1][1])
if ( math.fabs(dist1-beacons[2][2]) < math.fabs(dist2-beacons[2][2]) ):
best_point = intersection[0]
best_dist = dist1
else:
best_point = intersection[1]
best_dist = dist2
best_dist_diff = math.fabs(best_dist-beacons[2][2])
if best_dist_diff < THRESHOLD*best_dist:
print best_point
else:
print 'Bad Beacon Data (Beacon2 distance to best point not within threshold)'
If you want to trust closer beacons more, you may want to calculate the intersection points between the two closest beacons and then use the farther beacon to tie-break. Keep in mind that almost anything you do with "confidence levels" for the individual measurements will be a hack at best. Since you will always be working with very bad data, you will defintiely need to loosen up the power_increases limit and threshold percentage.
You have 3 points : A(xA,yA,zA), B(xB,yB,zB) and C(xC,yC,zC), which respectively are approximately at dA, dB and dC from you goal point G(xG,yG,zG).
Let's say cA, cB and cC are the confidence rate ( 0 < cX <= 1 ) of each point.
Basically, you might take something really close to 1, like {0.95,0.97,0.99}.
If you don't know, try different coefficient depending of distance avg. If distance is really big, you're likely to be not very confident about it.
Here is the way i'll do it :
var sum = (cA*dA) + (cB*dB) + (cC*dC);
dA = cA*dA/sum;
dB = cB*dB/sum;
dC = cC*dC/sum;
xG = (xA*dA) + (xB*dB) + (xC*dC);
yG = (yA*dA) + (yB*dB) + (yC*dC);
xG = (zA*dA) + (zB*dB) + (zC*dC);
Basic, and not really smart but will do the job for some simple tasks.
EDIT
You can take any confidence coef you want in [0,inf[, but IMHO, restraining at [0,1] is a good idea to keep a realistic result.

Need Help in Contour Plot in Matlab

I am trying to plot contour of the Shifted Schwefel Problem function but keep having this error: Z must be size 2x2 or greater. I have searched on this forum and the information i have helped a little, but could not fix the above error. The information i got from this forum lead me to trying this code:
min = -50;
max = 50;
steps = 20;
c = linspace(min, max, steps); % Create the mesh
[x, y] = meshgrid(c, c); % Create the grid
%o=-50+100*rand(1,2);
%c = c - repmat(o,1,10);
for I=1:length(x)
for J=1:length(y)
o=-50+100*rand(1,2);
x=x-repmat(o,20,10);
f = max(abs(x), [], 2);
end
end
figure,
contour(x,y,f);
figure,
surfc(x, y,f);
Now i have error that z, which the the value of f most be atleast 2x2 or greater. I know my f is taking only one input and therefore will output only one. I tried having it in a nested for loops, but still giving me a array of vectors not matrix of atleast 2x2. if the input was two, then the problem will be fine, but the problem is, it is one input. Does anyone know how i can make this "f" output a matrix of atleast 2x2 so that i can plot the z of the contour?
There are a few things to note:
1.) As Jacob Robbins pointed out correctly in his comment, you should avoid using names from Matlab functions as variable names (in your case min and max). One very easy way to do this, is to use only upper case letters for variable names.
2.) You are correct in saying that your fis only one output (though one output in this case is not a single number, but a vector). That is, because you don't assign any indexing to it within the loop.
3.) Yes, both contour and surfc need at least 2x2 - because they plot information on a grid, which is itself at least 2x2 in nature.
4.) In your particular case, two loops may not be necessary. You seem to only be manipulating the x-vector and your grid is regular. Hence you might want to try this loop:
for I=1:length(x)
o=-50+100*rand(1,2);
x=x-repmat(o,20,10);
f(:,I) = max(abs(x), [], 2);
end
Now, f will be of size 20x20, which corresponds to your x- and y-grid. Also, now your contour and surfc command will produce plots.
5.) One last comment: The output of your function and the results of a web-search for "Shifted Schwefel function" are very different. But the question if your implementation of the Shifted Schwefel function is correct, should be asked as a new question.

Fastest way to fit a parabola to set of points?

Given a set of points, what's the fastest way to fit a parabola to them? Is it doing the least squares calculation or is there an iterative way?
Thanks
Edit:
I think gradient descent is the way to go. The least squares calculation would have been a little bit more taxing (having to do qr decomposition or something to keep things stable).
If the points have no error associated, you may interpolate by three points. Otherwise least squares or any equivalent formulation is the way to go.
I recently needed to find a parabola that passes through 3 points.
suppose you have (x1,y1), (x2,y2) and (x3,y3) and you want the parabola
y-y0 = a*(x-x0)^2
to pass through them: find y0, x0, and a.
You can do some algebra and get this solution (providing the points aren't all on a line) :
let c = (y1-y2) / (y2-y3)
x0 = ( -x1^2 + x2^2 + c*( x2^2 - x3^2 ) ) / (2.0*( -x1+x2 + c*x2 - c*x3 ))
a = (y1-y2) / ( (x1-x0)^2 - (x2-x0)^2 )
y0 = y1 - a*(x1-x0)^2
Note in the equation for c if y2==y3 then you've got a problem. So in my algorithm I check for this and swap say x1, y1 with x2, y2 and then proceed.
hope that helps!
Paul Probert
A calculated solution is almost always faster than an iterative solution. The "exception" would be for low iteration counts and complex calculations.
I would use the least squares method. I've only every coded it for linear regression fits but it can be used for parabolas (I had reason to look it up recently - sources included an old edition of "Numerical Recipes" Press et al; and "Engineering Mathematics" Kreyzig).
ALGORITHM FOR PARABOLA
Read no. of data points n and order of polynomial Mp .
Read data values .
If n< Mp
[ Regression is not possible ]
stop
else
continue ;
Set M=Mp + 1 ;
Compute co-efficient of C-matrix .
Compute co-efficient of B-matrix .
Solve for the co-efficients
a1,a2,. . . . . . . an .
Write the co-efficient .
Estimate the function value at the glren of independents variables .
Using the free arbitrary accuracy math program "PARI" (for Mac or PC):
Here is how I would fit a parabola to a set of 641 points,
and I also show how to find the minimum of that parabola:
Set a high number of digits of precision:
\p 300
Write the data points to a text file separated by one space
for each data point
(use ASCII characters in base ten, no space at file start or file end, and no returns, write extremely large or small floating points as for example
"9.0E-23" but not "9.0D-23" ).
make a string to point to that file:
fileone="./desktop/data.txt"
read that file into PARI using the following instructions:
fileopen(fileone,r)
readsplit(file) = my(cmd);cmd="perl -ne \"chomp; print '[' . join(',', split(/ +/)) . ']\n';\"";eval(externstr(Str(cmd," ",file)))
readsplit(fileone)
Label that data with a name:
in = %
V = in[1]
Define a least squares fit function:
lsf(X,Y,n) = my(M=matrix(#X,n+1,i,j,X[i]^(j-1)));fit=Polrev(matsolve(M~*M,M~*Y~))
Apply that lsf function to your 641 data points:
lsf([-320..320],V, 2)
Then if you want to show the minimum of that parabolic fit, enter:
xextreme = solve (x=-1000,1000,eval(deriv(fit)));print (xextreme*(124.5678-123.5678)/640+(124.5678+123.5678)/2);x=xextreme;print(eval(fit))
(I had to adjust for my particular x-axis scaling before the "print" statement in that command line above).
(Note: A sacrifice made to simplify this algorithm
causes it to work only
when the data set has equally spaced x-axis coordinates.)
I was worried that my last post
was too compact to follow and
too hard to convert to other environments.
I would like to show here how to solve the
generalized problem of parabolic data fitting explicitly
without specialized matrix math terminology;
and so that each multiplication, division,
subtraction and addition can be seen at once.
To save ink this fit reparameterizes the x-axis as evenly
spaced points centered on zero
so that odd powered sums all get eliminated
(saving a lot of space and time),
so the x-coordinates of the N data points
are effectively labeled by points
of this vector: X=[-(N-1)/2..(N-1)/2].
For example "xextreme" will be returned
versus those integer indices
and so (if desired) a simple (consumes very little CPU time)
linear transformation must be applied after the algorithm below
to get it versus your problem's particular x-axis labels.
This is written in the language of
the free program "PARI" but all the
commands are simple to translate to any language.
Step 1: assign a label to the y-axis data:
? V=[5,2,1,2,5]
"PARI" confirms that entry:
%280 = [5, 2, 1, 2, 5]
Then type in the following processing algorithm
which calculates a best fit parabola
through any y-axis data set with constant x-axis separation:
? g=#V;h=(g-1)*g*(g+1)/3;i=h*(3*g*g-7)/5;\
a=sum(i=1,g,V[i]);b=sum(i=1,g,(2*i-1-g)*V[i]);c=sum(i=1,g,(2*i-1-g)*(2*i-1-g)*V[i]);\
A=matdet([a,c;h,i])/matdet([g,h;h,i]);B=b/h*2;C=matdet([g,h;a,c])/matdet([g,h;h,i])*4;\
xextreme=-B/(2*C);yextreme=-B*B/(4*C)+A;fit=Polrev([A,B,C]);\
print("\n","y of extreme is ",yextreme,"\n","which occurs this many data points from center of data: ",xextreme)
(Note for non-PARI users:
the command "matdet([a,c;h,i])"
is just another way of entering "a*i-c*h")
Those commands then produce the following screen output:
y of extreme is 1
which occurs this many data points from center of data: 0
The algorithm stores the polynomial of the fit in the variable "fit":
? fit
%282 = x^2 + 1
?
(Note that to make that algorithm short
the x-axis labels are assigned as X=[-(N-1)/2..(N-1)/2],
thus they are X=[-2,-1,0,1,2]
To correct that
for the same polynomial as parameterized
by an x-axis coordinate data set of say X=[−1,0,1,2,3]:
just apply a simple linear transform, in this case:
"x^2 + 1" --> "(t - 1)^2 + 1".)

Resources