Related
I'm getting thoroughly confused over matrix definitions. I have a matrix class, which holds a float[16] which I assumed is row-major, based on the following observations:
float matrixA[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
float matrixB[4][4] = { { 0, 1, 2, 3 }, { 4, 5, 6, 7 }, { 8, 9, 10, 11 }, { 12, 13, 14, 15 } };
matrixA and matrixB both have the same linear layout in memory (i.e. all numbers are in order). According to http://en.wikipedia.org/wiki/Row-major_order this indicates a row-major layout.
matrixA[0] == matrixB[0][0];
matrixA[3] == matrixB[0][3];
matrixA[4] == matrixB[1][0];
matrixA[7] == matrixB[1][3];
Therefore, matrixB[0] = row 0, matrixB[1] = row 1, etc. Again, this indicates row-major layout.
My problem / confusion comes when I create a translation matrix which looks like:
1, 0, 0, transX
0, 1, 0, transY
0, 0, 1, transZ
0, 0, 0, 1
Which is laid out in memory as, { 1, 0, 0, transX, 0, 1, 0, transY, 0, 0, 1, transZ, 0, 0, 0, 1 }.
Then when I call glUniformMatrix4fv, I need to set the transpose flag to GL_FALSE, indicating that it's column-major, else transforms such as translate / scale etc don't get applied correctly:
If transpose is GL_FALSE, each matrix is assumed to be supplied in
column major order. If transpose is GL_TRUE, each matrix is assumed to
be supplied in row major order.
Why does my matrix, which appears to be row-major, need to be passed to OpenGL as column-major?
matrix notation used in opengl documentation does not describe in-memory layout for OpenGL matrices
If think it'll be easier if you drop/forget about the entire "row/column-major" thing. That's because in addition to row/column major, the programmer can also decide how he would want to lay out the matrix in the memory (whether adjacent elements form rows or columns), in addition to the notation, which adds to confusion.
OpenGL matrices have same memory layout as directx matrices.
x.x x.y x.z 0
y.x y.y y.z 0
z.x z.y z.z 0
p.x p.y p.z 1
or
{ x.x x.y x.z 0 y.x y.y y.z 0 z.x z.y z.z 0 p.x p.y p.z 1 }
x, y, z are 3-component vectors describing the matrix coordinate system (local coordinate system within relative to the global coordinate system).
p is a 3-component vector describing the origin of matrix coordinate system.
Which means that the translation matrix should be laid out in memory like this:
{ 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, transX, transY, transZ, 1 }.
Leave it at that, and the rest should be easy.
---citation from old opengl faq--
9.005 Are OpenGL matrices column-major or row-major?
For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix, where indices are numbered from 1 to 16 as described in section 2.11.2 of the OpenGL 2.1 Specification.
Column-major versus row-major is purely a notational convention. Note that post-multiplying with column-major matrices produces the same result as pre-multiplying with row-major matrices. The OpenGL Specification and the OpenGL Reference Manual both use column-major notation. You can use any notation, as long as it's clearly stated.
Sadly, the use of column-major format in the spec and blue book has resulted in endless confusion in the OpenGL programming community. Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.
I'm going to update this 9 years old answer.
A mathematical matrix is defined as m x n matrix. Where m is a number of rows and n is number of columns. For the sake of completeness, rows are horizontals, columns are vertical. When denoting a matrix element in mathematical notation Mij, the first element (i) is a row index, the second one (j) is a column index. When two matrices are multiplied, i.e. A(m x n) * B(m1 x n1), the resulting matrix has number of rows from the first argument(A), and number of columns of the second(B), and number of columns of the first argument (A) must match number of rows of the second (B). so n == m1. Clear so far, yes?
Now, regarding in-memory layout. You can store matrix two ways. Row-major and column-major. Row-major means that effectively you have rows laid out one after another, linearly. So, elements go from left to right, row after row. Kinda like english text. Column-major means that effectively you have columns laid out one after another, linearly. So elements start at top left, and go from top to bottom.
Example:
//matrix
|a11 a12 a13|
|a21 a22 a23|
|a31 a32 a33|
//row-major
[a11 a12 a13 a21 a22 a23 a31 a32 a33]
//column-major
[a11 a21 a31 a12 a22 a32 a13 a23 a33]
Now, here's the fun part!
There are two ways to store 3d transformation in a matrix.
As I mentioned before, a matrix in 3d essentially stores coordinate system basis vectors and position. So, you can store those vectors in rows or in columns of a matrix. When they're stored as columns, you multiply a matrix with a column vector. Like this.
//convention #1
|vx.x vy.x vz.x pos.x| |p.x| |res.x|
|vx.y vy.y vz.y pos.y| |p.y| |res.y|
|vx.z vy.z vz.z pos.z| x |p.z| = |res.z|
| 0 0 0 1| | 1| |res.w|
However, you can also store those vectors as rows, and then you'll be multiplying a row vector with a matrix:
//convention #2 (uncommon)
| vx.x vx.y vx.z 0|
| vy.x vy.y vy.z 0|
|p.x p.y p.z 1| x | vz.x vz.y vz.z 0| = |res.x res.y res.z res.w|
|pos.x pos.y pos.z 1|
So. Convention #1 often appears in mathematical texts. Convention #2 appeared in DirectX sdk at some point. Both are valid.
And in regards of the question, if you're using convention #1, then your matrices are column-major. And if you're using convention #2, then they're row major. However, memory layout is the same in both cases
[vx.x vx.y vx.z 0 vy.x vy.y vy.z 0 vz.x vz.y vz.z 0 pos.x pos.y pos.z 1]
Which is why I said it is easier to memorize which element is which, 9 years ago.
To summarize the answers by SigTerm and dsharlet: The usual way to transform a vector in GLSL is to right-multiply the transformation matrix by the vector:
mat4 T; vec4 v; vec4 v_transformed;
v_transformed = T*v;
In order for that to work, OpenGL expects the memory layout of T to be, as described by SigTerm,
{1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, transX, transY, transZ, 1 }
which is also called 'column major'. In your shader code (as indicated by your comments), however, you left-multiplied the transformation matrix by the vector:
v_transformed = v*T;
which only yields the correct result if T is transposed, i.e. has the layout
{ 1, 0, 0, transX, 0, 1, 0, transY, 0, 0, 1, transZ, 0, 0, 0, 1 }
(i.e. 'row major'). Since you already provided the correct layout to your shader, namely row major, it was not necessary to set the transpose flag of glUniform4v.
You are dealing with two separate issues.
First, your examples are dealing with the memory layout. Your [4][4] array is row major because you've used the convention established by C multi-dimensional arrays to match your linear array.
The second issue is a matter of convention for how you interpret matrices in your program. glUniformMatrix4fv is used to set a shader parameter. Whether your transform is computed for a row vector or column vector transform is a matter of how you use the matrix in your shader code. Because you say you need to use column vectors, I assume your shader code is using the matrix A and a column vector x to compute x' = A x.
I would argue that the documentation of glUniformMatrix is confusing. The description of the transpose parameter is a really roundabout way of just saying that the matrix is transposed or it isn't. OpenGL itself is just transporting that data to your shader, whether you want to transpose it or not is a matter of convention you should establish for your program.
This link has some good further discussion: http://steve.hollasch.net/cgindex/math/matrix/column-vec.html
I think that the existing answers here are very unhelpful, and I can see from the comments that people are left feeling confused after reading them, so here is another way of looking at this situation.
As a programmer, if I want to store an array in memory, I cannot store a rectangular grid of numbers, because computer memory doesn't work like that, I have to store the numbers in a linear sequence.
Lets say I have a 2x2 matrix and I initialize it in my code like this:
const matrix = [a, b, c, d];
I can successfully use this matrix in other parts of my code provided I know what each of the array elements represents.
The OpenGL specification defines what each index position represents, and this is all you need to know to construct an array and pass it to OpenGL and have it do what you expect.
The row or column major issue only comes into play when I want to write my matrix in a document that describes my code, because mathematicians write matrixes as rectangular grids of numbers. However this is just a convention, a way of writing things down, and has no impact on the code I write or the arrangement of numbers in memory on my computer. You could easily re-write these mathematics papers using some other notation, and it would work just as well.
For the array above, I have two options for writing this array in my documentation as a rectangular grid:
|a b| OR |a c|
|c d| |b d|
Whichever way I choose to write my documentation, this will have no impact on my code or the order of the numbers in memory on my computer, it's just documentation.
In order for people reading my documentation to know the order that I stored the values in the linear array in my program, I can specify that this is a column major or row major representation of the array as a matrix. If it is in column major order then I should traverse the columns to get the linear arrangement of numbers. If this is a row major representation then I should traverse the rows to get the linear arrangement of numbers.
In general, writing documentation in row major order makes life easier for programmers, because if I want to translate this matrix
|a b c|
|d e f|
|g h i|
into code, I can write it like this:
const matrix = [
a, b, c
d, e, f
g, h, i
];
For example:
GLM stores matrix values as m[4][4]. But it treats matrices as if they have a column major order. Even though for 2 dimensional array m[x][y] in C x represents a row and y represents a column, which means that matrix represented by this array has in fact row major order. The trick is to treat m[x][y] as if x represents a column and y represents a row. It is like you transposing the matrix without performing any additional operations to achieve that.
I realise I've written this like a homework question, but that's because it's the simplest way for me to understand and try to relay the problem. It's something I want to solve for a personal project.
I have cards laid out in a grid, I am starting with a simple case of a 2x2 grid but want to be able to extrapolate to larger n×n grids eventually.
The cards are all face down, and printed on the faces of the cards are either a:
positive non-zero integer, representing the card's 'score'
or the black spot.
I am given the information of the sum of the scores of each row, the number of black spots in each row, and the sum of the scores of each column, and the number of black spots in each column.
So the top row must have a sum score of 1, and exactly one of the cards is a black spot.
The rightmost column must have a sum score of 2, and exactly one of the cards is a black spot.
Etc.
Of course we can see the above grid will "solve" to
Now I want to make a function that inputs the given information and produces the grid of cards that satisfies those constraints.
I am thinking I can use tuple-like arguments to the function.
And then every "cell" or card in the grid is itself a tuple, the first element of the tuple will be the score of the card there (or 0 if it is a black spot) and the second element will be a 1 if the card is a black spot or a 0 otherwise.
So the grid will have to resemble that ^^
I can find out what all the a, b, variables are by solving this system of equations:
(Knowing also that all of these numbers are integers which are ≥0).
I wanted to use this problem as a learning exercise in prolog, I think it seems like a problem Prolog will solve elegantly.
Have I made a good decision or is Prolog not a good choice?
I wonder how I can implement this in Prolog.
Prolog is very good for this kind of problems. Have a look clp(fd), that is Constraint Logic Programming in Finite Domains.
This snippet shows a primitive way how to solve your initial 2x2 example in SWI Prolog:
:- use_module(library(clpfd)).
test(Vars) :-
Vars = [TopLeft, TopRight, BottomLeft, BottomRight],
global_cardinality([TopLeft, TopRight], [0-1,1-_,2-_]), TopLeft + TopRight #= 1,
global_cardinality([TopLeft, BottomLeft], [0-1,1-_,2-_]), TopLeft + BottomLeft #= 1,
global_cardinality([BottomLeft, BottomRight], [0-1,1-_,2-_]), BottomLeft + BottomRight #= 2,
global_cardinality([TopRight, BottomRight], [0-1,1-_,2-_]), TopRight + BottomRight #= 2,
label(Vars).
Query:
?- test(Vars).
Vars = [1, 0, 0, 2].
You can take this as a starting point and generalize. Note that the black dot is represented as 0, because clp(fd) deals only with integers.
Here is the documentation: http://www.swi-prolog.org/man/clpfd.html
Given an image and a set of labels attached to particular points on the image, I'm looking for an algorithm to lay out the labels to the sides of the image with certain constraints (roughly same number of labels on each side, labels roughly equidistant, lines connecting the labels to their respective points with no lines crossing).
Now, an approximate solution can typically be found quite naively by ordering the labels by Y coordinate (of the point they refer to), as in this example (proof of concept only, please ignore accuracy or otherwise of actual data!).
Now to satisfy the condition of no crossings, some ideas that occurred to me:
use a genetic algorithm to find an ordering of labels with no crossovers;
use another method (e.g. dynamic programming algorithm) to search for such an ordering;
use one of the above algorithms, allowing for variations in the spacing as well as ordering, to find the solution that minimises number of crossings and variation from even spacing;
maybe there are criteria I can use to brute search through every possible ordering of the labels within certain criteria (do not re-order two labels if their distance is greater than X);
if all else fails, just try millions of random orderings/spacing offsets and take the one that gives the minimum crossings/spacing variation. (Advantage: straightforward to program and will probably find a good enough solution; slight disadvantage, though not a show-stopper: maybe can't then run it on the fly during the application to allow user to change layout/size of the image.)
Before I embark on one of these, I would just welcome some other people's input: has anybody else experience with a similar problem and have any information to report on the success/failure of any of the above methods, or if they have a better/simpler solution that isn't occurring to me? Thanks for your input!
Lucas Bradsheet's honours thesis Labelling Maps using Multi-Objective Evolutionary Algorithms has quite a good discussion of this.
First off, this paper creates usable metrics for a number of metrics of labelling quality.
For example, clarity (how obvious the mapping between sites and labels was): clarity(s)=rs+rs1/rt
where rs is the distance between a site and its label and rt is the distance between a site and there closest other label).
It also has useful metrics for the conflicts between labels, sites and borders, as well as for measuring the density and symmetry of labels. Bradsheet then uses a multiple objective genetic algorithm to generate a "Pareto frontier" of feasible solutions. It also includes information about how he mutated the individuals, and some notes on improving the speed of the algorithm.
There's a lot of detail in it, and it should provide some good food for thought.
Let's forget about information design for a moment. This tasks recalls some memories related to PCB routing algorithms. Actually there are a lot of common requirements, including:
intersections optimization
size optimization
gaps optimization
So, it could be possible to turn the initial task into something similar to PCB routing.
There are a lot of information available, but I would suggest to look through Algorithmic studies on PCB routing by Tan Yan.
It provides a lot of details and dozens of hints.
Adaptation for the current task
The idea is to treat markers on the image and labels as two sets of pins and use escape routing to solve the task. Usually the PCB area is represented as an array of pins. Same can be done to the image with possible optimizations:
avoid low contrast areas
avoid text boxes if any
etc
So the task can be reduced to "routing in case of unused pins"
Final result can be really close to the requested style:
Algorithmic studies on PCB routing by Tan Yan is a good place to continue.
Additional notes
I chn change the style of the drawing a little bit, in order to accentuate the similarity.
It should not be a big problem to do some reverse transformation, keeping the good look and readability.
Anyway, adepts of simplicity (like me, for example) can spend several minutes and invent something better (or something different):
As for me, curves do not look like a complete solution, at least on this stage. Anyway, I've just tried to show there is room for enhancements, so PCB routing approach can be considered as an option.
One option is to turn it into an integer programming problem.
Lets say you have n points and n corresponding labels distributed around the outside of the diagram.
The number of possible lines is n^2, if we look at all possible intersections, there are less than n^4 intersections (if all possible lines were displayed).
In our integer programming problem we add the following constraints:
(to decide if a line is switched on (i.e. displayed to the screen) )
For each point on the diagram, only one of the possible n lines
connecting to it is to be switched on.
For each label, only one of the possible n lines connecting to it is
to be switched on.
For each pair of intersecting line segments line1 and line2, only
zero or one of these lines may be switched on.
Optionally, we can minimize the total distance of all the switched on lines. This enhances aesthetics.
When all of these constraints hold at the same time, we have a solution:
The code below produced the above diagram for 24 random points.
Once You start to get more than 15 or so points, the run time of the program will start to slow.
I used the PULP package with its default solver. I used PyGame for the display.
Here is the code:
__author__ = 'Robert'
import pygame
pygame.font.init()
import pulp
from random import randint
class Line():
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
self.length = (p1[0] - p2[0])**2 + (p1[1] - p2[1])**2
def intersect(self, line2):
#Copied some equations for wikipedia. Not sure if this is the best way to check intersection.
x1, y1 = self.p1
x2, y2 = self.p2
x3, y3 = line2.p1
x4, y4 = line2.p2
xtop = (x1*y2-y1*x2)*(x3-x4)-(x1-x2)*(x3*y4-y3*x4)
xbottom = (x1-x2)*(y3-y4) - (y1-y2)*(x3-x4)
ytop = (x1*y2-y1*x2)*(y3-y4)-(y1-y2)*(x3*y4-y3*x4)
ybottom = xbottom
if xbottom == 0:
#lines are parallel. Can only intersect if they are the same line. I'm not checking that however,
#which means there could be a rare bug that occurs if more than 3 points line up.
if self.p1 in (line2.p1, line2.p2) or self.p2 in (line2.p1, line2.p2):
return True
return False
x = float(xtop) / xbottom
y = float(ytop) / ybottom
if min(x1, x2) <= x <= max(x1, x2) and min(x3, x4) <= x <= max(x3, x4):
if min(y1, y2) <= y <= max(y1, y2) and min(y3, y4) <= y <= max(y3, y4):
return True
return False
def solver(lines):
#returns best line matching
lines = list(lines)
prob = pulp.LpProblem("diagram labelling finder", pulp.LpMinimize)
label_points = {} #a point at each label
points = {} #points on the image
line_variables = {}
variable_to_line = {}
for line in lines:
point, label_point = line.p1, line.p2
if label_point not in label_points:
label_points[label_point] = []
if point not in points:
points[point] = []
line_on = pulp.LpVariable("point{0}-point{1}".format(point, label_point),
lowBound=0, upBound=1, cat=pulp.LpInteger) #variable controls if line used or not
label_points[label_point].append(line_on)
points[point].append(line_on)
line_variables[line] = line_on
variable_to_line[line_on] = line
for lines_to_point in points.itervalues():
prob += sum(lines_to_point) == 1 #1 label to each point..
for lines_to_label in label_points.itervalues():
prob += sum(lines_to_label) == 1 #1 point for each label.
for line1 in lines:
for line2 in lines:
if line1 > line2 and line1.intersect(line2):
line1_on = line_variables[line1]
line2_on = line_variables[line2]
prob += line1_on + line2_on <= 1 #only switch one on.
#minimize length of switched on lines:
prob += sum(i.length * line_variables[i] for i in lines)
prob.solve()
print prob.solutionTime
print pulp.LpStatus[prob.status] #should say "Optimal"
print len(prob.variables())
for line_on, line in variable_to_line.iteritems():
if line_on.varValue > 0:
yield line #yield the lines that are switched on
class Diagram():
def __init__(self, num_points=20, width=700, height=800, offset=150):
assert(num_points % 2 == 0) #if even then labels align nicer (-:
self.background_colour = (255,255,255)
self.width, self.height = width, height
self.screen = pygame.display.set_mode((width, height))
pygame.display.set_caption('Diagram Labeling')
self.screen.fill(self.background_colour)
self.offset = offset
self.points = list(self.get_points(num_points))
self.num_points = num_points
self.font_size = min((self.height - 2 * self.offset)//num_points, self.offset//4)
def get_points(self, n):
for i in range(n):
x = randint(self.offset, self.width - self.offset)
y = randint(self.offset, self.height - self.offset)
yield (x, y)
def display_outline(self):
w, h = self.width, self.height
o = self.offset
outline1 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 100, 100), True, outline1, 1)
o = self.offset - self.offset//4
outline2 = [(o, o), (w - o, o), (w - o, h - o), (o, h - o)]
pygame.draw.lines(self.screen, (0, 200, 0), True, outline2, 1)
def display_points(self, color=(100, 100, 0), radius=3):
for point in self.points:
pygame.draw.circle(self.screen, color, point, radius, 2)
def get_label_heights(self):
for i in range((self.num_points + 1)//2):
yield self.offset + 2 * i * self.font_size
def get_label_endpoints(self):
for y in self.get_label_heights():
yield (self.offset, y)
yield (self.width - self.offset, y)
def get_all_lines(self):
for point in self.points:
for end_point in self.get_label_endpoints():
yield Line(point, end_point)
def display_label_lines(self, lines):
for line in lines:
pygame.draw.line(self.screen, (255, 0, 0), line.p1, line.p2, 1)
def display_labels(self):
myfont = pygame.font.SysFont("Comic Sans MS", self.font_size)
label = myfont.render("label", True, (155, 155, 155))
for y in self.get_label_heights():
self.screen.blit(label, (self.offset//4 - 10, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.offset - self.offset//4, y), (self.offset, y), 1)
for y in self.get_label_heights():
self.screen.blit(label, (self.width - 2*self.offset//3, y - self.font_size//2))
pygame.draw.line(self.screen, (255, 0, 0), (self.width - self.offset + self.offset//4, y), (self.width - self.offset, y), 1)
def display(self):
self.display_points()
self.display_outline()
self.display_labels()
#self.display_label_lines(self.get_all_lines())
self.display_label_lines(solver(self.get_all_lines()))
diagram = Diagram()
diagram.display()
pygame.display.flip()
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
I think an actual solution of this problem is on the slightly different layer. It doesn't seem to be right idea to start solving algorithmic problem totally ignoring Information design. There is an interesting example found here
Let's identify some important questions:
How is the data best viewed?
Will it confuse people?
Is it readable?
Does it actually help to better understand the picture?
By the way, chaos is really confusing. We like order and predictability. There is no need to introduce additional informational noise to the initial image.
The readability of a graphical message is determined by the content and its presentation. Readability of a message involves the reader’s ability to understand the style of text and pictures. You have that interesting algorithmic task because of the additional "noisy" approach. Remove the chaos -- find better solution :)
Please note, this is just a PoC. The idea is to use only horizontal lines with clear markers. Labels placement is straightforward and deterministic. Several similar ideas can be proposed.
With such approach you can easily balance left-right labels, avoid small vertical gaps between lines, provide optimal vertical density for labels, etc.
EDIT
Ok, let's see how initial process may look.
User story: as a user I want important images to be annotated in order to simplify understanding and increase it's explanatory value.
Important assumptions:
initial image is a primary object for the user
readability is a must
So, the best possible solution is to have annotations but do not have them. (I would really suggest to spend some time reading about the theory of inventive problem solving).
Basically, there should be no obstacles for the user to see the initial picture, but annotations should be right there when needed. It can be slightly confusing, sorry for that.
Do you think intersections issue is the only one behind the following image?
Please note, the actual goal behind the developed approach is to provide two information flows (image and annotations) and help the user to understand everything as fast as possible. By the way, vision memory is also very important.
What are behind human vision:
Selective attention
Familiarity detection
Pattern detection
Do you want to break at least one of these mechanisms? I hope you don't. Because it will make the actual result not very user-friendly.
So what can distract me?
strange lines randomly distributed over the image (random geometric objects are very distractive)
not uniform annotations placement and style
strange complex patterns as a result of final merge of the image and the annotation layer
Why my proposal should be considered?
It has simple pattern, so pattern detection will let the user stop noticing annotations, but see the picture instead
It has uniform design, so familiarity detection will work too
It does not affect initial image so much as other solutions because lines have minimal width.
Lines are horizontal, anti-aliasing is not used, so it saves more information and provides clean result
Finally, it does simplify routing algorithm a lot.
Some additional comments:
Do not use random points to test your algorithms, use simple but yet important cases. You'll see automated solutions sometimes may fail dramatically.
I do not suggest to use approach proposed by me as is. There are a lot of possible enhancements.
What I'm really suggest is to go one level up and do several iterations on the meta-level.
Grouping can be used to deal with the complex case, mentioned by Robert King:
Or I can imagine for a second some point is located slightly above it's default location. But only for a second, because I do not want to break the main processing flow and affect other markers.
Thank you for reading.
You can find the center of your diagram, and then draw the lines from the points radially outward from the center. The only way you could have a crossing is if two of the points lie on the same ray, in which case you just shift one of the lines a bit one way, and shift the other a bit the other way, like so:
With only actual parts showing:
In case there are two or more points colinear with the center, you can shift the lines slightly to the side:
While this doen't produce very good multisegment line things, it very clearly labels the diagram. Also, to make it more fisually appealing, it may be better to pick a point for the center that is actually the center of your object, rather than just the center of the point set.
I would add one more thing to your prototype - may be it will be acceptable after this:
Iterate through every intersection and swap labels, repeat until there are intersections.
This process is finite, because number of states is finite and every swap reduces sum of all line lengths - so no loop is possible.
This problem can be cast as graph layout.
I recommend you look at e.g. the Graphviz library. I have not done any experiments, but believe that by expressing the points to be labeled and the labels themselves as nodes and the lead lines as edges, you would get good results.
You would have to express areas where labels should not go as "dummy" nodes not to be overlapped.
Graphvis has bindings for many languages.
Even if Graphviz does not have quite enough flexibility to do exactly what you need, the "Theory" section of that page has references for energy minimization and spring algorithms that can be applied to your problem. The literature on graph layout is enormous.
Here is the deal. I have multiple points (X,Y) that form an 'ellipse like' shape.
I would like to evaluate/fit the 'best' ellipse possible and get its properties (a,b,F1,F2), or just the center of the ellipse.
Any ideas/leads would be appreciated.
Gilad.
There's a Matlab function fit_ellipse that can do the job. There's also this paper on methods for orthogonal distance fitting of ellipses. A web search for orthogonal ellipse fit will probably turn up a lot of other resources as well.
The ellipse fitting method proposed by:
Z. L. Szpak, W. Chojnacki, and A. van den Hengel.
Guaranteed ellipse fitting with a confidence region and an uncertainty measure for centre, axes, and orientation.
J. Math. Imaging Vision, 2015.
may be of interest to you. They provide estimates of both algebraic and geometric ellipse
parameters, together with covariance matrices that express the uncertainty of the parameter estimates.
They also provide a means of computing a planar 95% confidence region associated with the estimate
that allows one to visualise the uncertainty in the ellipse fit.
A pre-print version of the paper is available on the authors websites (http://cs.adelaide.edu.au/~wojtek/publicationsWC.html).
A MATLAB implementation of the method is also available for download:
https://sites.google.com/site/szpakz/source-code/guaranteed-ellipse-fitting-with-a-confidence-region-and-an-uncertainty-measure-for-centre-axes-and-orientation
I will explain how I would approach the problem. I would suggest a hill climbing approach. First compute the gravity center of the points as a start point and choose two values for a and b in some way(probably arbitrary positive values will do). You need to have a fit function and I would suggest it to return the number of points (close enough to)lying on a given ellipse:
int fit(x, y, a, b)
int res := 0
for point in points
if point_almost_on_ellipse(x, y, a, b, point)
res = res + 1
end_if
end_for
return res
Now start with some step. I would choose a big enough value to be sure the best center of the elipse will never be more then step away from the first point. Choosing such a big value is not necessary, but the slowest part of the algorithm is the time it takes to get close to the best center so bigger value is better, I think.
So now we have some initial point(x, y), some initial values of a and b and an initial step. The algorithm iteratively chooses the best of the neighbours of the current point if there is any neighbour better then it, or decrease step twice otherwise. Here by 'best' I mean using the fit function. And also a position is defined by four values (x, y, a, b) and it's neighbours are 8: (x+-step, y, a, b),(x, y+-step, a, b), (x, y, a+-step, b), (x, y, a, b+-step)(if results are not good enough you can add more neighbours by also going by diagonal - for instance (x+-step, y+-step, a, b) and so on). Here is how you do that
neighbours = [[-1, 0, 0, 0], [1, 0, 0, 0], [0, -1, 0, 0], [0, 1, 0, 0],
[0, 0, -1, 0], [0, 0, 1, 0], [0, 0, 0, -1], [0, 0, 0, 1]]
iterate (cx, cy, ca, cb, step)
current_fit = fit(cx, cy, ca, cb)
best_neighbour = []
best_fit = current_fit
for neighbour in neighbours
tx = cx + neighbour[0]*step
ty = cx + neighbour[1]*step
ta = ca + neighbour[2]*step
tb = cb + neighbour[3]*step
tfit = fit(tx, ty, ta, tb)
if (tfit > best_fit)
best_fit = tfit
best_neighbour = [tx,ty,ta,tb]
endif
end_for
if best_neighbour.size == 4
cx := best_neighbour[0]
cy := best_neighbour[1]
ca := best_neighbour[2]
cb := best_neighbour[3]
else
step = step * 0.5
end_if
And you continue iterating until the value of step is smaller then a given threshold(for instance 1e-6). I have written everything in pseudo code as I am not sure which language do you want to use.
It is not guaranteed that the answer found this way will be optimal but I am pretty sure it will be good enough approximation.
Here is an article about hill climbing.
I think that Wild Magic library contains a function for ellipse fitting. There is article with method decription
The problem is to define "best". What is best in your case? The ellipse with the smallest area which contains n% of pointS?
If you define "best" in terms of probability, you can simply use the covariance matrix of your points, and compute the error ellipse.
An error ellipse for this "multivariate Gaussian distribution" would then contain the points corresponding to whatever confidence interval you decide.
Many computing packages can compute the covariance, with its corresponding eigenvalues and eigenvectors. The angle of the ellipse is the angle between the x axis and the eigenvector corresponding to the largest eigenvalue. The semi-axes are the reciprocal of the eigenvalues.
If your routine returns everything normalized (which it should), then you can decide by what factor to multiply everything to obtain an alpha-confidence interval.
For this problem speed is pretty crucial. I've drawn a nice image to explain the problem better. The algorithm needs to calculate if edges of a rectangle continue within the confines of the canvas, will the edge intersect another rectangle?
We know:
The size of the canvas
The size of each rectangle
The position of each rectangle
The faster the solution is the better! I'm pretty stuck on this one and don't really know where to start.
alt text http://www.freeimagehosting.net/uploads/8a457f2925.gif
Cheers
Just create the set of intervals for each of the X and the Y axis. Then for each new rectangle, see if there are intersecting intervals in the X or the Y axis. See here for one way of implementing the interval sets.
In your first example, the interval set on the horizontal axis would be { [0-8], [0-8], [9-10] }, and on the vertical: { [0-3], [4-6], [0-4] }
This is only a sketch, I abstracted many details here (e.g. usually one would ask an interval set/tree "which intervals overlap this one", instead of "intersect this one", but nothing not doable).
Edit
Please watch this related MIT lecture (it's a bit long, but absolutely worths it).
Even if you find simpler solutions (than implementing an augmented red-black tree), it's good to know the ideas behind these things.
Lines that are not parallel to each other are going to intersect at some point. Calculate the slopes of each line and then determine what lines they won't intersect with.
Start with that, and then let's see how to optimize it. I'm not sure how your data is represented and I can't see your image.
Using slopes is a simple equality check which probably means you can take advantage of sorting the data. In fact, you can probably just create a set of distinct slopes. You'll have to figure out how to represent the data such that the two slopes of the same rectangle are not counted as intersecting.
EDIT: Wait.. how can two rectangles whose edges go to infinity not intersect? Rectangles are basically two lines that are perpendicular to each other. shouldn't that mean it always intersects with another if those lines are extended to infinity?
as long as you didn't mention the language you chose to solve the problem, i will use some kind of pseudo code
the idea is that if everything is ok, then a sorted collection of rectangle edges along one axis should be a sequence of non-overlapping intervals.
number all your rectangles, assigning them individual ids
create an empty binary tree collection (btc). this collection should have a method to insert an integer node with info btc::insert(key, value)
for all rectangles, do:
foreach rect in rects do
btc.insert(rect.top, rect.id)
btc.insert(rect.bottom, rect.id)
now iterate through the btc (this will give you a sorted order)
btc_item = btc.first()
do
id = btc_item.id
btc_item = btc.next()
if(id != btc_item.id)
then report_invalid_placement(id, btc_item.id)
btc_item = btc.next()
while btc_item is valid
5,7,8 - repeat steps 2,3,4 for rect.left and rect.right coordinates
I like this question. Here is my try to get on it:
If possible:
Create a polygon from each rectangle. Treat each edge as an line of maximum length that must be clipped. Use a clipping algorithm to check weather or not a line intersects with another. For example this one: Line Clipping
But keep in mind: If you find an intersection which is at the vertex position, its a valid one.
Here's an idea. Instead of creating each rectangle with (x, y, width, height), instantiate them with (x1, y1, x2, y2), or at least have it interpret these values given the width and height.
That way, you can check which rectangles have a similar x or y value and make sure the corresponding rectangle has the same secondary value.
Example:
The rectangles you have given have the following values:
Square 1: [0, 0, 8, 3]
Square 3: [0, 4, 8, 6]
Square 4: [9, 0, 10, 4]
First, we compare Square 1 to Square 3 (no collision):
Compare the x values
[0, 8] to [0, 8] These are exactly the same, so there's no crossover.
Compare the y values
[0, 4] to [3, 6] None of these numbers are similar, so they're not a factor
Next, we compare Square 3 to Square 4 (collision):
Compare the x values
[0, 8] to [9, 10] None of these numbers are similar, so they're not a factor
Compare the y values
[4, 6] to [0, 4] The rectangles have the number 4 in common, but 0 != 6, therefore, there is a collision
By know we know that a collision will occur, so the method will end, but lets evaluate Square 1 and Square 4 for some extra clarity.
Compare the x values
[0, 8] to [9, 10] None of these numbers are similar, so they're not a factor
Compare the y values
[0, 3] to [0, 4] The rectangles have the number 0 in common, but 3 != 4, therefore, there is a collision
Let me know if you need any extra details :)
Heh, taking the overlapping intervals answer to the extreme, you simply determine all distinct intervals along the x and y axis. For each cutting line, do an upper bound search along the axis it will cut based on the interval's starting value. If you don't find an interval or the interval does not intersect the line, then it's a valid line.
The slightly tricky part is to realize that valid cutting lines will not intersect a rectangle's bounds along an axis, so you can combine overlapping intervals into a single interval. You end up with a simple sorted array (which you fill in O(n) time) and a O(log n) search for each cutting line.