Connect points from set in the line segments - algorithm

I have been given a task where I have to connects all the points in the 2D plane.
There are four conditions to to be met:
Length of the all segments joined together has to be minimal.
One point can be a part of only one line segment.
Line segments cannot intersect
All points have to be used(one can't be left alone but only if it cannot be avoided)
Image to visualize the problem:
The wrong image connected points correctly, although the total length is bigger that the the one in on the left.
At first I thought about sorting the points and doing it with a sweeping line and building a tree of all possibilities, although it does seem like a way to complicated solution with huge complexity. Therefore I search better approaches. I would appreciate some hints what to do, or how could I approach the problem.

I would start with a Delaunay triangulation of the point set. This should already give you the nearest neighbor connections of each point without any intersections. In the next step I'd look at the triangles that result from the triangulation - the convenient property here is that based on your ruleset you can pick exactly one side from each triangle and remove the remaining two from the selection.
The problem that remains now is to pick those edges that give you the smallest total sum which of course will not always be the smallest side since that one might already have been blocked by a neighboring triangle. I'd start with a greedy approach, always picking the smallest remaining edge that has not been blocked by neighboring triangles yet.
Edit: In the next step you retrieve a list of all the edges in that triangulation and sort them by length. You also make another list in which you count the amount of connections each point has. Now you iterate through the edge list going from the longest edge to the shortest one and check the two points it connects in the connection count list: if each of the points has still more than 1 connection left, you can discard the edge and decrement the connection count for the two points involved. If at least one of the points has only one connection left, you have got yourself one of the edges you are looking for. You repeat the process until there are no edges left and this should hopefully give you the smallest possible edge sum.
If I am not mistaken this problem is loosely related to the knapsack problem which is NP-Hard so I am not sure if this solution really gives you the best possible one.

I'd say this is an extension to the well-known travelling salesman problem.
A good technique (if a little old-fashioned) is to use a simulated annealing optimisation technique.
You'll need to make adjustments to the cost (a.k.a. objective) function to miss out sections of the path. But given a candidate continuous path, it's reasonably trivial to decide which sections to miss out to minimise its length. (You'd first remove the longer of any intersecting lines).

Wow, that's a tricky one. That's a lot of conditions to meet.
I think from a programming standpoint, the "simplest" solution might actually be to just loop through, find all the possibilities that satisfy the last 3 conditions, and record the total length as you loop through, and just choose the one with the shortest length in the end - brute force, guess-and-check. I think this is what you were referring to in your OP when you mentioned a "sweeping line and building a tree of all possibilities". This approach is very computationally expensive, but if the code is written right, it should always work in the end.
If you want the "best" solution, where you want to just solve for the single final answer right away, I'm afraid my math skills aren't strong enough for that - I'm not even sure if there is any single analytical solution to that problem for any arbitrary collection of points. Maybe try checking with the people over at MathOverflow. If someone over there can explain you with the math behind that calculation, and you then you still need help to convert that math into code in a certain programming language, update your question here (maybe with a link to the answer they provide you) and I'm sure someone will be able to help you out from that point.

One of the possible solutions is to use graph theory.
Construct a bipartite graph G, such that each point has its copy in both parts. Now put the edges between the points i and j with the weight = i == j ? infinity : distance[i][j]. The minimal weight maximum matching in the graph will be your desired configuration.
Notice that since this is on a euclidean 2D plane, the resulting "edges" of the matching will not intersect. Let's say that edges AB and XY intersect for points A, B, X, Y. Then the matching is not of the minimum weight, because either AX, BY or AY, BX will produce a smaller total weight without an intersection (this comes from triangle inequality a+b > c)

Related

Merge adjacent vertices of a graph until single vertex left in the fewest steps possible

I have a game system that can be represented as an undirected, unweighted graph where each vertex has one (relevant) property: a color. The goal of the game in terms of the graph representation is to reduce it down to one vertex in the fewest "steps" possible. In each step, the player can change the color of any one vertex, and all adjacent vertices of the same color are merged with it. (Note that in the example below I just happened to show the user only changing one specific vertex the whole game, but the user can pick any vertex in each step.)
What I am after is a way to compute the fewest amount of steps necessary to "beat" a given graph per the procedure described above, and also provide the specific moves needed to do so. I'm familiar with the basics of path-finding, BFS, and things of that nature, but I'm having a hard time framing this problem in terms of a "fastest path" solution.
I am unable to find this same problem anywhere on Google, or even a graph-theory term that encapsulates the problem. Does anyone have an idea of at least how to get started approaching this problem? Can anyone point me in the right direction?
EDIT Since this problem seems to be really difficult to solve efficiently, perhaps I could change the aim of my question. Could someone describe how I would even set up a brute force, breadth first search for this? (Brute force could possibly be okay, since in practice these graphs will only be 20 vertices at most.) I know how to write a BFS for a normal linked graph data structure... but in this case it seems quite weird since each vertex would have to contain a whole graph within itself, and the next vertices in the search graph would have to be generated based on possible moves to make in the graph within the vertex. How would one setup the data structure and search algorithm to accomplish this?
EDIT 2 This is an old question, but I figured it might help to just state outright what the game was. The game was essentially to be a rip-off of Kami 2 for iOS, except my custom puzzle editor would automatically figure out the quickest possible way to solve your puzzle, instead of having to find the shortest move number by trial and error yourself. I'm not sure if Kami was a completely original game concept, or if there is a whole class of games like it with the same "flood-fill" mechanic that I'm unaware of. If this is a common type of game, perhaps knowing the name of it could allow finding more literature on the algorithm I'm seeking.
EDIT 3 This Stack Overflow question seems like it may have some relevant insights.
Intuitively, the solution seems global. If you take a larger graph, for example, which dot you select first will have an impact on the direct neighbours which will have an impact on their neighbours and so on.
It sounds as if it were of the same breed of problems as the map colouring problem. Not because of the colours but because of the implications of a local selection to the other end of the graph down the road. In the map colouring, you have to decide what colour to draw a country and its neighbouring countries so two countries that touch don't have the same colour. That first set of selections have an impact on whether there is a solution in the subsequent iterations.
Just to show how complex problem is.
Lets check simpler problem where graph is changed with a tree, and only root vertex can change a colour. In that case path to a leaf can be represented as a sequence of colours of vertices on that path. Sequence A of colour changes collapses a leaf if leaf's sequence is subsequence of A.
Problem can be stated that for given set of sequences problem is to find minimal length sequence (S) so that each initial sequence is contained in S. That is called shortest common supersequence problem, and it is NP-complete.
Your problem is for sure more complex than this one :-/
Edit *
This is a comment on question's edit. Check this page for a terms.
Number of minimal possible moves is >= than graph radius. With that it seems good strategy to:
use central vertices for moves,
use moves that reduce graph radius, or at least reduce distance from central vertices to 'large' set of vertices.
I would go with a strategy that keeps track of central vertices and distances of all graph vertices to these central vertices. Step is to check all meaningful moves and choose one that reduce radius or distance to central vertices the most. I think BFS can be used for distance calculation and how move influences them. There are tricky parts, like when central vertices changes after moves. Maybe it is good idea to use not only central vertices but also vertices close to central.
I think the graph term you are looking for is the "valence" of a graph, which is the number of edges that a node is connected to. It looks like you want to change the color based on what node has the highest valence. Then in the resulting graph change the color for the node that has the highest valence, etc. until you have just one node left.

How to break a geometry into blocks?

I am certain there is already some algorithm that does what I need, but I am not sure what phrase to Google, or what is the algorithm category.
Here is my problem: I have a polyhedron made up by several contacting blocks (hyperslabs), i. e. the edges are axis aligned and the angles between edges are 90°. There may be holes inside the polyhedron.
I want to break up this concave polyhedron in as little convex rectangular axis-aligned whole blocks are possible (if the original polyhedron is convex and has no holes, then it is already such a block, and therefore, the solution). To illustrate, some 2-D images I made (but I need the solution for 3-D, and preferably, N-D):
I have this geometry:
One possible breakup into blocks is this:
But the one I want is this (with as few blocks as possible):
I have the impression that an exact algorithm may be too expensive (is this problem NP-hard?), so an approximate algorithm is suitable.
One detail that maybe make the problem easier, so that there could be a more appropriated/specialized algorithm for it is that all edges have sizes multiple of some fixed value (you may think all edges sizes are integer numbers, or that the geometry is made up by uniform tiny squares, or voxels).
Background: this is the structured grid discretization of a PDE domain.
What algorithm can solve this problem? What class of algorithms should I
search for?
Update: Before you upvote that answer, I want to point out that my answer is slightly off-topic. The original poster have a question about the decomposition of a polyhedron with faces that are axis-aligned. Given such kind of polyhedron, the question is to decompose it into convex parts. And the question is in 3D, possibly nD. My answer is about the decomposition of a general polyhedron. So when I give an answer with a given implementation, that answer applies to the special case of polyhedron axis-aligned, but it might be that there exists a better implementation for axis-aligned polyhedron. And when my answer says that a problem for generic polyhedron is NP-complete, it might be that there exists a polynomial solution for the special case of axis-aligned polyhedron. I do not know.
Now here is my (slightly off-topic) answer, below the horizontal rule...
The CGAL C++ library has an algorithm that, given a 2D polygon, can compute the optimal convex decomposition of that polygon. The method is mentioned in the part 2D Polygon Partitioning of the manual. The method is named CGAL::optimal_convex_partition_2. I quote the manual:
This function provides an implementation of Greene's dynamic programming algorithm for optimal partitioning [2]. This algorithm requires O(n4) time and O(n3) space in the worst case.
In the bibliography of that CGAL chapter, the article [2] is:
[2] Daniel H. Greene. The decomposition of polygons into convex parts. In Franco P. Preparata, editor, Computational Geometry, volume 1 of Adv. Comput. Res., pages 235–259. JAI Press, Greenwich, Conn., 1983.
It seems to be exactly what you are looking for.
Note that the same chapter of the CGAL manual also mention an approximation, hence not optimal, that run in O(n): CGAL::approx_convex_partition_2.
Edit, about the 3D case:
In 3D, CGAL has another chapter about Convex Decomposition of Polyhedra. The second paragraph of the chapter says "this problem is known to be NP-hard [1]". The reference [1] is:
[1] Bernard Chazelle. Convex partitions of polyhedra: a lower bound and worst-case optimal algorithm. SIAM J. Comput., 13:488–507, 1984.
CGAL has a method CGAL::convex_decomposition_3 that computes a non-optimal decomposition.
I have the feeling your problem is NP-hard. I suggest a first step might be to break the figure into sub-rectangles along all hyperplanes. So in your example there would be three hyperplanes (lines) and four resulting rectangles. Then the problem becomes one of recombining rectangles into larger rectangles to minimize the final number of rectangles. Maybe 0-1 integer programming?
I think dynamic programming might be your friend.
The first step I see is to divide the polyhedron into a trivial collection of blocks such that every possible face is available (i.e. slice and dice it into the smallest pieces possible). This should be trivial because everything is an axis aligned box, so k-tree like solutions should be sufficient.
This seems reasonable because I can look at its cost. The cost of doing this is that I "forget" the original configuration of hyperslabs, choosing to replace it with a new set of hyperslabs. The only way this could lead me astray is if the original configuration had something to offer for the solution. Given that you want an "optimal" solution for all configurations, we have to assume that the original structure isn't very helpful. I don't know if it can be proven that this original information is useless, but I'm going to make that assumption in this answer.
The problem has now been reduced to a graph problem similar to a constrained spanning forest problem. I think the most natural way to view the problem is to think of it as a graph coloring problem (as long as you can avoid confusing it with the more famous graph coloring problem of trying to color a map without two states of the same color sharing a border). I have a graph of nodes (small blocks), each of which I wish to assign a color (which will eventually be the "hyperslab" which covers that block). I have the constraint that I must assign colors in hyperslab shapes.
Now a key observation is that not all possibilities must be considered. Take the final colored graph we want to see. We can partition this graph in any way we please by breaking any hyperslab which crosses the partition into two pieces. However, not every partition is meaningful. The only partitions that make sense are axis aligned cuts, which always break a hyperslab into two hyperslabs (as opposed to any more complicated shape which could occur if the cut was not axis aligned).
Now this cut is the reverse of the problem we're really trying to solve. That cutting is actually the thing we did in the first step. While we want to find the optimal merging algorithm, undoing those cuts. However, this shows a key feature we will use in dynamic programming: the only features that matter for merging are on the exposed surface of a cut. Once we find the optimal way of forming the central region, it generally doesn't play a part in the algorithm.
So let's start by building a collection of hyperslab-spaces, which can define not just a plain hyperslab, but any configuration of hyperslabs such as those with holes. Each hyperslab-space records:
The number of leaf hyperslabs contained within it (this is the number we are eventually going to try to minimize)
The internal configuration of hyperslabs.
A map of the surface of the hyperslab-space, which can be used for merging.
We then define a "merge" rule to turn two or more adjacent hyperslab-spaces into one:
Hyperslab-spaces may only be combined into new hyperslab-spaces (so you need to combine enough pieces to create a new hyperslab, not some more exotic shape)
Merges are done simply by comparing the surfaces. If there are features with matching dimensionalities, they are merged (because it is trivial to show that, if the features match, it is always better to merge hyperslabs than not to)
Now this is enough to solve the problem with brute force. The solution will be NP-complete for certain. However, we can add an additional rule which will drop this cost dramatically: "One hyperslab-space is deemed 'better' than another if they cover the same space, and have exactly the same features on their surface. In this case, the one with fewer hyperslabs inside it is the better choice."
Now the idea here is that, early on in the algorithm, you will have to keep track of all sorts of combinations, just in case they are the most useful. However, as the merging algorithm makes things bigger and bigger, it will become less likely that internal details will be exposed on the surface of the hyperslab-space. Consider
+===+===+===+---+---+---+---+
| : : A | X : : : :
+---+---+---+---+---+---+---+
| : : B | Y : : : :
+---+---+---+---+---+---+---+
| : : | : : : :
+===+===+===+ +---+---+---+
Take a look at the left side box, which I have taken the liberty of marking in stronger lines. When it comes to merging boxes with the rest of the world, the AB:XY surface is all that matters. As such, there are only a handful of merge patterns which can occur at this surface
No merges possible
A:X allows merging, but B:Y does not
B:Y allows merging, but A:X does not
Both A:X and B:Y allow merging (two independent merges)
We can merge a larger square, AB:XY
There are many ways to cover the 3x3 square (at least a few dozen). However, we only need to remember the best way to achieve each of those merge processes. Thus once we reach this point in the dynamic programming, we can forget about all of the other combinations that can occur, and only focus on the best way to achieve each set of surface features.
In fact, this sets up the problem for an easy greedy algorithm which explores whichever merges provide the best promise for decreasing the number of hyperslabs, always remembering the best way to achieve a given set of surface features. When the algorithm is done merging, whatever that final hyperslab-space contains is the optimal layout.
I don't know if it is provable, but my gut instinct thinks that this will be an O(n^d) algorithm where d is the number of dimensions. I think the worst case solution for this would be a collection of hyperslabs which, when put together, forms one big hyperslab. In this case, I believe the algorithm will eventually work its way into the reverse of a k-tree algorithm. Again, no proof is given... it's just my gut instinct.
You can try a constrained delaunay triangulation. It gives very few triangles.
Are you able to determine the equations for each line?
If so, maybe you can get the intersection (points) between those lines. Then if you take one axis, and start to look for a value which has more than two points (sharing this value) then you should "draw" a line. (At the beginning of the sweep there will be zero points, then two (your first pair) and when you find more than two points, you will be able to determine which points are of the first polygon and which are of the second one.
Eg, if you have those lines:
verticals (red):
x = 0, x = 2, x = 5
horizontals (yellow):
y = 0, y = 2, y = 3, y = 5
and you start to sweep through of X axis, you will get p1 and p2, (and we know to which line-equation they belong ) then you will get p3,p4,p5 and p6 !! So here you can check which of those points share the same line of p1 and p2. In this case p4 and p5. So your first new polygon is p1,p2,p4,p5.
Now we save the 'new' pair of points (p3, p6) and continue with the sweep until the next points. Here we have p7,p8,p9 and p10, looking for the points which share the line of the previous points (p3 and p6) and we get p7 and p10. Those are the points of your second polygon.
When we repeat the exercise for the Y axis, we will get two points (p3,p7) and then just three (p1,p2,p8) ! On this case we should use the farest point (p8) in the same line of the new discovered point.
As we are using lines equations and points 2 or more dimensions, the procedure should be very similar
ps, sorry for my english :S
I hope this helps :)

Tetromino space-filling: need to check if it's possible

I'm writing a program that needs to quickly check whether a contiguous region of space is fillable by tetrominoes (any type, any orientation). My first attempt was to simply check if the number of squares was divisible by 4. However, situations like this can still come up:
As you can see, even though these regions have 8 squares each, they are impossible to tile with tetrominoes.
I've been thinking for a bit and I'm not sure how to proceed. It seems to me that the "hub" squares, or squares that lead to more than two "tunnels", are the key to this. It's easy in the above examples, since you can quickly count the spaces in each such tunnel — 3, 1, and 3 in the first example, and 3, 1, 1, and 2 in the second — and determine that it's impossible to proceed due to the fact that each tunnel needs to connect to the hub square to fit a tetromino, which can't happen for all of them. However, you can have more complicated examples like this:
...where a simple counting technique just doesn't work. (At least, as far as I can tell.) And that's to say nothing of more open spaces with a very small number of hub squares. Plus, I don't have any proof that hub squares are the only trick here. For all I know, there may be tons of other impossible cases.
Is some sort of search algorithm (A*?) the best option for solving this? I'm very concerned about performance with hundreds, or even thousands, of squares. The algorithm needs to be very efficient, since it'll be used for real-time tiling (more or less), and in a browser at that.
Perfect matching on a perfect matching
[EDIT 28/10/2014: As noticed by pix, this approach never tries to use T-tetrominoes, so it is even more likely to give an incorrect "No" answer than I thought...]
This will not guarantee a solution on an arbitrary shape, but it will work quickly and well most of the time.
Imagine a graph in which there is a vertex for each white square, and an edge between two vertices if and only if their corresponding white squares are adjacent. (Each vertex can therefore touch at most 4 edges.) A perfect matching in this graph is a subset of edges such that every vertex touches exactly one edge in the subset. In other words, it is a way of pairing up adjacent vertices -- or in yet other words, a domino tiling of the white squares. Later I'll explain how to find a nicely random-looking perfect matching; for now, let's just assume that it can be done.
Then, starting from this domino tiling, we can just repeat the matching process, gluing dominos together into tetrominos! The only differences the second time around are that instead of having a vertex per white square, we have a vertex per domino; and because we must add an edge whenever two dominos are adjacent, a vertex can now have as many as 6 edges.
The first step (domino tiling) step cannot fail: if a domino tiling for the given shape exists, then one will be found. However, it is possible for the second step (gluing dominos together into tetrominos) to fail, because it has to work with the already-decided domino tiling, and this may limit its options. Here is an example showing how different domino tilings of the same shape can enable or spoil the tetromino tiling:
AABCDD --> XXXYYY Success :)
BC XY
AABBDD --> Failure.
CC
Solving the maximum matching problems
In order to generate a random pattern of dominos, the edges in the graph can be given random weights, and the maximum weighted matching problem can be solved. The weights should be in the range [1, V/(V-2)), to guarantee that it is never possible to achieve a higher score by leaving some vertices unpaired. The graph is in fact bipartite as it contains no odd-length cycles, meaning that the faster O(V^2*E) algorithm for the maximum weighted bipartite matching problem can be used for this step. (This is not true for the second matching problem: one domino can touch two other dominos that touch each other.)
If the second step fails to find a complete set of tetrominos, then either no solution is possible*, or a solution is possible using a different set of dominos. You can try randomly reweighting the graph used to find the domino tiling, and then rerunning the first step. Alternatively, instead of completely reweighting it from scratch, you could just increase the weights for the problematic dominos, and try again.
* For a plain square with even side lengths, we know that a solution is always possible: just fill it with 2x2 square tetrominos.

Solving a picture jumble!

What algorithm would you make a computer use, if you want to solve a picture jumble?
You can always argue in terms of efficiency of the algorithm, but here I am really looking at what approaches to follow.
Thanks
What you need to do is to define an indexing vocabulary for each face of a jigsaw puzzle, such that the index of a right-facing edge can can tell you what the index of a corresponding left-facing edge is (e.g, a simple vocabulary: "convex" and "concave", with "convex" on a face implying "concave" on a matching opposite face), and then classify each piece according to the indexing vocabulary. The finer the vocabulary, the more discrimantory the face-matching and the faster your algorthm will go, however you implement it. (For instance, you may have "flat edge, straight-edge-leans-left, straight-edge-leans-right, concave, convex, knob, knob-hole, ...). We assume that the indexing scheme abstracts the actual shape of the edge, and that there is a predicate "exactly-fits(piece1,edge1,piece2,edge2)" that is true only if the edges exactly match. We further assume that there is at most one exact match of a piece with a particular edge.
The goal is grow a set of regions, e.g., a set of connected pieces, until it is no longer possible to grow the regions. We first mark all pieces with unique region names, 1 per piece, and all edges as unmatched. Then we enumerate the piece edges in any order. For each enumerated piece P with edge E, use the indexing scheme to select potentially matching piece/edge pairs. Check the exactly-fits predicate; at most one piece Q, with edge F, exactly-matches. Combine the regions for P and Q together to make a large region. Repeat. I think this solves the puzzle.
Solving a jigsaw puzzle can basically be reduced to matching like edges to like edges. In a true jigsaw puzzle only one pair of pieces can properly be interlocked along a particular edge, and each piece will have corners so that you know where a particular side starts and ends.
Thus, just find the endpoints of each edge and select a few control points. Then iterate through all pieces with unbound edges until you find the right one. When there are no more unbound edges, you have solved the puzzle.
To elaborate on Ira Baxter's answer, another way of conceptualizing the problem is to think about the jigsaw puzzle as a graph, where each piece is a node and each iterface with another piece is an edge.
For example if you were designing a puzzle game, storing the "answer" this way would make "check if this fits" code a lot faster, since it could be reduced to some sort of hash-lookup.
.1 Find 2x2-grams such that all four edges will fit. Then, evaluate how well the image content matches with each other.
P1 <--> P2
^ ^
| |
v v
P3 <--> P4
.2 Tag orientations (manually or heuristically), but only use them as heuristics scores (for ranking), not as definitive search criteria.
.3 Shape Context

Mapping a set of 3D points to another set with minimal sum of distances

Given are two sets of three-dimensional points, a source and a destination set. The number of points on each set is arbitrary (may be zero). The task is to assign one or no source point to every destination point, so that the sum of all distances is minimal. If there are more source than destination points, the additional points are to be ignored.
There is a brute-force solution to this problem, but since the number of points may be big, it is not feasible. I heard this problem is easy in 2D with equal set sizes, but sadly these preconditions are not given here.
I'm interested in both approximations and exact solutions.
Edit: Haha, yes, I suppose it does sound like homework. Actually, it's not. I'm writing a program that receives positions of a large number of cars and i'm trying to map them to their respective parking cells. :)
One way you could approach this problem is to treat is as the classical assignment problem: http://en.wikipedia.org/wiki/Assignment_problem
You treat the points as the vertices of the graph, and the weights of the edges are the distance between points. Because the fastest algorithms assume that you are looking for maximum matching (and not minimum as in your case), and that the weights are non-negative, you can redefine weights to be e.g.:
weight(A, B) = bigNumber- distance(A,B)
where bigNumber is bigger than your longest distance.
Obviously you end up with a bipartite graph. Then you use one of the standard algorithms for maximum weighted bipartite matching (lots of resources on the web, e.g. http://valis.cs.uiuc.edu/~sariel/teach/courses/473/notes/27_matchings_notes.pdf or Wikipedia for overview: http://en.wikipedia.org/wiki/Perfect_matching#Maximum_bipartite_matchings) This way you will end-up with a O(NM max(N,M)) algoritms, where N and M are sizes of your sets of points.
Off the top of my head, spatial sort followed by simulated annealing.
Grid the space & sort the sets into spatial cells.
Solve the O(NM) problem within each cell, then within cell neighborhoods, and so on, to get a trial matching.
Finally, run lots of cycles of simulated annealing, in which you randomly alter matches, so as to explore the nearby space.
This is heuristic, getting you a good answer though not necessarily the best, and it should be fairly efficient due to the initial grid sort.
Although I don't really have an answer to your question, I can suggest looking into the following topics. (I know very little about this, but encountered it previously on Stack Overflow.)
Nearest Neighbour Search
kd-tree
Hope this helps a bit.

Resources