Fair division of a kingdom [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Recently, I've attended programming competition. There was a problem from it that I am still mulling over. Programming language does not matter, but I've wrote it in C++. The task was this:
As you already know, Flatland is located on the plane. There are n
cities in Flatland, i-th of these cities is located at the point (xi,
yi). There are ai citizens living in i-th city. The king of
Flatland has decided to divide the kingdom between his two sons. He
wants to build a wall in the form of infinite straight line; each of
the parts will be ruled by one of the sons. The wall cannot pass
through any city. To avoid envy between brothers, the populations of
two parts must be as close as possible; formally, if a and b are
the total number of citizens living in cities of the first and the
second part respectively, the value of |a - b| must be minimized.
Help the king to find the optimal division. Number of cities is less
than 1000. And all coordinates are integers. Output of algorithm
should be integer number of minimal |a-b|
Okay, if I knew the direction of line, it will be really easy task - binary search:
I don't want code, I want ideas because I don't have any. If I catch idea I can write code!
I don't know optimal direction, but I think it could be found somehow. So could it be found or is this task solved other way?
An example where the horizontal/vertical line is not optimal:
1
\
\
2 \ 1

The Ansatz
A brute force method would be to check all possible division...
First it should be noted, that the exact orientation of the line does not matter. It can always be shifted by small amounts and there are cases with more than one minimum. What matters it what cities go to which side of the kingdom. Even when simply trying all such possible combinations, it is not trivial to find them. To do so, I propose the following algorithm:
How to find all possible divisions
For each pair of cities x and y, the line connecting them, divides the kingdom in "left" and "right". Then consider all possible combinations of left, right, x and y:
left + x + y vs right (C)
left + x vs right + y (A)
left + y vs right + x (D)
left vs right + x + y (B)
Actually I am not 100% sure but I think in this way you can find all possible division with a finite number of trials. As the cities have no size (I assumed 0 radius), the line connecting x and y can be shifted slightly to include either city on either side of the kindom.
One counter example where this simple method will definitely fail is when more than 2 cities lie on a straight line
Example
This picture illustrates one step of my above algorithm for the example from the OP. x and y are the two cities with 1 inhabitants. Actually with this pair of cities one gets already all possible divisions. (However 3 points is trivial anyhow, as there is no geometrical restriction on what combinations are possible. Interestingly only starting with 4 points their location on the plane really matters.)
Colinear points
Following some discussion and fruitful comments, I came to the conclusion that colinear points are not really a problem. One just has to consider these points when evaluating the 4 possible divisions (for each pair of points). E.g. assume in the above example is another point at (-1,2). Then this point would lie on the left for A and C and on the right for B and D.

For each angle A, consider the family of parallel lines which make an angle of A with the x-axis, with special case A=0 corresponding to the family of lines parallel to the X-axis.
Given A, you can use a binary search to find the line in the family which divides the kingdom most nearly equally. So we have a function f from angles to integers, mapping each angle A to the minimum value of |a-b| for lines in the family corresponding to A.
How many angles do we need to try? The situation changes materially only when A is an angle corresponding to a line between two points, an angle which I will call a "jump angle". The function is continuous, and therefore constant, away from jump angles. We have to try jump angles, of which there are about n choose 2, approximately 500,000 at most. We also have to try intervals of angles between jump angles, doubling the size, to 1,000,000 at most.
Instead of angles, it's probably more sensible to use slopes. I just like thinking in terms of angles.
The time complexity of this approach is O(n^2 log n), n^2 for the number of angles, log n for the binary search. If we can learn more about the function f, it may be possible to use a faster method to minimize f than checking every possibility. For example, it seems reasonable that the minimum of f can be found at an angle not equal to a jump angle.
It may also be possible to eliminate the binary search by using the centroid of the cities. We calculate the weighted average
(a1(x1,y1) + a2(x2,y2) + ... + an(xn,yn))/(a1+a2+...+an)
I think that lines balancing the population will pass through that point. (Hmm.) If that's the case, we only have to think about angles.

Case where n is less than 3
The base case is where there are two cities: in which case you simple take the perpendicular line on the line that connects the two cities.
Case with three or more cities
You can discretize the tangent by taking every pair of two cities, and see the line that connects them as the direction of the infinite line.
Why this works
If you split the number of cities in two parts, there is at least one half with two or more cities. For that part, there are two points that are the closest to the border. Whether the border passes "very closely" to that line or has the same line does not matter; because a "slightly different tangent" will not swap any city (otherwise these cities were not the closest). Since we try "every border", we will eventually generate a border with the given tangent.
Example:
Say you have the following scenario:
1
\
2\ 1
With the numbers showing the values. In this case the two closest points at the border are the one at the top and the right. So we construct a line that points 45 degrees downwards. Now we use binary search to find the most optimal split: we rotate all points, order them by ascending rotated x-value, then perform binary search on the weights. The optimal one is to split it between the origin and the two other points.
Now with four points:
1 2
2 1
Here we will investigate the following lines:
\ 1\|/2 /
\ /|\ /
----+----
/ \|/ \
/ 2/|\1 \
And this will return either the horizontal or the vertical line.
There is a single possibility - as pointed out by #Nemo that all these points are lying on the same line. In such case there is no tangent that makes sense. In that case, one can use the perpendicular tangent as well.
Pseudocode:
for v in V
for w in V\{v}
calculate tangent
for tangent and perpendicular tangent
rotate all points such that the tangent is rotated to the y-axis
look for a rotated line in the y-direction that splits the cities optimal
return the best split found
Furthermore as nearly all geometrical approaches, this method can suffer from the fact that multiple dots are located on the same line in which case by adding a simple rotation one can either include/exclude one of the points. This is indeed a dirty hack to the problem.
This Haskell program calculates the "optimal direction" (if the above solution is correct) for a given list of points:
import Data.List
type Point = (Int,Int)
type WPoint = (Point,Int)
type Direction = Point
dirmul :: Direction -> WPoint -> Int
dirmul (dx,dy) ((xa,ya),_) = xa*dx+ya*dy
dirCompare :: Direction -> WPoint -> WPoint -> Ordering
dirCompare d pa pb = compare (dirmul d pa) (dirmul d pb)
optimalSplit :: [WPoint] -> Direction
optimalSplit pts = (-dy,dx)
where wsum = sum $ map snd pts
(dx,dy) = argmin (bestSplit pts wsum) $ concat [splits pa pb | pa <- pts, pb <- pts, pa /= pb]
splits :: WPoint -> WPoint -> [Direction]
splits ((xa,ya),_) ((xb,yb),_) = [(xb-xa,yb-ya),(ya-yb,xb-xa)]
bestSplit :: [WPoint] -> Int -> Direction -> Int
bestSplit pts wsum d = bestSplitScan cmp ordl 0 wsum
where cmp = dirCompare d
ordl = sortBy cmp pts
bestSplitScan :: ((a,Int) -> (a,Int) -> Ordering) -> [(a,Int)] -> Int -> Int -> Int
bestSplitScan _ [] l r = abs $ l-r
bestSplitScan cmp ((x1,w1):xs) l r = min (abs $ l-r) (bestSplitScan cmp (dropWhile eqf xs) (l+d) (r-d))
where eqf = (==) EQ . cmp (x1,w1)
d = w1+(sum $ map snd $ takeWhile eqf xs)
argmin :: (Ord b) => (a -> b) -> [a] -> a
argmin _ [x] = x
argmin f (x:xs) | (f x) <= f ax = x
| otherwise = ax
where ax = argmin f xs
For instance:
*Main> optimalSplit [((0,0),2),((0,1),1),((1,0),1)]
(-1,1)
*Main> optimalSplit [((0,0),2),((0,1),1),((1,0),1),((1,1),2)]
(-1,0)
So the direction is a line in which if the line moves one element to the left, it moves one element to the top as well. This is the first example. For the second case, it picks a line that moves in the x-direction so it splits horizontally. This algorithm allows only integral points and does not take into account slightly tweaking the line in case the points are placed on the same line: these are all in or all out for a line parallel.

[Edit: Bold-faced text is relevant to concerns expressed previously in comments.]
[Edit 2: As I should have pointed out earlier, this answer is a supplement to the earlier answer by tobi303, which gives a similar algorithm. The main purpose was to show that the basic idea of that algorithm is sound and sufficiently general.
Despite minor differences in the details of the algorithms proposed in the two answers, I think a careful reading of the "why it works" section, applied to either algorithm, will show that the algorithm is in fact complete.]
If all the cities are in one straight line
(including the case where there
are only one or two cities), then the solution is simple.
I assume you can detect and solve this case, so the rest of the
answer will deal with all other cases.
If there are more than two cities and they are not all collinear,
the "brute force" solution is:
for each city X,
for each city Y where Y is not X
construct a directed line that passes through X and then Y.
Divide the cities in two subsets:
S1 = all the cities to the left of this line
S2 = all the other cities (including cities exactly on the line)
Evaluate the "unfairness" of this division.
Among all subdivisions of cities found in this way,
choose the one with the least unfairness. Return the difference. Done.
Note that the line found in this way is not the line that divides the cities "fairly"; it is merely parallel to some such line.
If we had to find the actual dividing line we would have to do a little more work to figure out
exactly where to put that parallel line. But the requested return value
is merely |a-b|.
Why this works:
Suppose that the line L1 divides the cities in the fairest way possible.
There is not a unique line that does this;
there will be (mathematically speaking) an infinite number of lines
that achieve the same "best" division, but such lines exist, and
all we need to suppose is that L1 is one of those lines.
Let the city A be the closest to L1 on one side of the line
and the city B be closest to L1 on the other side.
(If A and B are not uniquely identified, that is if there are two or more
cities on one side of L1 that are tied for "closest to L1",
we can set L2 = L1 and skip forward to the procedure for L2, below.)
Consider rotations of L1 in each direction, using the point where L1 crosses
the line AB as a pivot point. In at least one direction of rotation,
a rotated image of L1 will "hit" one of the other cities,
call it C, without touching either A or B.
(This follows from the fact that the cities are not all along one line.)
At that point, C is closer to the image of L1 than A or B (whichever
of those cities is on the same side of the original L1 as C was).
The Mean Value Theorem of calculus tells us that at some point during
the rotation, C was exactly as close to the rotated image of L1
as the city A or B, whichever is on the same side of that line.
What this shows is that there is always a line L2 that divides the cities
as fairly as possible, such that there are two cities, D and E,
on the same side of L2 and tied for "closest city to L2" among all
cities on that side of L2.
Now consider two directed lines through D and E: L3, which passes through
D and then E, and L4, which passes through E and then D.
The cities that are on the other side of L2 than D and E consist either of
all the cities to the left of L3, or all the cities to the left of L4.
(Note that this works even if L3 and L4 happen
to pass through more than two cities.)
The procedure described before is simply a way to find all possible
lines that could be line L3 or line L4 in any execution of this
procedure starting from a line L1 that solves the problem.
(Note that while there are always infinite possible choices of L1,
every such L1 results in lines L3 and L4 selected from the finite set of
lines that pass through two or more cities.)
So the procedure will find the division of cities described by L1,
which is the solution to the problem.

Related

Finding closest pair of points in the plane with non-distinct x-coordinates in O(nlogn)

Most of the implementations of the algorithm to find the closest pair of points in the plane that I've seen online have one of two deficiencies: either they fail to meet an O(nlogn) runtime, or they fail to accommodate the case where some points share an x-coordinate. Is a hash map (or equivalent) required to solve this problem optimally?
Roughly, the algorithm in question is (per CLRS Ch. 33.4):
For an array of points P, create additional arrays X and Y such that X contains all points in P, sorted by x-coordinate and Y contains all points in P, sorted by y-coordinate.
Divide the points in half - drop a vertical line so that you split X into two arrays, XL and XR, and divide Y similarly, so that YL contains all points left of the line and YR contains all points right of the line, both sorted by y-coordinate.
Make recursive calls for each half, passing XL and YL to one and XR and YR to the other, and finding the minimum distance, d in each of those halves.
Lastly, determine if there's a pair with one point on the left and one point on the right of the dividing line with distance smaller than d; through a geometric argument, we find that we can adopt the strategy of just searching through the next 7 points for every point within distance d of the dividing line, meaning the recombination of the divided subproblems is only an O(n) step (even if it looks n2 at first glance).
This has some tricky edge cases. One way people deal with this is sorting the strip of points of distance d from the dividing line at every recombination step (e.g. here), but this is known to result in an O(nlog2n) solution.
Another way people deal with edge cases is by assuming each point has a distinct x-coordinate (e.g. here): note the snippet in closestUtil which adds to Pyl (or YL as we call it) if the x-coordinate of a point in Y is <= the line, or to Pyr (YR) otherwise. Note that if all points lie on the same vertical line, this would result us writing past the end of the array in C++, as we write all n points to YL.
So the tricky bit when points can have the same x-coordinate is dividing the points in Y into YL and YR depending on whether a point p in Y is in XL or XR. The pseudocode in CLRS for this is (edited slightly for brevity):
for i = 1 to Y.length
if Y[i] in X_L
Y_L.length = Y_L.length + 1;
Y_L[Y_L.length] = Y[i]
else Y_R.length = Y_R.length + 1;
Y_R[Y_R.length] = Y[i]
However, absent of pseudocode, if we're working with plain arrays, we don't have a magic function that can determine whether Y[i] is in X_L in O(1) time. If we're assured that all x-coordinates are distinct, sure - we know that anything with an x-coordinate less than the dividing line is in XL, so with one comparison we know what array to partition any point p in Y into. But in the case where x-coordinates are not necessarily distinct (e.g. in the case where they all lie on the same vertical line), do we require a hash map to determine whether a point in Y is in XL or XR and successfully break down Y into YL and YR in O(n) time? Or is there another strategy?
Yes, there are at least two approaches that work here.
The first, as Bing Wang suggests, is to apply a rotation. If the angle is sufficiently small, this amounts to breaking ties by y coordinate after comparing by x, no other math needed.
The second is to adjust the algorithm on G4G to use a linear-time partitioning algorithm to divide the instance, and a linear-time sorted merge to conquer it. Presumably this was not done because the author valued the simplicity of sorting relative to the previously mentioned algorithms in most programming languages.
Tardos & Kleinberg suggests annotating each point with its position (index) in X.
You could do this in N time, or, if you really, really want to, you could do it "for free" in the sorting operation.
With this annotation, you could do your O(1) partitioning, and then take the position pr of the right-most point in Xl in O(1), using it to determine weather a point in Y goes in Yl (position <= pr), or Yr (position > pr). This does not require an extra data structure like a hash map, but it does require that those same positions are used in X and Y.
NB:
It is not immediately obvious to me that the partitioning of Y is the only problem that arises when multiple points have the same coordinate on the x-axis. It seems to me that the proof of linearity of the comparisons neccesary across partitions breaks, but I have seen only the proof that you need only 15 comparisons, not the proof for the stricter 7-point version, so i cannot be sure.

Skew Lines Midpoint from 4 points?

I'm trying to understand the basics underlying a piece of source code I was given to use. It works, this is proven. I'm just trying to wrap my head around the why of it well enough that I could do it myself, or possibly extend/expand upon it.
The code in question finds the midpoint of the shortest line between two skew lines in 3D space. This paper is the closest I've come to finding something that matches, but I'm still missing some conceptual steps (and my linear algebra skills are decades out of use)
In this application, P1, P2, P3, and P4 are 3D (X,Y,Z) points in space. The lines we're concerned with are P1-P2 and P3-P4.
The language this system runs on doesn't include a Determinant function, hence why the original programmer wrote their own. VectMagn is a system function that simply returns the Norm of a 3D point value (ie, SQRT(X^2 + Y^2 + Z^2)). Pow(i,j) is just what it looks like, returning i^j.
Where I'm most getting stuck is the large formula for t. Based on my research so far, I would expect that both lines would need to be converted into unit-vector lines, then processed as per the first PDF. But the t formula appears to be doing all of this in one jump, and I'm missing the intermediate steps. It's obviously creating two 2x2 matrices from various matrix math on Ps 1-4, then dividing the Determinant of one matrix by the other.
If I'm understanding this correctly, t is r1 and r2 from the PDF, depending on which order the Points were passed to iv3DSkewLinePoint. But I haven't yet found any papers or formulae that explain why/how this particular algorithm works. So far, everything I've found starts with unit-vector lines and moves on from there.
! Returns the derterminant of a matrix
LOCAL FUNC num Det(num a,num b,num c,num d)
RETURN (a*d-c*b);
ENDFUNC
! Returns a point on a line (P1-P2) closest to a point on a skewed line (P3-P4)
FUNC pos iv3DSkewLinePoint(pos P1,pos P2,pos P3,pos P4)
VAR num x;
VAR num y;
VAR num z;
VAR num t;
t:=Det(DotProd(P3-P1,P2-P1),DotProd(P4-P3,P2-P1),DotProd(P3-P1,P4-P3),Pow(VectMagn(P4-P3),2))/Det(Pow(VectMagn(P2-P1),2),DotProd(P4-P3,P2-P1),DotProd(P2-P1,P4-P3),Pow(VectMagn(P4-P3),2));
x:=P1.x+(P2.x-P1.x)*t;
y:=P1.y+(P2.y-P1.y)*t;
z:=P1.z+(P2.z-P1.z)*t;
RETURN [x,y,z];
ENDFUNC
! Returns the closest point to two skewed lines in space
FUNC pos iv3DSkewLineMidpoint(pos P1,pos P2,pos P3,pos P4)
RETURN 0.5*(iv3DSkewLinePoint(P1,P2,P3,P4)+iv3DSkewLinePoint(P3,P4,P1,P2));
ENDFUNC
The shortest segment between two skew lines in 3D must be perpendicular to both (perpendicular projection is the shortest one).
So we have to get two points A and B that fulfill the next conditions:
A lies on P1..P2 line, so in parametric form using vector notation:
A = P1 + u*(P2-P1)
B lies on P3..P4 line, so in parametric form
B = P3 + v*(P4-P3)
vector AB is perpendicular to P1P2, so dot product is zero
(B-A).dot.(P2-P1) = 0
vector AB is perpendicular to P3P4, so dot product is zero
(B-A).dot.(P4-P3) = 0
Other stuff is vector algebra calculation to find u and v parameters (t in iv3DSkewLinePoint in your code).
I think that first expressions in the paper with dot products are simpler to calculate, than expressions with many vector product at the end - note wiki approach requires the only vector product calculation.
Seems yout long formula for t represents r1 formula from paper, but it is rather hard readable.
in addition:
Paul Bourke short article with codes
My code based on "Geometric Tools for Computer Graphics" book of D.Eberly (geometrictools.com)

How many paths of length n with the same start and end point can be found on a hexagonal grid?

Given this question, what about the special case when the start point and end point are the same?
Another change in my case is that we must move at every step. How many such paths can be found and what would be the most efficient approach? I guess this would be a random walk of some sort?
My think so far is, since we must always return to our starting point, thinking about n/2 might be easier. At every step, except at step n/2, we have 6 choices. At n/2 we have a different amount of choices depending on if n is even or odd. We also have a different amount of choices depending on where we are (what previous choices we made). For example if n is even and we went straight out, we only have one choice at n/2, going back. But if n is even and we didn't go straight out, we have more choices.
It is all the cases at this turning point that I have trouble getting straight.
Am I on the right track?
To be clear, I just want to count the paths. So I guess we are looking for some conditioned permutation?
This version of the combinatorial problem looks like it actually has a short formula as an answer.
Nevertheless, the general version, both this and the original question's, can be solved by dynamic programming in O (n^3) time and O (n^2) memory.
Consider a hexagonal grid which spans at least n steps in all directions from the target cell.
Introduce a coordinate system, so that every cell has coordinates of the form (x, y).
Let f (k, x, y) be the number of ways to arrive at cell (x, y) from the starting cell after making exactly k steps.
These can be computed either recursively or iteratively:
f (k, x, y) is just the sum of f (k-1, x', y') for the six neighboring cells (x', y').
The base case is f (0, xs, ys) = 1 for the starting cell (xs, ys), and f (0, x, y) = 0 for every other cell (x, y).
The answer for your particular problem is the value f (n, xs, ys).
The general structure of an iterative solution is as follows:
let f be an array [0..n] [-n-1..n+1] [-n-1..n+1] (all inclusive) of integers
f[0][*][*] = 0
f[0][xs][ys] = 1
for k = 1, 2, ..., n:
for x = -n, ..., n:
for y = -n, ..., n:
f[k][x][y] =
f[k-1][x-1][y] +
f[k-1][x][y-1] +
f[k-1][x+1][y] +
f[k-1][x][y+1]
answer = f[n][xs][ys]
OK, I cheated here: the solution above is for a rectangular grid, where the cell (x, y) has four neighbors.
The six neighbors of a hexagon depend on how exactly we introduce a coordinate system.
I'd prefer other coordinate systems than the one in the original question.
This link gives an overview of the possibilities, and here is a short summary of that page on StackExchange, to protect against link rot.
My personal preference would be axial coordinates.
Note that, if we allow standing still instead of moving to one of the neighbors, that just adds one more term, f[k-1][x][y], to the formula.
The same goes for using triangular, rectangular, or hexagonal grid, for using 4 or 8 or some other subset of neighbors in a grid, and so on.
If you want to arrive to some other target cell (xt, yt), that is also covered: the answer is the value f[n][xt][yt].
Similarly, if you have multiple start or target cells, and you can start and finish at any of them, just alter the base case or sum the answers in the cells.
The general layout of the solution remains the same.
This obviously works in n * (2n+1) * (2n+1) * number-of-neighbors, which is O(n^3) for any constant number of neighbors (4 or 6 or 8...) a cell may have in our particular problem.
Finally, note that, at step k of the main loop, we need only two layers of the array f: f[k-1] is the source layer, and f[k] is the target layer.
So, instead of storing all layers for the whole time, we can store just two layers, as we don't need more: one for odd k and one for even k.
Using only two layers is as simple as changing all f[k] and f[k-1] to f[k%2] and f[(k-1)%2], respectively.
This lowers the memory requirement from O(n^3) down to O(n^2), as advertised in the beginning.
For a more mathematical solution, here are some steps that would perhaps lead to one.
First, consider the following problem: what is the number of ways to go from (xs, ys) to (xt, yt) in n steps, each step moving one square north, west, south, or east?
To arrive from x = xs to x = xt, we need H = |xt - xs| steps in the right direction (without loss of generality, let it be east).
Similarly, we need V = |yt - ys| steps in another right direction to get to the desired y coordinate (let it be south).
We are left with k = n - H - V "free" steps, which can be split arbitrarily into pairs of north-south steps and pairs of east-west steps.
Obviously, if k is odd or negative, the answer is zero.
So, for each possible split k = 2h + 2v of "free" steps into horizontal and vertical steps, what we have to do is construct a path of H+h steps east, h steps west, V+v steps south, and v steps north. These steps can be done in any order.
The number of such sequences is a multinomial coefficient, and is equal to n! / (H+h)! / h! / (V+v)! / v!.
To finally get the answer, just sum these over all possible h and v such that k = 2h + 2v.
This solution calculates the answer in O(n) if we precalculate the factorials, also in O(n), and consider all arithmetic operations to take O(1) time.
For a hexagonal grid, a complicating feature is that there is no such clear separation into horizontal and vertical steps.
Still, given the starting cell and the number of steps in each of the six directions, we can find the final cell, regardless of the order of these steps.
So, a solution can go as follows:
Enumerate all possible partitions of n into six summands a1, ..., a6.
For each such partition, find the final cell.
For each partition where the final cell is the cell we want, add multinomial coefficient n! / a1! / ... / a6! to the answer.
Just so, this takes O(n^6) time and O(1) memory.
By carefully studying the relations between different directions on a hexagonal grid, perhaps we can actually consider only the partitions which arrive at the target cell, and completely ignore all other partitions.
If so, this solution can be optimized into at least some O(n^3) or O(n^2) time, maybe further with decent algebraic skills.

Choice of optimization algorithm for distributing lines inside a shape

Consider the follow representation of a concrete slab element with reinforcement bars and holes.
I need an algorithm that automatically distributes lines over an arbitrary shape with different holes.
The main constraints are:
Lines cannot be outside of the region or inside a hole
The distance between two side-by-side lines cannot exceed a variable D
Lines have to be positioned on a fixed interval I, i.e. y mod I = 0, where y is the line's Y coordinate.
Each available point inside the shape cannot be further from a line than D/2
I want to optimize the solution by minimizing the total number of lines N. What kind of optimization algorithm would suit this problem?
I assume most approaches involves simplifying the shape into a raster (with pixel height I) and disable or enable each pixel. I thought this was an obvious LP problem and tried to set it up with GLPK, but found it very hard to describe the problem using this simplified raster for an arbitrary number of lines. I also suspect that the solution space might be too big.
I have already implemented an algorithm in C# that does the job, but not very optimized. This is how it works:
Create a simplified raster of the geometry
Calculate a score for each cell using a complicated formula that takes possible line length and distances to other rods and obstacles into account.
Determine which needs reinforcement (where number of free cells in y direction > D)
Pick the cell with the highest score that needs reinforcement, and reinforce it as far as possible in -x and +x directions
Repeat
Depending on the complicated formula, this works pretty well but starts giving unwanted results when putting the last few lines, since it can never move an already put line.
Are there any other optimization techniques that I should have a look at?
I'm not sure what follows is what you want - I'm fairly sure it's not what you had in mind - but if it sounds reasonable you might try it out.
Because the distance is simply at most d, and can be anything less than that, it seems at first blush like a greedy algorithm should work here. Always place the next line(s) so that (1) as few as possible are needed and (2) they are as far away from existing lines as possible.
Assume you have an optimal algorithm for this problem, and it places the next line(s) at distance a <= d from the last line. Say it places b lines. Our greedy algorithm will definitely place no more than b lines (since the first criterion is to place as few as possible), and if it places b lines it will place them at distance c with a <= c <= d, since it then places the lines as far as possible.
If the greedy algorithm did not do what the optimal algorithm did, it differed in one of the following ways:
It placed the same or fewer lines farther away. Suppose the optimal algorithm had gone on to place b' lines at distance a' away at the next step. Then these lines would be at distance a+a' and there would be b+b' lines in total. But the greedy algorithm can mimic the optimal algorithm in this case by placing b' lines at displacement a+a' by choosing c' = (a+a') - c. Since c > a and a' < d, c' < d and this is a legal placement.
It placed fewer lines closer together. This case is actually problematic. It is possible that this places k unnecessary lines, if any placement requires at least k lines and the farthest ones require more, and the arrangement of holes is chosen so that (e.g.) the distance it spans is a multiple of d.
So the greedy algorithm turns out not to work in case 2. However, it does in other cases. In particular, our observation in the first case is very useful: for any two placements (distance, lines) and (distance', lines'), if distance >= distance' and lines <= lines', the first placement is always to be preferred. This suggests the following algorithm:
PlaceLines(start, stop)
// if we are close enough to the other edge,
// don't place any more lines.
if start + d >= stop then return ([], 0)
// see how many lines we can place at distance
// d from the last placed lines. no need to
// ever place more lines than this
nmax = min_lines_at_distance(start + d)
// see how that selection pans out by recursively
// seeing how line placement works after choosing
// nmax lines at distance d from the last lines.
optimal = PlaceLines(start + d, stop)
optimal[0] = [d] . optimal[0]
optimal[1] = nmax + optimal[1]
// we only need to try fewer lines, never more
for n = 1 to nmax do
// find the max displacement a from the last placed
// lines where we can place n lines.
a = max_distance_for_lines(start, stop, n)
if a is undefined then continue
// see how that choice pans out by placing
// the rest of the lines
candidate = PlaceLines(start + a, stop)
candidate[0] = [a] . candidate[0]
candidate[1] = n + candidate[1]
// replace the last best placement with the
// one we just tried, if it turned out to be
// better than the last
if candidate[1] < optimal[1] then
optimal = candidate
// return the best placement we found
return optimal
This can be improved by memoization by putting results (seq, lines) into a cache indexed by (start, stop). That way, we can recognize when we are trying to compute assignments that may already have been evaluated. I would expect that we'd have this case a lot, regardless of whether you use a coarse or a fine discretization for problem instances.
I don't get into details about how max_lines_at_distance and max_distance_for_lines functions might work, but maybe a word on these.
The first tells you at a given displacement how many lines are required to span the geometry. If you have pixelated your geometry and colored holes black, this would mean looking at the row of cells at the indicated displacement, considering the contiguous black line segments, and determining from there how many lines that implies.
The second tells you, for a given candidate number of lines, the maximum distance from the current position at which that number of lines can be placed. You could make this better by having it tell you the maximum distance at which that number of lines, or fewer, could be placed. If you use this improvement, you could reverse the direction in which you're iterating n and:
if f(start, stop, x) = a and y < x, you only need to search up to a, not stop, from then on;
if f(start, stop, x) is undefined and y < x, you don't need to search any more.
Note that this function can be undefined if it is impossible to place n or fewer lines anywhere between start and stop.
Note also that you can memorize separately for these functions to save repeated lookups. You can precompute max_lines_at_distance for each row and store it in a cache for later. Then, max_distance_for_lines could be a loop that checks the cache back to front inside two bounds.

Find points given distances between them

Here is an example:
Suppose there are 4 points: A, B, C, and D
Given that Point A is at (0,0):
and the distances:
A to B: 7
A to C: 5
A to D: 9
B to C: 6
B to D: 5
C to D: 7
The goal would be to find a solution to points B(x,y), C(x,y) and D(x,y)
What is an algorithm to find the points ( up to 50 of them ) given the distances between them?
OK,you have 4 points A, B, C, and D which are separated from one another such that the lengths of the distances between each pair of points is AB=7, AC=5, BC=6, AD=9, BD=5, and CD=7. Axyz=(0,0,0), Bxyz=(7,0,0), Cxyz=(2.7,4.2,0), Dxyz=(7.5,1.9,4.6) (rounding to the first decimal).
We set point A at the origin Axyz= (0,0,0).
We set point B at x=7,y=0,z=0 Bxyz= (7,0,0).
We find the x coordinate for point C by using the law of cosines:
((AB^2+AC^2-BC^2)/2)/Bx = Cx
((7^2+5^2-6^2)/2)/7=
((49+25-36)/2)/7= 38/14 = 2.714286
We then use the pythagorean theorem to find Cy:
sqrt(AC^2-Cx^2)=Cy
sqrt(25-7.367347)=4.199
So Cxyz=(2.714,4.199,0)
We find Dx in much the same way we found Cx:
((AB^2+AD^2-BD^2)/2)/Bx =Dx
((49+81-25)/2)/7= 7.5 = Dx
We find Dy by a slightly different formula:
(((AC^2+AD^2-CD^2)/2)-(Cx*Dx))/Dy
(((25+81-49)/2)-(2.714*7.5))/4.199= 1.94 (approx)
Having found Dx and Dy, we can find Dz by using Pythagorean theorem:
sqrt(AD^2-Dx^2-Dy^2)=
sqrt(9^2-7.5^2-1.94^2) = 4.58
So Dxyz=(7.5, 1.94, 4.58)
If you have pairwise distances between each of a set of 50 points, then you might need as many as 49 dimensions in order to obtain coordinates for all the points. If A, B, C, D, and E are all separated by 10 lengths units from each of every other, then you would need 4 spatial dimensions - if you introduce another point (F) which is also equidistant from all the other points, then you will need 5 dimensions. The algorithm works the same no matter how many dimensions are necessary (and in fact it works best when the maximum number of dimensions IS required-). The algorithm also works when the distances violate the triangle rule - such as if AB=3, AC=4, and BC=13 - the coordinates are A=0,0; B=3,0; and C=-24,23.66i. If the triangle rule is violated, then some of the coordinates will simply be imaginary valued. No big deal.
In general for point G, the coordinates (x1st, x2nd, x3rd, x4th, x5th, and x6th) can be found thusly:
G1st=((AB^2+AG^2-BG^2)/2)/(B1st)
G2nd=(((AC^2+AG^2-CG^2)/2)-(C1st*G1st))/(C2nd)
G3rd=(((AD^2+AG^2-DG^2)/2)-(D1stG1st)-(D2ndG2nd))/(D3rd)
G4th=(((AE^2+AG^2-EG^2)/2)-(E1stG1st)-(E2ndG2nd)-(E3rd*G3rd))/(E4th)
G5th=(((AF^2+AG^2-FG^2)/2)-(F1stG1st)-(F2ndG2nd)-(F3rdG3rd)-(F4thG4th))/(F5th)
G6th=sqrt(AG^2-G1st^2-G2nd^2-G3rd^2-G4th^2-G5th^2)
For the 5th point you find the first three coordinates with lawofcosine calculations and you find the 4th coordinate with a pythagoreantheorem calculations. For the 6th point you find the first 4 coordinates with 4 lawofcosine calculations and then you obtain the final coordinate with the pythagoreantheorem calculation. For the 50th point, you find the first 48 coordinates with 48 lawofcosines calculations and the 49th coordinate is found with a pythagoreantheorem calculation. So for 50 points, there will be 48 pythagoreantheorem calculations altogether plus 1128 lawofcosine calculations.
The algorithm is fairly straightforward:
A is always set at the origin and B is set at x=AB (or rather B1st=AB)
C1st is found by using the law of cosines ((AB^2+AC^2-BC^2)/2)/(B1st)
C2nd is then found with pythagorean theorem (sqrt(AC^2-C1st^2))
BUT WHAT IF C2nd = 0? This is not necessarily a problem, but it can become a problem for finding D2nd, D3rd, E2nd, E3rd, E4th, etc.
If AB=4, AC=8, BC=4, then we will obtain A (0,0), B (4,0), and C (8,0). If AD=4, BD=8, and CD=12, then there will be no problem for finding coordinates for D which would be D (-4,0).
However, if CD is not equal to 12, then we WILL have a problem. For instance, if CD=5, then we might find that we should go back and calculate coordinates for the points in a different order such as ACDB, that way we can get A=(0,0,0);C=(8,0,0); D=(3.44,2.04,0); and B=(4,-14.55,14.55i). This is a fairly intuitive solution, but it interrupts the flow of the algorithm because we have to go backwards and start over in a different order.
Another solution to the problem which does not necessitate interrupting the flow of computations is to deliberately introduce an error whenever a pythagoreantheorem calculation gives us a zero. -- Instead of a zero, put a 0.1 or 0.01 as the C2nd coordinate. This will allow one to proceed with calculating coordinates for the remaining points without interruption and the accuracy of the final results will suffer only a little (truth be told the algorithm is subject to cumulative rounding errors anyhow, so its no big deal). Also the deliberate introduction of error is the only way to obtain a solution at all in some cases:
Consider once again 4 points A, B, C, and D with distances such the AB=4, AC=8, BC=4, AD=4, BD=8, and CD=4 (we previously have had CD at 12, and CD at 5). When CD=4, there IS NO exact solution no matter what order you calculate the points. Go ahead and try.
A=(0,0,0), B=(4,0,0), C=(8,0,0)... If you introduce an error at C2nd so that instead of zero you put 0.1 such that C=(8,0.1,0), then you can obtain a solution for point D's coordinates D=(-4,640,640i). If you introduce a smaller error for C2nd such that C=(8,0.01,0), then you get D=(-4,6400,6400i). As C2nd gets closer and closer to zero, D2nd, and D3rd just get farther and farther away along the same direction. A similar result occurs sometimes when the distance between two points is close to zero. The algorithm ofcourse will not work with a distance that is actually equal to zero such with AB=5,AC=8, and BC=0. But it will work with BC=0.000001.
Anyway, I think this has answered your question you asked a year ago.

Resources