Identify start and end of arc out of three points/angles - algorithm

I have three points that I know to be on a circle that represent the start, middle and end of an arc (a, m and b). I also have the angle of these points, from the positive X axis in an anti-clockwise direction, using atan2(y,x) of the three respective vectors from the centre to the points, so we have -pi < theta < pi.
I also know which of the points is m, and what I want to know is which of a and b is the clockwise end of the arc.
I can see that there are 8 ways the points can be arranged:
"East" "West" "East"
0 -pi | pi 0
---------------+-------------
a m b |
a m | b
a | m b
| a m b
b m a |
b m | a
b | m a
| b m a
where the first four have a as the "end" and b as the "start" and the latter four are the other way around. Bear in mind that orders about can wrap around at 0 and appear on the right or left, so sign is not helpful.
Is there a tidy way to work out which is the start and which is the end point? Other than laboriously checking relative values among each of the 8 options in a big, dense if/else-if block, that is.
Implementation language is Python, but this is not a language specific question!

If m is on the left side of the directed line segment from a to b, then a is the counterclockwise end; otherwise, it's the clockwise end.
That is, take the left perpendicular of the vector ab, and find its dot product with am. If the dot product is positive, a is the CCW endpoint.
Incidentally, the tidiest way to deal with angles is to avoid using them. Vectors and linear algebra out-tidy angles and trigonometry any day of the week.

I have just run into the same problem - thanks for the great writeup of the question. I think solving it the way you were heading is actually not so "laborious".
You can use your table to see it's just a question of which order the three angles are arranged, cyclically. A < M < B or A > M > B.
// = a < m < b ||
// b < a < m ||
// m < b < a;
anticlockwise = a < m && m < b || b < a && a < m || m < b && b < a;
Depending on how your language of choice feels about coercing booleans into integers, you might be able to write that as:
anticlockwise = (a < m + m < b + b < a) === 2;
(Well I found this a lot easier than trying to understand and compute perpendiculars, dot products, linear algebra...)

Related

Algorithm to find closest integer values that meet certain criteria

Edited to clarify the application by adding units (ml) and explaining the difficulty to measure wet reagents by units of 1/26. The word 'solution' was ambiguous because it was used to mean both a chemical solution as well as the solution to the problem.
Added results based on Edward's reply
The real world application is that I am trying to determine the closest "convenient" volumes to use when mixing reagents A and B to create a solution (in the wet chemistry sense) that best approximates a specific A:B ratio. Let's define "convenient" as divisible by 5.
Example
Given:
1. X = A/(A+B) * C
2. Y = B/(A+B) * C
3. X + Y = C
4. A, B, C always positive integer
// e.g. a 500ml solution (wet chemistry sense) C with a 1:25 ratio of A and B
A = 1
B = 25
C = 500
This gives the volumes to use of X and Y to create the solution (wet chemistry sense) with the proper A:B ratio.
X = 500/26 = ~19.23ml
Y = 12500/26 = ~480.77ml
C = 13000/26 = 500ml
These are the exact volumes create a total volume of 500ml, but trying to measure reagent volumes in units of 1/26ml is a challenge.
How to find "convenient values" (integer divisible by 5) for X, Y, and C that best approximate the exact values of X, Y, and C that would be multiples of 1/26? In this case I found as the closest "convenient" values for X, Y, C:
X = 20ml
Y = 500ml
C = 520ml
C in this case (520ml) is more than the required volume of 500ml, but it is more practical to physically measure the volumes of 20mL and 500mL than it would be to measure reagent volumes in 1/26ths. The extra 20mL is discarded, the cost for using nice values.
RESULTS BASED ON EDWARD'S ANSWER
A=1 B=25 C=500
X=20 Y=500 C2=520
A=1 B=20 C=500
X=25 Y=500 C2=525
A=1 B=100 C=500
X=5 Y=500 C2=505
A=1 B=75 C=500
X=10 Y=750 C2=760
A=1 B=50 C=900
X=20 Y=1000 C2=1020
One way to approach this would be to adjust C so that it absorbs the factor A+B. Then the ratio of A to B would be exact, and X, Y, and C would all be integers. Let D = 5*(A+B), C2 = ceiling(C/((double)D)) * D (round up so you get enough C), X = C2/(A+B)*A, Y = C2/(A+B)*B. If you want the closest value of C, use C2 = round(C/((double)D))*D instead.
If you're mixing chemicals, you probably want to round up rather than round to closest so you'll have enough with a little waste left over, which is better than not having enough.
You can phrase this as an optimization problem with an L1 (absolute value) objective function. (This is using a cannon to swat a mosquito, but I did it because I wanted to figure out about the L1 optimization.) I used the program glpsol from the GLPK package (open source). Here is my program:
param A, integer, >= 0;
param B, integer, >= 0;
param C, integer, >= 0;
var x, integer, >= 0;
var y, integer, >= 0;
var e1x, >= 0;
var e1y, >= 0;
minimize e1 : e1x + e1y;
subject to
c1 : (5*x - (C*A)/(A + B)) <= e1x;
c2 : ((C*A)/(A + B) - 5*x) <= e1x;
c3 : (5*y - (C*B)/(A + B)) <= e1y;
c4 : ((C*B)/(A + B) - 5*y) <= e1y;
solve;
printf "x=%g, y=%g, error=%g\n", x, y, e1;
data;
param A := 1;
param B := 25;
param C := 500;
Here is the output:
$ glpsol --model find_nice_integers.mod
[... snip ...]
x=4, y=96, error=1.53846
Here are some notes about how to handle absolute values in optimization problems.
So, you are given an integer number C and the ratio p:q between two other integer numbers A and B (i.e., A/B = p/q).
I will interpret your definition of convenient as requiring that X and Y are both multiple of 5 where
X = A / (A+B) * C'
Y = B / (A+B) * C'
C' is close to C
Replacing A/B with p/q we get
X = p / (p+q) * C'
Y = q / (p+q) * C'
Now, in order for X and Y to be integer both p * C' and q * C' must both be multiples of (p+q). And since we can assume that p:q is irreductible (i.e., p and q have no multiples in common) this means that C' must be divisible by p+q. In addition, C'/(p+q) must be multiple of 5. So, C' must be a multiple of 5*(p+q).
The multiple of 5*(p+q) that is closest to C is:
C' := round(C/(5*(p+q)))*5*(p+q)
Now we can calculate:
X := p/(p+q)*C'
Y := q/(p+q)*C'
and they are indeed multiple of 5 because C'/(p+q) is.
Let's see how this behaves with your example:
Inputs:
p = 1
q = 25
C = 500
Then
C' := round(500/5(1+25))*5*(1+25) = round(100/26)*5*26 = 4*5*26 = 520
Hence
X := p/(p+q)*C' = 1/(1+25)*4*5*26 = 1/26*4*5*26 = 4*5 = 20
Y := q/(p+q)*C' = 25/(1+25)*4*5*26 = 25/26*4*5*26 = 25*4*5 = 500.
Voila!
Let's first calculate optimal(float) A and B.
It could be Observed that optimal integer solutions are either {floor(A), ceiling(B)} or {ceiling(A), floor(B)}. So we simply try both and chose the answer with less error.

Calculating normal to line towards given side

Given is a line (segment), defined by two vectors start(x,y) and end(x,y). I also have a point p(x,y), which is on either of the two areas separated by the line (i.e. it is not exactly on the line).
How can I calculate the normal to the line that is facing towards the side in which p is?
Let:
A = (a,b) and B = (c,d) define the line segment
P = (p,q) be the other point.
Define:
dot( (p,q), (r,s) ) == p*r + q*s
Then the vector:
v = ( c-a, d-b)
defines the direction along the line segment. Its perpendicular is:
u = (d-b, (-(c-a)) = (d-b,a-c)
This can be seen by taking the dot product with v. To get the normal from the perpendicular, just divide by its length:
n = u /|u|, |u| = sqrt( dot(u,u))
We now just need to know where P lies relative to the normal. If we take:
dir = dot( (P-A), n) )
Then dir > 0 means n is in the same direction as P, whilst dir < 0 means it is in the opposite direction. Should dir == 0, then P is in fact on the extended line (not necessarily the line segment itself).
First, determine which side of the line the point lies on, by taking the cross product of end-start and p-end:
z = (xend-xstart)(yp-yend) - (yend-ystart)(xp-xend)
If z>0, then the point is to the left of the line (as seen by a person standing at start and facing end). If z<0, then the point is to the right of the line.
Second, normalize the line segment:
S = end - start
k = S/|S|
Finally, if the point is to the left of the line, rotate k to the left:
(xk, yk) => (-yk, xk)
or if the point is to the right of the line, rotate k to the right:
(xk, yk) => (yk, -xk)
My math skills are a bit rusty, so I can't give you the exact calculations, but what you do is this (assuming 2D from your description):
First you calculate a normal n.
Then you calculate P' which is the perpendicular projection of your point P onto your line.
Basically, what you do is, you "create" another line and use your vector n from step 1 as the direction (y = p + x * n where y,p and n are vectors, p is actually your p(x,y) and x is a real number), then you intersect this line with the first one and the point where they intersect is P'.
Seeing you're from Austria, everyone else please forgive me for using one German word, I really don't know the English translation and couldn't find any. P' = Lotfußpunkt
Calculate P - P'. If it has the same sign as n in both components, n is the normal you're searching for. If it has the opposite sign, -n is the one you're searching for.
I hope the idea is clear even though I don't know all the technical terms in English.
For
start = (a,b)
end = (c,d)
p = (x,y)
Slope(startend) = (d - b) / (c - a)
Slope(norm) = -(c - a) / (d - b)
Norm line must include p = (x,y), so
ynorm = -((c - a) / (d - b)) * xnorm + (y + ((c - a) / (d - b)) * x)
y = mx + c
is the normal line equation where m is the slope and c is any constant.
You have start and end. Lets call them (x1,y1) and (x2,y2) and the line joining them L1.
The slope of this line, m1, is (y2-y1)/(x2-x1). This line is perpendicular to the line you need which we can call L2 with slope m2. The product of slopes of 2 mutuallu perpendicular lines is -1. Hence,
m1*m2=-1.
Hence you can calculate m2. Now, you need to find the equation of the line L2. You have 1 point in the line P (x,y). You can substitute in this manner:
y=m2*x+c.
This will give you c. Once you have the line equation, you can convert it to parametric form as shown below:
http://thejuniverse.org/PUBLIC/LinearAlgebra/LOLA/lines/index.html
The equation of the line is given as
A = start.y-end.y
B = end.x-start.x
C = start.x*end.y-start.y*end.x
A*x + B*y + C = 0
The minimum distance d to the line of a point p=(px,py) is
d = (A*px+B*py+C)/sqrt(A^2+B^2)
If the value is positive then the point is at a counter clockwise rotation from the vector (start->end). If negative then it is in clockwise rotation. So if (start->end) is pointing up, then a positive distance is to the left of the line.
Example
start = (8.04, -0.18)
end = (6.58, 1.72)
P = (2.82, 0.66)
A = (-0.18)-(1.72) = -1.9
B = (6.58)-(8.04) = -1.46
C = (8.04)*(1.72)-(-0.18)*(6.58) = 15.01
d = (A*(2.82)+B*(0.66)+C)/√(A^2+B^2) = 3.63
The calculation for d shows the identical value as the length of vector (near->P) in the sketch.
N = (Ey - Sy, Sx - Ex) is perpendicular to the line (it is SE rotated by 90°, not normalized).
Then compute the sign of the dot product
N . SP = (Ey - Sy)(Px - Sx) + (Sx - Ex)(Py - Sy),
it will tell you on what side the normal is pointing.

How to solve this programming contest exercise

I stumbled upon this algorithm question, I couldn't get any better approach than brute force, can someone guide me please?
Given a M * N grid of characters (A,B). You are allowed to flip any
number of columns i.e. change A to B and B to A. What is the max
number of rows that can have same symbols after all possible flipping
Eg,
A B A|
B A B|
A B B|
B B A|
The answer is 2, if we flip both column 1 & 3. Please let me know if further explanation is required.
First, note that A B A and B A B are essentially the same for the purposes of this problem: whenever one gets flipped to all As, the other gets flipped to all Bs, and vice versa, so both count for the answer at the same time.
On the other hand, when A B A or B A B is flipped so that it contains the same letters, all other possible rows contain different letters.
So, the first suggested step would be to flip all rows which start with a B, since it will merge the pairs of rows which count for the answer at the same time.
Now, we have
A B A|
A B A| (flipped from B A B)
A B B|
A A B| (flipped from B B A)
What's left is to find a row that occurs most often.
This can be done by constructing a map which, well, maps rows to the number of their occurrences.
For the example, it will look like {A B A: 2, A B B: 1, A A B: 1}`.
Now, A B A obviously wins since it occurs twice, so we flip all the columns with Bs in that row. Flipping all the columns with As is another option.
I have an O(n^2) and O(M) solution but slightly better than bruteforce since the second loop starts to the i+1 of the first loop, tell me what do you think about it :
First we need to change the A, B matrix for a bit matrix where each line will be a binary your matrix becomes :
0 1 0|
1 0 1|
0 1 1|
1 1 0|
now this is based on the fact that bitwise "010 & 101 = 000" so if there is a possible column permutation that will make the rows matched.
Given N and M;
int maxSameSymbole[M] = {0};
for (int i = 0; i < M; i++) {
for (int j = i+1; j < M; j++) {
if (!(line[i].toBinary & line[j].toBinary)) //this will equal 0 if there is a possible flip that will make the 2 rows with the same symbole
maxSameSymbole[i]++;
}
}
// Simple find max in the maxSameSymbole list :
int max = maxSameSymbole[0];
for (int i = 0; i < M; i++) {
if (maxSameSymbole[i] > max)
max = maxSameSymbole[i];
}
Hope this helped to find a better solution.
Start from the first column and get the indices of the rows that would have same symbols if you flipped that column. for example: (1,4,9) rows work. Get those indices for every column and put them as the keys in a map that maps list of indices to a number it occurred. Highest value would be the answer.

Enumerate matrix combinations with fixed row and column sums

I'm attempting to find an algorithm (not a matlab command) to enumerate all possible NxM matrices with the constraints of having only positive integers in each cell (or 0) and fixed sums for each row and column (these are the parameters of the algorithm).
Exemple :
Enumerate all 2x3 matrices with row totals 2, 1 and column totals 0, 1, 2:
| 0 0 2 | = 2
| 0 1 0 | = 1
0 1 2
| 0 1 1 | = 2
| 0 0 1 | = 1
0 1 2
This is a rather simple example, but as N and M increase, as well as the sums, there can be a lot of possibilities.
Edit 1
I might have a valid arrangement to start the algorithm:
matrix = new Matrix(N, M) // NxM matrix filled with 0s
FOR i FROM 0 TO matrix.rows().count()
FOR j FROM 0 TO matrix.columns().count()
a = target_row_sum[i] - matrix.rows[i].sum()
b = target_column_sum[j] - matrix.columns[j].sum()
matrix[i, j] = min(a, b)
END FOR
END FOR
target_row_sum[i] being the expected sum on row i.
In the example above it gives the 2nd arrangement.
Edit 2:
(based on j_random_hacker's last statement)
Let M be any matrix verifying the given conditions (row and column sums fixed, positive or null cell values).
Let (a, b, c, d) be 4 cell values in M where (a, b) and (c, d) are on the same row, and (a, c) and (b, d) are on the same column.
Let Xa be the row number of the cell containing a and Ya be its column number.
Example:
| 1 a b |
| 1 2 3 |
| 1 c d |
-> Xa = 0, Ya = 1
-> Xb = 0, Yb = 2
-> Xc = 2, Yc = 1
-> Xd = 2, Yd = 2
Here is an algorithm to get all the combinations verifying the initial conditions and making only a, b, c and d varying:
// A matrix array containing a single element, M
// It will be filled with all possible combinations
matrices = [M]
I = min(a, d)
J = min(b, c)
FOR i FROM 1 TO I
tmp_matrix = M
tmp_matrix[Xa, Ya] = a - i
tmp_matrix[Xb, Yb] = b + i
tmp_matrix[Xc, Yc] = c - i
tmp_matrix[Xd, Yd] = d + i
matrices.add(tmp_matrix)
END FOR
FOR j FROM 1 TO J
tmp_matrix = M
tmp_matrix[Xa, Ya] = a + j
tmp_matrix[Xb, Yb] = b - j
tmp_matrix[Xc, Yc] = c + j
tmp_matrix[Xd, Yd] = d - j
matrices.add(tmp_matrix)
END FOR
It should then be possible to find every possible combination of matrix values:
Apply the algorithm on the first matrix for every possible group of 4 cells ;
Recursively apply the algorithm on each sub-matrix obtained by the previous iteration, for every possible group of 4 cells except any group already used in a parent execution ;
The recursive depth should be (N*(N-1)/2)*(M*(M-1)/2), each execution resulting in ((N*(N-1)/2)*(M*(M-1)/2) - depth)*(I+J+1) sub-matrices. But this creates a LOT of duplicate matrices, so this could probably be optimized.
Are you needing this to calculate Fisher's exact test? Because that requires what you're doing, and based on that page, it seems there will in general be a vast number of solutions, so you probably can't do better than a brute force recursive enumeration if you want every solution. OTOH it seems Monte Carlo approximations are successfully used by some software instead of full-blown enumerations.
I asked a similar question, which might be helpful. Although that question deals with preserving frequencies of letters in each row and column rather than sums, some results can be translated across. E.g. if you find any submatrix (pair of not-necessarily-adjacent rows and pair of not-necessarily-adjacent columns) with numbers
xy
yx
Then you can rearrange these to
yx
xy
without changing any row or column sums. However:
mhum's answer proves that there will in general be valid matrices that cannot be reached by any sequence of such 2x2 swaps. This can be seen by taking his 3x3 matrices and mapping A -> 1, B -> 2, C -> 4 and noticing that, because no element appears more than once in a row or column, frequency preservation in the original matrix is equivalent to sum preservation in the new matrix. However...
someone's answer links to a mathematical proof that it actually will work for matrices whose entries are just 0 or 1.
More generally, if you have any submatrix
ab
cd
where the (not necessarily unique) minimum is d, then you can replace this with any of the d+1 matrices
ef
gh
where h = d-i, g = c+i, f = b+i and e = a-i, for any integer 0 <= i <= d.
For a NXM matrix you have NXM unknowns and N+M equations. Put random numbers to the top-left (N-1)X(M-1) sub-matrix, except for the (N-1, M-1) element. Now, you can find the closed form for the rest of N+M elements trivially.
More details: There are total of T = N*M elements
There are R = (N-1)+(M-1)-1 randomly filled out elements.
Remaining number of unknowns: T-S = N*M - (N-1)*(M-1) +1 = N+M

Aranging integers in a specific order

Given a set of distinct unsorted integers s1, s2, .., sn how do you arrange integers such that s1 < s2 > s3 < s4...
I know this can be solved by looking at the array from left to right and if the condition is not satisfied swapping those two elements gives the right answer. Can someone explain me why this algorithm works.
Given any three successive numbers in the array, there are four possible relationships:
a < b < c
a < b > c
a > b < c
a > b > c
In the first case we know that a < c. Since the first condition is met, we can swap b and c to meet the second condition, and the first condition is still met.
In the second case, both conditions are already met.
In the third case, we have to swap a and b to give b < a ? c. But we already know that b < c, so if a < c then swapping to meet that second condition doesn't invalidate the first condition.
In the last case we know that a > c, so swapping a and b to meet the first condition maintains the validity of the second condition.
Now, you add a fourth number to the sequence. You have:
a < b > c ? d
If c < d then there's no need to change anything. But if we have to swap c and d, the prior condition is still met. Because if b > c and c > d, then we know that b > d. So swapping c and d gives us b > d < c.
You can use similar reasoning when you add the fifth number. You have a < b > c < d ? e. If d > e, then there's no need to change anything. If d < e, then by definition c < e as well, so swapping maintains the prior condition.
Pseudo code that implements the algorithm:
for i = 0 to n-2
if i is even
if (a[i] > a[i+1])
swap(a[i], a[i+1])
end if
else
if (a[i] < a[i+1])
swap(a[i], a[i+1])
end
Here is the code to the suggested solution in java.
public static int [] alternatingList(int [] list) {
int first, second,third;
for (int i = 0;i < list.length-2;i+=2) {
first = list[i];
second = list[i+1];
third = list[i+2];
if (first > second && first > third) {
list[i+1] = first;
list[i] = second;
}
else if (third> first && third > second) {
list[i+1] = third;
list[i+2] = second;
}
}
return list;
}
In this code since all the numbers are distinct there will always be a bigger number to put into the "peaks". Swapping the numbers will not change the consistency of the last part you did because the number you swap out will always be smaller than the one you put into the new peak.
Keep in mind this code doesn't handle some edge cases like even length lists and lists smaller than three, I wrote it pretty fast :), I only wrote the code to illustrate the concept of the solution
In addition this solution is better than the one in the proposed dupe because it makes one pass. The solution in the dupe uses the hoare's selection algorithm which is n but requires multiple decreasing in size passes on the list, also it needs to make another n pass on the list after using Hoare's (or the median of medians).
More mathematical proof:
For every three consecutive numbers a,b,c there are three options
a > b && a > c
b > c && b > a
c > a && c > b
In the first case you switch a into the middle because it's the largest, second case do nothing (largest is already in the middle) and 3rd case 'c` goes to the middle.
now you have a < b > c d e where for now d and e are unknown. Now the new a,b,c are c,d,e and you do the same operation this is guaranteed not to mess up the order since c will only be changed if it is larger than d and e thus the number moved into c's spot will be smaller than b and not break the ordering, this can continue infinitely clearly with the order never breaking.

Resources