How to formulate such problem mathematicaly? (line continuation search) - algorithm

I have an array of "lines" each defined by 2 points. I am working with only the line segments lying between those points. I need to search lines that could continue one another (relative to some angle) and lie on the same line (with some offset)
I mean I had something like 3 lines
I solved some mathematical problem (formulation of which is my question) and got understanding that there are lines that could be called relatively one line (with some angle K and offset J)
And of course by math formulation I meant some kind of math formula like

Sort all your segments based on angle (in range 0 to Pi), and build a sorted map of angle-to-segment.
Decide on some angle difference threshold below which two segments can be considered parallel. Iterate through your map and for each mapping, consider adjacent mappings on either side of the angle (needs wrap around) which are considered parallel.
Within each set of nearly-parallel segments, see if they are "continuations" of each other.
Two segments (A,B) and (C,D) are roughly collinear if all possible pairings of the 4 points are roughly parallel. You can use the same test as above.
Pseudo-code:
Angle(A,B)
return Atan((B.y-A.y) / (B.x-A.x)) // use atan2 if possible, but needs wrapping
NearlyParallel(angle1, angle2)
delta = Abs(angle1-angle2)
return (delta < threshold) or (delta > Pi-threshold)
Collinear(A,B, C,D)
// Assume NearlyParallel(Angle(A,B), Angle(C,D)) == true
return NearlyParallel(Angle(A,C), Angle(B,D)) and NearlyParallel(Angle(A,D), Angle(B,C))

What have you tried so far?
I guess one way is to look for pairs of lines where:
the directions are similar |theta_1 - theta_2| < eps for some eps
there is at least one pair of end points where the points are close
Depending on the size of your problem you may be able to just check all pairs of lines for the conditions

Starting with: A = a2 – a1 where a1 and a2 are the angle of the two lines.
We can do this:
tan(A) = tan(a1 – a2) = (tan(a2) – tan(a1)) / (1 + tan(a1) tan(a2))
If tan(a2) is bigger than tan(a1) then tan(A) will be the acute angle between the two lines. You can then check tan(A) against your tolerance.
So I guess the formula would be:
tan(A) = (tan(a2) – tan(a1)) / (1 + tan(a1) tan(a2)) when tan(a2) > tan(a1)
But I'm no mathematician

Related

Normalizing the edit distance

I have a question that can we normalize the levenshtein edit distance by dividing the e.d value by the length of the two strings?
I am asking this because, if we compare two strings of unequal length, the difference between the lengths of the two will be counted as well.
for eg:
ed('has a', 'has a ball') = 4 and ed('has a', 'has a ball the is round') = 15.
if we increase the length of the string, the edit distance will increase even though they are similar.
Therefore, I can not set a value, what a good edit distance value should be.
Yes, normalizing the edit distance is one way to put the differences between strings on a single scale from "identical" to "nothing in common".
A few things to consider:
Whether or not the normalized distance is a better measure of similarity between strings depends on the application. If the question is "how likely is this word to be a misspelling of that word?", normalization is a way to go. If it's "how much has this document changed since the last version?", the raw edit distance may be a better option.
If you want the result to be in the range [0, 1], you need to divide the distance by the maximum possible distance between two strings of given lengths. That is, length(str1)+length(str2) for the LCS distance and max(length(str1), length(str2)) for the Levenshtein distance.
The normalized distance is not a metric, as it violates the triangle inequality.
I used the following successfully:
len = std::max(s1.length(), s2.length());
// normalize by length, high score wins
fDist = float(len - levenshteinDistance(s1, s2)) / float(len);
Then chose the highest score. 1.0 means an exact match.
I had used a normalized edit distance or similarity (NES) which I think is very useful, defined by Daniel Lopresti and Jiangyin Zhou, in Equation (6) of their work: http://www.cse.lehigh.edu/~lopresti/Publications/1996/sdair96.pdf.
The NES in python is:
import math
def normalized_edit_similarity(m, d):
# d : edit distance between the two strings
# m : length of the shorter string
return ( 1.0 / math.exp( d / (m - d) ) )
print(normalized_edit_similarity(3, 0))
print(normalized_edit_similarity(3, 1))
print(normalized_edit_similarity(4, 1))
print(normalized_edit_similarity(5, 1))
print(normalized_edit_similarity(5, 2))
1.0
0.6065306597126334
0.7165313105737893
0.7788007830714049
0.513417119032592
More examples can be found in Table 2 in the above paper.
The variable m in the above function can be replaced with the length of the longer string, depending on your application.

Solving Rubik's Cubes for Dummies

Mr. Dum: Hello, I'm very stupid but I still want to solve a 3x3x3 Rubik's cube.
Mr. Smart: Well, you're in luck. Here is guidance to do just that!
Mr. Dum: No that won't work for me because I'm Dum. I'm only capable of following an algorithm like this.
pick up cube
look up a list of moves from some smart person
while(cube is not solved)
perform the next move from list and turn
the cube as instructed. If there are no
more turns in the list, I'll start from the
beginning again.
hey look, it's solved!
Mr. Smart: Ah, no problem here's your list!
Ok, so what sort of list would work for a problem like this? I know that the Rubik's cube can never be farther away from 20 moves to solved, and that there are 43,252,003,274,489,856,000 permutations of a Rubik's Cube. Therefore, I think that this list could be (20 * 43,252,003,274,489,856,000) long, but
Does anyone know the shortest such list currently known?
How would you find a theoretically shortest list?
Note that this is purely a theoretical problem and I don't actually want to program a computer to do this.
An idea to get such a path through all permutations of the Cube would be to use some of the sequences that human solvers use. The main structure of the algorithm for Mr Smart would look like this:
function getMoves(callback):
paritySwitchingSequences = getParitySwitchingSequences()
cornerCycleSequences = getCornerCycleSequences()
edgeCycleSequences = getEdgeCycleSequences()
cornerRotationSequences = getCornerRotationSequences()
edgeFlipSequences = getEdgeFlipSequences()
foreach paritySeq in paritySwitchingSequences:
if callback(paritySeq) return
foreach cornerCycleSeq in cornerCycleSequences:
if callback(cornerCycleSeq) return
foreach edgeCycleSeq in edgeCycleSequences:
if callback(edgeCycleSeq) return
foreach cornerRotationSeq in cornerRotationSequences:
if callback(cornerRotationSeq) return
foreach edgeFLipSeq in edgeFlipSequences:
if callback(edgeFlipSeq) return
The 5 get... functions would all return an array of sequences, where each sequence is an array of moves. A callback system will avoid the need for keeping all moves in memory, and could be rewritten in the more modern generator syntax if available in the target language.
Mr Dumb would have this code:
function performMoves(sequence):
foreach move in sequence:
cube.do(move)
if cube.isSolved() then return true
return false
getMoves(performMoves)
Mr Dumb's code passes his callback function once to Mr Smart, who will then keep calling back that function until it returns true.
Mr Smart's code will go through each of the 5 get functions to retrieve the basic sequences he needs to start producing sequences to the caller. I will describe those functions below, starting with the one whose result is used in the innermost loop:
getEdgeFlipSequences
Imagine a cube that has all pieces in their right slots and rightly rotated, except for the edges which could be flipped, but still in right slot. If they would be flipped, the cube would be solved. As there are 12 edges, but edges can only be flipped with 2 at the same time, the number of ways this cube could have its edges flipped (or not) is 2^11 = 2048. Otherwise put, there are 11 of the 12 edges that can have any flip status (flipped or not), while the last one is bound by the flips of the other 11.
This function should return just as many sequences, such that after applying one of those sequences the next state of the cube is produced that has a unique set of edges flipped.
function getEdgeFlipSequences
sequences = []
for i = 1 to 2^11:
for edge = 1 to 11:
if i % (2^edge) != 0 then break
sequence = getEdgePairFlipSequence(edge, 12)
sequences.push(sequence)
return sequences
The inner loop makes sure that with one flip in each iteration of the outer loop you get exactly all possible flip states.
It is like listing all numbers in binary representation by just flipping one bit to arrive at the next number. The numbers' output will not be in order when produced that way, but you will get them all. For example, for 4 bits (instead of 11), it would go like this:
0000
0001
0011
0010
0110
0111
0101
0100
1100
1101
1111
1110
1010
1011
1001
1000
The sequence will determine which edge to flip together with the 12th edge. I will not go into defining that getEdgePairFlipSequence function now. It is evident that there are sequences for flipping any pair of edges, and where they are not publicly available, one can easily make a few moves to bring those two edges in a better position, do the double flip and return those edges to their original position again by applying the starting moves in reversed order and in opposite direction.
getCornerRotationSequences
The idea is the same as above, but now with rotated corners. The difference is that a corner can have three rotation states. But like with the flipped edges, if you know the rotations of 7 corners (already in their right position), the rotation of the 8th corner is determined as well. So there are 3^7 possible ways a cube can have its corners rotated.
The trick to rotate a corner together with the 8th corner, and so find all possible corner rotations also works here. The pattern in the 3-base number representation would be like this (for 3 corners):
000
001
002
012
011
010
020
021
022
122
121
120
110
111
112
102
101
100
200
201
202
212
211
210
220
221
222
So the code for this function would look like this:
function getCornerRotationSequences
sequences = []
for i = 1 to 3^7:
for corner = 1 to 7:
if i % (3^edge) != 0 break
sequence = getCornerPairRotationSequence(corner, 8)
sequences.push(sequence)
return sequences
Again, I will not define getCornerPairRotationSequence. A similar reasoning as for the edges applies.
getEdgeCycleSequences
When you want to move edges around without affecting the rest of the cube, you need to cycle at least 3 of them, as it is not possible to swap two edges without altering anything else.
For instance, it is possible to swap two edges and two corners. But that would be out of the scope of this function. I will come back to this later when dealing with the last function.
This function aims to find all possible cube states that can be arrived at by repeatedly cycling 3 edges. There are 12 edges, and if you know the position of 10 of them, the positions of the 2 remaining ones are determined (still assuming corners remain at their position). So there are 12!/2 = 239 500 800 possible permutations of edges in these conditions.
This may be a bit of problem memory-wise, as the array of sequences to produce will occupy a multiple of that number in bytes, so we could be talking about a few gigabytes. But I will assume there is enough memory for this:
function getEdgeCycleSequences
sequences = []
cycles = getCyclesReachingAllPermutations([1,2,3,4,5,6,7,8,9,10,11,12])
foreach cycle in cycles:
sequence = getEdgeTripletCycleSequence(cycle[0], cycle[1], cycle[3])
sequences.push(sequence)
return sequences
The getCyclesAchievingAllPermutations function would return an array of triplets of edges, such that if you would cycle the edges from left to right as listed in a triplet, and repeat this for the complete array, you would get to all possible permutations of edges (without altering the position of corners).
Several answers for this question I asked can be used to implement getCyclesReachingAllPermutations. The pseudo code based on this answer could look like this:
function getCyclesReachingAllPermutations(n):
c = [0] * n
b = [0, 1, ... n]
triplets = []
while (true):
triplet = [0]
for (parity = 0; parity < 2; parity++):
for (k = 1; k <= c[k]; k++):
c[k] = 0
if (k == n - 1):
return triplets
c[k] = c[k] + 1
triplet.add( b[k] )
for (j = 1, k--; j < k; j++, k--):
swap(b, j, k)
triplets.add(triplet)
Similarly for the other main functions, also here is a dependency on a function getEdgeTripletCycleSequence, which I will not expand on. There are many known sequences to cycle three edges, for several positions, and others can be easily derived from them.
getCornerCycleSequences
I will keep this short, as it is the same thing as for edges. There are 8!/2 possible permutations for corners if edges don't move.
function getCornerCycleSequences
sequences = []
cycles = getCyclesReachingAllPermutations([1,2,3,4,5,6,7,8])
foreach cycle in cycles:
sequence = getCornerTripletCycleSequence(cycle[0], cycle[1], cycle[3])
sequences.push(sequence)
return sequences
getParitySwitchingSequences
This extra level is needed to deal with the fact that a cube can be in an odd or even position. It is odd when an odd number of quarter-moves (a half turn counts as 2 then) is needed to solve the cube.
I did not mention it before, but all the above used sequences should not change the parity of the cube. I did refer to it implicitly when I wrote that when permuting edges, corners should stay in their original position. This ensures that the parity does not change. If on the other hand you would apply a sequence that swaps two edges and two corners at the same time, you are bound to toggle the parity.
But since that was not accounted for with the four functions above, this extra layer is needed.
The function is quite simple:
function getParitySwitchingSequences
return = [
[L], [-L]
]
L is a constant that represents the quarter move of the left face of the cube, and -L is the same move, but reversed. It could have been any face.
The simplest way to toggle the parity of a cube is just that: perform a quarter move.
Thoughts
This solution is certainly not the optimal one, but it is a solution that will eventually go through all states of the cube, albeit with many duplicate statuses appearing along the way. And it will do so with less than 20 moves between two consecutive permutations. The number of moves will vary between 1 -- for parity toggle -- and 18 -- for flipping two edges allowing for 2 extra moves to bring an edge in a good relative position and 2 for putting that edge back after the double flip with 14 moves, which I think is the worst case.
One quick optimisation would be to put the parity loop as the inner loop, as it only consists of one quarter move it is more efficient to have that one repeated the most.
Hamilton Graph: the best
A graph has been constructed where each edge represents one move, and where the nodes represent all unique cube states. It is cyclic, such that the edge forward from the last node, brings you back to the first node.
So this should allow you to go through all cube states with as many moves. Clearly a better solution cannot exist. The graph can be downloaded.
You can use the De Bruijn sequence to get a sequence that will definitely solve a rubik's cube (because it will contain every possible permutation of size 20).
From wiki (Python):
def de_bruijn(k, n):
"""
De Bruijn sequence for alphabet k
and subsequences of length n.
"""
try:
# let's see if k can be cast to an integer;
# if so, make our alphabet a list
_ = int(k)
alphabet = list(map(str, range(k)))
except (ValueError, TypeError):
alphabet = k
k = len(k)
a = [0] * k * n
sequence = []
def db(t, p):
if t > n:
if n % p == 0:
sequence.extend(a[1:p + 1])
else:
a[t] = a[t - p]
db(t + 1, p)
for j in range(a[t - p] + 1, k):
a[t] = j
db(t + 1, t)
db(1, 1)
return "".join(alphabet[i] for i in sequence)
You can use it kinda like this:
print(de_bruijn(x, 20))
Where 20 is the size of your sequence and x is a list/string containing every possible turn (couldn't think of a better word) of the cube.

Compare two arrays of points [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I'm trying to find a way to find similarities in two arrays of different points. I drew circles around points that have similar patterns and I would like to do some kind of auto comparison in intervals of let's say 100 points and tell what coefficient of similarity is for that interval. As you can see it might not be perfectly aligned also so point-to-point comparison would not be a good solution also (I suppose). Patterns that are slightly misaligned could also mean that they are matching the pattern (but obviously with a smaller coefficient)
What similarity could mean (1 coefficient is a perfect match, 0 or less - is not a match at all):
Points 640 to 660 - Very similar (coefficient is ~0.8)
Points 670 to 690 - Quite similar (coefficient is ~0.5-~0.6)
Points 720 to 780 - Let's say quite similar (coefficient is ~0.5-~0.6)
Points 790 to 810 - Perfectly similar (coefficient is 1)
Coefficient is just my thoughts of how a final calculated result of comparing function could look like with given data.
I read many posts on SO but it didn't seem to solve my problem. I would appreciate your help a lot. Thank you
P.S. Perfect answer would be the one that provides pseudo code for function which could accept two data arrays as arguments (intervals of data) and return coefficient of similarity.
Click here to see original size of image
I also think High Performance Mark has basically given you the answer (cross-correlation). In my opinion, most of the other answers are only giving you half of what you need (i.e., dot product plus compare against some threshold). However, this won't consider a signal to be similar to a shifted version of itself. You'll want to compute this dot product N + M - 1 times, where N, M are the sizes of the arrays. For each iteration, compute the dot product between array 1 and a shifted version of array 2. The amount you shift array 2 increases by one each iteration. You can think of array 2 as a window you are passing over array 1. You'll want to start the loop with the last element of array 2 only overlapping the first element in array 1.
This loop will generate numbers for different amounts of shift, and what you do with that number is up to you. Maybe you compare it (or the absolute value of it) against a threshold that you define to consider two signals "similar".
Lastly, in many contexts, a signal is considered similar to a scaled (in the amplitude sense, not time-scaling) version of itself, so there must be a normalization step prior to computing the cross-correlation. This is usually done by scaling the elements of the array so that the dot product with itself equals 1. Just be careful to ensure this makes sense for your application numerically, i.e., integers don't scale very well to values between 0 and 1 :-)
i think HighPerformanceMarks's suggestion is the standard way of doing the job.
a computationally lightweight alternative measure might be a dot product.
split both arrays into the same predefined index intervals.
consider the array elements in each intervals as vector coordinates in high-dimensional space.
compute the dot product of both vectors.
the dot product will not be negative. if the two vectors are perpendicular in their vector space, the dot product will be 0 (in fact that's how 'perpendicular' is usually defined in higher dimensions), and it will attain its maximum for identical vectors.
if you accept the geometric notion of perpendicularity as a (dis)similarity measure, here you go.
caveat:
this is an ad hoc heuristic chosen for computational efficiency. i cannot tell you about mathematical/statistical properties of the process and separation properties - if you need rigorous analysis, however, you'll probably fare better with correlation theory anyway and should perhaps forward your question to math.stackexchange.com.
My Attempt:
Total_sum=0
1. For each index i in the range (m,n)
2. sum=0
3. k=Array1[i]*Array2[i]; t1=magnitude(Array1[i]); t2=magnitude(Array2[i]);
4. k=k/(t1*t2)
5. sum=sum+k
6. Total_sum=Total_sum+sum
Coefficient=Total_sum/(m-n)
If all values are equal, then sum would return 1 in each case and total_sum would return (m-n)*(1). Hence, when the same is divided by (m-n) we get the value as 1. If the graphs are exact opposites, we get -1 and for other variations a value between -1 and 1 is returned.
This is not so efficient when the y range or the x range is huge. But, I just wanted to give you an idea.
Another option would be to perform an extensive xnor.
1. For each index i in the range (m,n)
2. sum=1
3. k=Array1[i] xnor Array2[i];
4. k=k/((pow(2,number_of_bits))-1) //This will scale k down to a value between 0 and 1
5. sum=(sum+k)/2
Coefficient=sum
Is this helpful ?
You can define a distance metric for two vectors A and B of length N containing numbers in the interval [-1, 1] e.g. as
sum = 0
for i in 0 to 99:
d = (A[i] - B[i])^2 // this is in range 0 .. 4
sum = (sum / 4) / N // now in range 0 .. 1
This now returns distance 1 for vectors that are completely opposite (one is all 1, another all -1), and 0 for identical vectors.
You can translate this into your coefficient by
coeff = 1 - sum
However, this is a crude approach because it does not take into account the fact that there could be horizontal distortion or shift between the signals you want to compare, so let's look at some approaches for coping with that.
You can sort both your arrays (e.g. in ascending order) and then calculate the distance / coefficient. This returns more similarity than the original metric, and is agnostic towards permutations / shifts of the signal.
You can also calculate the differentials and calculate distance / coefficient for those, and then you can do that sorted also. Using differentials has the benefit that it eliminates vertical shifts. Sorted differentials eliminate horizontal shift but still recognize different shapes better than sorted original data points.
You can then e.g. average the different coefficients. Here more complete code. The routine below calculates coefficient for arrays A and B of given size, and takes d many differentials (recursively) first. If sorted is true, the final (differentiated) array is sorted.
procedure calc(A, B, size, d, sorted):
if (d > 0):
A' = new array[size - 1]
B' = new array[size - 1]
for i in 0 to size - 2:
A'[i] = (A[i + 1] - A[i]) / 2 // keep in range -1..1 by dividing by 2
B'[i] = (B[i + 1] - B[i]) / 2
return calc(A', B', size - 1, d - 1, sorted)
else:
if (sorted):
A = sort(A)
B = sort(B)
sum = 0
for i in 0 to size - 1:
sum = sum + (A[i] - B[i]) * (A[i] - B[i])
sum = (sum / 4) / size
return 1 - sum // return the coefficient
procedure similarity(A, B, size):
sum a = 0
a = a + calc(A, B, size, 0, false)
a = a + calc(A, B, size, 0, true)
a = a + calc(A, B, size, 1, false)
a = a + calc(A, B, size, 1, true)
return a / 4 // take average
For something completely different, you could also run Fourier transform using FFT and then take a distance metric on the returning spectra.

matlab: optimum amount of points for linear fit

I want to make a linear fit to few data points, as shown on the image. Since I know the intercept (in this case say 0.05), I want to fit only points which are in the linear region with this particular intercept. In this case it will be lets say points 5:22 (but not 22:30).
I'm looking for the simple algorithm to determine this optimal amount of points, based on... hmm, that's the question... R^2? Any Ideas how to do it?
I was thinking about probing R^2 for fits using points 1 to 2:30, 2 to 3:30, and so on, but I don't really know how to enclose it into clear and simple function. For fits with fixed intercept I'm using polyfit0 (http://www.mathworks.com/matlabcentral/fileexchange/272-polyfit0-m) . Thanks for any suggestions!
EDIT:
sample data:
intercept = 0.043;
x = 0.01:0.01:0.3;
y = [0.0530642513911393,0.0600786706929529,0.0673485248329648,0.0794662409166333,0.0895915873196170,0.103837395346484,0.107224784565365,0.120300492775786,0.126318699218730,0.141508831492330,0.147135757370947,0.161734674733680,0.170982455701681,0.191799936622712,0.192312642057298,0.204771365716483,0.222689541632988,0.242582251060963,0.252582727297656,0.267390860166283,0.282890010610515,0.292381165948577,0.307990544720676,0.314264952297699,0.332344368808024,0.355781519885611,0.373277721489254,0.387722683944356,0.413648156978284,0.446500064130389;];
What you have here is a rather difficult problem to find a general solution of.
One approach would be to compute all the slopes/intersects between all consecutive pairs of points, and then do cluster analysis on the intersepts:
slopes = diff(y)./diff(x);
intersepts = y(1:end-1) - slopes.*x(1:end-1);
idx = kmeans(intersepts, 3);
x([idx; 3] == 2) % the points with the intersepts closest to the linear one.
This requires the statistics toolbox (for kmeans). This is the best of all methods I tried, although the range of points found this way might have a few small holes in it; e.g., when the slopes of two points in the start and end range lie close to the slope of the line, these points will be detected as belonging to the line. This (and other factors) will require a bit more post-processing of the solution found this way.
Another approach (which I failed to construct successfully) is to do a linear fit in a loop, each time increasing the range of points from some point in the middle towards both of the endpoints, and see if the sum of the squared error remains small. This I gave up very quickly, because defining what "small" is is very subjective and must be done in some heuristic way.
I tried a more systematic and robust approach of the above:
function test
%% example data
slope = 2;
intercept = 1.5;
x = linspace(0.1, 5, 100).';
y = slope*x + intercept;
y(1:12) = log(x(1:12)) + y(12)-log(x(12));
y(74:100) = y(74:100) + (x(74:100)-x(74)).^8;
y = y + 0.2*randn(size(y));
%% simple algorithm
[X,fn] = fminsearch(#(ii)P(ii, x,y,intercept), [0.5 0.5])
[~,inds] = P(X, y,x,intercept)
end
function [C, inds] = P(ii, x,y,intercept)
% ii represents fraction of range from center to end,
% So ii lies between 0 and 1.
N = numel(x);
n = round(N/2);
ii = round(ii*n);
inds = min(max(1, n+(-ii(1):ii(2))), N);
% Solve linear system with fixed intercept
A = x(inds);
b = y(inds) - intercept;
% and return the sum of squared errors, divided by
% the number of points included in the set. This
% last step is required to prevent fminsearch from
% reducing the set to 1 point (= minimum possible
% squared error).
C = sum(((A\b)*A - b).^2)/numel(inds);
end
which only finds a rough approximation to the desired indices (12 and 74 in this example).
When fminsearch is run a few dozen times with random starting values (really just rand(1,2)), it gets more reliable, but I still wouln't bet my life on it.
If you have the statistics toolbox, use the kmeans option.
Depending on the number of data values, I would split the data into a relative small number of overlapping segments, and for each segment calculate the linear fit, or rather the 1-st order coefficient, (remember you know the intercept, which will be same for all segments).
Then, for each coefficient calculate the MSE between this hypothetical line and entire dataset, choosing the coefficient which yields the smallest MSE.

What is a tidy algorithm to find overlapping intervals?

I'm sure this must have been asked before, but I'm not finding it: I'm only finding related, but harder questions.
I've got four points, representing two lines like this:
A C B D
|------*---|-----+----|-*---+---|----------|
0 10 20 30 40
So in the example, AB = {7, 21} and CD = {16,26}. (The lines could be in any relation to each other, and any size.) I want to find out whether or not they overlap, and by how much if so. (In the example, the answer would be 5.) My current solution involves a bunch of complicated if/then steps, and I can't help but think there's a nice arithmetical solution. Is there?
(P.S. Really, I'm doing bounding-box intersection, but if I can get it in one dimension, the other will be the same, obviously.)
Try this:
intersects = (max(a,b) > min(c,d)) && (min(a,b) < max(c,d))
overlap = min(max(a,b), max(c,d)) - max(min(c,d), min(a,b))
If you can assume a <= b and c <= d:
intersects = (b > c) && (a < d)
overlap = min(b, d) - max(c, a)
You can also calculate intersects as follows:
intersects = (overlap > 0)
A line segment is the intersection of two opposing rays (two half-infinite lines in opposite directions). You have two line segments to intersect -- the result is the intersection of all 4 rays. So you can factor your code as three successive ray-intersections: the leftward of the left-facing rays intersected with the rightward of the right-facing rays.
(This is another way of stating the now-accepted answer.)

Resources