Can anyone suggest a quick triangle detection algorithm? - algorithm

I have some data in the form of Line objects (eg Line1(start, end), where start and end are coordinates in the form of point objects). Is there a quick way to go through all the lines to see if any of them form a triangle? By quick I mean anything better than going through all nC3 possibilities.
Edit: Just realised I may not understand all the replies (I'm no Adrian Lamo). Please try and explain wrt Python.

1) geometric step: enter all line segments in a dictionary, with the first endpoint as the key and the second endpoint as the value. There will be duplicate keys, so you will keep a list of values rather than single values. In principle there will be no duplicates in the lists (unless you enter the same edge twice).
2) topological step: for all entries P in the dictionary, consider all the pairs of elements from its list, let (Q, R). Lookup Q and check if R belongs to the list of Q. If yes, you have found the triangle (P, Q, R).
By symmetry, all six permutations of every triangle will be reported. You can avoid that by enforcing that P<Q<R in the lexicographical sense.

Is there a quick way to go through all the lines to see if any of them form a triangle?
Yes. Assuming that your Points are integers (or can be easily converted to such, because they have fixed significant digits or similar):
Being creative for you here:
Make two fastly searchable storage structures (e.g. std::multimap<int>), one for each of the x and y coordinates of the endpoints, associating the coordinates with a pointer to the respective line
In the first structure, search for elements with the same x coordinate. Those are the only "candidates" for being a corner of a triangle. Searching for duplicate entries is fast, because of you using an appropriate data structure.
For each of these, verify whether they are actually an edge. If they are not, discard.
For each of the remaining corners, verify that both "opposite" line ends are part of the same edge. Discard the others. Done.

Related

Find optimal local alignment of two strings using local & global alignments

I have a homework question that I trying to solve for many hours without success, maybe someone can guide me to the right way of thinking about it.
The problem:
We want to find an optimal local alignment of two strings S1 and S2, we know that there exists such an alignment
with the two aligned substrings of S1 and S2 both of length at most q.
Besides, we know that the number of the table cells with the maximal value, opt, is at most r.
Describe an algorithm solving the problem in time O(mn+r*q^2) using working space of at most
O(n+r+q^2).
Restrictions: run the algorithm of finding the optimal local alignment value, with
additions to your choice (like the list of index pairs), only once. Besides, you can run any variant of the algorithm for solving the global optimal alignment problem as many times as you wish
I know the solution to this problem with running the local alignment many times and the global alignment only once, but not the opposite.
the global alignment algorithm:
the local alignment algorithm:
Any help would be appreciated.
The answer in case someone will be interested in this question in the future:
Compute the optimal local alignment score OPT of both strings in $O(mn)$ time and $O(n)$ space by maintaining just a single row of the DP matrix. (Since we are only computing the score and don't need to perform traceback to build the full solution, we don't need to keep the full DP matrix.) As you do so, keep track of the highest cell value seen so far, as well as a list of the coordinates $(i, j)$ of cells having this value. Whenever a new maximum is seen, clear the list and update the maximum. Whenever a cell $\ge$ the current maximum is seen (including in the case where we just saw a new maximum), add the coordinates of the current cell to the list. At the end, we have a list of endpoints of all optimal local alignments; by assumption, there are at most $r$ of these.
For each entry $(i, j)$ in the list:
Set $R1$ to the reverse of the substring $S1[i-q+1..i]$, and $R2$ to the reverse of $S2[j-q+1..j]$.
Perform optimal global alignment of $R1$ and $R2$ in $O(q^2)$ time, maintaining the full $O(q^2)$ DP matrix this time.
Search for the highest entry in the matrix (also $O(q^2)$ time; or you can perform this during the previous step).
If this entry is OPT, we have found a solution: Trace back towards the top-left corner from this cell to find the full solution, reverse it, and output it, and stop.
By assumption, at least one of the alignments performed in the previous step reaches a score of OPT. (Note that reversing both strings does not change the score of an alignment.)
Step 2 iterates at most $r$ times, and does $O(q^2)$ work each time, using at most $O(q^2)$ space, so overall the time and space bounds are met.
(A simpler way, that avoids reversing strings, would be to simply perform local alignments of the length-$q$ substrings, but the restrictions appear to forbid this.)

3x3 grid puzzle solving (JS)

I have an image separated into a 3x3 grid. The grid is represented by an array. Each column or row can be rotated through. E.g, the top row [1,2,3] could become [3,1,2] etc.
The array needs to end up as:
[1,2,3]
[4,5,6]
[7,8,9]
And would start from something like:
[5,3,9]
[7,1,4]
[8,6,2]
It will always be solvable, so this doesn't need to be checked.
I've tried a long handed approach of looking for '1' and moving left then up to its correct place, and so on for 2, 3,... but end up going round in circles for ever.
Any help would be appreciated, even if you can just give me a starting point/reference... I can't seem to think through this one.
your problem is that the moves to shift one value will mess up others. i suspect with enough set theory you can work out an exact solution, but here's a heuristic that has more chance of working.
first, note that if every number in a row belongs to that row then it's either trivial to solve, or some values are swapped. [2,3,1] is trivial, while [3,2,1] is swapped, for example.
so an "easier" target than placing 1 top left is to get all rows into that state. how might we do that? let's look at the columns...
if the column contains one number from each row, then we are in a similar state to above (it's either trivial to shift so numbers are in the correct rows, or it's swapped).
so, what i would suggest is:
for column in columns:
if column is not one value from each row:
pick a value from column that is from a duplicate row
rotate that row
for column in columns:
as well as possible, shift until each value is in correct row
for row in rows:
as well as possible, shift until each value is in correct column
now, that is not guaranteed to work, although it will tend to get close, and can solve some set of "almost right" arrangements.
so what i would then do is put that in a loop and, on each run, record a "hash" of the state (for example, a string containing the values read row by row). and then on each invocation if i detect (by checking if the hash was one we had seen already) that the state has already occurred (so we are repeating ourselves) i would invoke a "random shuffle" that mixes things up.
so the idea is that we have something that has a chance of working once we are close, and a shuffle that we resort to when that gets stuck in a loop.
as i said, i am sure there are smarter ways to do this, but if i were desperate and couldn't find anything on google, that's the kind of heuristic i would try... i am not even sure the above is right, but the more general tactic is:
identify something that will solve very close solutions (in a sense, find out where the puzzle is "linear")
try repeating that
shuffle if it repeats
and that's really all i am saying here.
Since the grid is 3x3, you can not only find the solution, but find the smallest number of moves to solve the problem.
You would need to use Breadth First Search for this purpose.
Represent each configuration as a linear array of 9 elements. After each move, you reach a different configuration. Since the array is essentially a permutation of numbers between 1-9, there would be only 9! = 362,880 different configurations possible.
If we consider each configuration as a node, and making each move is considered as taking an edge, we can explore the entire graph in O(n), where n is the number of configurations. We need to make sure, that we do not re-solve a configuration which we have already seen before, so you would need a visited array, which marks each configuration visited as it sees it.
When you reach the 'solved' configuration, you can trace back the moves taken by using a 'parent' array, which stores the configuration you came from.
Also note, if it had been a 4x4 grid, the problem would have been quite intractable, since n would equal (4x4)! = 16! = 2.09227899 × 10^13. But for smaller problems like this, you can find the solution pretty fast.
Edit:
TL;DR:
Guaranteed to work, and pretty fast at that. 362,880 is a pretty small number for today's computers
It will find the shortest sequence of moves.

Is there an efficient algorithm to generate random points in general position in the plane?

I need to generate n random points in general position in the plane, i.e. no three points can lie on a same line. Points should have coordinates that are integers and lie inside a fixed square m x m. What would be the best algorithm to solve such a problem?
Update: square is aligned with the axes.
Since they're integers within a square, treat them as points in a bitmap. When you add a point after the first, use Bresenham's algorithm to paint all pixels on each of the lines going through the new point and one of the old ones. When you need to add a new point, get a random location and check if it's clear; otherwise, try again. Since each pair of pixels gives a new line, and thus excludes up to m-2 other pixels, as the number of points grows you will have several random choices rejected before you find a good one. The advantage of the approach I'm suggesting is that you only pay the cost of going through all lines when you have a good choice, while rejecting a bad one is a very quick test.
(if you want to use a different definition of line, just replace Bresenham's with the appropriate algorithm)
Can't see any way around checking each point as you add it, either by (a) running through all of the possible lines it could be on, or (b) eliminating conflicting points as you go along to reduce the possible locations for the next point. Of the two, (b) seems like it could give you better performance.
Similar to #LaC's answer. If memory is not a problem, you could do it like this:
Add all points on the plane to a list (L).
Shuffle the list.
For each point (P) in the list,
For each point (Q) previously picked,
Remove every point from L which are linear to P-Q.
Add P to the picked list.
You could continue the outer loop until you have enough points, or run out of them.
This might just work (though might be a little constrained on being random). Find the largest circle you can draw within the square (this seems very doable). Pick any n points on the circle, no three will ever be collinear :-).
This should be an easy enough task in code. Say the circle is centered at origin (so something of the form x^2 + y^2 = r^2). Assuming r is fixed and x randomly generated, you can solve to find y coordinates. This gives you two points on the circle for every x which are diametrically opposite. Hope this helps.
Edit: Oh, integer points, just noticed that. Thats a pity. I'm going to keep this solution up though - since I like the idea
Both #LaC's and #MizardX's solution are very interesting, but you can combine them to get even better solution.
The problem with #LaC's solution is that you get random choices rejected. The more points you have already generated the harder it gets to generate new ones. If there is only one available position left you have slight chance of randomly choosing it (1/(n*m)).
In the #MizardX's solution you never get rejected choices, however if you directly implement the "Remove every point from L which are linear to P-Q." step you'll get worse complexity (O(n^5)).
Instead it would be better to use a bitmap to find which points from L are to be removed. The bitmap would contain a value indicating whether a point is free to use and what is its location on the L list or a value indicating that this point is already crossed out. This way you get worst-case complexity of O(n^4) which is probably optimal.
EDIT:
I've just found that question: Generate Non-Degenerate Point Set in 2D - C++
It's very similar to this one. It would be good to use solution from this answer Generate Non-Degenerate Point Set in 2D - C++. Modifying it a bit to use radix or bucket sort and adding all the n^2 possible points to the P set initially and shufflying it, one can also get worst-case complexity of O(n^4) with a much simpler code. Moreover, if space is a problem and #LaC's solution is not feasible due to space requirements, then this algorithm will just fit in without modifications and offer a decent complexity.
Here is a paper that can maybe solve your problem:
"POINT-SETS IN GENERAL POSITION WITH MANY
SIMILAR COPIES OF A PATTERN"
by BERNARDO M. ABREGO AND SILVIA FERNANDEZ-MERCHANT
um, you don't specify which plane.. but just generate 3 random numbers and assign to x,y, and z
if 'the plane' is arbitrary, then set z=o every time or something...
do a check on x and y to see if they are in your m boundary,
compare the third x,y pair to see if it is on the same line as the first two... if it is, then regenerate the random values.

Tricky algorithm for sorting symbols in an array while preserving relationships via order

The problem
I have multiple groups which specify the relationships of symbols.. for example:
[A B C]
[A D E]
[X Y Z]
What these groups mean is that (for the first group) the symbols, A, B, and C are related to each other. (The second group) The symbols A, D, E are related to each other.. and so forth.
Given all these data, I would need to put all the unique symbols into a 1-dimension array wherein the symbols which are somehow related to each other would be placed closer to each other. Given the example above, the result should be something like:
[B C A D E X Y Z]
or
[X Y Z D E A B C]
In this resulting array, since the symbol A has multiple relationships (namely with B and C in one group and with D and E in another) it's now located between those symbols, somewhat preserving the relationship.
Note that the order is not important. In the result, X Y Z can be placed first or last since those symbols are not related to any other symbols. However, the closeness of the related symbols is what's important.
What I need help in
I need help in determining an algorithm that takes groups of symbol relationships, then outputs the 1-dimension array using the logic above. I'm pulling my hair out on how to do this since with real data, the number of symbols in a relationship group can vary, there is also no limit to the number of relationship groups and a symbol can have relationships with any other symbol.
Further example
To further illustrate the trickiness of my dilemma, IF you add another relationship group to the example above. Let's say:
[C Z]
The result now should be something like:
[X Y Z C B A D E]
Notice that the symbols Z and C are now closer together since their relationship was reinforced by the additional data. All previous relationships are still retained in the result also.
The first thing you need to do is to precisely define the result you want.
You do this by defining how good a result is, so that you know which is the best one. Mathematically you do this by a cost function. In this case one would typically choose the sum of the distances between related elements, the sum of the squares of these distances, or the maximal distance. Then a list with a small value of the cost function is the desired result.
It is not clear whether in this case it is feasible to compute the best solution by some special method (maybe if you choose the maximal distance or the sum of the distances as the cost function).
In any case it should be easy to find a good approximation by standard methods.
A simple greedy approach would be to insert each element in the position where the resulting cost function for the whole list is minimal.
Once you have a good starting point you can try to improve it further by modifying the list towards better solutions, for example by swapping elements or rotating parts of the list (local search, hill climbing, simulated annealing, other).
I think, because with large amounts of data and lack of additional criteria, it's going to be very very difficult to make something that finds the best option. Have you considered doing a greedy algorithm (construct your solution incrementally in a way that gives you something close to the ideal solution)? Here's my idea:
Sort your sets of related symbols by size, and start with the largest one. Keep those all together, because without any other criteria, we might as well say their proximity is the most important since it's the biggest set. Consider every symbol in that first set an "endpoint", an endpoint being a symbol you can rearrange and put at either end of your array without damaging your proximity rule (everything in the first set is an endpoint initially because they can be rearranged in any way). Then go through your list and as soon as one set has one or more symbols in common with the first set, connect them appropriately. The symbols that you connected to each other are no longer considered endpoints, but everything else still is. Even if a bigger set only has one symbol in common, I'm going to guess that's better than smaller sets with more symbols in common, because this way, at least the bigger set stays together as opposed to possibly being split up if it was put in the array later than smaller sets.
I would go on like this, updating the list of endpoints that existed so that you could continue making matches as you went through your set. I would keep track of if I stopped making matches, and in that case, I'd just go to the top of the list and just tack on the next biggest, unmatched set (doesn't matter if there are no more matches to be made, so go with the most valuable/biggest association). Ditch the old endpoints, since they have no matches, and then all the symbols of the set you just tacked on are the new endpoints.
This may not have a good enough runtime, I'm not sure. But hopefully it gives you some ideas.
Edit: Obviously, as part of the algorithm, ditch duplicates (trivial).
The problem as described is essentially the problem of drawing a graph in one dimension.
Using the relationships, construct a graph. Treat the unique symbols as the vertices of the graph. Place an edge between any two vertices that co-occur in a relationship; more sophisticated would be to construct a weight based on the number of relationships in which the pair of symbols co-occur.
Algorithms for drawing graphs place well-connected vertices closer to one another, which is equivalent to placing related symbols near one another. Since only an ordering is needed, the symbols can just be ranked based on their positions in the drawing.
There are a lot of algorithms for drawing graphs. In this case, I'd go with Fiedler ordering, which orders the vertices using a particular eigenvector (the Fiedler vector) of the graph Laplacian. Fiedler ordering is straightforward, effective, and optimal in a well-defined mathematical sense.
It sounds like you want to do topological sorting: http://en.wikipedia.org/wiki/Topological_sorting
Regarding the initial ordering, it seems like you are trying to enforce some kind of stability condition, but it is not really clear to me what this should be from your question. Could you try to be a bit more precise in your description?

Computing overlaps of grids

Say I have two maps, each represented as a 2D array. Each map contains several distinct features (rocks, grass, plants, trees, etc.). I know the two maps are of the same general region but I would like to find out: 1.) if they overlap and 2.) if so, where does this overlap occur. Does anyone know of any algorithms which would help me do this?
[EDIT]
Each feature is contained entirely inside an array index. Although it is possibly to discern (for example) a rock from a patch of grass, it is not possible to discern one rock from another (or one patch of grass from another).
When doing this in 1D, I would for each index in the first collection (a string, really), try find the largest match in the second collection. If the match goes to the end, I have an overlap (like in action and ionbeam).
match( A on B ):
for each i in length(A):
see if A[i..] matches B[0..]
if no match found: do the same for B on A.
For 2D, you do the same thing, basically: find an 'edge' of A that overlaps with the opposite edge of B. Only the edges aren't 1D, but 2D:
for each point xa,ya in A:
find a row yb in B that has a match( A[ya] on B[yb] )
see if A[ya..] matches B[yb..]
You need to do this for the 2 diagonals, in each sense.
For one map, go through each feature and find the nearest other feature to it. Record these in a list, storing the type of each of the two features and the dx dy between them. Store in a hash table or sorted list. These are now location invariant, since they record only relative distances.
Now for your second map, start doing the same: pick any feature, find its closest neighbor, find the delta. Look for the same correspondence in the original map list. If the features are shared between the maps, you'll find it in the list and you now know one correspondence between the maps. Repeat for many features if necessary. The results will give you a decent answer of if the maps overlap, and if so, at what offset.
Sounds like image registration (wikipedia), finding a transformation (translation only, in your case) which can align two images. There's a pile of software that does this sort of thing linked off the wikipedia page.

Resources