Algorithm for finding symmetries of a tree

Algorithm for finding symmetries of a tree - algorithm

I have n sectors, enumerated 0 to n-1 counterclockwise. The boundaries between these sectors are infinite branches (n of them).
The sectors live in the complex plane, and for n even,
sector 0 and n/2 are bisected by the real axis, and the sectors are evenly spaced.
These branches meet at certain points, called junctions. Each junction is adjacent to a subset of the sectors (at least 3 of them).
Specifying the junctions, (in pre-fix order, lets say, starting from junction adjacent to sector 0 and 1), and the distance between the junctions, uniquely describes the tree.
Now, given such a representation, how can I see if it is symmetric wrt the real axis?
For example, n=6, the tree (0,1,5)(1,2,4,5)(2,3,4) have three junctions on the real line,
so it is symmetric wrt the real axis.
If the distances between (015) and (1245) is equal to distance from (1245) to (234),
this is also symmetric wrt the imaginary axis.
The tree (0,1,5)(1,2,5)(2,4,5)(2,3,4) have 4 junctions, and this is never symmetric wrt either imaginary or real axis, but it has 180 degrees rotation symmetry if the distance between the first two and the last two junctions in the representation are equal.
Edit:
Here are all trees with 6 branches, distances 1.
http://www2.math.su.se/~per/files/allTrees.pdf
So, given the description/representation, I want to find some algorithm to decide if it is symmetric wrt real, imaginary, and rotation 180 degrees. The last example have 180 degree symmetry.
Edit 2:
This is actually for my research. I have posted the question at mathoverflow as well,
but my days in competition programming tells me that this is more like an IOI task.
Code in mathematica would be excellent, but java, python, or any other language readable by a human suffices.
(These symmetries corresponds to special kinds of potential in the Schroedinger equation,
which has nice properties in quantum mechanics.)

Could you please define better what you mean by symmetry of the tree?
You first say that
"The sectors live in the complex
plane, and for n even, sector 0 and
n/2 are bisected by the real axis, and
the sectors are evenly spaced."
and that you want to find symmetry
wrt real, imaginary, and rotation 180 degrees
I would then expect that the symmetries would be purely geometrical, but then you also say, in the comment to Justin's answer
There is also not a canonical way to draw a tree,
and my drawing algorithm does not respect all possible
symmetries that a tree can have
How can you look for geometrical symmetry if the position of the vertices of the tree cannot be uniquely defined on the plane? Furthermore in many of the plots you have given (N=6, even) sectors 0 and 3 are not bisected by the x axis (real axis), so I would deem your own drawings wrong.

Since you already have an algorithm to construct the point set for the tree, you only need to determine if the point set has flip symmetry. Ideally your set is computed symbolically (and left in terms of sin(theta), cos(theta)) for non rational points, which should be fine since you seem to be using Mathematica.
You now want to know if your point set has a symmetry about some axis, so represent the flip/rotation transformation as a matrix A, and we have {x'} = A{x}. Sort the after image set {x'} (using the expressions not the numeric values), and compare to the original point set {x}. If there is not a 1-1 correspondence then you don't have a symmetry otherwise you do.
I think there is a mathematica function to find the unique expressions in a set (e.g. Unique[beforeImage] == Unique[afterImage])

I have not had time to implement this, perhaps someone here might take it further:
First partition the junctions by quadrant, this should produce 4 trees. { Tpp, Tmp, Tmm, Tpm} (p for plus, m for minus). Now checking for symmetry seems to be a directional breadth first traversal:
Its been a while on my mathematica, so none of this will compile
CheckRealFlip[T_] := And[TraverseCompare[Tpp[T], Tpm[T]],
TraverseCompare[Tmp[T], Tmm[T]];
CheckImFlip[T_] := And[TraverseCompare[Tpp[T], Tmp[T]],
TraverseCompare[Tpm[T], Tmm[T]];
Where TraverseCompare checks the structure of the tree using a breath first traversal along one tree, and a reverse order breadth first traversal along the other tree. (something like the following, but this won't work at ).
TraverseCompare[A_, B_] := Size[A] == Size[B] &&
Apply[TraverseCompare, Children[A], Reverse[Children[B]];

Related

Algorithm for >2D skyline query/efficient frontier

The problem at hand:
given a set of N points in an D dimensional space, with all their coordinates >= 0 (in 2D the points would all be in the 1st quadrant, in 3D in the 1st octant, and so on...), remove all the points that have another point that has value bigger or equal in every coordinate.
In 2D, the result is this:
(image from Vincent Zoonekynd's answer here) and there is a simple algorithm, detailed in that answer, that runs in N*log(N).
With chunking I should have brought it to N*log(H), but optimizations on that are for another question.
I was interested in extending the solution to 3 dimensions (and possibly 4, if it's still reasonable), but my current 3D algorithm is pretty slow, cumbersome and doesn't generalize to 4D nicely:
Sort points on the x axis, annotate the position of each point
Initialize a sort of segment tree with N leaves, where leaves will hold the points' y values and a node will hold max(child1, child2)
Sort points on the z axis
For every point from the largest z:
Check what position it was in the x order, try to put it in the segment tree in that position
Check first if there is a point already down (so it has > z), at an higher place (so it has > x) with a bigger y (this costs log(N), thanks tree)
If said point is found, the current point is discarded, otherwise it's inserted and the tree is updated
This still runs in N*log(N), but requires 2 different sorts and a 2*N-big structure.
Extending this would require another sort and a prohibitive 2*N^2-big quad tree.
Are there more efficient (especially CPU-wise) approaches?
I don't think it's relevant, but I'm writing in C, the code is here.

Two salesmen - one always visits the nearest neighbour, the other the farthest

Consider this question relative to graph theory:
Let G a complete (every vertex is connected to all the other vertices) non-directed graph of size N x N. Two "salesmen" travel this way: the first always visits the nearest non visited vertex, the second the farthest, until they have both visited all the vertices. We must generate a matrix of distances and the starting points for the two salesmen (they can be different) such that:
All the distances are unique Edit: positive integers
The distance from a vertex to itself is always 0.
The difference between the total distance covered by the two salesmen must be a specific number, D.
The distance from A to B is equal to the distance from B to A
What efficient algorithms cn be useful to help me? I can only think of backtracking, but I don't see any way to reduce the work to be done by the program.

Geometry is helpful.
Using the distances of points on a circle seems like it would work. Seems like you could determine adjust D by making the circle radius larger or smaller.
Alternatively really any 2D shape, where the distances are all different could probably used as well. In this case you should scale up or down the shape to obtain the correct D.
Edit: Now that I think about it, the simplest solution may be to simply pick N random 2D points, say 32 bit integer coordinates to lower the chances of any distances being too close to equal. If two distances are too close, just pick a different point for one of them until it's valid.
Ideally, you'd then just need to work out a formula to determine the relationship between D and the scaling factor, which I'm not sure of offhand. If nothing else, you could also just use binary search or interpolation search or something to search for scaling factor to obtain the required D, but that's a slower method.

Finding all points in certain radius of another point

I am making a simple game and stumbled upon this problem. Assume several points in 2D space. What I want is to make points close to each other interact in some way.
Let me throw a picture here for better understanding of the problem:
Now, the problem isn't about computing the distance. I know how to do that.
At first I had around 10 points and I could simply check every combination, but as you can already assume, this is extremely inefficient with increasing number of points. What if I had a million of points in total, but all of them would be very distant to each other?
I'm trying to find a suitable data structure or a way to look at this problem, so every point can only mind their surrounding and not whole space. Are there any known algorithms for this? I don't exactly know how to name this problem so I can google exactly what I want.
If you don't know of such known algorighm, all ideas are very welcome.

This is a range searching problem. More specifically - the 2-d circular range reporting problem.
Quoting from "Solving Query-Retrieval Problems by Compacting Voronoi Diagrams" [Aggarwal, Hansen, Leighton, 1990]:
Input: A set P of n points in the Euclidean plane E²
Query: Find all points of P contained in a disk in E² with radius r centered at q.
The best results were obtained in "Optimal Halfspace Range Reporting in Three Dimensions" [Afshani, Chan, 2009]. Their method requires O(n) space data structure that supports queries in O(log n + k) worst-case time. The structure can be preprocessed by a randomized algorithm that runs in O(n log n) expected time. (n is the number of input points, and k in the number of output points).
The CGAL library supports circular range search queries. See here.

You're still going to have to iterate through every point, but there are two optimizations you can perform:
1) You can eliminate obvious points by checking if x1 < radius and if y1 < radius (like Brent already mentioned in another answer).
2) Instead of calculating the distance, you can calculate the square of the distance and compare it to the square of the allowed radius. This saves you from performing expensive square root calculations.
This is probably the best performance you're gonna get.

This looks like a nearest neighbor problem. You should be using the kd tree for storing the points.
https://en.wikipedia.org/wiki/K-d_tree

Space partitioning is what you want.. https://en.wikipedia.org/wiki/Quadtree

If you could get those points to be sorted by x and y values, then you could quickly pick out those points (binary search?) which are within a box of the central point: x +- r, y +- r. Once you have that subset of points, then you can use the distance formula to see if they are within the radius.

I assume you have a minimum and maximum X and Y coordinate? If so how about this.
Call our radius R, Xmax-Xmin X, and Ymax-Ymin Y.
Have a 2D matrix of [X/R, Y/R] of double-linked lists. Put each dot structure on the correct linked list.
To find dots you need to interact with, you only need check your cell plus your 8 neighbors.
Example: if X and Y are 100 each, and R is 1, then put a dot at 43.2, 77.1 in cell [43,77]. You'll check cells [42,76] [43,76] [44,76] [42,77] [43,77] [44,77] [42,78] [43,78] [44,78] for matches. Note that not all cells in your own box will match (for instance 43.9,77.9 is in the same list but more than 1 unit distant), and you'll always need to check all 8 neighbors.
As dots move (it sounds like they'd move?) you'd simply unlink them (fast and easy with a double-link list) and relink in their new location. Moving any dot is O(1). Moving them all is O(n).
If that array size gives too many cells, you can make bigger cells with the same algo and probably same code; just be prepared for fewer candidate dots to actually be close enough. For instance if R=1 and the map is a million times R by a million times R, you wouldn't be able to make a 2D array that big. Better perhaps to have each cell be 1000 units wide? As long as density was low, the same code as before would probably work: check each dot only against other dots in this cell plus the neighboring 8 cells. Just be prepared for more candidates failing to be within R.
If some cells will have a lot of dots, each cell having a linked list, perhaps the cell should have an red-black tree indexed by X coordinate? Even in the same cell the vast majority of other cell members will be too far away so just traverse the tree from X-R to X+R. Rather than loop over all dots, and go diving into each one's tree, perhaps you could instead iterate through the tree looking for X coords within R and if/when you find them calculate the distance. As you traverse one cell's tree from low to high X, you need only check the neighboring cell to the left's tree while in the first R entries.
You could also go to cells smaller than R. You'd have fewer candidates that fail to be close enough. For instance with R/2, you'd check 25 link lists instead of 9, but have on average (if randomly distributed) 25/36ths as many dots to check. That might be a minor gain.

finding saddle points in 3d heightmap

Given a 3d heightmap (from a laser scanner), how do I find the saddle points?
I.e. given something like this:
I am looking for all points where the curvature is positive in one direction and negative in the other.
(These directions should not need to be aligned with the X and Y axis.
I know how to check whether the curvature in X direction has the opposite sign as the curvature in Y direction, but that does not cover all cases. To make matters worse, the resolution in X is different from the resolution in Y)
Ideally I am looking for an algorithm that can tolerate some amount of noise and only mark "significant" saddle points.

I've been exploring a similar problem for a computational topology class and have had some success with the method outlined below.
First you will need a comparison function that will evaluate the height at two input points and will return < or > (not equal) for any input. One way to do this is that if the points are equal height you use some position-based or random index to find the greater point. You can think of this as adding an infinitesimal perturbation to the height.
Now, for each point, you will compare the height at all the surrounding neighbors (there will be 8 neighbors on a 2D rectangular grid). The lower link for a point will be the set of all neighbors for which the height is less than the point.
If all the neighboring values are in the lower link, you are at a local maximum. If none of the points are in the lower link you are at a local minimum. Otherwise, if the lower link is a single connected set, you are at a regular point on a slope. But if the lower link is two unconnected sets, you are at a saddle.
In 2D you can construct a list of the 8 neighboring point in cyclic order around the point you are checking. You assign a value of +/-1 for each neighbor depending on your comparison function. You can then step through that list (remember to compare the two end points) and count how many times the sign changes to determine the number of connected components in the lower link.
Determining which saddles are "important" is a more difficult analysis. You may wish to look at this: http://www.cs.jhu.edu/~misha/ReadingSeminar/Papers/Gyulassy08.pdf for some guidance.
-Michael

(From a guess at the maths rather than practical experience)
Fit a quadratic to the surface in a small patch around each candidate point, e.g. with least squares. How big the patch is is one way of controlling noise, and you might gain by weighting points depending on their distance from the candidate point. In matrix notation, you can represent the quadratic as x'Ax + b'x + c, where A is symmetric.
The quadratic will have zero gradient at x = (A^-1)b/2. If this not within the patch, discard it.
If A has both +ve and -ve eigenvalues you have a saddle point at x. Since A is only 2x2 and so has at most two eigenvalues, you can ignore the case when it as a zero eigenvalue and so you couldn't invert it at the previous stage.

How to select points at a regular density

how do I select a subset of points at a regular density? More formally,
Given
a set A of irregularly spaced points,
a metric of distance dist (e.g., Euclidean distance),
and a target density d,
how can I select a smallest subset B that satisfies below?
for every point x in A,
there exists a point y in B
which satisfies dist(x,y) <= d
My current best shot is to
start with A itself
pick out the closest (or just particularly close) couple of points
randomly exclude one of them
repeat as long as the condition holds
and repeat the whole procedure for best luck. But are there better ways?
I'm trying to do this with 280,000 18-D points, but my question is in general strategy. So I also wish to know how to do it with 2-D points. And I don't really need a guarantee of a smallest subset. Any useful method is welcome. Thank you.
bottom-up method
select a random point
select among unselected y for which min(d(x,y) for x in selected) is largest
keep going!
I'll call it bottom-up and the one I originally posted top-down. This is much faster in the beginning, so for sparse sampling this should be better?
performance measure
If guarantee of optimality is not required, I think these two indicators could be useful:
radius of coverage: max {y in unselected} min(d(x,y) for x in selected)
radius of economy: min {y in selected != x} min(d(x,y) for x in selected)
RC is minimum allowed d, and there is no absolute inequality between these two. But RC <= RE is more desirable.
my little methods
For a little demonstration of that "performance measure," I generated 256 2-D points distributed uniformly or by standard normal distribution. Then I tried my top-down and bottom-up methods with them. And this is what I got:
RC is red, RE is blue. X axis is number of selected points. Did you think bottom-up could be as good? I thought so watching the animation, but it seems top-down is significantly better (look at the sparse region). Nevertheless, not too horrible given that it's much faster.
Here I packed everything.
http://www.filehosting.org/file/details/352267/density_sampling.tar.gz

You can model your problem with graphs, assume points as nodes, and connect two nodes with edge if their distance is smaller than d, Now you should find the minimum number of vertex such that they are with their connected vertices cover all nodes of graph, this is minimum vertex cover problem (which is NP-Hard in general), but you can use fast 2-approximation : repeatedly taking both endpoints of an edge into the vertex cover, then removing them from the graph.
P.S: sure you should select nodes which are fully disconnected from the graph, After removing this nodes (means selecting them), your problem is vertex cover.

A genetic algorithm may probably produce good results here.
update:
I have been playing a little with this problem and these are my findings:
A simple method (call it random-selection) to obtain a set of points fulfilling the stated condition is as follows:
start with B empty
select a random point x from A and place it in B
remove from A every point y such that dist(x, y) < d
while A is not empty go to 2
A kd-tree can be used to perform the look ups in step 3 relatively fast.
The experiments I have run in 2D show that the subsets generated are approximately half the size of the ones generated by your top-down approach.
Then I have used this random-selection algorithm to seed a genetic algorithm that resulted in a further 25% reduction on the size of the subsets.
For mutation, giving a chromosome representing a subset B, I randomly choose an hyperball inside the minimal axis-aligned hyperbox that covers all the points in A. Then, I remove from B all the points that are also in the hyperball and use the random-selection to complete it again.
For crossover I employ a similar approach, using a random hyperball to divide the mother and father chromosomes.
I have implemented everything in Perl using my wrapper for the GAUL library (GAUL can be obtained from here.
The script is here: https://github.com/salva/p5-AI-GAUL/blob/master/examples/point_density.pl
It accepts a list of n-dimensional points from stdin and generates a collection of pictures showing the best solution for every iteration of the genetic algorithm. The companion script https://github.com/salva/p5-AI-GAUL/blob/master/examples/point_gen.pl can be used to generate the random points with a uniform distribution.

Here is a proposal which makes an assumption of Manhattan distance metric:
Divide up the entire space into a grid of granularity d. Formally: partition A so that points (x1,...,xn) and (y1,...,yn) are in the same partition exactly when (floor(x1/d),...,floor(xn/d))=(floor(y1/d),...,floor(yn/d)).
Pick one point (arbitrarily) from each grid space -- that is, choose a representative from each set in the partition created in step 1. Don't worry if some grid spaces are empty! Simply don't choose a representative for this space.
Actually, the implementation won't have to do any real work to do step one, and step two can be done in one pass through the points, using a hash of the partition identifier (the (floor(x1/d),...,floor(xn/d))) to check whether we have already chosen a representative for a particular grid space, so this can be very, very fast.
Some other distance metrics may be able to use an adapted approach. For example, the Euclidean metric could use d/sqrt(n)-size grids. In this case, you might want to add a post-processing step that tries to reduce the cover a bit (since the grids described above are no longer exactly radius-d balls -- the balls overlap neighboring grids a bit), but I'm not sure how that part would look.

To be lazy, this can be casted to a set cover problem, which can be handled by mixed-integer problem solver/optimizers. Here is a GNU MathProg model for the GLPK LP/MIP solver. Here C denotes which point can "satisfy" each point.
param N, integer, > 0;
set C{1..N};
var x{i in 1..N}, binary;
s.t. cover{i in 1..N}: sum{j in C[i]} x[j] >= 1;
minimize goal: sum{i in 1..N} x[i];
With normally distributed 1000 points, it didn't find the optimum subset in 4 minutes, but it said it knew the true minimum and it selected only one more point.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio