Eigenvalues of large symmetric matrices

Eigenvalues of large symmetric matrices - matrix

When I try to compute the eigenvalues of the adjacency matrix of a very large graph I get, what can be charitably described as, garbage. In particular, since the graph is four-regular, the eigenvalues should be in $[-4, 4]$ but they are visibly not. I used Matlab (via MATLink), and got the same problems, so this is clearly an issue that transcends mathematica. The question is: what is the best way to deal with it? I am sure MATLAB and Mathematica use the venerable EISPAK code, so there may be something newer/better

Eigenvalue methods for dense matrices usually proceed by first transforming the matrix into Hessenberg form, here this would result in a tridiagonal matrix. After that some variant of the shifted QR algorithm, like bulge-chasing, is applied to iteratively reduce the non-diagonal elements, splitting the matrix at positions where these become small enough.
But what I would like to draw the attention to is that first step and its structure destroying consequences. It is, for instance, not guaranteed that the tri-diagonal matrix is still symmetrical. This applies also to all further steps if they are not explicitly tailored for symmetric matrices.
But what is much more relevant here is that this step ignores all connectivity or non-connectivity of the graph and potentially connects all nodes, albeit with very small weights, when the transformation is reversed.
Each of the m connected component of the graph gives one eigenvalue 4 with an eigenvector that is 1 at the nodes of the components and 0 else. These eigenspaces have all dimension 1. Any small perturbation of the matrix first removes that separation and joins them in an eigenspace of dimension m and then perturbs this as a multiple eigenvalue. This then can result in an approximately regular m pointed star in the complex plane of radius 4*(1e-15)^(1/m) around the original value 4. Even for medium sized m this gives a substantial deviation from the true eigenvalue.
So in summary, use a sparse method as these usually will first re-order the matrix to be as diagonal as possible, which should give a block-diagonal structure according to the components. Then the eigenvalue method will automatically work on all blocks separately, avoiding the above described mixing. And if possible, use a method for symmetric matrices or set a corresponding option/flag if it exists.

Related

How to break a geometry into blocks?

I am certain there is already some algorithm that does what I need, but I am not sure what phrase to Google, or what is the algorithm category.
Here is my problem: I have a polyhedron made up by several contacting blocks (hyperslabs), i. e. the edges are axis aligned and the angles between edges are 90°. There may be holes inside the polyhedron.
I want to break up this concave polyhedron in as little convex rectangular axis-aligned whole blocks are possible (if the original polyhedron is convex and has no holes, then it is already such a block, and therefore, the solution). To illustrate, some 2-D images I made (but I need the solution for 3-D, and preferably, N-D):
I have this geometry:
One possible breakup into blocks is this:
But the one I want is this (with as few blocks as possible):
I have the impression that an exact algorithm may be too expensive (is this problem NP-hard?), so an approximate algorithm is suitable.
One detail that maybe make the problem easier, so that there could be a more appropriated/specialized algorithm for it is that all edges have sizes multiple of some fixed value (you may think all edges sizes are integer numbers, or that the geometry is made up by uniform tiny squares, or voxels).
Background: this is the structured grid discretization of a PDE domain.
What algorithm can solve this problem? What class of algorithms should I
search for?

Update: Before you upvote that answer, I want to point out that my answer is slightly off-topic. The original poster have a question about the decomposition of a polyhedron with faces that are axis-aligned. Given such kind of polyhedron, the question is to decompose it into convex parts. And the question is in 3D, possibly nD. My answer is about the decomposition of a general polyhedron. So when I give an answer with a given implementation, that answer applies to the special case of polyhedron axis-aligned, but it might be that there exists a better implementation for axis-aligned polyhedron. And when my answer says that a problem for generic polyhedron is NP-complete, it might be that there exists a polynomial solution for the special case of axis-aligned polyhedron. I do not know.
Now here is my (slightly off-topic) answer, below the horizontal rule...
The CGAL C++ library has an algorithm that, given a 2D polygon, can compute the optimal convex decomposition of that polygon. The method is mentioned in the part 2D Polygon Partitioning of the manual. The method is named CGAL::optimal_convex_partition_2. I quote the manual:
This function provides an implementation of Greene's dynamic programming algorithm for optimal partitioning [2]. This algorithm requires O(n4) time and O(n3) space in the worst case.
In the bibliography of that CGAL chapter, the article [2] is:
[2] Daniel H. Greene. The decomposition of polygons into convex parts. In Franco P. Preparata, editor, Computational Geometry, volume 1 of Adv. Comput. Res., pages 235–259. JAI Press, Greenwich, Conn., 1983.
It seems to be exactly what you are looking for.
Note that the same chapter of the CGAL manual also mention an approximation, hence not optimal, that run in O(n): CGAL::approx_convex_partition_2.
Edit, about the 3D case:
In 3D, CGAL has another chapter about Convex Decomposition of Polyhedra. The second paragraph of the chapter says "this problem is known to be NP-hard [1]". The reference [1] is:
[1] Bernard Chazelle. Convex partitions of polyhedra: a lower bound and worst-case optimal algorithm. SIAM J. Comput., 13:488–507, 1984.
CGAL has a method CGAL::convex_decomposition_3 that computes a non-optimal decomposition.

I have the feeling your problem is NP-hard. I suggest a first step might be to break the figure into sub-rectangles along all hyperplanes. So in your example there would be three hyperplanes (lines) and four resulting rectangles. Then the problem becomes one of recombining rectangles into larger rectangles to minimize the final number of rectangles. Maybe 0-1 integer programming?

I think dynamic programming might be your friend.
The first step I see is to divide the polyhedron into a trivial collection of blocks such that every possible face is available (i.e. slice and dice it into the smallest pieces possible). This should be trivial because everything is an axis aligned box, so k-tree like solutions should be sufficient.
This seems reasonable because I can look at its cost. The cost of doing this is that I "forget" the original configuration of hyperslabs, choosing to replace it with a new set of hyperslabs. The only way this could lead me astray is if the original configuration had something to offer for the solution. Given that you want an "optimal" solution for all configurations, we have to assume that the original structure isn't very helpful. I don't know if it can be proven that this original information is useless, but I'm going to make that assumption in this answer.
The problem has now been reduced to a graph problem similar to a constrained spanning forest problem. I think the most natural way to view the problem is to think of it as a graph coloring problem (as long as you can avoid confusing it with the more famous graph coloring problem of trying to color a map without two states of the same color sharing a border). I have a graph of nodes (small blocks), each of which I wish to assign a color (which will eventually be the "hyperslab" which covers that block). I have the constraint that I must assign colors in hyperslab shapes.
Now a key observation is that not all possibilities must be considered. Take the final colored graph we want to see. We can partition this graph in any way we please by breaking any hyperslab which crosses the partition into two pieces. However, not every partition is meaningful. The only partitions that make sense are axis aligned cuts, which always break a hyperslab into two hyperslabs (as opposed to any more complicated shape which could occur if the cut was not axis aligned).
Now this cut is the reverse of the problem we're really trying to solve. That cutting is actually the thing we did in the first step. While we want to find the optimal merging algorithm, undoing those cuts. However, this shows a key feature we will use in dynamic programming: the only features that matter for merging are on the exposed surface of a cut. Once we find the optimal way of forming the central region, it generally doesn't play a part in the algorithm.
So let's start by building a collection of hyperslab-spaces, which can define not just a plain hyperslab, but any configuration of hyperslabs such as those with holes. Each hyperslab-space records:
The number of leaf hyperslabs contained within it (this is the number we are eventually going to try to minimize)
The internal configuration of hyperslabs.
A map of the surface of the hyperslab-space, which can be used for merging.
We then define a "merge" rule to turn two or more adjacent hyperslab-spaces into one:
Hyperslab-spaces may only be combined into new hyperslab-spaces (so you need to combine enough pieces to create a new hyperslab, not some more exotic shape)
Merges are done simply by comparing the surfaces. If there are features with matching dimensionalities, they are merged (because it is trivial to show that, if the features match, it is always better to merge hyperslabs than not to)
Now this is enough to solve the problem with brute force. The solution will be NP-complete for certain. However, we can add an additional rule which will drop this cost dramatically: "One hyperslab-space is deemed 'better' than another if they cover the same space, and have exactly the same features on their surface. In this case, the one with fewer hyperslabs inside it is the better choice."
Now the idea here is that, early on in the algorithm, you will have to keep track of all sorts of combinations, just in case they are the most useful. However, as the merging algorithm makes things bigger and bigger, it will become less likely that internal details will be exposed on the surface of the hyperslab-space. Consider
+===+===+===+---+---+---+---+
| : : A | X : : : :
+---+---+---+---+---+---+---+
| : : B | Y : : : :
+---+---+---+---+---+---+---+
| : : | : : : :
+===+===+===+ +---+---+---+
Take a look at the left side box, which I have taken the liberty of marking in stronger lines. When it comes to merging boxes with the rest of the world, the AB:XY surface is all that matters. As such, there are only a handful of merge patterns which can occur at this surface
No merges possible
A:X allows merging, but B:Y does not
B:Y allows merging, but A:X does not
Both A:X and B:Y allow merging (two independent merges)
We can merge a larger square, AB:XY
There are many ways to cover the 3x3 square (at least a few dozen). However, we only need to remember the best way to achieve each of those merge processes. Thus once we reach this point in the dynamic programming, we can forget about all of the other combinations that can occur, and only focus on the best way to achieve each set of surface features.
In fact, this sets up the problem for an easy greedy algorithm which explores whichever merges provide the best promise for decreasing the number of hyperslabs, always remembering the best way to achieve a given set of surface features. When the algorithm is done merging, whatever that final hyperslab-space contains is the optimal layout.
I don't know if it is provable, but my gut instinct thinks that this will be an O(n^d) algorithm where d is the number of dimensions. I think the worst case solution for this would be a collection of hyperslabs which, when put together, forms one big hyperslab. In this case, I believe the algorithm will eventually work its way into the reverse of a k-tree algorithm. Again, no proof is given... it's just my gut instinct.

You can try a constrained delaunay triangulation. It gives very few triangles.

Are you able to determine the equations for each line?
If so, maybe you can get the intersection (points) between those lines. Then if you take one axis, and start to look for a value which has more than two points (sharing this value) then you should "draw" a line. (At the beginning of the sweep there will be zero points, then two (your first pair) and when you find more than two points, you will be able to determine which points are of the first polygon and which are of the second one.
Eg, if you have those lines:
verticals (red):
x = 0, x = 2, x = 5
horizontals (yellow):
y = 0, y = 2, y = 3, y = 5
and you start to sweep through of X axis, you will get p1 and p2, (and we know to which line-equation they belong ) then you will get p3,p4,p5 and p6 !! So here you can check which of those points share the same line of p1 and p2. In this case p4 and p5. So your first new polygon is p1,p2,p4,p5.
Now we save the 'new' pair of points (p3, p6) and continue with the sweep until the next points. Here we have p7,p8,p9 and p10, looking for the points which share the line of the previous points (p3 and p6) and we get p7 and p10. Those are the points of your second polygon.
When we repeat the exercise for the Y axis, we will get two points (p3,p7) and then just three (p1,p2,p8) ! On this case we should use the farest point (p8) in the same line of the new discovered point.
As we are using lines equations and points 2 or more dimensions, the procedure should be very similar
ps, sorry for my english :S
I hope this helps :)

CUDA Thrust find near neighbor points

In my problem, there are N points in the domain and they are somehow randomly distributed. For each point I need to find all neighbor points with distance less than a given double precision floating number, DIST.
Is there an efficient way to do this in Thrust?
In serial, I would use a neighborhood table and hope to achieve approximately O(n) instead of naive algorithm of O(n^2).
I have found a thrust example for 2D bucket sort, which is a perfect fit for the first part of my problem. But that is not enough, because for each bucket I need to find all points in the neighbor buckets, and then compute their distances and see if any of them is less than DIST. Finding neighbors and compute distance should be relatively easy, but adding those eligible points to a result array seems really difficult for me to implement in Thrust.
A way to rephrase this particular problem is this -- I have two 2D arrays A1 and A2, the column number represent the index of the 2D bucket and each column have different number of elements that are indices of my points. Each element in column(i) of A1 will form a potential pair with each element in colunm(i) of A2, and all eligible pairs should be recorded to a result array.
I could use a CUDA kernel and allocating tons of potentially unused memory as a workaround, but that would be the last thing I would want to do.
Thanks in advance.

The full solution is out of the scope of a single Stack Overflow answer, but there's a discussion on how to use Thrust to build a 2D spatial index in this repository:
https://github.com/jaredhoberock/thrust-workshop

Another possibility, simpler than creating a quad-tree, is using a neighborhood matrix.
First place all your points into a 2D square matrix (or 3D cubic grid, if you are dealing with three dimensions). Then you can run a full or partial spatial sort, so points will became ordered inside the matrix.
Points with small Y could move to the top rows of the matrix, and likewise, points with large Y would go to the bottom rows. The same will happen with points with small X coordinates, that should move to the columns on the left. And symmetrically, points with large X value will go to the right columns.
After you did the spatial sort (there are many ways to achieve this, both by serial or parallel algorithms) you can lookup the nearest points of a given point P by just visiting the adjacent cells where point P is actually stored in the neighborhood matrix.
If this matrix is placed into texture memory, you can use all the spatial caching from CUDA to have very fast accesses to all neighbors!
You can read more details for this idea in the following paper (you will find PDF copies of it online): Supermassive Crowd Simulation on GPU based on Emergent Behavior.
The sorting step gives you interesting choices. You can use just the even-odd transposition sort described in the paper, which is very simple to implement (even in CUDA). If you run just one pass of this, it will give you a partial sort, which can be already useful if your matrix is near-sorted. That is, if your points move slowly, it will save you a lot of computation.
If you need a full sort, you can run such even-odd transposition pass several times (as described in the following Wikipedia page):
http://en.wikipedia.org/wiki/Odd%E2%80%93even_sort
There is a second paper from the same authors, describing an extension to 3D and using three passes of the bitonic sort (which is highly parallel, but it is not a spatial sort). they claim it is both more precise than a single even-odd transposition pass and more efficient than a full sort. The paper is A Neighborhood Grid Data Structure for Massive 3D Crowd Simulation on GPU.

Determine whether the two classes are linearly separable (algorithmically in 2D)

There are two classes, let's call them X and O. A number of elements belonging to these classes are spread out in the xy-plane. Here is an example where the two classes are not linearly separable. It is not possible to draw a straight line that perfectly divides the Xs and the Os on each side of the line.
How to determine, in general, whether the two classes are linearly separable?. I am interested in an algorithm where no assumptions are made regarding the number of elements or their distribution. An algorithm of the lowest computational complexity is of course preferred.

If you found the convex hull for both the X points and the O points separately (i.e. you have two separate convex hulls at this stage) you would then just need to check whether any segments of the hulls intersected or whether either hull was enclosed by the other.
If the two hulls were found to be totally disjoint the two data-sets would be geometrically separable.
Since the hulls are convex by definition, any separator would be a straight line.
There are efficient algorithms that can be used both to find the convex hull (the qhull algorithm is based on an O(nlog(n)) quickhull approach I think), and to perform line-line intersection tests for a set of segments (sweepline at O(nlog(n))), so overall it seems that an efficient O(nlog(n)) algorithm should be possible.
This type of approach should also generalise to general k-way separation tests (where you have k groups of objects) by forming the convex hull and performing the intersection tests for each group.
It should also work in higher dimensions, although the intersection tests would start to become more challenging...
Hope this helps.

Computationally the most effective way to decide whether two sets of points are linearly separable is by applying linear programming. GLTK is perfect for that purpose and pretty much every highlevel language offers an interface for it - R, Python, Octave, Julia, etc.
Let's say you have a set of points A and B:
Then you have to minimize the 0 for the following conditions:
(The A below is a matrix, not the set of points from above)
"Minimizing 0" effectively means that you don't need to actually optimize an objective function because this is not necessary to find out if the sets are linearly separable.
In the end
() is defining the separating plane.
In case you are interested in a working example in R or the math details, then check this out.

Here is a naïve algorithm that I'm quite sure will work (and, if so, shows that the problem is not NP-complete, as another post claims), but I wouldn't be surprised if it can be done more efficiently: If a separating line exists, it will be possible to move and rotate it until it hits two of the X'es or one X and one O. Therefore, we can simply look at all the possible lines that intersect two X'es or one X and one O, and see if any of them are dividing lines. So, for each of the O(n^2) pairs, iterate over all the n-2 other elements to see if all the X'es are on one side and all the O's on the other. Total time complexity: O(n^3).

Linear perceptron is guaranteed to find such separation if one exists.
See: http://en.wikipedia.org/wiki/Perceptron .

You can probably apply linear programming to this problem. I'm not sure of its computational complexity in formal terms, but the technique is successfully applied to very large problems covering a wide range of domains.

Computing a linear SVM then determining which side of the computed plane with optimal marginals each point lies on will tell you if the points are linearly separable.
This is overkill, but if you need a quick one off solution, there are many existing SVM libraries that will do this for you.

As mentioned by ElKamina, Linear Perceptron is guaranteed to find a solution if one exists. This approach is not efficient for large dimensions. Computationally the most effective way to decide whether two sets of points are linearly separable is by applying linear programming.
A code with an example to solve using Perceptron in Matlab is here

In general that problem is NP-hard but there are good approximate solutions like K-means clustering.

Well, both Perceptron and SVM (Support Vector Machines) can tell if two data sets are separable linearly, but SVM can find the Optimal Hiperplane of separability. Besides, it can work with n-dimensional vectors, not only points.
It is used in applications such as face recognition. I recomend to go deep into this topic.

Approximate Estimation of Distance Matrices

I have a set of N objects, and I'd like to compute a NxN distance matrix. Sometimes my set of N objects is very large, and I'd like to compute an approximation to the NxN distance matrix by only computing a subset of the distance comparisons.
Can anyone point me in the direction of something that calculates approximations to a full distance matrix? I have some ideas in mind, but I'd like to avoid re-inventing the wheel.
Edit: An example of the type of algorithm would take advantage of the fact that if there is a very small distance between object A and object B, and there is a very small distance between object B and object C, there has to be a somewhat short distance between objects A and C.

I had this same question and ended up writing Python code for it:
https://github.com/jpeterbaker/lazyDistance
README.md explains how the triangle inequality can be used to update upper and lower bounds for each distance.
Just run the Python file as a script for an example in 2-dimensional space. The plotted lines are the only distances that were actually calculated.
In my version, the time savings aren't about having a large number of objects. As I've written it, it's a O(n^4) algorithm, so it's actually worse than just calculating all distances if the number of objects is large. But my method will save time when you have a modest number of objects and the distance function is very expensive to calculate. It assumes that it is faster to do several O(n^2) operations rather than a single distance measurement.
If n is large, you could look for cheaper methods to decide which distance to calculate next (that don't involve arithmetic with n^2 entries of distance bounds matrices). You also may not need to update all 2*n^2 bounds every time that this code does.

Honestly, I think it depends how close you want your approximation to be and how big your subset is. If you just want some overall feel of what the matrix will look like, you can do simple linear interpolation on a random subset (including the maximal and minimal nodes) getting pretty accurate (tm) results.
I think the real trick here is figuring out the heuristic (linear, quadratic, etc interpolation) and the subset size. You could also figure out the distance matrices of various subsets and then interpolate those matrices with some method (linear, spherical linear, cubic).
Depending on your initial sample, it's pretty much an heuristic trial and error until you go "oh that's good enough for what I need".

Are your "objects" on a network? If the objects are in a network, you can use this or this that yields the all-pairs shortest paths. If not, you're pretty much stuck with calculated all the n x n distances, I think.

The solution you require is similar to what we commonly see in a graph, you can use All pair shortest path for finding the distance, you can also look at johnson's algorithm

What are eigen values and expansions?

What are eigen values, vectors and expansions and as an algorithm designer how can I use them?
EDIT: I want to know how YOU have used it in your program so that I get an idea. Thanks.

they're used for a lot more than matrix algebra. examples include:
the asymptotic state distribution of a hidden markov model is given by the left eigenvector associated with the eigenvalue of unity from the state transition matrix.
one of the best & fastest methods of finding community structure in a network is to construct what's called the modularity matrix (which basically is how "surprising" is a connection between two nodes), and then the signs of the elements of the eigenvector associated with the largest eigenvalue tell you how to partition the network into two communities
in principle component analysis you essentially select the eigenvectors associated with the k largest eigenvalues from the n>=k dimensional covariance matrix of your data and project your data down to the k dimensional subspace. the use of the largest eigenvalues ensures that you're retaining the dimensions that are most significant to the data, since they are the ones that have the greatest variance.
many methods of image recognition (e.g. facial recognition) rely on building an eigenbasis from known data (a large set of faces) and seeing how difficult it is to reconstruct a target image using the eigenbasis -- if it's easy, then the target image is likely to be from the set the eigenbasis describes (i.e. eigenfaces easily reconstruct faces, but not cars).
if you're in to scientific computing, the eigenvectors of a quantum hamiltonian are those states that are stable, in that if a system is in an eigenstate at time t1, then at time t2>t1, if it hasn't been disturbed, it will still be in that eigenstate. also, the eigenvector associated with the smallest eigenvalue of a hamiltonian is the ground state of a system.

Eigen vectors and corresponding eigen values are mainly used to switch between different coordinate systems. This might simplify problems and computations enormously by moving the problem sphere to from one coordinate system to another.
This new coordinates system has the eigen vectors as its base vectors, i.e. they "span" this coordinate system. Since they can be normalized, the transformation matrix from the first coordinate system is "orthonormal", that is the eigen vectors have magnitude 1 and are perpendicular to each other.
In the transformed coordinate system, a linear operation A (matrix) is pure diagonal. See Spectral Theorem, and Eigendecomposition for more information.
A quick implication is for example that you can from a general quadratic curve:
ax^2 + 2bxy + cy^2 + 2dx + 2fy + g = 0
rewrite it as
AX^2 + BY^2 + C = 0
where X and Y are counted along the direction of the eigen vectors.
Cheers !

check out http://mathworld.wolfram.com/Eigenvalue.html
Using eigen values in algorithms will need you to be proficient with the math involved.
I'm absolutely the wrong person to be talking about math: I puke on it.
cheers, jrh.

Eigen values and vectors are used in matrix computation as finding of reverse matrix. So if you need to write math code, precomputing them can speed some operations.
In short, you need them if you do matrix algebra, linear algebra etc.

Using the notation favored by physicists, if we have an operator H, then |x> is an eigenstate of H if and only if
H|x> = h|x>
where we call h the eigenvalue associated with the eigenvector |x> under H.
(Here the state of the system can be represented by a matrix, making this math isomorphic with all the other expressions already linked.)
Which brings us to the uses of these things once they have been discovered:
The full set of eigenvectors of a system under a given operator form an orthagonal spanning set for they system. This set may be a basis if there is no degeneracy. This is very useful because it allows extremely compact expressions of arbitrary (non eigen-) states of the system.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio