What are the advantages of using a permutation matrix to swap rows? Why one would create a permutation matrix and then apply a matrix multiplication, is it easier and more efficient than just swapping rows with a for loop?
Permutation matrices are a useful mathematical abstraction, because they allow analysis using the normal rules of matrix algebra, without having to introduce another type of operation.
In software, good implementations do not store a permutation matrix as a full matrix, they store a permutation array and they apply it directly (without a full matrix multiplication).
Depending on the sizes of the matrices and the operations and access patterns involved, it may be cheaper not to apply the permutation to the data in memory at all, but just to use it as an extra indirection. So, when you request (P * M)(i,j), where P is a permutation matrix and M is some other matrix that you are permuting, the data need not be re-arranged at all, but rather the element access operation will look up the permuted row when you access the element.
The first thing that comes into my mind is the issue called "spatial locality". Caching technologies assume that if a memory location is accessed, it is probable to access the nearby locations of the memory. In some programming languages, elements in rows are neighbors whereas elements in columns are neighbors in others. It depends on the implementation. I guess permutation matrices are designed to solve this problem, since optimization of matrix multiplication is one of the problems that algorithms academia mostly works on improving. Simple loop structure will not be able to make use of cache technologies to improve performance.
Related
It might not be evident, but Prolog also offers arrays out of the box. A Prolog compound has a functor and a number of arguments. This means we could represent an array such as:
[[1,2],[3,4]]
Replacing the Prolog lists by the following Prolog compounds:
matrice(vector(1,2), vector(3,4))
The advantage would be faster element access from an integer index. Can this representation be used to realize a matrix multiplication?
There is yet another approach, as implemented in R (the statistical environment). The dimensions of the array and the values are kept separately. So your square could also be represented as:
array(dims(2, 2), v(1,2,3,4))
This approach has some (questionable) benefits and drawbacks. You can start reading here, if you are at all interested: https://stat.ethz.ch/R-manual/R-devel/library/base/html/dim.html
To your question, yes, you can implement matrix multiplication, regardless on how you decide to represent the matrix. It would be interesting to see how the two approaches (array of arrays vs. one array and calculating indexes from the dimensions) compare in terms of efficiency.
What algorithm do you want to use for the matrix multiplication? Is it any of the ones described here: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm?
EDIT: do you want to allow the client code to be able to provide the product and sum operations? Do you want to allow specialization of the values? For example, if you want to use matrix multiplication for finding the transitive closure of a graph, you could represent the boolean square matrix as an unbounded integer. This will make the matrix itself at least quite small.
This post concerns very short, wide arrays (# columns can be several orders of magnitude larger than the number of rows).
Due to the disparity in row/column number and the large size of the matrices I work with, it's usually infeasible to hold the U part of an LU decomposition in memory. Does Eigen have functionality to compute just the L? Equivalently, to place the input matrix in echelon form using row operations?
General notes
(1) I saw a related question here
https://forum.kde.org/viewtopic.php?f=74&t=138686&p=371097&hilit=echelon#p371097
The answer suggested looking at the image() method under FullPivLU, but I wasn't able to find the necessary information in the docs. In particular, it's often important to obtain the matrix L, in practice. An arbitrary basis for the column space of the matrix does not suffice.
(2) There was another question here
https://forum.kde.org/viewtopic.php?f=74&t=130430&p=348923&hilit=echelon#p348923
but it did not seem to get a response.
(3) Issues of stability are of less concern in the (fairly specialized) application domain that motivates this question, since we usually work over finite fields.
Thanks!
I recently discovered genetic algothims and after doing a little research I can't find any example on how to evolve structures more complex than a vector or a string.
Let's say that I'm using a covariance matrix for a certain computation (to compute a mahalanobis distance for example) and I want to look for a better matrix to do the job and linimize a certain criteria, are there any classic examples on how to evolve the matrix and which crossover operators to use ?
Thanks !
Any structure of fixed size and shape that is made of numbers (or any other elements) can be rewritten to a 1-D vector and back. You can then use any operator you like which works on vectors.
If you wanted to work with matrices (or any other structures) directly you can always design your own operators, but a matrix basically is a vector, just written in a different way. For the matrix case there are a number of possibilites of operators (crossover):
Swap rows/columns (between the parents)
Swap submatrices (generalization of the above)
Continuous-space crossover methos like BLX-alpha, PCX, arithmetic crossover... These all are designed for vectors but you will just treat the matrix as a vector (it's really not that different).
Mutation is probably going to be more or less identical to the vector-like - you just mutate the elements (or some of them).
Are there any algorithms that allow efficient creation (element filling) of sparse (e.g. CSR or coordinate) matrix in parallel?
If you store your matrix as a coordinate map, any language which has a concurrent dictionary implementation available should do the job for you.
Java's got the ConcurrentHashMap, and .NET 4 has ConcurrentDictionary, both of which allow multi-threaded non-blocking (afaik) element insertion in parallel.
There are no efficient algorithms for creating sparse matrices in data-parallel way. Plausible is coordinate matrix type which requires sorting after content filling, but that type is slow for matrix products etc.
Solution is you don't build sparse matrix - you don't keep it in memory; you do implicit operations in place when you're calculating elements of sparse matrix.
I currently have an algorithm that operates on an adjacency matrix of size n by m. In my algorithm, I need to zero out entire rows or columns at a time. My implementation is currently O(m) or O(n) depending on if it's a column or row.
Is there any way to zero out a column or row in O(1) time?
Essentially this depends on the Chip architecture that you're dealing with. For most CPUs, it isn't possible to zero out whole swathes of memory at go, and therefore each word will require a separate memory operation, no matter what facilities your programming language provides.
It helps tremendously if your memory is contiguous for memory access time, because memory adjacent to memory just accessed will be cached, and subsequent accesses will hit the cache, resulting in fast performance.
The result of this is that if your matrix is large, it may be faster to zero out a row at a time or a column at a time, rather than vice versa, depending on whether your data is written by column or by row.
EDIT: I have assumed that your matrices aren't sparse, or triangular, or otherwise special, since you talk about "zeroing out a whole row". If you know that your matrix is mostly empty or somehow otherwise fits a special pattern, you would be able to represent your matrix in a different way (not a simple nxm array) and the story would be different. But if you have an nxm matrix right now, then this is the case.
Is the distance metric and is the graph undirected? (in this case the matrix is symmetric). In that case you could just operate on lower or upper triangular matrices throughout the program. In this way you just have to 0 out one row (or column if you are dealing with upper triangular). and even then it wont be a whole row, on average half.
It depends on how your matrices are implemented.
If you have a representation such as an array of arrays, you can point to a shared zeroed element array, as long as you check you don't subsequently write to it. Which means one out of a row or column can be zeroed in O(N), with a constant cost on all other write operations.
You also could have a couple of arrays - one for rows, one for columns - which scale the values in the matrix. Putting a zero in either would be a O(1) operation to mask out a row or column, at the cost of extra processing for every read; but it may be worth it as a way of temporarily removing a node from the graph if that's a common use case. It also leaves the original version of the matrix untouched, so you could parallelise your algorithm (assuming the only operation it requires is pruning all edges into or out of a node).