I have found some mentions in another question of matrix addition being a quadratic operation. But I think it is linear.
If I double the size of a matrix, I need to calculate double the additions, not quadruple.
The main diverging point seems to be what is the size of the problem. To me, it's the number of elements in the matrix. Others think it is the number of columns or lines, hence the O(n^2) complexity.
Another problem I have with seeing it as a quadratic operation is that that means adding 3-dimensional matrices is cubic, and adding 4-dimensional matrices is O(n^4), etc, even though all of these problems can be reduced to the problem of adding two vectors, which has an obviously linear solution.
Am I right or wrong? If wrong, why?
As you already noted, it depends on your definition of the problem size: is it the total number of elements, or the width/height of the matrix. Which ever is correct actually depends on the larger problem of which the matrix addition is part of.
NB: on some hardware (GPU, vector machines, etc) the addition might run faster than expected (even though complexity is still the same, see discussion below), because the hardware can perform multiple additions in one step. For a bounded problem size (like n < 3) it might even be one step.
It's O(M*N) for a 2-dimensional matrix with M rows and N columns.
Or you can say it's O(L) where L is the total number of elements.
Usually the problem is defined using square matrices "of size N", meaning NxN. By that definition, matrix addition is an O(N^2) since you must visit each of the NxN elements exactly once.
By that same definition, matrix multiplication (using square NxN matrices) is O(N^3) because you need to visit N elements in each of the source matrices to compute each of the NxN elements in the product matrix.
Generally, all matrix operations have a lower bound of O(N^2) simply because you must visit each element at least once to compute anything involving the whole matrix.
think of the general case implementation:
for 1 : n
for 1 : m
c[i][j] = a[i][j] + b[i][j]
if we take the simple square matrix, that is n x n additions
Related
I apologize for the vagueness of this question, but I'm trying to ascertain a way to perform divide and conquer multiplication of rectangular matrices A and B such that A = n x m and B = m x p
I've done a bit of reading and Strassen's method seems promising, but I can't determine how I would use this algorithm on rectangular matrices. I've seen some people refer to "padding" with zeros to make both matrices square and then "unpadding" the result, but I'm not clear on what the unpadding stage would entail.
Thank you for your advice!
The result matrix is going to contain zeros on all items that were "added" to operand matrices. To get back to your rectangular result, you would just crop the result, i.e. take upper left corner of the result matrix based on dimensions of operands.
However, padding by itself seems to be wise only in cases where n, m and p are very close. When these are disproportionate, you are going to lot of zero matrix multiplication.
For example if n = 2m = p, Strassen's algorithm is going to divide multiplication into 7 multiplications of m-size matrices. However, 4 of these multiplications would involve zero matrices and are not necessary.
I think there are two ways how to improve the performance:
Use padding and remember which part of matrix is padded. Then for each multiplication step check whether you are not multiplying by a zero matrix. If you do, the result would also be a zero matrix, no need to compute that. This would remove most of the cost involved with padding.
Do not use padding. NonSquare_Strassen: Divide the rectangular matrices into square regions and a remainders. Run vanilla Strassen on square regions. Run NonSquareStrassen again on the remainders. Afterwards, combine these results. This algorithm will be most likely faster than the first, but not entirely easy to implement. However, the logic will be quite similar to Strassen's algorithm for square matrices.
For the sake of simplicity I would choose the first option.
Note:
Remember that you can use Strassen's approach also for rectangular matrices and that below certain matrix size, O(n^2) cost of additional matrix additions becomes more significant and it's better to finish small sizes using normal cubic multiplication. This means that the Strassen's approach is still quite easy to implement for non-square matrices. The above expects that you have the algorithm for square matrices already implemented.
Algorithm requirements
Input is an arbitrary square matrix M of size N×N, which just fits in memory.
The algorithm's output must be true if M[i,j] = M[j,i] for all j≠i, false otherwise.
Obvious solutions
Check if the transpose equals the matrix itself (MT=M). Easiest to program in many environments, but (usually) consumes twice the memory and requires N² comparisons worst case. Therefore, this is O(N²) and has high peak memory.
Check if the lower triangular part equals the upper triangular part. Of course, the algorithm returns on the first inequality found. This would make the worst case (worst case being, the matrix is indeed symmetric) require N²/2 - N comparisons, since the diagonal does not need to be checked. So although it is better than option 1, this is still O(N²).
Question
Although it's hard to see how it would be possible (the N² elements will all have to be compared somehow), is there an algorithm doing this check that is better than O(N²)?
Or, provided there is a proof of non-existence of such an algorithm: how to implement this most efficiently for a multi-core CPU (Intel or AMD) taking into account things like cache-friendliness, optimal branch prediction, other compiler-specific specializations, etc.?
This question stems mostly from academic interest, although I imagine a practical use could be to determine what solver to use if the matrix describes a linear system AX=b...
Since you will have to examine all the elements except the diagonal, the complexity IMO can't be better than O (n^2).
For a dense matrix, the answer is a definite "no", because any uninspected (non-diagonal) elements could be different from their transposed counterparts.
For standard representations of a sparse matrix, the same reasoning indicates that you can't generally do better than the input size.
However, the same reasoning doesn't apply to arbitrary matrix representations. For example, you could store sparse representations of the symmetric and antisymmetric components of your matrix, which can easily be checked for symmetry in O(1) time by checking if antisymmetric element has any components at all...
I think you can take a probabilistic approach here.
I think it's not a chance/coincidence that x randomly picked lower coordinate elements will match to their upper triangular counter part. The chance is very high that the matrix is indeed symmetric.
So instead of going through all the ½n² - n elements you can check p random coordinates and tell if the matrix is symmetric with confidence:
p / (½n² - n)
you can then decide a threshold above which you believe that the matrix must be a symmetric matrix.
Say we have two square matrices of the same size n, named A and B.
A and B share the property that each entry in their main diagonal diagonals is the same value (i.e., A[0,0] = A[1,1] = A[2,2] ... = A[n,n] and B[0,0] = B[1,1] = B[2,2] ... = B[n,n]).
Is there a way to represent A and B so that they can be added to each other in O(n) time, rather than O(n^2)?
In general: No.
For an nxn matrix, there are n^2 output values to populate; that takes O(n^2) time.
In your case: No.
Even if O(n) of the input/output values are dependent, that leaves O(n^2) that are independent. So there is no representation that can reduce the overall runtime below O(n^2).
But...
In order to reduce the runtime, it is necessary (but not necessarily sufficient) to increase the number of dependent values to O(n^2). Obviously, whether or not this is possible is dictated by the particular scenario...
To complement Oli Cherlesworth answer, I'd like to point out that in the specific case of sparse matrices, you can often obtain a runtime of O(n).
For instance, if you happen to know that your matrices are diagonal, you also know that the resulting matrix will be diagonal, and hence you only need to compute n values.
Similarly, there are band matrices that can be added in O(n), as well as more "random" sparse matrices. In general, in a sparse matrix, the number of non-zero elements per row is more or less constant (you obtain these elements from a finite element computation for example, or from graph adjacency matrices etc.), and as such, using an appropriate representation such as "Compressed row storage" or "Compressed column storage", you will end up using O(n) operations to add your two matrices.
Also a special mention for sublinear randomized algorithms, that only propose you to know the final value that is "not-too-far" from the real solution, up to random errors.
Let's say I have a matrix that has X rows and Y columns. The total number of elements is X*Y, correct? So does that make n=X*Y?
for (i=0; i<X; i++)
{
for (j=0; j<Y; j++)
{
print(matrix[i][j]);
}
}
Then wouldn't that mean that this nested for loop is O(n)? Or am I misunderstanding how time complexities work?
Generally, I thought all nested for loops were O(n^2), but if it goes through X*Y calls to print(), doesn't that mean that the time complexity is O(X*Y) and X*Y is equal to n?
If you have a matrix of size rows*columns, then the inner loop (let's say) is O(columns), and the nested loops together are O(rows*columns).
You are confusing a problem size of N for a problem size of N^2. You can either say your matrix is size N or your matrix is size N^2, though unless your matrix is square you should say that you have a matrix of size Rows*Columns.
You are right when you say n = X x Y but wrong when you say the nested loops should be O(n). The meaning of nested loop can be understood if you dry run your code. You will notice that for each iteration of the outer loop the inner loop runs n (or what ever is the size condition) times. Hence, by simple math, you can deduce that its O(n^2). But, if you had just one loop when you will be iterating over (X x Y) (Eg: for(i = 0; i<(X*Y); i++) elements, then it will be O(n) cause you are not restarting your iteration at any point of time.
Hope this makes sense.
This answer was written hastily and received a few downvotes, so I decided to clarify and rewrite it
Time complexity of an algorithm is an expression of the number of operations of the algorithm in terms of the size of the problem the algorithm is intended to solve.
There are two sizes involved here.
The first size is the number of elements of the matrix X × Y This corresponds to what is known in complexity theory as the size of input. Let k = X × Y denote the number of elements in the matrix. Since the number of operations in the twin loop is X × Y, it is in O(k).
The second size is the number of columns and rows of the matrix. Let m = max(X,Y). The number of operations in the twin loop is in O(m^2). Usually in Linear Algebra this kind of size is used to characterize the complexity of matrix operations on m × m matrices.
When you talk about complexity you have to specify precisely how you encode an instance problem and what parameter you use to specify its size. In Complexity Theory we usually assume that the input to an algorithm is given as a string of characters coming from some finite alphabet and measure the complexity of an algorithm in terms of an upper bound on the number of operations on an instance of a problem given by a string of length n. That is in Complexity Theory n is usually the size of input.
In practical Complexity Analysis of algorithms we often use other measures of the size of an instance that are more meaningful in specific context. For instance if A is a connectivity matrix of a graph, we may use the number of vertices V as a measure of complexity of an instance of a problem, or if A is a matrix of a linear operator acting on a vector space, we may use the dimension of a vector space as such a measure. For square matrices the convention is to specify the complexity in terms of the dimension of the matrix, that is to measure the complexity of algorithms acting upon n × n matrices in terms of n. It often makes practical sense and also agrees with the conventions of a specific application field even if it may contradict the conventions of Complexity Theory.
Let us give the name Matrix Scan to our twin loop. You may legitimately say that if the size of an instance of Matrix Scan is the length of a string encoding of a matrix. Assuming bounded size of the entries it is the number of elements in the matrix, k. Then we can say the complexity of Matrix Scan is in O(k). On the other hand if we take m = max(X,Y) as a parameter that characterizes the complexity of an instance, as is customary in many applications, then the complexity Matrix Scan for an X×Y matrix will is also in O(m^2). For a square matrix X = Y = m and O(k) = O(m^2).
Notice: Some people in the comments asked whether we can always find an encoding of the problem to reduce any polynomial problem to a linear problem. This is not true. For some algorithms the number of operations grows faster than the length of the string encoding of their input. For instance, there is no algorithm to multiply two m×m matrices with θ(m^2) number of operations. Here the size of input grows as m^2, however Ran Raz proved that the number of operations grows at least as fast as m^2 log m. If n is in O(m^2) then m^2 log m is in O(n log n) and the best known algorithms complexity grows as O(m^(2+c)) = O(n^(1+c/2)), where c is at least 0.372 for versions of Coppersmith-Winograd algorithm and c = 1 for the common iterative algorithm.
Generally, I thought all nested for loops were O(n^2),
You are wrong about that. What confuses you I guess is that often people use as an example square(X==Y) matrix so complexity is n*n(X==n,Y==n).
If you want to practise your O(*) skills try to figure out why matrix multiplication is O(n^3). IF you dont know the algorithm for matrix multiplication it is easy to find it online.
In a paper I'm writing I make use of an n x n matrix multiplying a dense vector of dimension n. In its natural form, this matrix has O(n^2) space complexity and the multiplication takes time O(n^2).
However, it is known that the matrix is symmetric, and has zero values along its diagonal. The matrix is also highly sparse: the majority of non-diagonal entries are zero.
Could anyone link me to an algorithm/paper/data structure which uses a sparse symmetric matrix representation to approach O(nlogn) or maybe even O(n), in cases of high sparsity?
I would have a look at the csparse library by Tim Davis. There's also a corresponding book that describes a whole range of sparse matrix algorithms.
In the sparse case the A*x operation can be made to run in O(|A|) complexity - i.e. linear in the number of non-zero elements in the matrix.
Are you interested in parallel algorithms of this sort
http://www.cs.cmu.edu/~scandal/cacm/node9.html