Adding square matrices in O(n) time?

Adding square matrices in O(n) time? - algorithm

Say we have two square matrices of the same size n, named A and B.
A and B share the property that each entry in their main diagonal diagonals is the same value (i.e., A[0,0] = A[1,1] = A[2,2] ... = A[n,n] and B[0,0] = B[1,1] = B[2,2] ... = B[n,n]).
Is there a way to represent A and B so that they can be added to each other in O(n) time, rather than O(n^2)?

In general: No.
For an nxn matrix, there are n^2 output values to populate; that takes O(n^2) time.
In your case: No.
Even if O(n) of the input/output values are dependent, that leaves O(n^2) that are independent. So there is no representation that can reduce the overall runtime below O(n^2).
But...
In order to reduce the runtime, it is necessary (but not necessarily sufficient) to increase the number of dependent values to O(n^2). Obviously, whether or not this is possible is dictated by the particular scenario...

To complement Oli Cherlesworth answer, I'd like to point out that in the specific case of sparse matrices, you can often obtain a runtime of O(n).
For instance, if you happen to know that your matrices are diagonal, you also know that the resulting matrix will be diagonal, and hence you only need to compute n values.
Similarly, there are band matrices that can be added in O(n), as well as more "random" sparse matrices. In general, in a sparse matrix, the number of non-zero elements per row is more or less constant (you obtain these elements from a finite element computation for example, or from graph adjacency matrices etc.), and as such, using an appropriate representation such as "Compressed row storage" or "Compressed column storage", you will end up using O(n) operations to add your two matrices.
Also a special mention for sublinear randomized algorithms, that only propose you to know the final value that is "not-too-far" from the real solution, up to random errors.

Related

Is there an algorithm better than O(N²) to determine if matrix is symmetric?

Algorithm requirements
Input is an arbitrary square matrix M of size N×N, which just fits in memory.
The algorithm's output must be true if M[i,j] = M[j,i] for all j≠i, false otherwise.
Obvious solutions
Check if the transpose equals the matrix itself (MT=M). Easiest to program in many environments, but (usually) consumes twice the memory and requires N² comparisons worst case. Therefore, this is O(N²) and has high peak memory.
Check if the lower triangular part equals the upper triangular part. Of course, the algorithm returns on the first inequality found. This would make the worst case (worst case being, the matrix is indeed symmetric) require N²/2 - N comparisons, since the diagonal does not need to be checked. So although it is better than option 1, this is still O(N²).
Question
Although it's hard to see how it would be possible (the N² elements will all have to be compared somehow), is there an algorithm doing this check that is better than O(N²)?
Or, provided there is a proof of non-existence of such an algorithm: how to implement this most efficiently for a multi-core CPU (Intel or AMD) taking into account things like cache-friendliness, optimal branch prediction, other compiler-specific specializations, etc.?
This question stems mostly from academic interest, although I imagine a practical use could be to determine what solver to use if the matrix describes a linear system AX=b...

Since you will have to examine all the elements except the diagonal, the complexity IMO can't be better than O (n^2).

For a dense matrix, the answer is a definite "no", because any uninspected (non-diagonal) elements could be different from their transposed counterparts.
For standard representations of a sparse matrix, the same reasoning indicates that you can't generally do better than the input size.
However, the same reasoning doesn't apply to arbitrary matrix representations. For example, you could store sparse representations of the symmetric and antisymmetric components of your matrix, which can easily be checked for symmetry in O(1) time by checking if antisymmetric element has any components at all...

I think you can take a probabilistic approach here.
I think it's not a chance/coincidence that x randomly picked lower coordinate elements will match to their upper triangular counter part. The chance is very high that the matrix is indeed symmetric.
So instead of going through all the ½n² - n elements you can check p random coordinates and tell if the matrix is symmetric with confidence:
p / (½n² - n)
you can then decide a threshold above which you believe that the matrix must be a symmetric matrix.

How do I fill a 2D array with a constant value, with a better efficiency than n^2?

This is a general question, which could be applicable to any given language like C,C++,Java etc.
I figured any way you implement it, you can't get more efficient than using 2 loops, which gives an efficiency of n^2.
for(i=0;i<n;i++)
for(j=0;j<n;j++)
a[i][j]=1;
I was asked this at an interview recently, and couldn't think of anything more efficient. All I got from the interviewer was that I could use recursion or convert the 2D array to a linked list to make it more efficient than n^2. Anyone know if this is possible, and if yes, how? At least theoretically, if not practically.
edit: The actual question gives me the coordinates of two cells, and I have to fill the paths taken by all possible shortest routes with 1.
eg, if i have a 5x5 matrix, and my two coordinates are (2,0) and (3,3), I'd have to fill:
(2,0)(2,1)(2,2)(2,3)
(3,0)(3,1)(3,2)(3,3)
while leaving the rest of the cells as they were.

It depends on what you mean. If the question is about plain arrays, meaning a sequence of contiguos memory locations and for initialization you mean putting a value in every memory location of this "matrix" then the answer is no, better than O(n*m) is not possible and we can prove it:
Let us assume that algorithm fill(A[n][m], init_val) is correct(i.e. fills all the memory locations of A) has complexity g(n,m) which is less than O(n*m)(meaning g(n,m) is not part of Ω(n*m)), then for big enough n and m we will have that g(n,m) < n*m = number of memory locations. Since filling a memory location requires one operation the algorithm fill can fill at most g(n,m) locations[actually half because it must also do at least an operation to "select" a different memory location, except if the hardware provides a combined operation] which is strictly less than n*m, which imply that the algorithm fill is not correct.
The same applies if filling k memory locations takes constant time, you simply have to choose bigger n and m values.
As other already suggested you can use other data-structures to avoid the O(n^2) initialization time. amit suggestion uses some kind of lazy-evaluation, which allows you to not initialize the array at all but do it only when you access the elements.
Note that this removes the Ω(n^2) cost at the beginning, but requires more complex operations to access the array's elements and also requires more memory.
It is not clear what your interviewer meant: converting an array into a linked-list requires Ω(L) time(where L is the length of the array), so simply converting the whole matrix into a linked-list would require Ω(n^2) time plus the real initialization. Using recursion does not help at all,
you simply end up in recurrences such as T(n) = 2T(n/2) + O(1) which would again result in no benefit for the asymptotic complexity.
As a general rule all algorithms have to scan at least all of their input, except it they have some form of knowledge beforehand(e.g. elements are sorted). In your case the space to scan is Θ(n^2) and thus every algorithm that wants to fill it must be at least Ω(n^2). Anything with less than this complexity either make some assumption(e.g. the memory contains the initializer value by default -> O(1)), or solves a different problem(e.g. use lazy arrays, or other data structures).

You can initialize an array in O(1), but it consumes triple the amount of space, and extra "work" for each element access in the matrix.
Since in practice, a matrix is a 1D array in memory, the same principles still hold.
The page describes how it can be done in details.

When you fill a 2d-array with same element, if you really will fill every element at least n^2 operations should be made.(given 2-d array is n*n).
The only way to decrease complexity is use a parallel programming approach.For example, given n processor, first input is is assigned the first row of the array.This is n operations. Then each processor Pi assigns array[i] of row k to array[i] of row k+1 for k=0 to n-1. This will be again O(n) since we have n processor working parallel.
If you really want to implement this approach you can look for free parallel programming environments like OpenMPI and mpich

Time Complexity of a nested for loop that parses a matrix

Let's say I have a matrix that has X rows and Y columns. The total number of elements is X*Y, correct? So does that make n=X*Y?
for (i=0; i<X; i++)
{
for (j=0; j<Y; j++)
{
print(matrix[i][j]);
}
}
Then wouldn't that mean that this nested for loop is O(n)? Or am I misunderstanding how time complexities work?
Generally, I thought all nested for loops were O(n^2), but if it goes through X*Y calls to print(), doesn't that mean that the time complexity is O(X*Y) and X*Y is equal to n?

If you have a matrix of size rows*columns, then the inner loop (let's say) is O(columns), and the nested loops together are O(rows*columns).
You are confusing a problem size of N for a problem size of N^2. You can either say your matrix is size N or your matrix is size N^2, though unless your matrix is square you should say that you have a matrix of size Rows*Columns.

You are right when you say n = X x Y but wrong when you say the nested loops should be O(n). The meaning of nested loop can be understood if you dry run your code. You will notice that for each iteration of the outer loop the inner loop runs n (or what ever is the size condition) times. Hence, by simple math, you can deduce that its O(n^2). But, if you had just one loop when you will be iterating over (X x Y) (Eg: for(i = 0; i<(X*Y); i++) elements, then it will be O(n) cause you are not restarting your iteration at any point of time.
Hope this makes sense.

This answer was written hastily and received a few downvotes, so I decided to clarify and rewrite it
Time complexity of an algorithm is an expression of the number of operations of the algorithm in terms of the size of the problem the algorithm is intended to solve.
There are two sizes involved here.
The first size is the number of elements of the matrix X × Y This corresponds to what is known in complexity theory as the size of input. Let k = X × Y denote the number of elements in the matrix. Since the number of operations in the twin loop is X × Y, it is in O(k).
The second size is the number of columns and rows of the matrix. Let m = max(X,Y). The number of operations in the twin loop is in O(m^2). Usually in Linear Algebra this kind of size is used to characterize the complexity of matrix operations on m × m matrices.
When you talk about complexity you have to specify precisely how you encode an instance problem and what parameter you use to specify its size. In Complexity Theory we usually assume that the input to an algorithm is given as a string of characters coming from some finite alphabet and measure the complexity of an algorithm in terms of an upper bound on the number of operations on an instance of a problem given by a string of length n. That is in Complexity Theory n is usually the size of input.
In practical Complexity Analysis of algorithms we often use other measures of the size of an instance that are more meaningful in specific context. For instance if A is a connectivity matrix of a graph, we may use the number of vertices V as a measure of complexity of an instance of a problem, or if A is a matrix of a linear operator acting on a vector space, we may use the dimension of a vector space as such a measure. For square matrices the convention is to specify the complexity in terms of the dimension of the matrix, that is to measure the complexity of algorithms acting upon n × n matrices in terms of n. It often makes practical sense and also agrees with the conventions of a specific application field even if it may contradict the conventions of Complexity Theory.
Let us give the name Matrix Scan to our twin loop. You may legitimately say that if the size of an instance of Matrix Scan is the length of a string encoding of a matrix. Assuming bounded size of the entries it is the number of elements in the matrix, k. Then we can say the complexity of Matrix Scan is in O(k). On the other hand if we take m = max(X,Y) as a parameter that characterizes the complexity of an instance, as is customary in many applications, then the complexity Matrix Scan for an X×Y matrix will is also in O(m^2). For a square matrix X = Y = m and O(k) = O(m^2).
Notice: Some people in the comments asked whether we can always find an encoding of the problem to reduce any polynomial problem to a linear problem. This is not true. For some algorithms the number of operations grows faster than the length of the string encoding of their input. For instance, there is no algorithm to multiply two m×m matrices with θ(m^2) number of operations. Here the size of input grows as m^2, however Ran Raz proved that the number of operations grows at least as fast as m^2 log m. If n is in O(m^2) then m^2 log m is in O(n log n) and the best known algorithms complexity grows as O(m^(2+c)) = O(n^(1+c/2)), where c is at least 0.372 for versions of Coppersmith-Winograd algorithm and c = 1 for the common iterative algorithm.

Generally, I thought all nested for loops were O(n^2),
You are wrong about that. What confuses you I guess is that often people use as an example square(X==Y) matrix so complexity is n*n(X==n,Y==n).
If you want to practise your O(*) skills try to figure out why matrix multiplication is O(n^3). IF you dont know the algorithm for matrix multiplication it is easy to find it online.

Reference for lowest order complexity of sparse symmetric matrix premultiplying full vector

In a paper I'm writing I make use of an n x n matrix multiplying a dense vector of dimension n. In its natural form, this matrix has O(n^2) space complexity and the multiplication takes time O(n^2).
However, it is known that the matrix is symmetric, and has zero values along its diagonal. The matrix is also highly sparse: the majority of non-diagonal entries are zero.
Could anyone link me to an algorithm/paper/data structure which uses a sparse symmetric matrix representation to approach O(nlogn) or maybe even O(n), in cases of high sparsity?

I would have a look at the csparse library by Tim Davis. There's also a corresponding book that describes a whole range of sparse matrix algorithms.
In the sparse case the A*x operation can be made to run in O(|A|) complexity - i.e. linear in the number of non-zero elements in the matrix.

Are you interested in parallel algorithms of this sort
http://www.cs.cmu.edu/~scandal/cacm/node9.html

What is the complexity of matrix addition?

I have found some mentions in another question of matrix addition being a quadratic operation. But I think it is linear.
If I double the size of a matrix, I need to calculate double the additions, not quadruple.
The main diverging point seems to be what is the size of the problem. To me, it's the number of elements in the matrix. Others think it is the number of columns or lines, hence the O(n^2) complexity.
Another problem I have with seeing it as a quadratic operation is that that means adding 3-dimensional matrices is cubic, and adding 4-dimensional matrices is O(n^4), etc, even though all of these problems can be reduced to the problem of adding two vectors, which has an obviously linear solution.
Am I right or wrong? If wrong, why?

As you already noted, it depends on your definition of the problem size: is it the total number of elements, or the width/height of the matrix. Which ever is correct actually depends on the larger problem of which the matrix addition is part of.
NB: on some hardware (GPU, vector machines, etc) the addition might run faster than expected (even though complexity is still the same, see discussion below), because the hardware can perform multiple additions in one step. For a bounded problem size (like n < 3) it might even be one step.

It's O(M*N) for a 2-dimensional matrix with M rows and N columns.
Or you can say it's O(L) where L is the total number of elements.

Usually the problem is defined using square matrices "of size N", meaning NxN. By that definition, matrix addition is an O(N^2) since you must visit each of the NxN elements exactly once.
By that same definition, matrix multiplication (using square NxN matrices) is O(N^3) because you need to visit N elements in each of the source matrices to compute each of the NxN elements in the product matrix.
Generally, all matrix operations have a lower bound of O(N^2) simply because you must visit each element at least once to compute anything involving the whole matrix.

think of the general case implementation:
for 1 : n
for 1 : m
c[i][j] = a[i][j] + b[i][j]
if we take the simple square matrix, that is n x n additions

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio