Sparse Matrices Storage formats - Conversion - matrix

Is there an efficient way of converting a sparse matrix in Compressed Row Storage(CRS) format to Coordinate List (COO) format ?

Have a look at Yousef Saad's library SPARSKIT -- he has subroutines to convert back and forth between compressed sparse row and coordinate formats, as well as several other sparse matrix storage schemes.
Anyhow, to see how to get the coordinate format from the compressed one, it's easiest to consider how you could have come up with the compressed row format in the first place. Say you have a sparse matrix in COO, where you've put everything in order, for example
rows: 1 1 1 1 2 2 2 2 2 3 3 3 ...
cols: 1 3 5 9 2 3 7 9 11 1 2 3 ...
So the non-zero entries in row 1 are (1,1), (1,3), (1,5), (1,9) and so forth. You're storing a lot of redundant data in the array of rows; you can instead just have an array ia such that ia(i) tells you the starting address in the array cols for row i. In our example above, we would then have
ia : 1 5 10 ...
cols: 1 3 5 9 2 3 7 9 11 1 2 3 ...
To go from COO to CSR, we just use the fact that
ia(i+1) = ia(i) + number of non-zero entries in row i
for any i. Knowing that, you can work backwards to get the COO format from CSR.

Related

Convert diamond matrix 2d coordinates to 1d index and back

I have a 2d game board that expands as tiles are added to the board. Tiles can only be adjacent to existing tiles in the up, down, left and right positions.
So I thought a diamond spiral matrix would be the most efficient way to store the board, but I cannot find a way to convert the x,y coordinates to a 1d array index or the reverse operation.
like this layout
X -3 -2 -1 0 1 2 3
Y 3 13
2 24 5 14
1 23 12 1 6 15
0 22 11 4 0 2 7 16
-1 21 10 3 8 17
-2 20 9 18
-3 19
Tile 1 will always be at position 0, tile 2 will be at 1,2,3 or 4, tile 3 somewhere from 1 to 12 etc.
So I need an algorithm that goes from X,Y to an index and from an index back to the original X and Y.
Anyone know how to do this, or recommend another space filling algorithm that suits my needs. I'm probably going to use Java but would prefer something language neutral.
Thanks
As I can understand form the problem statement, there is no guarantee that the tiles will be filled evenly on the sides. for example:
X -3 -2 -1 0 1 2 3
Y 3 6
2 3 4 5
1 1
0 0 2
-1
So, I think a diamond matrix won't be the best choice.
I would suggest storing them in a hash-map, like implementing a dictionary for 2 letter words.
Also, You need to be more specific to what your requirements are. Like, do you prioritize space complexity over time? Or do you want a fast access time and don't care about memory usage that much.
IMPORTANT :
Also, what is the
Max number of tiles that we have to hold
Max width and height of the board.

APL: how to search for a value's index in a matrix

In APL, matrices and vectors are used to hold data. I was wondering if there was a way to search within a matrix for a given value, and have that values index returned. For example, say I have the following 2-dimensional matrices:
VALUES ← 1 2 3 4 5 6 7 8 9 10 11... all the way up to 36
KINDS ← 0 0 0 2 0 0 0 3 0 ... filled with 0's the rest of the way to 36 length.
If I laminated these two matrices with
kinds,[.5] values
so that they are laminated one on top of the other
1 2 3 4 5 6 7 8 9 10...
0 0 0 2 0 0 0 3 0 ....
is there a functionally easy way to search for the index of the 2 value in the "second row" of the newly laminated matrix? eg. the column containing
4
2
and return that matrix index?
The value 2 also appears in row 1 of your newly laminated matrix (nlm), and as you stated, you really do not want to search the whole matrix, but only the second row. So, since you're only searching within a given row, getting the column index in that row gives you the complete answer:
row←2
⎕←col←nlm[row;]⍳2
4
nlm[;col] ⍝ values in matched column
4 2
Try it online!

Ascending Cardinal Numbers in APL

In the FinnAPL Idiom Library, the 19th item is described as “Ascending cardinal numbers (ranking, all different) ,” and the code is as follows:
⍋⍋X
I also found a book review of the same library by R. Peschi, in which he said, “'Ascending cardinal numbers (ranking, all different)' How many of us understand why grading the result of Grade Up has that effect?” That's my question too. I searched extensively on the internet and came up with zilch.
Ascending Cardinal Numbers
For the sake of shorthand, I'll call that little code snippet “rank.” It becomes evident what is happening with rank when you start applying it to binary numbers. For example:
X←0 0 1 0 1
⍋⍋X ⍝ output is 1 2 4 3 5
The output indicates the position of the values after sorting. You can see from the output that the two 1s will end up in the last two slots, 4 and 5, and the 0s will end up at positions 1, 2 and 3. Thus, it is assigning rank to each value of the vector. Compare that to grade up:
X←7 8 9 6
⍋X ⍝ output is 4 1 2 3
⍋⍋X ⍝ output is 2 3 4 1
You can think of grade up as this position gets that number and, you can think of rank as this number gets that position:
7 8 9 6 ⍝ values of X
4 1 2 3 ⍝ position 1 gets the number at 4 (6)
⍝ position 2 gets the number at 1 (7) etc.
2 3 4 1 ⍝ 1st number (7) gets the position 2
⍝ 2nd number (8) gets the position 3 etc.
It's interesting to note that grade up and rank are like two sides of the same coin in that you can alternate between the two. In other words, we have the following identities:
⍋X = ⍋⍋⍋X = ⍋⍋⍋⍋⍋X = ...
⍋⍋X = ⍋⍋⍋⍋X = ⍋⍋⍋⍋⍋⍋X = ...
Why?
So far that doesn't really answer Mr Peschi's question as to why it has this effect. If you think in terms of key-value pairs, the answer lies in the fact that the original keys are a set of ascending cardinal numbers: 1 2 3 4. After applying grade up, a new vector is created, whose values are the original keys rearranged as they would be after a sort: 4 1 2 3. Applying grade up a second time is about restoring the original keys to a sequence of ascending cardinal numbers again. However, the values of this third vector aren't the ascending cardinal numbers themselves. Rather they correspond to the keys of the second vector.
It's kind of hard to understand since it's a reference to a reference, but the values of the third vector are referencing the orginal set of numbers as they occurred in their original positions:
7 8 9 6
2 3 4 1
In the example, 2 is referencing 7 from 7's original position. Since the value 2 also corresponds to the key of the second vector, which in turn is the second position, the final message is that after the sort, 7 will be in position 2. 8 will be in position 3, 9 in 4 and 6 in the 1st position.
Ranking and Shareable
In the FinnAPL Idiom Library, the 2nd item is described as “Ascending cardinal numbers (ranking, shareable) ,” and the code is as follows:
⌊.5×(⍋⍋X)+⌽⍋⍋⌽X
The output of this code is the same as its brother, ascending cardinal numbers (ranking, all different) as long as all the values of the input vector are different. However, the shareable version doesn't assign new values for those that are equal:
X←0 0 1 0 1
⌊.5×(⍋⍋X)+⌽⍋⍋⌽X ⍝ output is 2 2 4 2 4
The values of the output should generally be interpreted as relative, i.e. The 2s have a relatively lower rank than the 4s, so they will appear first in the array.

effective way of transformation from 2D to 1D vector

i want to create 1D vector in matlab from given matrix,for this i have implemented following algorithm ,which use trivial way
% create one dimensional vector from 2D matrix
function [x]=one_dimensional(b,m,n)
k=1;
for i=1:m
for t=1:n
x(k)=b(i,t);
k=k+1;
end
end
x;
end
when i run it using following example,it seems to do it's task fine
b=[2 1 3;4 2 3;1 5 4]
b =
2 1 3
4 2 3
1 5 4
>> one_dimensional(b,3,3)
ans =
2 1 3 4 2 3 1 5 4
but generally i know that,arrays are not good way to use in matlab,because it's performance,so what should be effective way for transformation matrix into row/column vector?i am just care about performance.thanks very much
You can use the (:) operator...But it works on columns not rows so you need to transpose using the 'operator before , for example:
b=b.';
b(:)'
ans=
2 1 3 4 2 3 1 5 4
and I transposed again to get a row output (otherwise it'll the same vector only in column form)
or also, this is an option (probably a slower one):
reshape(b.',1,[])

Matrix, algorithm interview question

This was one of my interview questions.
We have a matrix containing integers (no range provided). The matrix is randomly populated with integers. We need to devise an algorithm which finds those rows which match exactly with a column(s). We need to return the row number and the column number for the match. The order of of the matching elements is the same. For example, If, i'th row matches with j'th column, and i'th row contains the elements - [1,4,5,6,3]. Then jth column would also contain the elements - [1,4,5,6,3]. Size is n x n.
My solution:
RCEQUAL(A,i1..12,j1..j2)// A is n*n matrix
if(i2-i1==2 && j2-j1==2 && b[n*i1+1..n*i2] has [j1..j2])
use brute force to check if the rows and columns are same.
if (any rows and columns are same)
store the row and column numbers in b[1..n^2].//b[1],b[n+2],b[2n+3].. store row no,
// b[2..n+1] stores columns that
//match with row 1, b[n+3..2n+2]
//those that match with row 2,etc..
else
RCEQUAL(A,1..n/2,1..n/2);
RCEQUAL(A,n/2..n,1..n/2);
RCEQUAL(A,1..n/2,n/2..n);
RCEQUAL(A,n/2..n,n/2..n);
Takes O(n^2). Is this correct? If correct, is there a faster algorithm?
you could build a trie from the data in the rows. then you can compare the columns with the trie.
this would allow to exit as soon as the beginning of a column do not match any row. also this would let you check a column against all rows in one pass.
of course the trie is most interesting when n is big (setting up a trie for a small n is not worth it) and when there are many rows and columns which are quite the same. but even in the worst case where all integers in the matrix are different, the structure allows for a clear algorithm...
You could speed up the average case by calculating the sum of each row/column and narrowing your brute-force comparison (which you have to do eventually) only on rows that match the sums of columns.
This doesn't increase the worst case (all having the same sum) but if your input is truly random that "won't happen" :-)
This might only work on non-singular matrices (not sure), but...
Let A be a square (and possibly non-singular) NxN matrix. Let A' be the transpose of A. If we create matrix B such that it is a horizontal concatenation of A and A' (in other words [A A']) and put it into RREF form, we will get a diagonal on all ones in the left half and some square matrix in the right half.
Example:
A = 1 2
3 4
A'= 1 3
2 4
B = 1 2 1 3
3 4 2 4
rref(B) = 1 0 0 -2
0 1 0.5 2.5
On the other hand, if a column of A were equal to a row of A then column of A would be equal to a column of A'. Then we would get another single 1 in of of the columns of the right half of rref(B).
Example
A=
1 2 3 4 5
2 6 -3 4 6
3 8 -7 6 9
4 1 7 -5 3
5 2 4 -1 -1
A'=
1 2 3 4 5
2 6 8 1 2
3 -3 -7 7 4
4 4 6 -5 -1
5 6 9 3 -1
B =
1 2 3 4 5 1 2 3 4 5
2 6 -3 4 6 2 6 8 1 2
3 8 -7 6 9 3 -3 -7 7 4
4 1 7 -5 3 4 4 6 -5 -1
5 2 4 -1 -1 5 6 9 3 -1
rref(B)=
1 0 0 0 0 1.000 -3.689 -5.921 3.080 0.495
0 1 0 0 0 0 6.054 9.394 -3.097 -1.024
0 0 1 0 0 0 2.378 3.842 -0.961 0.009
0 0 0 1 0 0 -0.565 -0.842 1.823 0.802
0 0 0 0 1 0 -2.258 -3.605 0.540 0.662
1.000 in the top row of the right half means that the first column of A matches on of its rows. The fact that the 1.000 is in the left-most column of the right half means that it is the first row.
Without looking at your algorithm or any of the approaches in the previous answers, but since the matrix has n^2 elements to begin with, I do not think there is a method which does better than that :)
IFF the matrix is truely random...
You could create a list of pointers to the columns sorted by the first element. Then create a similar list of the rows sorted by their first element. This takes O(n*logn).
Next create an index into each sorted list initialized to 0. If the first elements match, you must compare the whole row. If they do not match, increment the index of the one with the lowest starting element (either move to the next row or to the next column). Since each index cycles from 0 to n-1 only once, you have at most 2*n comparisons unless all the rows and columns start with the same number, but we said a matrix of random numbers.
The time for a row/column comparison is n in the worst case, but is expected to be O(1) on average with random data.
So 2 sorts of O(nlogn), and a scan of 2*n*1 gives you an expected run time of O(nlogn). This is of course assuming random data. Worst case is still going to be n**3 for a large matrix with most elements the same value.

Resources