Is there a simple (non for loop) way to create a model matrix in Octave. In R i use model.matrix() to do this.
I have this array:
array = [1;2;3;2]
and i need (for regression reasons)
*(model = [1 0 0 0; 0 1 0 1; 0 0 1 0])* EDIT on my side
result is this model (colum 1 is for 1, column 2 for the two's etc.:
model = [1 0 0 ; 0 1 0 ; 0 0 1 ; 0 1 0]
I can do this with a for loop:
model = zeros(4,3);
for i=1:4
model(i,array(i)) = 1;
end
but it would be nice to do this in one step something like:
model = model.matrix(array)
i can than include it in a formula straight away
You need to turn your values into linear indices like so:
octave:1> array = [1 2 3 2];
octave:2> model = zeros ([numel(array) max(array)]);
octave:3> model(sub2ind (size (model), 1:numel(array), array)) = 1
model =
1 0 0
0 1 0
0 0 1
0 1 0
Because your matrix will be very sparse, a possible optimization is to create a sparse matrix instead.
octave:4> sp = sparse (1:numel(array), array, 1, numel (array), max (array))
sp =
Compressed Column Sparse (rows = 4, cols = 3, nnz = 4 [33%])
(1, 1) -> 1
(2, 2) -> 1
(4, 2) -> 1
(3, 3) -> 1
octave:5> full (sp)
ans =
1 0 0
0 1 0
0 0 1
0 1 0
This will take a lot less memory but many functions will be unable to handle them and convert them to a full matrix anyway. So whether this is worth is dependent on what you want to do next.
Related
It is the first time I deal with column-compress storage (CCS) format to store matrices. After googling a bit, if I am right, in a matrix having n nonzero elements the CCS is as follows:
-we define a vector A_v of dimensions n x 1 storing the n non-zero elements
of the matrix
- we define a second vector A_ir of dimensions n x 1 storing the rows of the
non-zero elements of the matrix
-we finally define a third vector A_jc whose elements are the indices of the
elements of A_v which corresponds to the beginning of new column, plus a
final value which is by convention equal t0 n+1, and identifies the end of
the matrix (pointing theoretically to a virtual extra-column).
So for instance,
if
M = [1 0 4 0 0;
0 3 5 2 0;
2 0 0 4 6;
0 0 7 0 8]
we get
A_v = [1 2 3 4 5 7 2 4 6 8];
A_ir = [1 3 2 1 2 4 2 3 3 4];
A_jc = [1 3 4 7 9 11];
my questions are
I) is what I wrote correct, or I misunderstood anything?
II) what if I want to represent a matri with some columns which are zeroes, e.g.,
M2 = [0 1 0 0 4 0 0;
0 0 3 0 5 2 0;
0 2 0 0 0 4 6;
0 0 0 0 7 0 8]
wouldn't the representation of M2 in CCS be identical to the one of M?
Thanks for the help!
I) is what I wrote correct, or I misunderstood anything?
You are perfectly correct. However, you have to take care that if you use a C or C++ library offsets and indices should start at 0. Here, I guess you read some Fortran doc for which indices are starting at 1. To be clear, here is below the C version, which simply translates the indices of your Fortran-style correct answer:
A_v = unmodified
A_ir = [0 2 1 0 1 3 1 2 2 4] (in short [1 3 2 1 2 4 2 3 3 4] - 1)
A_jc = [0 2 3 6 8 10] (in short [1 3 4 7 9 11] - 1)
II) what if I want to represent a matri with some columns which are
zeroes, e.g., M2 = [0 1 0 0 4 0 0;
0 0 3 0 5 2 0;
0 2 0 0 0 4 6;
0 0 0 0 7 0 8]
wouldn't the representation of M2 in CCS be identical to the one of M?
I you have an empty column, simply add a new entry in the offset table A_jc. As this column contains no element this new entry value is simply the value of the previous entry. For instance for M2 (with index starting at 0) you have:
A_v = unmodified
A_ir = unmodified
A_jc = [0 0 2 3 6 8 10] (to be compared to [0 2 3 6 8 10])
Hence the two representations are differents.
If you just start learning about sparse matrices there is an excelllent free book here: http://www-users.cs.umn.edu/~saad/IterMethBook_2ndEd.pdf
Given MXN matrix where matrix elements are either "." or "*". Where . is representing road and * is representing block or wall. Person can move adjacent forward, down and diagonally, we need to find maximum "." covered by person without blocked by wall. Example(in image)
Can you please suggest me efficient algorithm to approach this problem?
You have to do this: https://en.wikipedia.org/wiki/Flood_fill
Take the biggest flood you can do.
You go through your matrix and find a '.'
Do a flood from that point. The amount of elements you flood the area you always compare it with the maximum you already found. To make this easy you can flood with a letter or a number or whatever you want but not with '.'. What you add instead of '.' consider it as a wall or a '*' so you don't try to flood that area again and again.
Continue to go through the matrix and try to find the next '.'. All the previous '.' where flooded so you won't consider the same area twice.
Redo 2 until you can't find any more '.'. The maximum will contain your answer.
When you have the answer you can go back in the Matrix and you already know the letter or number you flooded the area with the maximum result so you can print the biggest area.
Are you looking for the exact path or only the number of cases?
Edit: here a smallp Python script which creates a random matrix and count the number of cases in each zone defined by your "walls".
import numpy as np
matrix = np.random.randint(2, size=(10, 10))
print(matrix)
M, N = matrix.shape
walked = []
zonesCount = []
def pathCount(x, y):
if x < 0 or y < 0 or x >= M or y >= N:
return 0
if matrix[x, y] == 1: # I replaced * by 1 and . by 0 for easier generation
return 0
if (x, y) in walked:
return 0
walked.append((x, y))
count = 1
for i in [x - 1, x, x + 1]:
for j in [y - 1, y, y + 1]:
if (i, j) != (x, y):
count += pathCount(i, j)
return count
for x in range(M):
for y in range(N):
if not (x, y) in walked:
zonesCount.append(pathCount(x, y))
print('Max zone count :', max(zonesCount))
And here is the result:
[[0 0 1 0 0 0 1 0 1 0]
[1 0 1 0 0 0 1 0 1 1]
[0 1 0 0 1 0 0 1 1 1]
[0 0 1 0 0 0 1 1 0 1]
[1 0 1 1 1 1 0 1 1 0]
[1 0 1 1 1 1 0 1 1 0]
[0 0 0 1 1 1 0 0 0 0]
[1 0 0 1 1 0 0 1 1 0]
[0 1 0 1 0 0 1 0 1 1]
[0 1 1 0 0 0 1 0 1 0]]
Max zone count : 50
I need an algorithm in Matlab which counts how many adjacent and non-overlapping (1,1) I have in each row of a matrix A mx(n*2) without using loops. E.g.
A=[1 1 1 0 1 1 0 0 0 1; 1 0 1 1 1 1 0 0 1 1] %m=2, n=5
Then I want
B=[2;3] %mx1
Specific case
Assuming A to have ones and zeros only, this could be one way -
B = sum(reshape(sum(reshape(A',2,[]))==2,size(A,2)/2,[]))
General case
If you are looking for a general approach that must work for all integers and a case where you can specify the pattern of numbers, you may use this -
patt = [0 1] %%// pattern to be found out
B = sum(reshape(ismember(reshape(A',2,[])',patt,'rows'),[],2))
Output
With patt = [1 1], B = [2 3]
With patt = [0 1], B = [1 0]
you can use transpose then reshape so each consecutive values will now be in a row, then compare the top and bottom row (boolean compare or compare the sum of each row to 2), then sum the result of the comparison and reshape the result to your liking.
in code, it would look like:
A=[1 1 1 0 1 1 0 0 0 1; 1 0 1 1 1 1 0 0 1 1] ;
m = size(A,1) ;
n = size(A,2)/2 ;
Atemp = reshape(A.' , 2 , [] , m ) ;
B = squeeze(sum(sum(Atemp)==2))
You could pack everything in one line of code if you want, but several lines is usually easier for comprehension. For clarity, the Atemp matrix looks like that:
Atemp(:,:,1) =
1 1 1 0 0
1 0 1 0 1
Atemp(:,:,2) =
1 1 1 0 1
0 1 1 0 1
You'll notice that each row of the original A matrix has been broken down in 2 rows element-wise. The second line will simply compare the sum of each row with 2, then sum the valid result of the comparisons.
The squeeze command is only to remove the singleton dimensions not necessary anymore.
you can use imresize , for example
imresize(A,[size(A,1),size(A,2)/2])>0.8
ans =
1 0 1 0 0
0 1 1 0 1
this places 1 where you have [1 1] pairs... then you can just use sum
For any pair type [x y] you can :
x=0; y=1;
R(size(A,1),size(A,2)/2)=0; % prealocarting memory
for n=1:size(A,1)
b=[A(n,1:2:end)' A(n,2:2:end)']
try
R(n,find(b(:,1)==x & b(:,2)==y))=1;
end
end
R =
0 0 0 0 1
0 0 0 0 0
With diff (to detect start and end of each run of ones) and accumarray (to group runs of the same row; each run contributes half its length rounded down):
B = diff([zeros(1,size(A,1)); A.'; zeros(1,size(A,1))]); %'// columnwise is easier
[is js] = find(B==1); %// rows and columns of starts of runs of ones
[ie je] = find(B==-1); %// rows and columns of ends of runs of ones
result = accumarray(js, floor((ie-is)/2)); %// sum values for each row of A
I came across this interview question:
In a N x N bi-dimensional array of boolean elements, how do you determine if the values form a square?
For example:
true true true true
true false false true
true false false true
true true true true
form a square.
I figured that I have to start by checking if there is a square in the middle (if N is odd that is always true) and then recursively checking the values at the perimeter.
Is this the best way to do it or is there a better, faster, way to find out?
A squre could be determined by two points. Let's say the left-top point (x1,y1) and the right-bottom point (x2,y2). And, let's use 1 as true, and 0 as false.
Consider an array:
array = [None] * 5
array[0] = [1, 1, 1, 1, 0]
array[1] = [1, 0, 0, 1, 0]
array[2] = [1, 0, 0, 1, 0]
array[3] = [1, 1, 1, 1, 0]
array[4] = [1, 0, 0, 1, 0]
It's obviously that (0,0)(3,3) forms a square in this case. And we could find a property that:
A square is formed if and only if:
By adding the two row borders together, you will get a sequence of 2;
The length of the sequence is equal to the distance between to two row borders.
By adding the two column borders together, you will get a sequence of 2;
The length of the sequence is equal to the distance between to two column borders.
Exploiting the property above, you would get an algorithm:
row_segment = []
col_segment = []
for v1 in range(len(array)):
for v2 in range(v1+1, len(array)):
add_row = [array[v1][col]+array[v2][col] for col in range(len(array))]
add_col = [array[row][v1]+array[row][v2] for row in range(len(array))]
row_distance = v2-v1
row_sum = sum(add_row[:row_distance+1])
col_sum = sum(add_col[:row_distance+1])
for i in range(len(array)-row_distance):
j = i+row_distance
if row_sum == 2*(row_distance+1):
row_segment.append([v1, i, v2, j])
if col_sum == 2*(row_distance+1):
col_segment.append([i, v1, j, v2])
row_sum = row_sum - add_row[i] + add_row[j+1] if j+1 < len(array) else None
col_sum = col_sum - add_col[i] + add_col[j+1] if j+1 < len(array) else None
for i in row_segment:
if i in col_segment:
print "Square ({x1}, {y1}) ({x2}, {y2})".format(x1=i[0], y1=i[1], x2=i[2], y2=i[3])
Let's run some tests:
Test 1:
0 0 0 0 0
0 0 1 1 1
0 0 1 0 1
0 0 1 1 1
0 0 0 0 0
Square (1, 2) (3, 4)
Test 2:
0 0 0 0 0
1 1 1 1 1
1 0 1 0 1
1 1 1 0 1
0 0 1 1 1
Square (1, 0) (3, 2)
Test 3:
0 0 0 0 0
1 1 1 1 1
1 0 1 0 1
1 1 1 1 1
0 0 0 0 0
Square (1, 0) (3, 2)
Square (1, 2) (3, 4)
Test 4:
1 1 1 1 1
1 0 1 0 1
1 1 0 0 1
1 0 0 1 0
1 1 1 1 1
No squares found
I have this matrix:
S.No. A B
1 5268020 1756
2 15106230 5241
3 24298744 9591
4 23197375 9129
I want to get a matrix which will have two columns [X,Y]. X will take values from S.No. and Y will can be either 1 or 0. For example, for 1 5268020 1756 there should be total 5268020 (1,0) i.e, (X,Y) pairs and 1756 (1,1) pairs.
How can I get this matrix in Octave ??
If I understand your question correctly, you want to fill a matrix with repeated entries (x,0) and (x,1), where x=1...4, where repetition is determined by values found in column A and B. Given the values you supplied that's going to be a huge matrix (67,896,086 rows). So, you could try something like this (replace m below, which has less elements for illustrative purpose):
m = [1, 2, 1;
2, 3, 2;
3, 2, 1;
4, 2, 2];
res = [];
for k = 1:4
res = [res ; [k*ones(m(k, 2), 1), zeros(m(k, 2), 1);
k*ones(m(k, 3), 1), ones(m(k, 3), 1)]];
endfor
which yields
res =
1 0
1 0
1 1
2 0
2 0
2 0
2 1
2 1
3 0
3 0
3 1
4 0
4 0
4 1
4 1
Out of curiosity, is there any reason not to consider a matrix like
1 0 n
1 1 m
2 0 p
2 1 q
...
where n, m, p, q, are values found in columns A and B. This would probably be easier to handle , no?