I have the following data:
client_id <- c(1,2,3,1,2,3)
product_id <- c(10,10,10,20,20,20)
connected <- c(1,1,0,1,0,0)
clientID_productID <- paste0(client_id,";",product_id)
df <- data.frame(client_id, product_id,connected,clientID_productID)
client_id product_id connected clientID_productID
1 1 10 1 1;10
2 2 10 1 2;10
3 3 10 0 3;10
4 1 20 1 1;20
5 2 20 0 2;20
6 3 20 0 3;20
The goal is to produce a relational matrix:
client_id product_id clientID_productID client_pro_1_10 client_pro_2_10 client_pro_3_10 client_pro_1_20 client_pro_2_20 client_pro_3_20
1 1 10 1;10 0 1 0 0 0 0
2 2 10 2;10 1 0 0 0 0 0
3 3 10 3;10 0 0 0 0 0 0
4 1 20 1;20 0 0 0 0 0 0
5 2 20 2;20 0 0 0 0 0 0
6 3 20 3;20 0 0 0 0 0 0
In other words, when product_id equals 10, clients 1 and 2 are connected. Importantly, I do not want client 1 to be connected with herself. When product_id=20, I have only one client, meaning that there is no connection, so I should have only zeros.
To be more specific, all that I am trying to create is a square matrix of relations, with all the combinations of client/product in the columns. A client can only be connected with another if they bought the same product.
I have searched a bunch and played with other code. The difference between this problem and others already answered is that I want to keep on my table client number 3, even though she never bought any product. I want to show that she does not have a relationship with any other client. Right now, I am able to create the matrix by stacking the relationships by product (How to create relational matrix in R?), but I am struggling with a way to not stack them.
I apologize if the question is not specific enough, or too specific. Thank you anyway, stackoverflow is a lifesaver for beginners.
I believe I figured it out.
It is for sure not the most elegant answer, though.
client_id <- c(1,2,3,1,2,3)
product_id <- c(10,10,10,20,20,20)
connected <- c(1,1,0,1,0,0)
clientID_productID <- paste0(client_id,";",product_id)
df <- data.frame(client_id, product_id,connected,clientID_productID)
df2 <- inner_join(df[c(1:3)], df[c(1:3)], by = c("product_id", "connected"))
df2$Source <- paste0(df2$client_id.x,"|",df2$product_id)
df2$Target <- paste0(df2$client_id.y,"|",df2$product_id)
df2 <- df2[order(df2$product_id),]
indices = unique(as.character(df2$Source))
mtx <- as.matrix(dcast(df2, Source ~ Target, value.var="connected", fill=0))
rownames(mtx) = mtx[,"Source"]
mtx <- mtx[,-1]
diag(mtx)=0
mtx = as.data.frame(mtx)
mtx = mtx[indices, indices]
I got the result I wanted:
1|10 2|10 3|10 1|20 2|20 3|20
1|10 0 1 0 0 0 0
2|10 1 0 0 0 0 0
3|10 0 0 0 0 0 0
1|20 0 0 0 0 0 0
2|20 0 0 0 0 0 0
3|20 0 0 0 0 0 0
Related
I recently worked on a task where I needed to identify new clients.
I managed to find something similar on google and the final result was this measure that I don't understand
and maybe you can help me understand the logic behind this measure. I obviously thought wrongly that it should be >=MIN(Sheet1[Data])))
not <MIN(Sheet1[Data])))
I improvised some data along with the formula.
new_cust =
CALCULATE(
DISTINCTCOUNT(Sheet1[Cust_id])
,FILTER(
ALL(Sheet1[Data])
,Sheet1[Data]<=MAX(Sheet1[Data])
)
)
-
CALCULATE(
DISTINCTCOUNT(Sheet1[Cust_id])
,FILTER(
ALL(Sheet1[Data])
,Sheet1[Data]<MIN(Sheet1[Data])
)
)
Cust_id Data New_Cust
1 1/1/2023 1
1 1/2/2023 0
2 1/3/2023 1
2 1/4/2023 0
2 1/5/2023 0
3 1/6/2023 1
3 1/7/2023 0
1 2/1/2023 0
1 2/2/2023 0
3 2/3/2023 0
3 2/4/2023 0
3 2/5/2023 0
4 2/6/2023 1
4 2/7/2023 0
4 2/8/2023 0
1 3/1/2023 0
1 3/2/2023 0
2 3/3/2023 0
2 3/4/2023 0
3 3/5/2023 0
3 3/6/2023 0
4 3/7/2023 0
4 3/8/2023 0
6 3/9/2023 1
6 3/10/2023 0
Thank you in advance for your understanding and help
I am trying to find islands of numbers in a matrix.
By an island, I mean a rectangular area where ones are connected with each other either horizontally, vertically or diagonally including the boundary layer of zeros
Suppose I have this matrix:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
By boundary layer, I mean row 2 and 7, and column 3 and 10 for island#1.
This is shown below:
I want the row and column indices of the islands. So for the above matrix, the desired output is:
isl{1}= {[2 3 4 5 6 7]; % row indices of island#1
[3 4 5 6 7 8 9 10]} % column indices of island#1
isl{2}= {[2 3 4 5 6 7]; % row indices of island#2
[12 13 14 15 16 17]}; % column indices of island#2
isl{3} ={[9 10 11 12]; % row indices of island#3
[2 3 4 5 6 7 8 9 10 11];} % column indices of island#3
It doesn't matter which island is detected first.
While I know that the [r,c] = find(matrix) function can give the row and column indices of ones but I have no clues on how to detect the connected ones since they can be connected in horizontal, vertical and diagonal order.
Any ideas on how to deal with this problem?
You should look at the BoundingBox and ConvexHull stats returned by regionprops:
a = imread('circlesBrightDark.png');
bw = a < 100;
s = regionprops('table',bw,'BoundingBox','ConvexHull')
https://www.mathworks.com/help/images/ref/regionprops.html
Finding the connected components and their bounding boxes is the easy part. The more difficult part is merging the bounding boxes into islands.
Bounding Boxes
First the easy part.
function bBoxes = getIslandBoxes(lMap)
% find bounding box of each candidate island
% lMap is a logical matrix containing zero or more connected components
bw = bwlabel(lMap); % label connected components in logical matrix
bBoxes = struct2cell(regionprops(bw, 'BoundingBox')); % get bounding boxes
bBoxes = cellfun(#round, bBoxes, 'UniformOutput', false); % round values
end
The values are rounded because the bounding boxes returned by regionprops lies outside its respective component on the grid lines rather than the cell center, and we need integer values to use as subscripts into the matrix. For example, a component that looks like this:
0 0 0
0 1 0
0 0 0
will have a bounding box of
[ 1.5000 1.5000 1.0000 1.0000 ]
which we round to
[ 2 2 1 1]
Merging
Now the hard part. First, the merge condition:
We merge bounding box b2 into bounding box b1 if b2 and the island of b1 (including the boundary layer) have a non-null intersection.
This condition ensures that bounding boxes are merged when one component is wholly or partially inside the bounding box of another, but it also catches the edge cases when a bounding box is within the zero boundary of another. Once all of the bounding boxes are merged, they are guaranteed to have a boundary of all zeros (or border the edge of the matrix), otherwise the nonzero value in its boundary would have been merged.
Since merging involves deleting the merged bounding box, the loops are done backwards so that we don't end up indexing non-existent array elements.
Unfortunately, making one pass through the array comparing each element to all the others is insufficient to catch all cases. To signal that all of the possible bounding boxes have been merged into islands, we use a flag called anyMerged and loop until we get through one complete iteration without merging anything.
function mBoxes = mergeBoxes(bBoxes)
% find bounding boxes that intersect, and merge them
mBoxes = bBoxes;
% merge bounding boxes that overlap
anyMerged = true; % flag to show when we've finished
while (anyMerged)
anyMerged = false; % no boxes merged on this iteration so far...
for box1 = numel(mBoxes):-1:2
for box2 = box1-1:-1:1
% if intersection between bounding boxes is > 0, merge
% the size of box1 is increased b y 1 on all sides...
% this is so that components that lie within the borders
% of another component, but not inside the bounding box,
% are merged
if (rectint(mBoxes{box1} + [-1 -1 2 2], mBoxes{box2}) > 0)
coords1 = rect2corners(mBoxes{box1});
coords2 = rect2corners(mBoxes{box2});
minX = min(coords1(1), coords2(1));
minY = min(coords1(2), coords2(2));
maxX = max(coords1(3), coords2(3));
maxY = max(coords1(4), coords2(4));
mBoxes{box2} = [minX, minY, maxX-minX+1, maxY-minY+1]; % merge
mBoxes(box1) = []; % delete redundant bounding box
anyMerged = true; % bounding boxes merged: loop again
break;
end
end
end
end
end
The merge function uses a small utility function that converts rectangles with the format [x y width height] to a vector of subscripts for the top-left, bottom-right corners [x1 y1 x2 y2]. (This was actually used in another function to check that an island had a zero border, but as discussed above, this check is unnecessary.)
function corners = rect2corners(rect)
% change from rect = x, y, width, height
% to corners = x1, y1, x2, y2
corners = [rect(1), ...
rect(2), ...
rect(1) + rect(3) - 1, ...
rect(2) + rect(4) - 1];
end
Output Formatting and Driver Function
The return value from mergeBoxes is a cell array of rectangle objects. If you find this format useful, you can stop here, but it's easy to get to the format requested with ranges of rows and columns for each island:
function rRanges = rect2range(bBoxes, mSize)
% convert rect = x, y, width, height to
% range = y:y+height-1; x:x+width-1
% and expand range by 1 in all 4 directions to include zero border,
% making sure to stay within borders of original matrix
rangeFun = #(rect) {max(rect(2)-1,1):min(rect(2)+rect(4),mSize(1));...
max(rect(1)-1,1):min(rect(1)+rect(3),mSize(2))};
rRanges = cellfun(rangeFun, bBoxes, 'UniformOutput', false);
end
All that's left is a main function to tie all of the others together and we're done.
function theIslands = getIslandRects(m)
% get rectangle around each component in map
lMap = logical(m);
% get the bounding boxes of candidate islands
bBoxes = getIslandBoxes(lMap);
% merge bounding boxes that overlap
bBoxes = mergeBoxes(bBoxes);
% convert bounding boxes to row/column ranges
theIslands = rect2range(bBoxes, size(lMap));
end
Here's a run using the sample matrix given in the question:
M =
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> getIslandRects(M)
ans =
{
[1,1] =
{
[1,1] =
9 10 11 12
[2,1] =
2 3 4 5 6 7 8 9 10 11
}
[1,2] =
{
[1,1] =
2 3 4 5 6 7
[2,1] =
3 4 5 6 7 8 9 10
}
[1,3] =
{
[1,1] =
2 3 4 5 6 7
[2,1] =
12 13 14 15 16 17
}
}
Quite easy!
Just use bwboundaries to get the boundaries of each of the blobs. you can then just get the min and max in each x and y direction of each boundary to build your box.
Use image dilation and regionprops
mat = [...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1;
0 0 0 1 1 1 0 1 1 0 0 0 1 1 1 1 0;
0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1;
0 0 0 1 0 1 0 1 1 0 0 0 1 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0;
0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0;
0 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0];
mat=logical(mat);
dil_mat=imdilate(mat,true(2,2)); %here we make bridges to 1 px away ones
l_mat=bwlabel(dil_mat,8);
bb = regionprops(l_mat,'BoundingBox');
bb = struct2cell(bb); bb = cellfun(#(x) fix(x), bb, 'un',0);
isl = cellfun(#(x) {max(1,x(2)):min(x(2)+x(4),size(mat,1)),...
max(1,x(1)):min(x(1)+x(3),size(mat,2))},bb,'un',0);
I have a set with elements and the possible adjacent combinations for this are:
So the total possible combinations are c=11 which can be calculated with the formula:
I can model this using a as below whose elements can be represented as a(n,c) are:
I have tried to implement this in MATLAB, but since I have hard-coded the above math my code is not extensible for cases where n > 4:
n=4;
c=((n^2)/2)+(n/2)+1;
A=zeros(n,c);
for i=1:n
A(i,i+1)=1;
end
for i=1:n-1
A(i,n+i+1)=1;
A(i+1,n+i+1)=1;
end
for i=1:n-2
A(i,n+i+4)=1;
A(i+1,n+i+4)=1;
A(i+2,n+i+4)=1;
end
for i=1:n-3
A(i,n+i+6)=1;
A(i+1,n+i+6)=1;
A(i+2,n+i+6)=1;
A(i+3,n+i+6)=1;
end
Is there a relatively low complexity method to transform this problem in MATLAB with n number of elements of set N, following my above mathematical formulation?
The easy way to go about this is to take a bit pattern with the first k bits set and shift it down n - k times, saving each shifted column vector to the result. So, starting from
1
0
0
0
Shift 1, 2, and 3 times to get
|1 0 0 0|
|0 1 0 0|
|0 0 1 0|
|0 0 0 1|
We'll use circshift to achieve this.
function A = adjcombs(n)
c = (n^2 + n)/2 + 1; % number of combinations
A = zeros(n,c); % preallocate output array
col_idx = 1; % skip the first (all-zero) column
curr_col = zeros(n,1); % column vector containing current combination
for elem_count = 1:n
curr_col(elem_count) = 1; % add another element to our combination
for shift_count = 0:(n - elem_count)
col_idx = col_idx + 1; % increment column index
% shift the current column and insert it at the proper index
A(:,col_idx) = circshift(curr_col, shift_count);
end
end
end
Calling the function with n = 4 and 6 we get:
>> A = adjcombs(4)
A =
0 1 0 0 0 1 0 0 1 0 1
0 0 1 0 0 1 1 0 1 1 1
0 0 0 1 0 0 1 1 1 1 1
0 0 0 0 1 0 0 1 0 1 1
>> A = adjcombs(6)
A =
0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1
0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 1 0 1 1 1
0 0 0 1 0 0 0 0 1 1 0 0 1 1 1 0 1 1 1 1 1 1
0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1
0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 0 1 1 1 1 1
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 1 1
I would like to create an incidence matrix.
I have a file with 3 columns, like:
id x y
A 22 2
B 4 21
C 21 360
D 26 2
E 22 58
F 2 347
And I want a matrix like (without col and row names):
2 4 21 22 26 58 347 360
A 1 0 0 1 0 0 0 0
B 0 1 1 0 0 0 0 0
C 0 0 1 0 0 0 0 1
D 1 0 0 0 1 0 0 0
E 0 0 0 1 0 1 0 0
F 1 0 0 0 0 0 1 0
I have started the code like:
haps = readdlm("File.txt",header=true)
hap1_2 = map(Int64,haps[1][:,2:end])
ID = (haps[1][:,1])
dic1 = Dict()
for (i in 1:21)
dic1[ID[i]] = hap1_2[i,:]
end
X=[zeros(21,22)]; #the original file has 21 rows and 22 columns
X1 = hcat(ID,X)
The problem now is that I don't know how to fill the matrix with 1s in the specific columns as in the example above.
I'm also not sure if I'm on the right way.
Any suggestion that could help me??
Thanks!
NamedArrays is a neat package which allows naming both rows and columns and seems to fit the bill for this problem. Suppose the data is in data.csv, here is one method to go about it (install NamedArrays with Pkg.add("NamedArrays")):
data,header = readcsv("data.csv",header=true);
# get the column names by looking at unique values in columns
cols = unique(vec([(header[j+1],data[i,j+1]) for i in 1:size(data,1),j=1:2]))
# row names from ID column
rows = data[:,1]
using NamedArrays
narr = NamedArray(zeros(Int,length(rows),length(cols)),(rows,cols),("id","attr"));
# now stamp in the 1s in the right places
for r=1:size(data,1),c=2:size(data,2) narr[data[r,1],(header[c],data[r,c])] = 1 ; end
Now we have (note I transposed narr for better printout):
julia> narr'
10x6 NamedArray{Int64,2}:
attr ╲ id │ A B C D E F
──────────┼─────────────────
("x",22) │ 1 0 0 0 1 0
("x",4) │ 0 1 0 0 0 0
("x",21) │ 0 0 1 0 0 0
("x",26) │ 0 0 0 1 0 0
("x",2) │ 0 0 0 0 0 1
("y",2) │ 1 0 0 1 0 0
("y",21) │ 0 1 0 0 0 0
("y",360) │ 0 0 1 0 0 0
("y",58) │ 0 0 0 0 1 0
("y",347) │ 0 0 0 0 0 1
But, if DataFrames are necessary, similar tricks should apply.
---------- UPDATE ----------
In case the column of a value should be ignored i.e. x=2 and y=2 should both set a 1 on column for value 2, then the code becomes:
using NamedArrays
data,header = readcsv("data.csv",header=true);
rows = data[:,1]
cols = map(string,sort(unique(vec(data[:,2:end]))))
narr = NamedArray(zeros(Int,length(rows),length(cols)),(rows,cols),("id","attr"));
for r=1:size(data,1),c=2:size(data,2) narr[data[r,1],string(data[r,c])] = 1 ; end
giving:
julia> narr
6x8 NamedArray{Int64,2}:
id ╲ attr │ 2 4 21 22 26 58 347 360
──────────┼───────────────────────────────────────
A │ 1 0 0 1 0 0 0 0
B │ 0 1 1 0 0 0 0 0
C │ 0 0 1 0 0 0 0 1
D │ 1 0 0 0 1 0 0 0
E │ 0 0 0 1 0 1 0 0
F │ 1 0 0 0 0 0 1 0
Here is a slight variation on something that I use for creating sparse matrices out of categorical variables for regression analyses. The function includes a variety of comments and options to suit it to your needs. Note: as written, it treats the appearances of "2" and "21" in x and y as separate. It is far less elegant in naming and appearance than the nice response from Dan Getz. The main advantage here is that it works with sparse matrices so if your data is huge, this will be helpful in reducing storage space and computation time.
function OneHot(x::Array, header::Bool)
UniqueVals = unique(x)
Val_to_Idx = [Val => Idx for (Idx, Val) in enumerate(unique(x))] ## create a dictionary that maps unique values in the input array to column positions in the new sparse matrix.
ColIdx = convert(Array{Int64}, [Val_to_Idx[Val] for Val in x])
MySparse = sparse(collect(1:length(x)), ColIdx, ones(Int32, length(x)))
if header
return [UniqueVals' ; MySparse] ## note: this won't be sparse
## alternatively use return (MySparse, UniqueVals) to get a tuple, second element is the header which you can then feed to something to name the columns or do whatever else with
else
return MySparse ## use MySparse[:, 2:end] to drop a value (which you would want to do for categorical variables in a regression)
end
end
x = [22, 4, 21, 26, 22, 2];
y = [2, 21, 360, 2, 58, 347];
Incidence = [OneHot(x, true) OneHot(y, true)]
7x10 Array{Int64,2}:
22 4 21 26 2 2 21 360 58 347
1 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 1 0 0
0 0 0 1 0 1 0 0 0 0
1 0 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 1
Let Y be a vector of length N, containing numbers from 1 to 10. As example code you can use:
Y = vec(1:10);
I am writing the code which must create an N x 10 matrix, each row consisting of all zeros except for a 1 only in the position which corresponds to the number in vector Y. Thus, 1 in Y becomes 10000000000, 3 becomes 0010000000, and so on.
This approach works:
cell2mat(arrayfun(#(x)eye(10)(x,:), Y, 'UniformOutput', false))
My next idea was to "optimize", so eye(10) is not generated N times, and I wrote this:
theEye = eye(10);
cell2mat(arrayfun(#(x)theEye(x,:), Y, 'UniformOutput', false))
However, now Octave is giving me error:
error: can't perform indexing operations for diagonal matrix type
error: evaluating argument list element number 1
Why do I get this error? What is wrong?
Bonus questions — do you see a better way to do what I am doing? Is my attempt to optimize making things easier for Octave?
I ran this code in Octave and eye creates a matrix of a class (or whatever this is) known as a Diagonal Matrix:
octave:3> theEye = eye(10);
octave:4> theEye
theEye =
Diagonal Matrix
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
In fact, the documentation for Octave says that if the matrix is diagonal, a special object is created to handle the diagonal matrices instead of a standard matrix: https://www.gnu.org/software/octave/doc/interpreter/Creating-Diagonal-Matrices.html
What's interesting is that we can slice into this matrix outside of the arrayfun call, regardless of it being in a separate class.
octave:1> theEye = eye(10);
octave:2> theEye(1,:)
ans =
Diagonal Matrix
1 0 0 0 0 0 0 0 0 0
However, as soon as we put this into an arrayfun call, it decides to crap out:
octave:5> arrayfun(#(x)theEye(x,:), 1:3, 'uni', 0)
error: can't perform indexing operations for diagonal matrix type
This to me doesn't make any sense, especially since we can slice into it outside of arrayfun. One may suspect that it has something to do with arrayfun and since you are specifying UniformOutput to be false, a cell array of elements is returned per element in Y and perhaps something is going wrong when storing these slices into each cell array element.
However, this doesn't seem to be the culprit either. I took the first three rows of theEye, placed them into a cell array and merged them together using cell2mat:
octave:6> cell2mat({theEye(1,:); theEye(2,:); theEye(3,:)})
ans =
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
As such, I suspect that it may be some sort of internal bug (if you could call it that...). Thanks to user carandraug (see comment above), this is indeed a bug and it has been reported: https://savannah.gnu.org/bugs/?47510. What may also provide insight is that this code runs as expected in MATLAB.
In any case, one thing you can take away from this is that I would seriously refrain from using cell2mat. Just use straight up indexing:
Y = vec(1:10);
theEye = eye(10);
out = theEye(Y,:);
This would index into theEye and extract out the relevant rows stored in Y and create a matrix where each row is zero except for the corresponding value seen in each element Y.
Also, have a look at this post for a similar example: Replace specific columns in a matrix with a constant column vector
However, it is defined over the columns instead of the rows, but it's very similar to what you want to achieve.
Another approach; We start with the data:
>> len = 10; % max number
>> vec = randi(len, [1 7]) % vector of numbers
vec =
1 10 9 5 7 3 6
Now we build the indicator matrix:
>> I = full(sparse(1:numel(vec), vec, 1, numel(vec), len))
I =
1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0