I have two vectors, A = [1,3,5] and B = [1,2,3,4,5,6,7,8,9,10]. I want to get C=[2,4,6,7,8,9,10] by extracting some elements from B that A doesn't have.
I don't want to use loops, because this is a simplified problem from a real data simulation. In the real case A and B are huge, but A is included in B.
Here are two methods,
C=setdiff(B,A)
but if values are repeated in B they will only come up once in C, or
C=B(~ismember(B,A))
which will preserve repeated values in B.
One approach with unique, sort and diff -
C = [A B];
[~,~,idC] = unique(C);
[sidC,id_idC] = sort(idC);
start_id = id_idC(diff([0 sidC])==1);
out = C(start_id(start_id>numel(A)))
Sample runs -
Case #1 (Sample from question):
A =
1 3 5
B =
1 2 3 4 5 6 7 8 9 10
out =
2 4 6 7 8 9 10
Case #2 (Bit more generic case):
A =
11 15 14
B =
19 14 6 8 9 11 15
out =
6 8 9 19
Related
Suppose I have an array X of size n by p by q. I would like to reshape it as a matrix with p rows, and in each row put the concatenation of the n rows of size q, resulting in a matrix of size p by nq.
I managed to do it with a loop but it takes a while say if n=1000, p=300, q=300.
F0=[];
for k=1:size(F,1)
F0=[F0,squeeze(X(k,:,:))];
end
Is there a faster way?
I think this is what you want:
Y = reshape(permute(X, [2 1 3]), size(X,2), []);
Example with n=2, p=3, q=4:
>> X
X(:,:,1) =
0 6 9
8 3 0
X(:,:,2) =
4 7 1
3 7 4
X(:,:,3) =
4 7 2
6 7 6
X(:,:,4) =
6 1 9
1 4 3
>> Y = reshape(permute(X, [2 1 3]), size(X,2), [])
Y =
0 8 4 3 4 6 6 1
6 3 7 7 7 7 1 4
9 0 1 4 2 6 9 3
Try this -
reshape(permute(X,[2 3 1]),p,[])
Thus, for code verification, one can look into a sample case run -
n = 2;
p = 3;
q = 4;
X = rand(n,p,q)
F0=[];
for k=1:n
F0=[F0,squeeze(X(k,:,:))];
end
F0
F0_noloop = reshape(permute(X,[2 3 1]),p,[])
Output is -
F0 =
0.4134 0.6938 0.3782 0.4775 0.2177 0.0098 0.7043 0.6237
0.1257 0.8432 0.7295 0.2364 0.3089 0.9223 0.2243 0.1771
0.7261 0.7710 0.2691 0.8296 0.7829 0.0427 0.6730 0.7669
F0_noloop =
0.4134 0.6938 0.3782 0.4775 0.2177 0.0098 0.7043 0.6237
0.1257 0.8432 0.7295 0.2364 0.3089 0.9223 0.2243 0.1771
0.7261 0.7710 0.2691 0.8296 0.7829 0.0427 0.6730 0.7669
Rather than using vectorization to solve the problem, you could look at the code to try and figure out what may improve performance. In this case, since you know the size of your output matrix F0 should be px(n*q), you could pre-allocate memory to F0 and avoid the constant resizing of the matrix at each iteration of the for loop
n=1000;
p=300;
q=300;
F0=zeros(p,n*q);
for k=1:size(F,1)
F0(:,(k-1)*q+1:k*q) = squeeze(F(k,:,:));
end
While probably not as efficient as the other two solutions, it is an alternative. Try the above and see what happens!
I have a vector that should contain n sequences from 00 to 11
A = [00;01;02;03;04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
and I would like to check that the sequence "00 - 11 " is always respected (no missing values).
for example if
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
(missing 03 in the 3rd position)
For each missing value I would like to have back this information in another vector
missing=
[value_1,position_1;
value_2, position_2;
etc, etc]
Can you help me?
For sure we know that the last element must be 11, so we can already check for this and make our life easier for testing all previous elements. We ensure that A is 11-terminated, so an "element-wise change" approach (below) will be valid. Note that the same is true for the beginning, but changing A there would mess with indices, so we better take care of that later.
missing = [];
if A(end) ~= 11
missing = [missing; 11, length(A) + 1];
A = [A, 11];
end
Then we can calculate the change dA = A(2:end) - A(1:end-1); from one element to another, and identify the gap positions idx_gap = find((dA~=1) & (dA~=-11));. Now we need to expand all missing indices and expected values, using ev for the expected value. ev can be obtained from the previous value, as in
for k = 1 : length(idx_gap)
ev = A(idx_gap(k));
Now, the number of elements to fill in is the change dA in that position minus one (because one means no gap). Note that this can wrap over if there is a gap at the boundary between segments, so we use the modulus.
for n = 1 : mod(dA(idx_gap(k)) - 1, 12)
ev = mod(ev + 1, 12);
missing = [missing; ev, idx_gap(k) + 1];
end
end
As a test, consider A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]. That's a case where the special initialization from the beginning will fire, memorizing the missing 11 already, and changing A to [5 6 ... 7 8 11]. missing then will yield
11 24 % recognizes improper termination of A.
11 7
0 7 % properly handles wrap-over here.
1 7
2 7
5 21 % recognizes single element as missing.
9 24
10 24
which should be what you are expecting. Now what's missing still is the beginning of A, so let's say missing = [0 : A(1) - 1, 1; missing]; to complete the list.
This will give you the missing values and their positions in the full sequence:
N = 11; % specify the repeating 0:N sub-sequence
n = 3; % reps of sub-sequence
A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]'; %' column from s.bandara
da = diff([A; N+1]); % EDITED to include missing end
skipLocs = find(~(da==1 | da==-N));
skipLength = da(skipLocs)-1;
skipLength(skipLength<0) = N + skipLength(skipLength<0) + 1;
firstSkipVal = A(skipLocs)+1;
patchFun = #(x,y)(0:y)'+x - (N+1)*(((0:y)'+x)>N);
patches = arrayfun(patchFun,firstSkipVal,skipLength-1,'uni',false);
locs = arrayfun(#(x,y)(x:x+y)',skipLocs+cumsum([A(1); skipLength(1:end-1)])+1,...
skipLength-1,'uni',false);
Then putting them together, including any missing values at the beginning:
>> gapMap = [vertcat(patches{:}) vertcat(locs{:})-1]; % not including lead
>> gapMap = [repmat((0 : A(1) - 1)',1,2); gapMap] %' including lead
gapMap =
0 0
1 1
2 2
3 3
4 4
11 11
0 12
1 13
2 14
5 29
9 33
10 34
11 35
The first column contains the missing values. The second column is the 0-based location in the hypothetical full sequence.
>> Afull = repmat(0:N,1,n)
>> isequal(gapMap(:,1), Afull(gapMap(:,2)+1)')
ans =
1
Although this doesn't solve your problem completely, you can identify the position of missing values, or of groups of contiguous missing values, like this:
ind = 1+find(~ismember(diff(A),[1 -11]));
ind gives the position with respect to the current sequence A, not to the completed sequence.
For example, with
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03; ;06;07;08;09;10;11];
this gives
>> ind = 1+find(~ismember(diff(A),[1 -11]))
ind =
4
16
I have a problem with removing the rows when the columns are identical.
I have used a for and if loop but the run time is too long.
I was thinking if there are any more efficient and faster run time method.
say
A=[ 2 4 6 8;
3 9 7 9;
4 8 7 6;
8 5 4 6;
2 10 11 2]
I would want the result to be
A=[ 2 4 6 8;
4 8 7 6;
8 5 4 6]
eliminating the 2nd row because of the repeated '9' and remove the 5th row because of repeated '2'.
You can use sort and diff to identify the rows with repeated values
A = A(all(diff(sort(A'))),:)
returns
A =
2 4 6 8
4 8 7 6
8 5 4 6
The trick here is how to find the rows with repeated values in an efficient manner.
How about this:
% compare all-vs-all for each row using `bsxfun`
>> c = bsxfun( #eq, A, permute( A, [1 3 2] ) );
>> c = sum( c, 3 ); % count the number of matches each element has in the row
>> c = any( c > 1, 2 ); % indicates rows with repeated values - an element with more than one match
>> A = A( ~c, : )
A =
2 4 6 8
4 8 7 6
8 5 4 6
I am in the process of building a function in MATLAB. As a part of it I have to calculate differences between elements in two matrices and sum them up.
Let me explain considering two matrices,
1 2 3 4 5 6
13 14 15 16 17 18
and
7 8 9 10 11 12
19 20 21 22 23 24
The calculations in the first row - only four elements in both matrices are considered at once (zero indicates padding):
(1-8)+(2-9)+(3-10)+(4-11): This replaces 1 in initial matrix.
(2-9)+(3-10)+(4-11)+(5-12): This replaces 2 in initial matrix.
(3-10)+(4-11)+(5-12)+(6-0): This replaces 3 in initial matrix.
(4-11)+(5-12)+(6-0)+(0-0): This replaces 4 in initial matrix. And so on
I am unable to decide how to code this in MATLAB. How do I do it?
I use the following equation.
Here i ranges from 1 to n(h), n(h), the number of distant pairs. It depends on the lag distance chosen. So if I choose a lag distance of 1, n(h) will be the number of elements - 1.
When I use a 7 X 7 window, considering the central value, n(h) = 4 - 1 = 3 which is the case here.
You may want to look at the circshfit() function:
a = [1 2 3 4; 9 10 11 12];
b = [5 6 7 8; 12 14 15 16];
for k = 1:3
b = circshift(b, [0 -1]);
b(:, end) = 0;
diff = sum(a - b, 2)
end
I have array say "a"
a =
1 4 5
6 7 2
if i use function
b=sort(a)
gives ans
b =
1 4 2
6 7 5
but i want ans like
b =
5 1 4
2 6 7
mean 2nd row should be sorted but elements of ist row should remain unchanged and should be correspondent to row 2nd.
sortrows(a',2)'
Pulling this apart:
a = 1 4 5
6 7 2
a' = 1 6
4 7
5 2
sortrows(a',2) = 5 2
1 6
4 7
sortrows(a',2)' = 5 1 4
2 6 7
The key here is sortrows sorts by a specified row, all the others follow its order.
You can use the SORT function on just the second row, then use the index output to sort the whole array:
[junk,sortIndex] = sort(a(2,:));
b = a(:,sortIndex);
How about
a = [1 4 5; 6 7 2]
a =
1 4 5
6 7 2
>> [s,idx] = sort(a(2,:))
s =
2 6 7
idx =
3 1 2
>> b = a(:,idx)
b =
5 1 4
2 6 7
in other words, you use the second argument of sort to get the sort order you want, and then you apply it to the whole thing.