How to find smallest number of lists needed to cover all elements in another list - algorithm

I'm working on a code using Matlab in which I need to find the least number lists (in some set of given lists) necessary to cover all the elements of a reference list.
For example, say my reference list is
X = [0 1 2 3 4 5 6 7 8 9]
And I have a given set of lists as follows:
A = [0 1 3 5 6 7 9]
B = [0 1 2 3 4]
C = [5 6 7 8 9]
D = [1 2 3 4]
E = [1 5 7 8]
The smallest number of lists needed to cover every element in X is 2 (B and C), however, if I initially only search for the list that covers the most elements (A) and then try to find other lists that will cover the remaining elements, I'll end up using at least 3 lists. What would be the best way to write a code that can search for the smallest number of lists necessary for this (it would give me an output of B and C)? Any help at all would be greatly appreciated...even just a conceptual explanation (not actual code) of how to best approach this problem would be a huge help!

Approach #1: Iterative "brute-force" of all possible combinations
Below is one possible algorithm that illustrates how to solve the problem. The code itself should be self-explanatory, but the idea is that we test all possible combinations of lists until a valid one is found (hence we don't encounter the problem you described where we mistakenly choose lists based on their length).
function varargout = q36323802
R = [0 1 2 3 4 5 6 7 8 9]; %// Reference List
L = {... // As per Dan's suggestion:
[0 1 3 5 6 7 9]
[0 1 2 3 4]
[5 6 7 8 9]
[1 2 3 4]
[1 5 7 8]
};
out = []; %// Initialize output
%% // Brute-force approach:
nLists = numel(L);
for indN = 1:nLists
setCombinationsToCheck = nchoosek(1:nLists,indN);
for indC = 1:size(setCombinationsToCheck,1)
u = unique(cat(2,L{setCombinationsToCheck(indC,:)}));
if all(ismember(R,u))
out = setCombinationsToCheck(indC,:);
disp(['The minimum number of required sets is ' num2str(indN) ...
', and their indices are: ' num2str(out)]);
return;
end
end
end
disp('No amount of lists found to cover the reference.');
if nargout > 0
varargout{1} = out;
end
For your example the output is:
The minimum number of required sets is 2, and their indices are: 2 3
Note(s):
This method does some redundant computations by not using lists of length n-1 in iteration n, which were already found in previous iterations (when applicable). A recursive solution may work in this case.
There is probably a way to vectorize this, which I did not really think about in depth.
I assumed all inputs are row vectors. There would have to be some extra steps if this is not the case.
Thanks go to Adiel for suggesting some improvements, and for Amro for finding some bugs!
Approach #2: Tree search Experimental
I've attempted to also build a recursive solver. Now it finds a solution, but it's not general enough (actually the problem is that it only returns the first result, not necessarily the best result). The reasoning behind this approach is that we can treat your question as a tree search problem, and so we can employ search/pathfinding algorithms (see BFS, DFS, IDS etc.). I think the algorithm below is closest to DFS. As before, this should mainly illustrate an approach to solving your problem.
function q36323802_DFS(R,L)
%% //Input checking:
if nargin < 2 || isempty(L)
L = {... // As per Dan's suggestion:
[0 1 3 5 6 7 9]
[0 1 2 3 4]
[5 6 7 8 9]
[1 2 3 4]
[1 5 7 8]
};
end
if nargin < 1 || isempty(R)
R = [0 1 2 3 4 5 6 7 8 9]; %// Reference List
end
%% // Algorithm (DFS: breadth-first search):
out = DFS_search(R,L,0);
if isempty(out)
disp('No amount of lists found to cover the reference.');
else
disp(['The minimum number of required sets is ' num2str(numel(out)) ...
', and their indices are: ' num2str(out)]);
end
end
function out = DFS_search(R,L,depth)
%// Check to see if we should stop:
if isempty(R) || isempty(L)
% // Backtrack here?
out = [];
return;
end
if isnan(R)
out = [];
return;
end
nLists = numel(L);
reducedR = cellfun(#(R,L)setdiff(R,L),repmat({R},[nLists,1]),L,'UniformOutput',false)';
%'// We consider a case where the reduction had no effect as "hopeless" and
%// "drop" it.
isFullCoverage = cellfun(#isempty,reducedR);
isHopeless = cellfun(#(R)all(isnan(R)),reducedR) | cellfun(#(rR)isequal(rR,R),reducedR);
reducedR(isHopeless) = deal({NaN});
if all(isHopeless) && ~any(isFullCoverage)
out = [];
return
end
if any(isFullCoverage) %// Check current "breadth level"
out = find(isFullCoverage,1,'first');
return
else
for indB = 1:nLists
out = DFS_search(reducedR{indB},L,depth+1);
if ~isempty(out)
out = [indB out]; %#ok
%// TODO: test if one of the sets is covered by the others and remove it
%// from the list "out".
%// Also, keep track of the best path and only return (finally) if shortest
return
end
end
end
end

A similar solution to Dev-iL's 1st approach, by Amro:
function varargout = q36323802A
R = [0 1 2 3 4 5 6 7 8 9];
names = {'A' 'B' 'C' 'D' 'E'};
L = {...
[0 1 3 5 6 7 9]
[0 1 2 3 4]
[5 6 7 8 9]
[1 2 3 4]
[1 5 7 8]
};
N = numel(L);
%// powerset of L: set of all subsets (excluding empty set)
powerset = cell(1,N);
for k=1:N
sets = nchoosek(1:N, k);
powerset{k} = num2cell(sets,2);
end
powerset = cat(1, powerset{:});
%// for each possible subset, check if it covers the target R
mask = false(size(powerset));
for i=1:numel(powerset)
elems = unique([L{powerset{i}}]);
mask(i) = all(ismember(R, elems));
end
if ~any(mask), error('cant cover target'); end
%// from candidates, choose the one with least amount of sets
candidates = powerset(mask);
len = cellfun(#numel, candidates);
[~,idx] = min(len);
out = candidates{idx};
varargout{1} = names(out);

Related

Neighbors in the matrix - algorithm

I have a problem with coming up with an algorithm for the "graph" :(
Maybe one of you would be so kind and direct me somehow <3
The task is as follows:
We have a board of at least 3x3 (it doesn't have to be a square, it can be 4x5 for example). The user specifies a sequence of moves (as in Android lock pattern). The task is to check how many points he has given are adjacent to each other horizontally or vertically.
Here is an example:
Matrix:
1 2 3 4
5 6 7 8
9 10 11 12
The user entered the code: 10,6,7,3
The algorithm should return the number 3 because:
10 is a neighbor of 6
6 is a neighbor of 7
7 is a neighbor of 3
Eventually return 3
Second example:
Matrix:
1 2 3
4 5 6
7 8 9
The user entered the code: 7,8,6,3
The algorithm should return 2 because:
7 is a neighbor of 8
8 is not a neighbor of 6
6 is a neighbor of 3
Eventually return 2
Ofc number of operations equal length of array - 1
Sorry for "ile" and "tutaj", i'm polish
If all the codes are unique, use them as keys to a dictionary (with (row/col) pairs as values). Loop thru the 2nd item in user input to the end, check if math.Abs(cur.row-prev.row)+math.Abs(cur.col-prev.col)==1. This is not space efficient but deal with user input in linear complexity.
The idea is you have 4 conditions, one for each direction. Given any matrix of the shape n,m which is made of a sequence of integers AND given any element:
The element left or right will always be + or - 1 to the given element.
The element up or down will always be + or - m to the given element.
So, if abs(x-y) is 1 or m, then x and y are neighbors.
I demonstrate this in python.
def get_neighbors(seq,matrix):
#Conditions
check = lambda x,y,m: np.abs(x-y)==1 or np.abs(x-y)==m
#Pairs of sequences appended with m
params = zip(seq, seq[1:], [matrix.shape[1]]*(len(seq)-1))
neighbours = [check(*i) for i in params]
count = sum(neighbours)
return neighbours, count
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
print('Matrix:')
print(matrix)
print('')
print('Sequence:', seq)
print('')
print('Count of neighbors:',count)
Matrix:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Sequence: [10, 6, 7, 3]
Count of neighbors: 3
Another example -
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
Matrix:
[[1 2 3]
[4 5 6]
[7 8 9]]
Sequence: [7, 8, 6, 3]
Count of neighbors: 2
So your input is the width of a table, the height of a table, and a list of numbers.
W = 4, H = 3, list = [10,6,7,3]
There are two steps:
Convert the list of numbers into a list of row/column coordinates (1 to [1,1], 5 to [2,1], 12 to [3,4]).
In the new list of coordinates, find consequent pairs, which have one coordinate identical, and the other one has a difference of 1.
Both steps are quite simple ("for" loops). Do you have problems with 1 or 2?

remove non matching elements from matrix

I am trying to compare two matrices A and B. If elements in the first two columns of A match those in B, I want to delete all non matching rows from A. The third column in B should not factor into the comparison.
A = [1 2 3 B = [1 2 8
3 4 5 3 4 5]
6 7 8]
Desired result:
A = [1 2 3
3 4 5]
So far I only found ways to remove duplicate entries, which is the exact opposite of what I want. How can I do this?
You can efficiently use ismember for this task:
% Input matrices
A = [1 2 3; 3 4 5; 7 8 9];
B = [1 2 8; 3 4 5];
A1 = A(:,1:2); % Extract first two columns for both matrices
B1 = B(:,1:2);
[~,ii] = ismember(A1,B1,'rows'); % Returns which rows in A1 are also in B1
ii = ii(ii>0); % Where ii is zero, it's a non-matching row
A(ii,:) % Index to keep only matching rows
All of this can be written more compactly, but I wanted to show the step-by-step process first:
[~,ii] = ismember(A(:,1:2),B(:,1:2),'rows');
A(ii(ii>0),:)
A = [1 2 3;3 4 5;7 8 9];
B = [1 2 8; 3 4 5];
tmp = min([size(A,1) size(B,1)]); % get size to loop over
k = false(tmp,1); % storage counter
for ii = 1:tmp
if all(A(ii,1:2)==B(ii,1:2)) % if the first two columns match
k(ii)=true; % store
end
end
C = A(k,:) % extract requested rows

Scanning Elements of A Matrix

I need to write a code that scan a matrix from the most left and down element to the right moving with diagonals.
For example for the matrix [1 2 3; 4 5 6] it should return 4,5,1,6,2,3
Any ideas where to start?
Since you didn't show your attempts, I'll let you figure out how this code works :-)
x = [1 2 3; 4 5 6];
m = bsxfun(#minus, (1:size(x,1)).', 1:size(x,2));
[~, ind] = sort(reshape(m, 1, []));
result = x(flip(ind));
You may need to read about
linear indexing;
bsxfun.
A solution using spdiags*:
x = [1 2 3; 4 5 6];
result = x(nonzeros(flipud(spdiags(reshape(1:numel(x),size(x))))));
*It may not be as fast as #LuisMendo's solution but it is one liner!

Matlab: sorting a matrix in a unique way

I have a problem with sorting some finance data based on firmnumbers. So given is a matrix that looks like:
[1 3 4 7;
1 2 7 8;
2 3 7 8;]
On Matlab i would like the matrix to be sorted as follows:
[1 0 3 4 7 0;
1 2 0 0 7 8;
0 2 3 0 7 8;]
So basically every column needs to consist of 1 type of number.
I have tried many things but i cant get the matrix sorted properly.
A = [1 3 4 7;
1 2 7 8;
2 3 7 8;]
%// Get a unique list of numbers in the order that you want them to appear as the new columns
U = unique(A(:))'
%'//For each column (of your output, same as columns of U), find which rows have that number. Do this by making A 3D so that bsxfun compares each element with each element
temp1 = bsxfun(#eq,permute(A,[1,3,2]),U)
%// Consolidate this into a boolean matrix with the right dimensions and 1 where you'll have a number in your final answer
temp2 = any(temp1,3)
%// Finally multiply each line with U
bsxfun(#times, temp2, U)
So you can do that all in one line but I broke it up to make it easier to understand. I suggest you run each line and look at the output to see how it works. It might seem complicated but it's worthwhile getting to understand bsxfun as it's a really useful function. The first use which also uses permute is a bit more tricky so I suggest you first make sure you understand that last line and then work backwards.
What you are asking can also be seen as an histogram
A = [1 3 4 7;
1 2 7 8;
2 3 7 8;]
uniquevalues = unique(A(:))
N = histc(A,uniquevalues' ,2) %//'
B = bsxfun(#times,N,uniquevalues') %//'
%// bsxfun can replace the following instructions:
%//(the instructions are equivalent only when each value appears only once per row )
%// B = repmat(uniquevalues', size(A,1),1)
%// B(N==0) = 0
Answer without assumptions - Simplified
I did not feel comfortable with my old answer that makes the assumption of everything being an integer and removed the possibility of duplicates, so I came up with a different solution based on #lib's suggestion of using a histogram and counting method.
The only case I can see this not working for is if a 0 is entered. you will end up with a column of all zeros, which one might interpret as all rows initially containing a zero, but that would be incorrect. you could uses nan instead of zeros in that case, but not sure what this data is being put into, and if it that processing would freak out.
EDITED
Includes sorting of secondary matrix, B, along with A.
A = [-1 3 4 7 9; 0 2 2 7 8.2; 2 3 5 9 8];
B = [5 4 3 2 1; 1 2 3 4 5; 10 9 8 7 6];
keys = unique(A);
[counts,bin] = histc(A,transpose(unique(A)),2);
A_sorted = cell(size(A,1),1);
for ii = 1:size(A,1)
for jj = 1:numel(keys)
temp = zeros(1,max(counts(:,jj)));
temp(1:counts(ii,jj)) = keys(jj);
A_sorted{ii} = [A_sorted{ii},temp];
end
end
A_sorted = cell2mat(A_sorted);
B_sorted = nan(size(A_sorted));
for ii = 1:size(bin,1)
for jj = 1:size(bin,2)
idx = bin(ii,jj);
while ~isnan(B_sorted(ii,idx))
idx = idx+1;
end
B_sorted(ii,idx) = B(ii,jj);
end
end
B_sorted(isnan(B_sorted)) = 0
You can create at the beginning a matrix with 9 columns , and treat the values in your original matrix as column indexes.
A = [1 3 4 7;
1 2 7 8;
2 3 7 8;]
B = zeros(3,max(A(:)))
for i = 1:size(A,1)
B(i,A(i,:)) = A(i,:)
end
B(:,~any(B,1)) = []

Replacing elements of a 2D matrix

I'm trying to increase the efficiency of my MATLAB code. What it does is, it replaces the nonzero elements of a matrix with the multiplication of the rest of the nonzero elements in the same row. For instance,
X = [2 3 6 0; 0 3 4 2]
transforms into
X = [18 12 6 0; 0 8 6 12]
It's an easy task to implement within a for loop. It checks every row, finds nonzero values and do its replacements. I want to get rid of the for loop though. Is there a way to implement this without a loop?
Code
X = [2 3 6 0; 0 3 4 2];
X1 = X;
X1(~X) = 1;
out = bsxfun(#rdivide,prod(X1,2),X1).*(X~=0)
Output
out =
18 12 6 0
0 8 6 12
Probably getting the row product first once and then divide by the element you don't want is the simplest way:
X = [2 3 6 0; 0 3 4 2]
Y=X
%get the product of all elements in a row
Y(Y==0)=1
Y=prod(Y,2)
%repeat Y to match the size of X
Y=repmat(Y,1,size(X,2))
%For all but the zero elements, divide Y by X, which is the product of all other elements.
X(X~=0)=Y(X~=0)./X(X~=0)

Resources