How to generate puzzles to 'crack the code'? [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
In a puzzle book I found the following puzzle:
The caption is in Dutch, but I can shortly describe it:
There is a sequence of numbers in the range 0...6, which is not given
There are 6 mathematical statements (listed below) that are true
By using logic and crossing off certain numbers that you know are certainly not it, whomever plays the puzzle can find the sequence of numbers (the code).
For example, in the pictured case, we know that A+C=5, thus we can infer that both A and C can not be 6. By doing these repeatedly, you will find the sequence.
Now, my question is: How could I write an algorithm that generates these puzzle values?
I've tried to come up with a random sequence, and describe it with 6 similar mathematical statements, but in some cases the puzzle will be unsolvable.
Such an unsolvable case / statements are for example when the sequence is 012345, with statements A+B=1 B+C=3 C+D=5 D+E=7 E+F=9 F+A=5

First, let's elaborate a solution S for this problem:
For each variable, create a set {0,1,2,3,4,5,6} with its possible values.
Analyse each equation and cut values from the variables' sets based on the equations.
example: A + B = 3 (we can cut 4, 5 and 6 from A and B sets).
When all sets have only 1 element, that's your answer.
On the other hand, if all equations are analysed and no set changes, it's impossible.
Example: (1) A + C = 5 (2) B - D = 5 (3) B + C = 6
(4) E * F = 6 (5) B - E = 3 (6) A + D = 6
Analysing (1): Reduce A and C to {0,1,2,3,4,5}.
Analysing (2): Reduce B to {5,6}. Reduce D to {0,1}.
Analysing (3): Reduce C to {0,1}
Analysing (4): Reduce E and F to {1,2,3,6}.
Analysing (5): Can't reduce.
Analysing (6): Reduce A to {5}. Reduce D to {1}.
--------------------------------------------------
Not over yet, but something changed. Restart:
--------------------------------------------------
Analysing (1): Reduce C to {0}
Analysing (2): Reduce B to {6}.
Analysing (3): Can't reduce.
Analysing (4): Reduce E and F to {1,2,3,6}.
Analysing (5): Reduce E to {3}.
Analysing (6): Can't reduce.
--------------------------------------------------
Not over yet, but something changed. Restart:
--------------------------------------------------
Analysing (1): Can't reduce.
Analysing (2): Can't reduce.
Analysing (3): Can't reduce.
Analysing (4): Reduce F to {2}.
Analysing (5): Can't reduce.
Analysing (6): Can't reduce.
--------------------------------------------------
Over. All sets of size 1. Answer: (5,6,0,1,3,2)
--------------------------------------------------
Now, knowing how to solve, I would create puzzles the following way:
Choose a random answer.
example: (4, 1, 3, 2, 0, 5).
Generate 6 random equations with it.
Run S with this equations.
If S ends, you have a possible group of equations.
Otherwise, restart.

There are 7^6 = 117649 possible codes. With 6 questions, we only need each to remove 6/7 of the search space. If the first few do more, then so much the better.
For the three questions, it is easy. When you look at the distribution of ways to get a sum or a difference you find that there are can be no more than 7/49 = 49 ways to get an answer and usually less. So partition A, B, C, D, E, F into 3 pairs of 2, supply random operations, and you have no more than 343 possible answers left.
With your example, I randomly generated D-A=4, C+F=7, D-B=2. For the first pair we have 3 possibilities, for the second we have 6, for the third we have 5.
Now what we need to do is actually construct all of the possibilities (in this case there are 90 of them). And then you can construct random statements, and only accept ones that are on the path to 1 solution. Which means that with 3 questions left, you cut down by a factor of at least the cube-root (in this case leaving no more than 20 possibilities), with 2 by the square root (leaving no more than 4), and then a single possibility.
Given that a random question on average cuts your search space by at least 6/7, random questions are likely to do this pretty easily. (The challenge is in verifying the answer.)

This seems like nothing more than simple algebraic substitutions. Moreover, the choice of operations in the equations does not matter. From
(D, B) (B, C)
we get
(D, C)
then from
(D, C) (C, A)
we get
(D, A)
An equation for (A, D) already exists so we can solve for both. From there, it's easy to see how to complete the solution.
In order to create a puzzle, start with any such double equation, say
(E, F) (E, F)
then split one of them into the more obscure, like
(E, B) (B, F)
that gives us three of the six equations (remember the choice of operations can be purely random). I leave the reader to give more thought to completion.
I wouldn't expect completion and implementation to be trivial but hopefully this can give us an idea of how this can be done quite deliberately without backtracking.
JavaScript code:
const vars = ["A", "B", "C", "D", "E", "F"];
function getOps(pair){
return pair.includes(0) ? ["+", "-"] : ["+", "-", "*"];
}
function getRandom(k, list){
let result = [];
for (let i=0; i<k; i++)
result = result.concat(
list.splice(~~(Math.random()*list.length), 1));
return result;
}
function getNum(a, b, op){
return eval(`${ a } ${ op } ${ b }`);
}
function getEquation(a, b, op, idxs){
// Randomise the order of the variables
if (Math.random() > 0.5)
[a, b] = [b, a];
return `${ vars[idxs[a]] } ${ op } ${ vars[idxs[b]] } = ${ getNum(a, b, op) }`;
}
function f(){
let remaining = [0, 1, 2, 3, 4, 5, 6];
let result = [];
// Remove one entry
remaining.splice(~~(Math.random()*7), 1);
// Shuffle
remaining = getRandom(6, remaining);
const idxs = Object.fromEntries(
remaining.map((x, i) => [x, i]));
// Select equation pairs
const [a, b] = getRandom(2, remaining);
const [c, d] = getRandom(2, remaining);
result.push(
getEquation(a, c, getRandom(1, getOps([a, c])), idxs),
getEquation(a, d, getRandom(1, getOps([a, d])), idxs),
getEquation(b, c, getRandom(1, getOps([b, c])), idxs),
getEquation(b, d, getRandom(1, getOps([b, d])), idxs)
);
const [e] = getRandom(1, remaining);
const [f] = remaining;
const [aa] = getRandom(1, [a, b, c, d]);
result.push(
getEquation(e, f, getRandom(1, getOps([e, f])), idxs),
getEquation(e, aa, getRandom(1, getOps([e, aa])), idxs)
);
const legend = Object.entries(idxs)
.map(([x, i]) => [vars[i], Number(x)])
.map(([a, b]) => `(${ a } ${ b })`)
.sort()
.join(' ');
return {
equations: getRandom(6, result).join('\n'),
legend: legend
};
}
var result = f();
console.log(result.equations);
console.log('');
console.log(result.legend);

Related

How to solve a McNuggets problem using z3py

I am new to z3py and wondering if this problem can be easily solved using "from z3 import *".
The McNuggets version of the coin problem was introduced by Henri Picciotto,
who included it in his algebra textbook co-authored with Anita Wah. Picciotto
thought of the application in the 1980s while dining with his son at
McDonald's, working the problem out on a napkin. A McNugget number is
the total number of McDonald's Chicken McNuggets in any number of boxes.
In the United Kingdom, the original boxes (prior to the introduction of
the Happy Meal-sized nugget boxes) were of 6, 9, and 20 nuggets.
[Wikipedia]
Task
McDonald is selling McNuggets in these boxes A=6, B= 9, C=20, D=27. Your friends and you are hungry and want to eat X chicken pieces.
first question: is possible to buy S boxes of size A, T boxes of size B, U boxes of size C and V boxes of size D such that you get exactly X chicken pieces (without left over)? (for example x=36)
second question: determine the minimal number X which gives a satisfiable solution, while for Y=X-1 it is not satisfiable and thus this Y is the solution of the Chicken McNuggets problem for the fixed A, B, C, D values above. This means that it is the largest number Y which can NOT be represented this way, or in terms of chicken pieces, no matter how many (S,T,U,V) boxes of the given sizes (A,B,C,D) you buy and your friends do eat exactly Y pieces, then there has to be some left over pieces.
Stack-overflow works the best if you show what you tried, and what sort of problems you ran into. I'm guessing you're not quite familiar with the idioms with z3py: Start by reading through https://ericpony.github.io/z3py-tutorial/guide-examples.htm which will get you on the right track.
Having said that, the first question is trivial to code in z3py:
from z3 import *
A = 6
B = 9
C = 20
D = 27
S, T, U, V = Ints('S T U V')
X = 36
s = Solver()
s.add(S >= 0)
s.add(T >= 0)
s.add(U >= 0)
s.add(V >= 0)
s.add(S*A + T*B + U*C + V*D == X)
while s.check() == sat:
m = s.model()
print(m)
block = []
for var in [S, T, U, V]:
v = m.eval(var, model_completion=True)
block.append(var != v)
s.add(Or(block))
This prints:
[S = 6, T = 0, U = 0, V = 0]
[S = 0, T = 1, U = 0, V = 1]
[S = 3, T = 2, U = 0, V = 0]
[S = 0, T = 4, U = 0, V = 0]
giving you all the solutions.
The second question is a bit confusing to read, but you might want to use the Optimize object instead of Solver. Start by reading through https://rise4fun.com/Z3/tutorial/optimization and see if you can make progress. If not, feel free to ask a new question; detailing what problem you ran into. (The tutorial there is in SMTLib, but you can do the same in Python using the Optimize class.)

Integral changes variable subscripts

I have been away from Mathematica for quite a while and am trying to fix some old notebooks from v4 that are no longer working under v11. I'm also a tad rusty.
I am attempting to use functional minimization to fit a polynomial of variable degree to an arbitrary function (F) given a starting guess (ao) and domain of interest (d). Note that while F is arbitrary, its nature is such that the integral of the product of F and a polynomial (or F^2) can always be evaluated algebraically.
For the sake of example, I'll use the following inputs:
ao = { 1, 2, 3, 4 }
d = { -1, 1 }
F = Sin[x]
To do so, I create an array of 'indexed' variables
polyCoeff = Array[a,Length[a],0]
Result: polycoeff = {a[0], a[1], a[2], a[3]}
I then create the polynomial itself using the following
genPoly[{},x_] := 0
genPoly[a_List,x_] := First[a] + x genPoly[Rest[a],x]
poly = genPoly[polyCoeff,x]
Result: poly = a[0] + x (a[1] + x (a[2] + x a[3]))
I then define my objective function as the integral of the square of the error of the difference between this poly and the function I am attempting to fit:
Q = Integrate[ (poly - F[x])^2, {x, d[[1]],d[[2]]} ]
result: Q = 0.545351 - 2. a[0.]^2 + 0.66667 a[1.]^2 + .....
And this is where things break down. poly looks just as I expected: a polynomial in x with coefficients that look like a[0], a[1], a[2], ... But, Q is not exactly what I expected. I expected and got a new polynomial. But not the coefficients contained a[0.], a[1.], a[2.], ...
The next step is to create the initial guess for FindMinimum
init = Transpose[{polyCoeff,ao}]
Result: {{a[0],1},{a[1],2},{a[3],3},{a[4],4}}
This looks fine.
But when I make the call to FindMinimum, I get an error because the coefficients passed in the objective (a[0.],a[1.],...) do not match those passed in the initial guess (a[0],a[1],...).
S = FindMinimum[Q,init]
So I think my question is how do I keep Integrate from changing the arguments to my coefficients? But, I am open to other approaches as well. Keep in mind though that this is "legacy" work that I really don't want to have to completely revamp.
Thanks much for any/all help.

Algorithm to group items in groups of 3

I am trying to solve a problem where I have pairs like:
A C
B F
A D
D C
F E
E B
A B
B C
E D
F D
and I need to group them in groups of 3 where I must have a triangule of matching from that list. Basically I need a result if its possible or not to group a collection.
So the possible groups are (ACD and BFE), or (ABC and DEF) and this collection is groupable since all letters can be grouped in groups of 3 and no one is left out.
I made a script where I can achieve this for small ammounts of input but for big ammounts it gets too slow.
My logic is:
make nested loop to find first match (looping untill I find a match)
> remove 3 elements from the collection
> run again
and I do this until I am out of letters. Since there can be different combinations I run this multiple times starting on different letters until I find a match.
I can understand that this gives me loops in order at least N^N and can get too slow. Is there a better logic for such problems? can a binary tree be used here?
This problem can be modeled as a graph Clique cover problem. Every letter is a node and every pair is an edge and you want to partition the graph into vertex-disjoint cliques of size 3 (triangles). If you want the partitioning to be of minimum cardinality then you want a minimum clique cover.
Actually this would be a k-clique cover problem, because in the clique cover problem you can have cliques of arbitrary/different sizes.
As Alberto Rivelli already stated, this problem is reducible to the Clique Cover problem, which is NP-hard.
It is also reducible to the problem of finding a clique of particular/maximum size. Maybe there are others, not NP-hard problems to which your particular case could be reduced to, but I didn't think of any.
However, there do exist algorithms which can find the solution in polynomial time, although not always for worst cases. One of them is Bron–Kerbosch algorithm, which is known by far to be the most efficient algorithm for finding the maximum clique and can find a clique in the worst case of O(3^(n/3)). I don't know the size of your inputs, but I hope it will be sufficient for your problem.
Here is the code in Python, ready to go:
#!/usr/bin/python3
# #by DeFazer
# Solution to:
# stackoverflow.com/questions/40193648/algorithm-to-group-items-in-groups-of-3
# Input:
# N P - number of vertices and number of pairs
# P pairs, 1 pair per line
# Output:
# "YES" and groups themselves if grouping is possible, and "NO" otherwise
# Input example:
# 6 10
# 1 3
# 2 6
# 1 4
# 4 3
# 6 5
# 5 2
# 1 2
# 2 3
# 5 4
# 6 4
# Output example:
# YES
# 1-2-3
# 4-5-6
# Output commentary:
# There are 2 possible coverages: 1-2-3*4-5-6 and 2-5-6*1-3-4.
# If required, it can be easily modified to return all possible groupings rather than just one.
# Algorithm:
# 1) List *all* existing triangles (1-2-3, 1-3-4, 2-5-6...)
# 2) Build a graph where vertices represent triangles and edges connect these triangles with no common... vertices. Sorry for ambiguity. :)
# 3) Use [this](en.wikipedia.org/wiki/Bron–Kerbosch_algorithm) algorithm (slightly modified) to find a clique of size N/3.
# The grouping is possible if such clique exists.
N, P = map(int, input().split())
assert (N%3 == 0) and (N>0)
cliquelength = N//3
pairs = {} # {a:{b, d, c}, b:{a, c, f}, c:{a, b}...}
# Get input
# [(0, 1), (1, 3), (3, 2)...]
##pairlist = list(map(lambda ab: tuple(map(lambda a: int(a)-1, ab)), (input().split() for pair in range(P))))
pairlist=[]
for pair in range(P):
a, b = map(int, input().split())
if a>b:
b, a = a, b
a, b = a-1, b-1
pairlist.append((a, b))
pairlist.sort()
for pair in pairlist:
a, b = pair
if a not in pairs:
pairs[a] = set()
pairs[a].add(b)
# Make list of triangles
triangles = []
for a in range(N-2):
for b in pairs.get(a, []):
for c in pairs.get(b, []):
if c in pairs[a]:
triangles.append((a, b, c))
break
def no_mutual_elements(sortedtupleA, sortedtupleB):
# Utility function
# TODO: if too slow, can be improved to O(n) since tuples are sorted. However, there are only 9 comparsions in case of triangles.
return all((a not in sortedtupleB) for a in sortedtupleA)
# Make a graph out of that list
tgraph = [] # if a<b and (b in tgraph[a]), then triangles[a] has no common elements with triangles[b]
T = len(triangles)
for t1 in range(T):
s = set()
for t2 in range(t1+1, T):
if no_mutual_elements(triangles[t1], triangles[t2]):
s.add(t2)
tgraph.append(s)
def connected(a, b):
if a > b:
b, a = a, b
return (b in tgraph[a])
# Finally, the magic algorithm!
CSUB = set()
def extend(CAND:set, NOT:set) -> bool:
# while CAND is not empty and there is no vertex in NOT connected to *all* vertexes in CAND
while CAND and all((any(not connected(n, c) for c in CAND)) for n in NOT):
v = CAND.pop()
CSUB.add(v)
newCAND = {c for c in CAND if connected(c, v)}
newNOT = {n for n in NOT if connected(n, v)}
if (not newCAND) and (not newNOT) and (len(CSUB)==cliquelength): # the last condition is the algorithm modification
return True
elif extend(newCAND, newNOT):
return True
else:
CSUB.remove(v)
NOT.add(v)
if extend(set(range(T)), set()):
print("YES")
# If the clique itself is not needed, it's enough to remove the following 2 lines
for a, b, c in [triangles[c] for c in CSUB]:
print("{}-{}-{}".format(a+1, b+1, c+1))
else:
print("NO")
If this solution is still too slow, perphaps it may be more efficient to solve the Clique Cover problem instead. If that's the case, I can try to find a proper algorithm for it.
Hope that helps!
Well i have implemented the job in JS where I feel most confident. I also tried with 100000 edges which are randomly selected from 26 letters. Provided that they are all unique and not a point such as ["A",A"] it resolves in like 90~500 msecs. The most convoluted part was to obtain the nonidentical groups, those without just the order of the triangles changing. For the given edges data it resolves within 1 msecs.
As a summary the first reduce stage finds the triangles and the second reduce stage groups the disconnected ones.
function getDisconnectedTriangles(edges){
return edges.reduce(function(p,e,i,a){
var ce = a.slice(i+1)
.filter(f => f.some(n => e.includes(n))), // connected edges
re = []; // resulting edges
if (ce.length > 1){
re = ce.reduce(function(r,v,j,b){
var xv = v.find(n => e.indexOf(n) === -1), // find the external vertex
xe = b.slice(j+1) // find the external edges
.filter(f => f.indexOf(xv) !== -1 );
return xe.length ? (r.push([...new Set(e.concat(v,xe[0]))]),r) : r;
},[]);
}
return re.length ? p.concat(re) : p;
},[])
.reduce((s,t,i,a) => t.used ? s
: (s.push(a.map((_,j) => a[(i+j)%a.length])
.reduce((p,c,k) => k-1 ? p.every(t => t.every(n => c.every(v => n !== v))) ? (c.used = true, p.push(c),p) : p
: [p].every(t => t.every(n => c.every(v => n !== v))) ? (c.used = true, [p,c]) : [p])),s)
,[]);
}
var edges = [["A","C"],["B","F"],["A","D"],["D","C"],["F","E"],["E","B"],["A","B"],["B","C"],["E","D"],["F","D"]],
ps = 0,
pe = 0,
result = [];
ps = performance.now();
result = getDisconnectedTriangles(edges);
pe = performance.now();
console.log("Disconnected triangles are calculated in",pe-ps, "msecs and the result is:");
console.log(result);
You may generate random edges in different lengths and play with the code here

Four nested for loops optimization - I promise I searched

I've tried to find a good way to speed up the code for a problem I've been working on. The basic idea of the code is very simple. There are five inputs:
Four 1xm (for some m < n, they can be different sizes) matrices (A, B, C, D) that are pairwise-disjoint subsets of {1,2,...,n} and one nxn symmetric binary matrix (M). The basic idea for the code is to check an inequality for for every combination of elements and if the inequality holds, return the values that cause it to hold, i.e.:
for a = A
for b = B
for c = C
for d = D
if M(a,c) + M(b,d) < M(a,d) + M(b,c)
result = [a b c d];
return
end
end
end
end
end
I know there has to be a better way to do this. First, since it's symmetric, I can cut down half of the items checked since M(a,b) = M(b,a). I've been researching vectorization, found several functions I'd never heard of with MATLAB (since I'm relatively new), but I can't find anything that will particularly help me with this specific problem. I've thought of other ways to approach the problem, but nothing has been perfected, and I just don't know what to do at this point.
For example, I could possibly split this into two cases:
1) The right hand side is 1: then I have to check that both terms on the left side are 0.
2) The right hand side is 2: then I have to check that at least one term on the left hand side is 0.
But, again, I won't be able to avoid nesting.
I appreciate all the help you can offer. Thank you!
You're asking two questions here: (1) is there a more efficient algorithm to perform this search, and (2) how can I vectorize this in MATLAB. The first one is very interesting to think about, but may be a little beyond the scope of this forum. The second one is easier to answer.
As pointed out in the comments below your question, you can vectorize the for loop by enumerating all of the possibilities and checking them all together, and the answers from this question can help:
[a,b,c,d] = ndgrid(A,B,C,D); % Enumerate all combos
a=a(:); b=b(:); c=c(:); d=d(:); % Reshape from 4-D matrices to vectors
ac = sub2ind(size(M),a,c); % Convert subscript pairs to linear indices
bd = sub2ind(size(M),b,d);
ad = sub2ind(size(M),a,d);
bc = sub2ind(size(M),b,c);
mask = (M(ac) + M(bd) < M(ad) + M(bc)); % Test the inequality
results = [a(mask), b(mask), c(mask), d(mask)]; % Select the ones that pass
Again, this isn't an algorithmic change: it still has the same complexity as your nested for loop. The vectorization may cause it to run faster, but it also lacks early termination, so in certain cases it may be slower.
Since M is binary, we can think about this as a graph problem. i,j in {1..n} correspond to nodes, and M(i,j) indicates whether there is an undirected edge connecting them.
Since A,B,C,D are disjoint, that simplifies the problem a bit. We can approach the problem in stages:
Find all (c,d) for which there exists a such that M(a,c) < M(a,d). Let's call this set CD_lt_a, (the subset of C*D such that the "less than" inequality holds for some a).
Find all (c,d) for which there exists a such that M(a,c) <= M(a,d), and call this set CD_le_a.
Repeat for b, forming CD_lt_b for M(b,d) < M(b,c) and CD_le_b for M(b,d)<=M(b,c).
One way to satisfy the overall inequality is for M(a,c) < M(a,d) and M(b,d) <= M(b,c), so we can look at the intersection of CD_lt_a and CD_le_b.
The other way is if M(a,c) <= M(a,d) and M(b,d) < M(b,c), so look at the intersection of CD_le_a and CD_lt_b.
With (c,d) known, we can go back and find the (a,b).
And so my implementation is:
% 0. Some preliminaries
% Get the size of each set
mA = numel(A); mB = numel(B); mC = numel(C); mD = numel(D);
% 1. Find all (c,d) for which there exists a such that M(a,c) < M(a,d)
CA_linked = M(C,A);
AD_linked = M(A,D);
CA_not_linked = ~CA_linked;
% Multiplying these matrices tells us, for each (c,d), how many nodes
% in A satisfy this M(a,c)<M(a,d) inequality
% Ugh, we need to cast to double to use the matrix multiplication
CD_lt_a = (CA_not_linked * double(AD_linked)) > 0;
% 2. For M(a,c) <= M(a,d), check that the converse is false for some a
AD_not_linked = ~AD_linked;
CD_le_a = (CA_linked * double(AD_not_linked)) < mA;
% 3. Repeat for b
CB_linked = M(C,B);
BD_linked = M(B,D);
CD_lt_b = (CB_linked * double(~BD_linked)) > 0;
CD_le_b = (~CB_linked * double(BD_linked)) < mB;
% 4. Find the intersection of CD_lt_a and CD_le_b - this is one way
% to satisfy the inequality M(a,c)+M(b,d) < M(a,d)+M(b,c)
CD_satisfy_ineq_1 = CD_lt_a & CD_le_b;
% 5. The other way to satisfy the inequality is CD_le_a & CD_lt_b
CD_satisfy_ineq_2 = CD_le_a & CD_lt_b;
inequality_feasible = any(CD_satisfy_ineq_1(:) | CD_satisfy_ineq_2(:));
Note that you can stop here if feasibility is your only concern. The complexity is A*C*D + B*C*D, which is better than the worst-case A*B*C*D complexity of the for loop. However, early termination means your nested for loops may still be faster in certain cases.
The next block of code enumerates all the a,b,c,d that satisfy the inequality. It's not very well optimized (it appends to a matrix from within a loop), so it can be pretty slow if there are many results.
% 6. With (c,d) known, find a and b
% We can define these functions to help us search
find_a_lt = #(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d));
find_a_le = #(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d));
find_b_lt = #(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d));
find_b_le = #(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d));
% I'm gonna assume there aren't too many results, so I will be appending
% to an array inside of a for loop. Bad for performance, but maybe a bit
% more readable for a StackOverflow answer.
results = zeros(0,4);
% Find those that satisfy it the first way
[c_list,d_list] = find(CD_satisfy_ineq_1);
for ii = 1:numel(c_list)
c = c_list(ii); d = d_list(ii);
a = find_a_lt(c,d);
b = find_b_le(c,d);
% a,b might be vectors, in which case all combos are valid
% Many ways to find all combos, gonna use ndgrid()
[a,b] = ndgrid(a,b);
% Append these to the growing list of results
abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
results = [results; abcd];
end
% Repeat for the second way
[c_list,d_list] = find(CD_satisfy_ineq_2);
for ii = 1:numel(c_list)
c = c_list(ii); d = d_list(ii);
a = find_a_le(c,d);
b = find_b_lt(c,d);
% a,b might be vectors, in which case all combos are valid
% Many ways to find all combos, gonna use ndgrid()
[a,b] = ndgrid(a,b);
% Append these to the growing list of results
abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
results = [results; abcd];
end
% Remove duplicates
results = unique(results, 'rows');
% And actually these a,b,c,d will be indices into A,B,C,D because they
% were obtained from calling find() on submatrices of M.
if ~isempty(results)
results(:,1) = A(results(:,1));
results(:,2) = B(results(:,2));
results(:,3) = C(results(:,3));
results(:,4) = D(results(:,4));
end
I tested this on the following test case:
m = 1000;
A = (1:m); B = A(end)+(1:m); C = B(end)+(1:m); D = C(end)+(1:m);
M = rand(D(end),D(end)) < 1e-6; M = M | M';
I like to think that first part (see if the inequality is feasible for any a,b,c,d) worked pretty well. The other vectorized answers (that use ndgrid or combvec to enumerate all combinations of a,b,c,d) would require 8 terabytes of memory for a problem of this size!
But I would not recommend running the second part (enumerating all of the results) when there are more than a few hundred c,d that satisfy the inequality, because it will be pretty damn slow.
P.S. I know I answered already, but that answer was about vectorizing such loops in general, and is less specific to your particular problem.
P.P.S. This kinda reminds me of the stable marriage problem. Perhaps some of those references would contain algorithms relevant to your problem as well. I suspect that a true graph-based algorithm could probably achieve the worst-case complexity as this while additionally offering early termination. But I think it would be difficult to implement a graph-based algorithm efficiently in MATLAB.
P.P.P.S. If you only want one of the feasible solutions, you can simplify step 6 to only return a single value, e.g.
find_a_lt = #(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d), 1, 'first');
find_a_le = #(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d), 1, 'first');
find_b_lt = #(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d), 1, 'first');
find_b_le = #(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d), 1, 'first');
if any(CD_satisfy_ineq_1)
[c,d] = find(CD_satisfy_ineq_1, 1, 'first');
a = find_a_lt(c,d);
b = find_a_le(c,d);
result = [A(a), B(b), C(c), D(d)];
elseif any(CD_satisfy_ineq_2)
[c,d] = find(CD_satisfy_ineq_2, 1, 'first');
a = find_a_le(c,d);
b = find_a_lt(c,d);
result = [A(a), B(b), C(c), D(d)];
else
result = zeros(0,4);
end
If you have access to the Neural Network Toolbox, combvec could be helpful here.
running allCombs = combvec(A,B,C,D) will give you a (4 by m1*m2*m3*m4) matrix that looks like:
[...
a1, a1, a1, a1, a1 ... a1... a2... am1;
b1, b1, b1, b1, b1 ... b2... b1... bm2;
c1, c1, c1, c1, c2 ... c1... c1... cm3;
d1, d2, d3, d4, d1 ... d1... d1... dm4]
You can then use sub2ind and Matrix Indexing to setup the two values you need for your inequality:
indices = [sub2ind(size(M),allCombs(1,:),allCombs(3,:));
sub2ind(size(M),allCombs(2,:),allCombs(4,:));
sub2ind(size(M),allCombs(1,:),allCombs(4,:));
sub2ind(size(M),allCombs(2,:),allCombs(3,:))];
testValues = M(indices);
testValues(5,:) = (testValues(1,:) + testValues(2,:) < testValues(3,:) + testValues(4,:))
Your final a,b,c,d indices could be retrieved by saying
allCombs(:,find(testValues(5,:)))
Which would print a matrix with all columns which the inequality was true.
This article might be of some use.

The Movie Scheduling _Problem_

Currently I'm reading "The Algorithm Design Manual" by Skiena (well, beginning to read)
He asks a problem he calls the "Movie Scheduling Problem":
Problem: Movie Scheduling Problem
Input: A set I of n intervals on the line.
Output: What is the largest subset of mutually non-overlapping intervals which can
be selected from I?
Example: (Each dashed line is a movie, you want to find a set with the highest quantity of movies)
----a---
-----b---- -----c--- ---d---
-----e--- -------f---
--g-- --h--
The algorithm I thought of to solve it was like this:
I could throw out the "worst offender" (intersects with the most other movies) until there are no worst offenders (zero intersections). The only problem I see is that if there is a tie (say two different movies each intersect with 3 other movies) could it matter which one I throw out?
Basically I'm wondering how I go about turning the idea into "math" and how to prove it correct/incorrect.
The algorithm is incorrect. Let's consider the following example:
Counterexample
|----F----| |-----G------|
|-------D-------| |--------E--------|
|-----A------| |------B------| |------C-------|
You can see that there is a solution of size at least 3 because you can pick A, B and C.
Firstly, let's count, for each interval the number of intersections:
A = 2 [F, D]
B = 4 [D, F, E, G]
C = 2 [E, G]
D = 3 [A, B, F]
E = 3 [B, C, G]
F = 3 [A, B, D]
G = 3 [B, C, E]
Now consider a run of your algorithm. In the first step we delete B because it intersects with the most number of invervals and we get:
|----F----| |-----G------|
|-------D-------| |--------E--------|
|-----A------| |------C-------|
It's easy to see that now from {A, D, F} you can choose only one, because each pair intersects. The same case with {G, E, C}, so after deleting B, you can choose at most one from {A, D, F} and at most one from {G, E, C}, to get the total of 2, which is smaller than the size of {A, B, C}.
The conclusion is, that after deleting B which intersects with the most number of invervals, you can't get the maximum number of nonintersecting movies.
Correct solution
The problem is very well known and one solution is to pick the interval which ends first, delete all intervals intersecting with it and continue until there are no intervals to examine. This is an example of a greedy method and you can find or develop a proof that it's correct.
This looks like a dynamic programming problem to me:
Define the following functions:
sched(t) = best schedule starting at time t
next(t) = set of movies that start next after time t
len(m) = length of movie m
next returns a set because there may be more than one movie that starts at the same time.
then sched should be defined as follows:
sched(t) = max { 1 + sched(t + len(m)), sched(t+1) } where m in next(t)
This recursive function selects a movie m from next(t) and compares the largest possible sets that either include or don't include m.
Invoke sched with the time of your first movie and you will get the size of the optimal set. Getting the optimal set itself just requires a little extra logic to remember which movies you select at each invocation.
I think this recursive (as opposed to iterative) algorithm runs in O(n^2) if you use memoization, where n is the number of movies.
It's correct, but I'd have to consult my algorithms textbook to give you an explicit proof, but hopefully this algorithm makes intuitive sense why it is correct.
# go through the database and create a 2-D matrix indexed a..h by a..h. Set each
# element of the matrix to 1 if the row index movie overlaps the column index movie.
mtx = []
for i in range(8):
column = []
for j in range(8):
column.append(0)
mtx.append(column)
# b <> e
mtx[1][4] = 1
mtx[4][1] = 1
# e <> g
mtx[4][6] = 1
mtx[6][4] = 1
# e <> c
mtx[4][2] = 1
mtx[2][4] = 1
# c <> a
mtx[2][0] = 1
mtx[0][2] = 1
# c <> f
mtx[2][5] = 1
mtx[5][2] = 1
# c <> g
mtx[2][6] = 1
mtx[6][2] = 1
# c <> h
mtx[2][7] = 1
mtx[7][2] = 1
# d <> f
mtx[3][5] = 1
mtx[5][3] = 1
# a <> f
mtx[0][5] = 1
mtx[5][0] = 1
# a <> d
mtx[0][3] = 1
mtx[3][0] = 1
# a <> h
mtx[0][7] = 1
mtx[7][0] = 1
# g <> e
mtx[4][7] = 1
mtx[7][4] = 1
# print out contstraints
for line in mtx:
print line
# keep track of which movies are still allowed
allowed = set(range(8))
# loop through in greedy fashion, picking movie that throws out the least
# number of other movies at each step
best = 8
while best > 0:
best_col = None
best_lost = set()
best = 8 # score if move does not overlap with any other
# each step, only try movies still allowed
for col in allowed:
lost = set()
for row in range(8):
# keep track of other movies eliminated by this selection
if mtx[row][col] == 1:
lost.add(row)
# this was the best of all the allowed choices so far
if len(lost) < best:
best_col = col
best_lost = lost
best = len(lost)
# there was a valid selection, process
if best_col > 0:
print 'watch movie: ', str(unichr(best_col+ord('a')))
for row in best_lost:
# now eliminate the other movies you can't now watch
if row in allowed:
print 'throwing out: ', str(unichr(row+ord('a')))
allowed.remove(row)
# also throw out this movie from the allowed list (can't watch twice)
allowed.remove(best_col)
# this is just a greedy algorithm, not guaranteed optimal!
# you could also iterate through all possible combinations of movies
# and simply eliminate all illegal possibilities (brute force search)

Resources