How to solve this non 0-1 integer Knapsack_Problem in Ruby - ruby

Question:
Minimising x1+x2+...+xn
Known k1*x1+k2*x2+...kn*xn = T
k1,k2,...,kn and T are known integers and > 0
k1 > k2 > k3 > ... > kn
All the x are also integers and >= 0
Find all the x
I was trying to use Rglpk and Glpk. But I can't find an example with only one row of matrix. Is this Integer programming? And is it solvable? Many thanks.
Some Ruby codes I wrote:
ks = [33, 18, 15, 5, 3]
t = 999
problem = Rglpk::Problem.new
problem.name = "test"
problem.obj.dir = Rglpk::GLP_MIN
rows = problem.add_rows(1)
rows[0].name = "sum of x equals t"
rows[0].set_bounds(Rglpk::GLP_UP, t, t)
cols = problem.add_cols(ks.size)
ks.each_with_index do |k,index|
cols[index].name = "k: #{k}"
cols[index].set_bounds(Rglpk::GLP_LO, 0.0, 0.0)
end
problem.obj.coefs = Array.new(ks.size, 1)
problem.set_matrix(ks)
problem.simplex
minimum_x_sum = problem.obj.get
xs = []
cols.each do |col|
xs << col.get_prim
end
xs

Yes, it is an integer program, a rather famous one, the so-called "knapsack problem". You therefore can solve it with either of the packages you mention (provided the number of variables is not too great) but a much more efficient approach is to use dynamic programming (see the above link). The use of DP here is quite simple to implement. This is one Ruby implementation I found by Googling.
I should mention a few related tidbits. Firstly, your constraint is an equality constraint:
k1x1 + k2x2 +...+ knxn = T
but this is normally assumed to be an inequality by (DP) knapsack algorithms:
k1x1 + k2x2 +...+ knxn <= T
To deal with an equality constraint you can either modify the algorithm slightly, or add the term:
M*(T - x1 + x2 +...+ xn)
to the objective you are minimizing, where M is a very large number (106, perhaps), thereby forcing equality at the optimal solution. (When expanded, the coefficient for each xi becomes 1-M. The constant term MT can be disregarded.)
Two more details:
DP algorithms permit the variables in the objective to have coefficients other than 1 (and there is no gain in efficiency when all the coefficients equal 1); and
If the DP algorithm maximizes (rather than minimizes) the objective, you can simply negate the coefficients of the variables in the objective to obtain an optimal solution to the minimization problem.

Related

Product of consecutive numbers f(n) = n(n-1)(n-2)(n-3)(n- ...) find the value of n

Is there a way to find programmatically the consecutive natural numbers?
On the Internet I found some examples using either factorization or polynomial solving.
Example 1
For n(n−1)(n−2)(n−3) = 840
n = 7, -4, (3+i√111)/2, (3-i√111)/2
Example 2
For n(n−1)(n−2)(n−3) = 1680
n = 8, −5, (3+i√159)/2, (3-i√159)/2
Both of those examples give 4 results (because both are 4th degree equations), but for my use case I'm only interested in the natural value. Also the solution should work for any sequences size of consecutive numbers, in other words, n(n−1)(n−2)(n−3)(n−4)...
The solution can be an algorithm or come from any open math library. The parameters passed to the algorithm will be the product and the degree (sequences size), like for those two examples the product is 840 or 1640 and the degree is 4 for both.
Thank you
If you're interested only in natural "n" solution then this reasoning may help:
Let's say n(n-1)(n-2)(n-3)...(n-k) = A
The solution n=sthen verifies:
remainder of A/s = 0
remainder of A/(s-1) = 0
remainder of A/(s-2) = 0
and so on
Now, we see that s is in the order of t= A^(1/k) : A is similar to s*s*s*s*s... k times. So we can start with v= (t-k) and finish at v= t+1. The solution will be between these two values.
So the algo may be, roughly:
s= 0
t= (int) (A^(1/k)) //this truncation by leave out t= v+1. Fix it in the loop
theLoop:
for (v= t-k to v= t+1, step= +1)
{ i=0
while ( i <= k )
{ if (A % (v - k + i) > 0 ) // % operator to find the reminder
continue at theLoop
i= i+1
}
// All are valid divisors, solution found
s = v
break
}
if (s==0)
not natural solution
Assuming that:
n is an integer, and
n > 0, and
k < n
Then approximately:
n = FLOOR( (product ** (1/(k+1)) + (k+1)/2 )
The only cases I have found where this isn't exactly right is when k is very close to n. You can of course check it by back-calculating the product and see if it matches. If not, it almost certainly is only 1 or 2 in higher than this estimate, so just keep incrementing n until the product matches. (I can write this up in pseudocode if you need it)

Checking that a matrix is positive semidefinite with a given rank (in Julia)

I am writing a function that checks if a matrix X is positive semidefinite with a given rank k. To do this, I compute the eigenvalues of X, and I check that exactly k of them are positive and the rest are 0. Here's what I have so far:
using LinearAlgebra
function ispossemdef(X::AbstractMatrix, k::Int, ϵ::Real = 1e-10)
n = size(X, 1) # dim of X
!issymmetric(X) && return false # short-circuit if X is asymmetric
k > n && error("k > n") # throw error if k > n
eigs = eigvals(X) # eigenvalues of X in ascending order
z = eigs[1:(n - k)] # the values that should be zero
p = eigs[(n - k + 1):end] # the values that should be positive
n_minus_k_zero_eigenvalues = norm(z) < ϵ
k_positive_eigenvalues = all(p .> ϵ)
return n_minus_k_zero_eigenvalues & k_positive_eigenvalues
end
Is there a better algorithm for doing this? Better might mean faster (avoids computing the eigenvalues), or more numerically stable (lets me get away with a stricter error tolerance).
For example, the isposdef function (which is the k = n special case of what I'm doing) works by attempting to compute the Cholesky factor of X, and reporting back with whether or not it could. Can I generalize this procedure to semidefinite matrices? If so, is it better than checking the eigenvalues?
It will not work on all matrices, but have you looked at
using LinearAlgebra # for julia 1+
help> isposdef
at the isposdef() function?

Four nested for loops optimization - I promise I searched

I've tried to find a good way to speed up the code for a problem I've been working on. The basic idea of the code is very simple. There are five inputs:
Four 1xm (for some m < n, they can be different sizes) matrices (A, B, C, D) that are pairwise-disjoint subsets of {1,2,...,n} and one nxn symmetric binary matrix (M). The basic idea for the code is to check an inequality for for every combination of elements and if the inequality holds, return the values that cause it to hold, i.e.:
for a = A
for b = B
for c = C
for d = D
if M(a,c) + M(b,d) < M(a,d) + M(b,c)
result = [a b c d];
return
end
end
end
end
end
I know there has to be a better way to do this. First, since it's symmetric, I can cut down half of the items checked since M(a,b) = M(b,a). I've been researching vectorization, found several functions I'd never heard of with MATLAB (since I'm relatively new), but I can't find anything that will particularly help me with this specific problem. I've thought of other ways to approach the problem, but nothing has been perfected, and I just don't know what to do at this point.
For example, I could possibly split this into two cases:
1) The right hand side is 1: then I have to check that both terms on the left side are 0.
2) The right hand side is 2: then I have to check that at least one term on the left hand side is 0.
But, again, I won't be able to avoid nesting.
I appreciate all the help you can offer. Thank you!
You're asking two questions here: (1) is there a more efficient algorithm to perform this search, and (2) how can I vectorize this in MATLAB. The first one is very interesting to think about, but may be a little beyond the scope of this forum. The second one is easier to answer.
As pointed out in the comments below your question, you can vectorize the for loop by enumerating all of the possibilities and checking them all together, and the answers from this question can help:
[a,b,c,d] = ndgrid(A,B,C,D); % Enumerate all combos
a=a(:); b=b(:); c=c(:); d=d(:); % Reshape from 4-D matrices to vectors
ac = sub2ind(size(M),a,c); % Convert subscript pairs to linear indices
bd = sub2ind(size(M),b,d);
ad = sub2ind(size(M),a,d);
bc = sub2ind(size(M),b,c);
mask = (M(ac) + M(bd) < M(ad) + M(bc)); % Test the inequality
results = [a(mask), b(mask), c(mask), d(mask)]; % Select the ones that pass
Again, this isn't an algorithmic change: it still has the same complexity as your nested for loop. The vectorization may cause it to run faster, but it also lacks early termination, so in certain cases it may be slower.
Since M is binary, we can think about this as a graph problem. i,j in {1..n} correspond to nodes, and M(i,j) indicates whether there is an undirected edge connecting them.
Since A,B,C,D are disjoint, that simplifies the problem a bit. We can approach the problem in stages:
Find all (c,d) for which there exists a such that M(a,c) < M(a,d). Let's call this set CD_lt_a, (the subset of C*D such that the "less than" inequality holds for some a).
Find all (c,d) for which there exists a such that M(a,c) <= M(a,d), and call this set CD_le_a.
Repeat for b, forming CD_lt_b for M(b,d) < M(b,c) and CD_le_b for M(b,d)<=M(b,c).
One way to satisfy the overall inequality is for M(a,c) < M(a,d) and M(b,d) <= M(b,c), so we can look at the intersection of CD_lt_a and CD_le_b.
The other way is if M(a,c) <= M(a,d) and M(b,d) < M(b,c), so look at the intersection of CD_le_a and CD_lt_b.
With (c,d) known, we can go back and find the (a,b).
And so my implementation is:
% 0. Some preliminaries
% Get the size of each set
mA = numel(A); mB = numel(B); mC = numel(C); mD = numel(D);
% 1. Find all (c,d) for which there exists a such that M(a,c) < M(a,d)
CA_linked = M(C,A);
AD_linked = M(A,D);
CA_not_linked = ~CA_linked;
% Multiplying these matrices tells us, for each (c,d), how many nodes
% in A satisfy this M(a,c)<M(a,d) inequality
% Ugh, we need to cast to double to use the matrix multiplication
CD_lt_a = (CA_not_linked * double(AD_linked)) > 0;
% 2. For M(a,c) <= M(a,d), check that the converse is false for some a
AD_not_linked = ~AD_linked;
CD_le_a = (CA_linked * double(AD_not_linked)) < mA;
% 3. Repeat for b
CB_linked = M(C,B);
BD_linked = M(B,D);
CD_lt_b = (CB_linked * double(~BD_linked)) > 0;
CD_le_b = (~CB_linked * double(BD_linked)) < mB;
% 4. Find the intersection of CD_lt_a and CD_le_b - this is one way
% to satisfy the inequality M(a,c)+M(b,d) < M(a,d)+M(b,c)
CD_satisfy_ineq_1 = CD_lt_a & CD_le_b;
% 5. The other way to satisfy the inequality is CD_le_a & CD_lt_b
CD_satisfy_ineq_2 = CD_le_a & CD_lt_b;
inequality_feasible = any(CD_satisfy_ineq_1(:) | CD_satisfy_ineq_2(:));
Note that you can stop here if feasibility is your only concern. The complexity is A*C*D + B*C*D, which is better than the worst-case A*B*C*D complexity of the for loop. However, early termination means your nested for loops may still be faster in certain cases.
The next block of code enumerates all the a,b,c,d that satisfy the inequality. It's not very well optimized (it appends to a matrix from within a loop), so it can be pretty slow if there are many results.
% 6. With (c,d) known, find a and b
% We can define these functions to help us search
find_a_lt = #(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d));
find_a_le = #(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d));
find_b_lt = #(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d));
find_b_le = #(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d));
% I'm gonna assume there aren't too many results, so I will be appending
% to an array inside of a for loop. Bad for performance, but maybe a bit
% more readable for a StackOverflow answer.
results = zeros(0,4);
% Find those that satisfy it the first way
[c_list,d_list] = find(CD_satisfy_ineq_1);
for ii = 1:numel(c_list)
c = c_list(ii); d = d_list(ii);
a = find_a_lt(c,d);
b = find_b_le(c,d);
% a,b might be vectors, in which case all combos are valid
% Many ways to find all combos, gonna use ndgrid()
[a,b] = ndgrid(a,b);
% Append these to the growing list of results
abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
results = [results; abcd];
end
% Repeat for the second way
[c_list,d_list] = find(CD_satisfy_ineq_2);
for ii = 1:numel(c_list)
c = c_list(ii); d = d_list(ii);
a = find_a_le(c,d);
b = find_b_lt(c,d);
% a,b might be vectors, in which case all combos are valid
% Many ways to find all combos, gonna use ndgrid()
[a,b] = ndgrid(a,b);
% Append these to the growing list of results
abcd = [a(:), b(:), repmat([c d],[numel(a),1])];
results = [results; abcd];
end
% Remove duplicates
results = unique(results, 'rows');
% And actually these a,b,c,d will be indices into A,B,C,D because they
% were obtained from calling find() on submatrices of M.
if ~isempty(results)
results(:,1) = A(results(:,1));
results(:,2) = B(results(:,2));
results(:,3) = C(results(:,3));
results(:,4) = D(results(:,4));
end
I tested this on the following test case:
m = 1000;
A = (1:m); B = A(end)+(1:m); C = B(end)+(1:m); D = C(end)+(1:m);
M = rand(D(end),D(end)) < 1e-6; M = M | M';
I like to think that first part (see if the inequality is feasible for any a,b,c,d) worked pretty well. The other vectorized answers (that use ndgrid or combvec to enumerate all combinations of a,b,c,d) would require 8 terabytes of memory for a problem of this size!
But I would not recommend running the second part (enumerating all of the results) when there are more than a few hundred c,d that satisfy the inequality, because it will be pretty damn slow.
P.S. I know I answered already, but that answer was about vectorizing such loops in general, and is less specific to your particular problem.
P.P.S. This kinda reminds me of the stable marriage problem. Perhaps some of those references would contain algorithms relevant to your problem as well. I suspect that a true graph-based algorithm could probably achieve the worst-case complexity as this while additionally offering early termination. But I think it would be difficult to implement a graph-based algorithm efficiently in MATLAB.
P.P.P.S. If you only want one of the feasible solutions, you can simplify step 6 to only return a single value, e.g.
find_a_lt = #(c,d) find(CA_not_linked(c,:)' & AD_linked(:,d), 1, 'first');
find_a_le = #(c,d) find(CA_not_linked(c,:)' | AD_linked(:,d), 1, 'first');
find_b_lt = #(c,d) find(CB_linked(c,:)' & ~BD_linked(:,d), 1, 'first');
find_b_le = #(c,d) find(CB_linked(c,:)' | ~BD_linked(:,d), 1, 'first');
if any(CD_satisfy_ineq_1)
[c,d] = find(CD_satisfy_ineq_1, 1, 'first');
a = find_a_lt(c,d);
b = find_a_le(c,d);
result = [A(a), B(b), C(c), D(d)];
elseif any(CD_satisfy_ineq_2)
[c,d] = find(CD_satisfy_ineq_2, 1, 'first');
a = find_a_le(c,d);
b = find_a_lt(c,d);
result = [A(a), B(b), C(c), D(d)];
else
result = zeros(0,4);
end
If you have access to the Neural Network Toolbox, combvec could be helpful here.
running allCombs = combvec(A,B,C,D) will give you a (4 by m1*m2*m3*m4) matrix that looks like:
[...
a1, a1, a1, a1, a1 ... a1... a2... am1;
b1, b1, b1, b1, b1 ... b2... b1... bm2;
c1, c1, c1, c1, c2 ... c1... c1... cm3;
d1, d2, d3, d4, d1 ... d1... d1... dm4]
You can then use sub2ind and Matrix Indexing to setup the two values you need for your inequality:
indices = [sub2ind(size(M),allCombs(1,:),allCombs(3,:));
sub2ind(size(M),allCombs(2,:),allCombs(4,:));
sub2ind(size(M),allCombs(1,:),allCombs(4,:));
sub2ind(size(M),allCombs(2,:),allCombs(3,:))];
testValues = M(indices);
testValues(5,:) = (testValues(1,:) + testValues(2,:) < testValues(3,:) + testValues(4,:))
Your final a,b,c,d indices could be retrieved by saying
allCombs(:,find(testValues(5,:)))
Which would print a matrix with all columns which the inequality was true.
This article might be of some use.

Algorithm for functions permutating integers

I want to write some functions as follows
y = f(x) and another function,
x = g(y) that acts as a reversible, where
y = f(g(y)) and where x and y are permutated integers.
For very simple example in the range of integers in 0 to 10 it would look like this:
0->1
1->2
2->3
...
9->10
10->0
but this is the simplest method by adding 1 and reversing by subtracting 1.
I want to have a more sofisticated algorithm that can do the following,
234927773->4299
34->33928830
850033->23234243423
but the reverse can be obtained by conversion
The solution could be obtained with a huge table storing pairs of unique integers but this will not be correct. This must be a function.
You could just XOR.
y = x XOR p
x = y XOR p
Though not my area of expertise, I think that cryptography should provide some valuable answers to your question.
If the domain of your permutation is a power of 2, you can use any block cipher: 'f' is encryption with a specific key, and 'g' is decryption with the same key. If your domain is not a power of 2, you can probably still use a block cipher: see this article.
You could use polynomial interpolation methods to interpolate a function one way, then do reverse interpolation to find the inverse function.
Here is some example code in MATLAB:
function [a] = Coef(x, y)
n = length(x);
a = y;
for j = 2:n
for i = n:-1:j
a(i) = (a(i) - a(i-1)) / (x(i) - x(i-j+1));
end
end
end
function [val] = Eval(x, a, t)
n = length(x);
val = a(n);
for i = n-1:-1:1
val = a(i) + val*(t-x(i));
end
end
It builds a Divided Difference table and evaluates a function based on Newtons Interpolation.
Then if your sets of points are x, and y (as vectors of the same length, where x(i) matches to y(i), your forward interpolation function at value n would be Eval(x, Coef(x, y), n) and the reverse interpolation function would be Eval(y, Coef(y, x), n).
Depending on your language, there are probably much cleaner ways to do this, but this gets down and dirty with the maths.
Here is an excerpt from the Text Book which is used in my Numerical Methods class: Google Book Link

Dynamic programming idiom for combinations

Consider the problem in which you have a value of N and you need to calculate how many ways you can sum up to N dollars using [1,2,5,10,20,50,100] Dollar bills.
Consider the classic DP solution:
C = [1,2,5,10,20,50,100]
def comb(p):
if p==0:
return 1
c = 0
for x in C:
if x <= p:
c += comb(p-x)
return c
It does not take into effect the order of the summed parts. For example, comb(4) will yield 5 results: [1,1,1,1],[2,1,1],[1,2,1],[1,1,2],[2,2] whereas there are actually 3 results ([2,1,1],[1,2,1],[1,1,2] are all the same).
What is the DP idiom for calculating this problem? (non-elegant solutions such as generating all possible solutions and removing duplicates are not welcome)
Not sure about any DP idioms, but you could try using Generating Functions.
What we need to find is the coefficient of x^N in
(1 + x + x^2 + ...)(1+x^5 + x^10 + ...)(1+x^10 + x^20 + ...)...(1+x^100 + x^200 + ...)
(number of times 1 appears*1 + number of times 5 appears * 5 + ... )
Which is same as the reciprocal of
(1-x)(1-x^5)(1-x^10)(1-x^20)(1-x^50)(1-x^100).
You can now factorize each in terms of products of roots of unity, split the reciprocal in terms of Partial Fractions (which is a one time step) and find the coefficient of x^N in each (which will be of the form Polynomial/(x-w)) and add them up.
You could do some DP in calculating the roots of unity.
You should not go from begining each time, but at max from were you came from at each depth.
That mean that you have to pass two parameters, start and remaining total.
C = [1,5,10,20,50,100]
def comb(p,start=0):
if p==0:
return 1
c = 0
for i,x in enumerate(C[start:]):
if x <= p:
c += comb(p-x,i+start)
return c
or equivalent (it might be more readable)
C = [1,5,10,20,50,100]
def comb(p,start=0):
if p==0:
return 1
c = 0
for i in range(start,len(C)):
x=C[i]
if x <= p:
c += comb(p-x,i)
return c
Terminology: What you are looking for is the "integer partitions"
into prescibed parts (you should replace "combinations" in the title).
Ignoring the "dynamic programming" part of the question, a routine
for your problem is given in the first section of chapter 16
("Integer partitions", p.339ff) of the fxtbook, online at
http://www.jjj.de/fxt/#fxtbook

Resources