degenerate in simplex method - algorithm

I am reading about linear programming using simplex method in book Introduction to Algorithm Design and analysis.
I am having difficulty in understanding text. Here is text snippet.
The principal advantage of the standard form lies in the simple
mechanism it provides for identifying extreme points of the feasible
region. For the general case of a problem with m equations in n
unknowns (n ≥ m), n − m variables need to be set to zero to get a
system of m equations in m unknowns. If the system obtained has a
unique solution—as any nondegenerate system of linear equations with
the number of equations equal to the number of unknowns does—we have a
basic solution; its coordinates set to zero before solving the system
are called nonbasic, and its coordinates obtained by solving the
system are called basic.
My questions on above text are
What is nondegenerate system of linear equations?
What does author mean by unique solution here in above context?

As author says, "For the general case of a problem with m equations in n unknowns (n ≥ m), (n − m) variables need to be set to zero to get a system of m equations in m unknowns." So when you set (n-m) variables as zero, you will have a system of m equations with m variables.
Let me explain with an example and basic math. Suppose that m = 2, and the system of equations you have is :
x + y = 2 & x - y = 0
So this system of equations has "unique solution", where (x = 1 & y = 1)
Now suppose that m = 3, and the system of equations you have is :
x + y + z = 3, x + y - z = 1, & x + y = 2
In this case, any point with z = 1 and satisfying equation x + y = 2 will satisfy all the equations. For example, (x = 1, y = 1, & z = 1), (x = 2, y = 0, & z = 1), and (x = 4, y = - 2, & z = 1). So here you have more than one solution to this system of equations, so you will say that this system of equations has "multiple solutions".
Now let us talk about degeneracy. Assuming that you get unique solution to the system of equations, If you solve system of m equations with m variables, you will get numerical value for all m variables. If any of these m variables have their numerical value equal to zero, you will say that solution is degenerate. By non-degenerate, author means that all of the variables have non-zero value in solution. Degeneracy tends to increase the number of simplex iterations before reaching the optimal solution.
Now let us talk a little about simplex method. Suppose you have set (n-m) out of n variables as zero (as author says), and you get an unique non-degenerate solution. You say this solution as basic solution. But their is catch here, you don't know which of (n-m) variables out of n variables have to be set as zero to find optimal solution. In theory, you will have n_combinatorial_(n-m) basic solutions to evaluate. This is where simplex method helps. It helps you in identifying optimal set of variables to be set as zero in order to find optimal solution. Once you have optimally identified variables to be set as zero, and your reduced system of equations have unique solution, then solution to this system of equations is the optimal solution.
I hope it helps :-)

Related

Multiobjective integer programming [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I wish to use integer programming to enumerate pareto optimal solutions.
I would like to implement an algorithm that uses gurobi or a similar single-objective integer programming solver to do this, but I don't know any such algorithms. Could someone please propose an algorithm for enumerating the efficient frontier?
In this answer, I'll address how to enumerate all pareto efficient solutions of 2-objective pure integer optimization problems of the form
min_x {g'x, h'x}
s.t. Ax <= b
x integer
Algorithm
We start the algorithm by optimizing for one of the objectives (we'll use g here). Since this is a standard single-objective integer optimization problem, it can be easily solved with gurobi or any other LP solver:
min_x g'x
s.t. Ax <= b
x integer
We initialize a set P, which will eventually contain all the pareto efficient solutions, to P = {x*}, where x* is the optimal solution of this model. To get the next point on the efficient frontier, which has the second-smallest g'x value and an improved h'x value, we can add the following constraints to our model:
Remove x* from the feasible set of solutions (details on these x != x* constraints are provided later in the answer).
Add a constraint that h'x <= h'x*
The new optimization model that we need to solve is:
min_x g'x
s.t. Ax <= b
x != x* for all x* in P
h'x <= h'x* for all x* in P
x integer
Again, this is a single-objective integer optimization model that can be solved with gurobi or another solver (once you follow the details below on how to model x != x* constraints). As you repeatedly solve this model, adding the optimal solutions to P, solutions will get progressively larger (worse) g'x values and progressively smaller (better) h'x values. Eventually, the model will become infeasible, which means no more points on the pareto frontier exist, and the algorithm terminates.
At this point, there may be some pairs of solutions x, y in P for which g'x = g'y and h'x > h'y, in which case x is dominated by y and can be removed. After filtering in this way, the set P represents the full pareto efficient frontier.
x != x* Constraints
All that remains is to model constraints of the form x != x*, where x and x* are n-dimensional vectors of integer variables. Even in one dimension this is a non-convex constraint (see here for details), so we need to add auxiliary variables to help us model the constraint.
Denote the n variables stored in the optimization model (collectively denoted x) as x_1, x_2, ..., x_n, and similarly denote the variable values in x* as x*_1, x*_2, ..., x*_n. We can add new binary variables y_1, y_2, ..., y_n to the model, where y_i is set to 1 when x_i > x*_i. Because x_i and x*_i are integer valued, this is the same as saying x_i >= x*_i + 1, and this can be implemented with the following constraints (M is a large constant):
x_i - x*_i >= 1 - M(1-y_i) for all i = 1, ..., n
Similarly, we can add new binary variables z_1, z_2, ..., z_n to the model, where z_i is set to 1 when x_i < x*_i. Because x_i and x*_i are integer valued, this is the same as saying x_i <= x*_i - 1, and this can be implemented with the following big-M constraints:
x_i - x*_i <= -1 + M(1-z_i) for all i = 1, ..., n
If at least one of the y or z variables is set, then we know that our x != x* constraint is satisfied. Therefore, we can replace x != x* with:
y_1 + y_2 + ... + y_n + z_1 + z_2 + ... + z_n >= 1
In short, each x != x* constraint can be handled by adding 2n binary variables and 2n+1 constraints to the model, where n is the number of variables in the original model.
PolySCIP is an academic open-source solver for multi-objective mixed integer linear programs. In case you do not want to implement your own solver or want to compare yours with another one.

Finding integral solution of an equation

This is part of a bigger question. Its actually a mathematical problem. So it would be really great if someone can direct me to any algorithm to obtain the solution of this problem or a pseudo code will be of help.
The question. Given an equation check if it has an integral solution.
For example:
(26a+5)/32=b
Here a is an integer. Is there an algorithm to predict or find if b can be an integer. I need a general solution not specific to this question. The equation can vary. Thanks
Your problem is an example of a linear Diophantine equation. About that, Wikipedia says:
This Diophantine equation [i.e., a x + b y = c] has a solution (where x and y are integers) if and only if c is a multiple of the greatest common divisor of a and b. Moreover, if (x, y) is a solution, then the other solutions have the form (x + k v, y - k u), where k is an arbitrary integer, and u and v are the quotients of a and b (respectively) by the greatest common divisor of a and b.
In this case, (26 a + 5)/32 = b is equivalent to 26 a - 32 b = -5. The gcd of the coefficients of the unknowns is gcd(26, -32) = 2. Since -5 is not a multiple of 2, there is no solution.
A general Diophantine equation is a polynomial in the unknowns, and can only be solved (if at all) by more complex methods. A web search might turn up specialized software for that problem.
Linear Diophantine equations take the form ax + by = c. If c is the greatest common divisor of a and b this means a=z'c and b=z''c then this is Bézout's identity of the form
with a=z' and b=z'' and the equation has an infinite number of solutions. So instead of trial searching method you can check if c is the greatest common divisor (GCD) of a and b
If indeed a and b are multiples of c then x and y can be computed using extended Euclidean algorithm which finds integers x and y (one of which is typically negative) that satisfy Bézout's identity
(as a side note: this holds also for any other Euclidean domain, i.e. polynomial ring & every Euclidean domain is unique factorization domain). You can use Iterative Method to find these solutions:
Integral solution to equation `a + bx = c + dy`

arrangement with constraint on the sum

I'm looking to construct an algorithm which gives the arrangements with repetition of n sequences of a given step S (which can be a positive real number), under the constraint that the sum of all combinations is k, with k a positive integer.
My problem is thus to find the solutions to the equation:
x 1 + x 2 + ⋯ + x n = k
where
0 ≤ x i ≤ b i
and S (the step) a real number with finite decimal.
For instance, if 0≤xi≤50, and S=2.5 then xi = {0, 2.5 , 5,..., 47.5, 50}.
The point here is to look only through the combinations having a sum=k because if n is big it is not possible to generate all the arrangements, so I would like to bypass this to generate only the combinations that match the constraint.
I was thinking to start with n=2 for instance, and find all linear combinations that match the constraint.
ex: if xi = {0, 2.5 , 5,..., 47.5, 50} and k=100, then we only have one combination={50,50}
For n=3, we have the combination for n=2 times 3, i.e. {50,50,0},{50,0,50} and {0,50,50} plus the combinations {50,47.5,2.5} * 3! etc...
If xi = {0, 2.5 , 5,..., 37.5, 40} and k=100, then we have 0 combinations for n=2 because 2*40<100, and we have {40,40,20} times 3 for n=3... (if I'm not mistaken)
I'm a bit lost as I can't seem to find a proper way to start the algorithm, knowing that I should have the step S and b as inputs.
Do you have any suggestions?
Thanks
You can transform your problem into an integer problem by dividing everything by S: We want to find all integer sequences y1, ..., yn with:
(1) 0 ≤ yi ≤ ⌊b / S⌋
(2) y1 + ... + yn = k / S
We can see that there is no solution if k is not a multiple of S. Once we have reduced the problem, I would suggest using a pseudopolynomial dynamic programming algorithm to solve the subset sum problem and then reconstruct the solution from it. Let f(i, j) be the number of ways to make sum j with i elements. We have the following recurrence:
f(0,0) = 1
f(0,j) = 0 forall j > 0
f(i,j) = sum_{m = 0}^{min(floor(b / S), j)} f(i - 1, j - m)
We can solve f in O(n * k / S) time by filling it row by row. Now we want to reconstruct the solution. I'm using Python-style pseudocode to illustrate the concept:
def reconstruct(i, j):
if f(i,j) == 0:
return
if i == 0:
yield []
return
for m := 0 to min(floor(b / S), j):
for rest in reconstruct(i - 1, j - m):
yield [m] + rest
result = reconstruct(n, k / S)
result will be a list of all possible combinations.
What you are describing sounds like a special case of the subset sum problem. Once you put it in those terms, you'll find that Pisinger apparently has a linear time algorithm for solving a more general version of your problem, since your weights are bounded. If you're interested in designing your own algorithm, you might start by reading Pisinger's thesis to get some ideas.
Since you are looking for all possible solutions and not just a single solution, the dynamic programming approach is probably your best bet.

Pollard Rho factorization method

Pollard Rho factorization method uses a function generator f(x) = x^2-a(mod n) or f(x) = x^2+a(mod n) , is the choice of this function (parabolic) has got any significance or we may use any function (cubic , polynomial or even linear) as we have to identify or find the numbers belonging to same congruence class modulo n to find the non trivial divisor ?
In Knuth Vol II (The Art Of Computer Programming - Seminumerical Algorithms) section 4.5.4 Knuth says
Furthermore if f(y) mod p behaves as a random mapping from the set {0,
1, ... p-1} into itself, exercise 3.1-12 shows that the average value
of the least such m will be of order sqrt(p)... From the theory in
Chapter 3, we know that a linear polynomial f(x) = ax + c will not be
sufficiently random for our purpose. The next simplest case is
quadratic, say f(x) = x^2 + 1. We don't know that this function is
sufficiently random, but our lack of knowledge tends to support the
hypothesis of randomness, and empirical tests show that this f does
work essentially as predicted
The probability theory that says that f(x) has a cycle of length about sqrt(p) assumes in particular that there can be two values y and z such that f(y) = f(z) - since f is chosen at random. The rho in Pollard Rho contains such a junction, with the cycle containing multiple lines leading on to it. For a linear function f(x) = ax + b then for gcd(a, p) = 1 mod p (which is likely since p is prime) f(y) = f(z) means that y = z mod p, so there are no such junctions.
If you look at http://www.agner.org/random/theory/chaosran.pdf you will see that the expected cycle length of a random function is about the sqrt of the state size, but the expected cycle length of a random bijection is about the state size. If you think of generating the random function only as you evaluate it you can see that if the function is entirely random then every value seen so far is available to be chosen again at random to find a cycle, so the odds of closing the cycle increase with the cycle length, but if the function has to be invertible the only way to close the cycle is to generate the starting point, which is much less likely.

How can a transform a polynomial to another coordinate system?

Using assorted matrix math, I've solved a system of equations resulting in coefficients for a polynomial of degree 'n'
Ax^(n-1) + Bx^(n-2) + ... + Z
I then evaulate the polynomial over a given x range, essentially I'm rendering the polynomial curve. Now here's the catch. I've done this work in one coordinate system we'll call "data space". Now I need to present the same curve in another coordinate space. It is easy to transform input/output to and from the coordinate spaces, but the end user is only interested in the coefficients [A,B,....,Z] since they can reconstruct the polynomial on their own. How can I present a second set of coefficients [A',B',....,Z'] which represent the same shaped curve in a different coordinate system.
If it helps, I'm working in 2D space. Plain old x's and y's. I also feel like this may involve multiplying the coefficients by a transformation matrix? Would it some incorporate the scale/translation factor between the coordinate systems? Would it be the inverse of this matrix? I feel like I'm headed in the right direction...
Update: Coordinate systems are linearly related. Would have been useful info eh?
The problem statement is slightly unclear, so first I will clarify my own interpretation of it:
You have a polynomial function
f(x) = Cnxn + Cn-1xn-1 + ... + C0
[I changed A, B, ... Z into Cn, Cn-1, ..., C0 to more easily work with linear algebra below.]
Then you also have a transformation such as: z = ax + b that you want to use to find coefficients for the same polynomial, but in terms of z:
f(z) = Dnzn + Dn-1zn-1 + ... + D0
This can be done pretty easily with some linear algebra. In particular, you can define an (n+1)×(n+1) matrix T which allows us to do the matrix multiplication
d = T * c ,
where d is a column vector with top entry D0, to last entry Dn, column vector c is similar for the Ci coefficients, and matrix T has (i,j)-th [ith row, jth column] entry tij given by
tij = (j choose i) ai bj-i.
Where (j choose i) is the binomial coefficient, and = 0 when i > j. Also, unlike standard matrices, I'm thinking that i,j each range from 0 to n (usually you start at 1).
This is basically a nice way to write out the expansion and re-compression of the polynomial when you plug in z=ax+b by hand and use the binomial theorem.
If I understand your question correctly, there is no guarantee that the function will remain polynomial after you change coordinates. For example, let y=x^2, and the new coordinate system x'=y, y'=x. Now the equation becomes y' = sqrt(x'), which isn't polynomial.
Tyler's answer is the right answer if you have to compute this change of variable z = ax+b many times (I mean for many different polynomials). On the other hand, if you have to do it just once, it is much faster to combine the computation of the coefficients of the matrix with the final evaluation. The best way to do it is to symbolically evaluate your polynomial at point (ax+b) by Hörner's method:
you store the polynomial coefficients in a vector V (at the beginning, all coefficients are zero), and for i = n to 0, you multiply it by (ax+b) and add Ci.
adding Ci means adding it to the constant term
multiplying by (ax+b) means multiplying all coefficients by b into a vector K1, multiplying all coefficients by a and shifting them away from the constant term into a vector K2, and putting K1+K2 back into V.
This will be easier to program, and faster to compute.
Note that changing y into w = cy+d is really easy. Finally, as mattiast points out, a general change of coordinates will not give you a polynomial.
Technical note: if you still want to compute matrix T (as defined by Tyler), you should compute it by using a weighted version of Pascal's rule (this is what the Hörner computation does implicitely):
ti,j = b ti,j-1 + a ti-1,j-1
This way, you compute it simply, column after column, from left to right.
You have the equation:
y = Ax^(n-1) + Bx^(n-2) + ... + Z
In xy space, and you want it in some x'y' space. What you need is transformation functions f(x) = x' and g(y) = y' (or h(x') = x and j(y') = y). In the first case you need to solve for x and solve for y. Once you have x and y, you can substituted those results into your original equation and solve for y'.
Whether or not this is trivial depends on the complexity of the functions used to transform from one space to another. For example, equations such as:
5x = x' and 10y = y'
are extremely easy to solve for the result
y' = 2Ax'^(n-1) + 2Bx'^(n-2) + ... + 10Z
If the input spaces are linearly related, then yes, a matrix should be able to transform one set of coefficients to another. For example, if you had your polynomial in your "original" x-space:
ax^3 + bx^2 + cx + d
and you wanted to transform into a different w-space where w = px+q
then you want to find a', b', c', and d' such that
ax^3 + bx^2 + cx + d = a'w^3 + b'w^2 + c'w + d'
and with some algebra,
a'w^3 + b'w^2 + c'w + d' = a'p^3x^3 + 3a'p^2qx^2 + 3a'pq^2x + a'q^3 + b'p^2x^2 + 2b'pqx + b'q^2 + c'px + c'q + d'
therefore
a = a'p^3
b = 3a'p^2q + b'p^2
c = 3a'pq^2 + 2b'pq + c'p
d = a'q^3 + b'q^2 + c'q + d'
which can be rewritten as a matrix problem and solved.

Resources