Creating normally distributed Multi-dimensional Knapsack Problem (MKP) data from a basis one - genetic-algorithm

I am trying to solve dynamic MKP for my thesis and I need to do the some calculations mentioned in the following passage of an article to create templates from a basis one:
2.3 The Dynamic MKP
In our study, we use a dynamic version of the MKP as proposed in [3] and described below. Basis is the first instance
given in the file mknapcb4.txt which can be downloaded from [1]. It has 100 items, 10 knapsacks and a tightness ratio of 0.25. For every change, the profits, resource consumptions and the constraints are multiplied by a normally distributed random variable as follows:
pj ← pj ∗ (1 + N (0, σp))
rij ← rij ∗ (1 + N (0, σr))
ci ← ci ∗ (1 + N(0, σc))
Unless specified otherwise, the standard deviation of the normally distributed random variable used for the changes has been set to σp = σr = σc = 0.05 which requires on average 11 out of the 100 possible items to be added or removed from one optimal solution to the next. Each profit pj, resource consumption rij and constraint ci is restricted to an interval as determined in Eq. 11.
lbp ∗ pj ≤ pj ≤ > ubp ∗ pj
lbr ∗ rij ≤ rij ≤ ubp ∗ rij
lbc ∗ ci ≤ ci ≤ ubp ∗ ci
where lbp = lbr = lbc = 0.8 and ubp = ubr = ubc = 1.2. If any of the changes causes any of the lower or upper bounds to be exceeded, the value is bounced back from the bounds and set to a corresponding value within the allowed boundaries.
Link to full version of the above paper: https://web.itu.edu.tr/~etaner/evostoc06.pdf
And here is my C++ code to implement it:
void createTemplate()
{
double lb=0.8;
double ub=1.2;
double previous;
std::random_device generator;
default_random_engine eng{generator()};;
normal_distribution<double> distribution(0,0.05);
double number = distribution(generator);
//updating profits
for(int i=0;i<mkpInfo.numberOfItems;i++){
previous=mkpInfo.profits[i];
mkpInfo.profits[i]=mkpInfo.profits[i]*(1.0+number);
if(mkpInfo.profits[i]<previous*lb)
mkpInfo.profits[i]=lb*previous;
if(mkpInfo.profits[i]>ub*previous)
mkpInfo.profits[i]=ub*previous;
}
//updating weights
for(int i=0;i<mkpInfo.numberOfResources;i++)
{
for(int j=0;j<mkpInfo.numberOfItems;j++){
previous=mkpInfo.Rij[i][j];
mkpInfo.Rij[i][j]=mkpInfo.Rij[i][j]*(1.0+number);
if(mkpInfo.Rij[i][j]<previous*lb)
mkpInfo.Rij[i][j]=lb*previous;
if(mkpInfo.Rij[i][j]>ub*previous)
mkpInfo.Rij[i][j]=ub*previous;
}
}
//updating resources
for(int i=0;i<mkpInfo.numberOfResources;i++){
previous=mkpInfo.resources[i];
mkpInfo.resources[i]=mkpInfo.resources[i]*(1.0+number);
if(mkpInfo.resources[i]<previous*lb)
mkpInfo.resources[i]=lb*previous;
if(mkpInfo.resources[i]>ub*previous)
mkpInfo.resources[i]=ub*previous;
}
}
I have created 9 different environments by calling void createTemplate() function according to the above code and solved them with CPLEX to find actual optimum values for each. Then, I tried to solve them with the algorithm that I developed for my thesis. Weirdly, either my algorithm is one of its kind, or I have done something really wrong while creating the templates. I mean I found very very good results according to previously studies in the literature. And I know my algorithm is fine because I have published it and solved similar problems before. For instance, the expression double number = distribution(generator); always generates the same number in a run.
So, my questions are:
Is it normal to generate the same number over and over with the above-mentioned code line?
Considering the explanation in the article, am I doing right? Or I need to generate different random number for each variable in the data file? I mean do I need to put the expression number=distribution(generator) in for loops?

Related

how to avoid repeating linear regression procedures when adding new points

I know how to do to do regression on a set of N samples. However, my project is about doing the linear regression of the first 2, 3, 4 ... k, k+1, ... N samples respectively. Instead of repeating the same procedure when adding a new sample, is there a faster method that I can use the previous result (or intermediate results) to solve the regression after adding a new point? Thank you.
In linear least squares method coefficients of approximation line are calculated using these formulas:
a = (N * Sum(Xi*Yi) - Sum(Xi)*Sum(Yi)) / (n * Sum(Xi^2) - (Sum(Xi))^2)
b = (Sum(Yi) - a * Sum(Xi)) / N
So you can store the values of Nth sums
Sum(Xi*Yi)
Sum(Xi)
Sum(Yi)
Sum(Xi^2)
and update them at (N+1)th step.
Sum(Xi)[N+1] = Sum(Xi)[N] + X(N+1)
Sum(Xi*Yi)[N+1] = Sum(Xi*Yi)[N] + X(N+1)*Y(N+1)
and so on, and calculate new coefficients values.
Note: such algorithms are called 'running' or 'online' - see analog for std deviation

implementing stochastic ACO algorithm

I am trying to implement a stochastic ant colony optimisation algorithm, and I'm having trouble working out how to implement movement choices based on probabilities.
the standard (greedy) version that I have implemented so far is that an ant m at a vertex i on a graph G = (V,E) where E is the set of edges (i, j), will choose the next vertex j based on the following criteria:
j = argmax(<fitness function for j>)
such that j is connected to i
the problem I am having is in trying to implement a stochastic version of this, so that now the criteria for choosing a new vertex, j is:
P(j) = <fitness function for j>/sum(<fitness function for J>)
where P(j) is the probability of choosing vertex j,
such j is connected to i,
and J is the set of all vertices connected to i
I understand the mathematics behind it, I am just having trouble working out how i should actually implement it.
if, say, i have 3 vertices connected to i, each with a probability of 0.2, 0.3, 0.5 - what is the best way to make the selection? should I just randomly select a vertex j, then generate a random number r in the range (0,1) and if r >= P(j), select vertex j? or is there a better way?
Looking at the problem statement, I think you are not trying to visit all nodes (connected to i (say) ), but some of the nodes based on some probability distribution. Lets take an example:
You have a node i and connected to it are 5 nodes, a1...a5, with probabilities p1...p5, such that sum(p_i) = 1. No, say the precision of probabilities that you consider is 2 places after decimal. Also, you dont want to visit all 5 nodes, but only k of them. Lets say, in this example, k = 2. So, since 2 places of decimal is your probability precision, add 3 to it to increase normality of probability distribution in the random function. (You can change this 3 to any number of your choice, as far as performance is concerned) (Since you have not tagged any language, I'll take example of java's nextInt() function to generate random numbers.)
Lets give some values:
p1...p5 = {0.17, 0.11, 0.45, 0.03, 0.24}
Now, in a loop from 1 to k, generate a random number from (0...10^5). {5 = 2 + 3, ie. precision + 3}. If the generated number is from 0 to 16999, go with node a1, 17000 to 27999, go with a2, 28000 to 72999, go with a3...and so on. You get the idea.
What you're trying to implement is a weighted random choice depending on the probabilities for the components of the solution, or a random proportional selection rule on ACO terms. Here is an snippet of the implementation of this rule on the Isula Framework:
double value = random.nextDouble();
while (componentWithProbabilitiesIterator.hasNext()) {
Map.Entry<C, Double> componentWithProbability = componentWithProbabilitiesIterator
.next();
Double probability = componentWithProbability.getValue();
total += probability;
if (total >= value) {
nextNode = componentWithProbability.getKey();
getAnt().visitNode(nextNode);
return true;
}
}
You just need to generate a random value between 0 and 1 (stored in value), and start accumulating the probabilities of the components (on the total variable). When the total exceeds the threshold defined in value, we have found the component to add to the solution.

Combinatorial Optimization - Variation on Knapsack

Here is a real-world combinatorial optimization problem.
We are given a large set of value propositions for a certain product. The value propositions are of different types but each type is independent and adds equal benefit to the overall product. In building the product, we can include any non-negative integer number of "units" of each type. However, after adding the first unit of a certain type, the marginal benefit of additional units of that type continually decreases. In fact, the marginal benefit of a new unit is the inverse of the number of units of that type, after adding the new unit. Our product must have a least one unit of some type, and there is a small correction that we must make to the overall value because of this requirement.
Let T[] be an array representing the number of each type in a certain production run of the product. Then the overall value V is given by (pseudo code):
V = 1
For Each t in T
V = V * (t + 1)
Next t
V = V - 1 // correction
On cost side, units of the same type have the same cost. But units of different types each have unique, irrational costs. The number of types is large, but we are given an array of type costs C[] that is sorted from smallest to largest. Let's further assume that the type quantity array T[] is also sorted by cost from smallest to largest. Then the overall cost U is simply the sum of each unit cost:
U = 0
For i = 0, i < NumOfValueTypes
U = U + T[i] * C[i]
Next i
So far so good. So here is the problem: Given product P with value V and cost U, find the product Q with the cost U' and value V', having the minimal U' such that U' > U, V'/U' > V/U.
The problem you've described is nonlinear integer programming problem because it contains a product of integer variables t. Its feasibility set is not closed because of strict inequalities which can be worked around by using non-strict inequalities and adding a small positive number (epsilon) to the right hand sides. Then the problem can be formulated in AMPL as follows:
set Types;
param Costs{Types}; # C
param GivenProductValue; # V
param GivenProductCost; # U
param Epsilon;
var units{Types} integer >= 0; # T
var productCost = sum {t in Types} units[t] * Costs[t];
minimize cost: productCost;
s.t. greaterCost: productCost >= GivenProductCost + Epsilon;
s.t. greaterValuePerCost:
prod {t in Types} (units[t] + 1) - 1 >=
productCost * GivenProductValue / GivenProductCost + Epsilon;
This problem can be solved using a nonlinear integer programming solver such as Couenne.
Honestly I don't think there is an easy way to solve this. The best thing would be to write the system and solve it with a solver ( Excel solver will do the tricks, but you can use Ampl to solve this non lienar program.)
The Program:
Define: U;
V;
C=[c1,...cn];
Variables: T=[t1,t2,...tn];
Objective Function: SUM(ti.ci)
Constraints:
For all i: ti integer
SUM(ti.ci) > U
(PROD(ti+1)-1).U > V.SUM(ti.ci)
It works well with excel, (you just replace >U by >=U+d where d is the significative number of the costs- (i.e if C=[1.1, 1.8, 3.0, 9.3] d =0.1) since excel doesn't allow stric inequalities in the solver.)
I guess with a real solver like Ampl it will work perfectly.
Hope it helps,

Dynamic programming question

I am stuck with one of the algorithm homework problem. Can anyone give me some hint to solve it? Here is the question:
Consider a chain structured computation represented by a weighted graph G = (V;E) where
V = {v1; v2; ... ; vn} and E = {(vi; vi+1) such that 1<= i <= n-1. We are also given a chain-structure m identical processors P = {P1; ... ; Pm} (i.e., there exists a communication link between Pk and Pk+1 for 1 <= k <= m - 1).
The set of vertices V represents computation modules, and the set of edges E represents
communication between the two modules. Each node vi is assigned a weight wi denoting the
execution time of the module on a single processor. Each edge (vi; vi+1) is assigned a weight ci denoting the amount of communication time between the two modules if they are assigned two different processors. If multiple modules are assigned to the same processor, the modules assigned to the same processor must be consecutive. Suppose modules va; va+1; .. ; vb are assigned to Processor Pk. Then, the time taken by Pk, denoted by Tk, is the time to compute assigned modules plus the time to communicate between neighboring processors. Hence, Tk = wa+...+ wb + ca-1 + cb. Note here that ca-1 = 0 if a = 1 and cb = 0 if b = n.
The objective of the problem is to find an assignment V to P such that max1<=k<=m Tk
is minimized, where we assume that each processor must take at least one module. (This
assumption can be relaxed by adding m dummy modules with zero weight on computational
and communication time.)
Develop a dynamic programming algorithm to solve this problem in polynomial time(i.e O(mn))
I tried to find the minimum execution time for each Pk and then find the max, but I doubt my solution is dynamic programming since there is no recursive formula. Please give me some hints!
Thanks!
I think you might be able to modify the Viterbi algorithm to solve this problem.
okay. this is easy.
decompose your problem to be a function you need to minimise, say F(n,k). which results into the minimum assignment of the first n nodes to k first processors.
Then derive your formula like this, collecting the number of nodes on the kth processor.
F(n,k) = min[i=0..n]( max(F(i,k-1), w[i]+...+w[n]+c[i-1]+c[n]) )
c[0] = 0
F(*,0) = inf
F(0,*) = inf

How can I efficiently calculate the binomial cumulative distribution function?

Let's say that I know the probability of a "success" is P. I run the test N times, and I see S successes. The test is akin to tossing an unevenly weighted coin (perhaps heads is a success, tails is a failure).
I want to know the approximate probability of seeing either S successes, or a number of successes less likely than S successes.
So for example, if P is 0.3, N is 100, and I get 20 successes, I'm looking for the probability of getting 20 or fewer successes.
If, on the other hadn, P is 0.3, N is 100, and I get 40 successes, I'm looking for the probability of getting 40 our more successes.
I'm aware that this problem relates to finding the area under a binomial curve, however:
My math-fu is not up to the task of translating this knowledge into efficient code
While I understand a binomial curve would give an exact result, I get the impression that it would be inherently inefficient. A fast method to calculate an approximate result would suffice.
I should stress that this computation has to be fast, and should ideally be determinable with standard 64 or 128 bit floating point computation.
I'm looking for a function that takes P, S, and N - and returns a probability. As I'm more familiar with code than mathematical notation, I'd prefer that any answers employ pseudo-code or code.
Exact Binomial Distribution
def factorial(n):
if n < 2: return 1
return reduce(lambda x, y: x*y, xrange(2, int(n)+1))
def prob(s, p, n):
x = 1.0 - p
a = n - s
b = s + 1
c = a + b - 1
prob = 0.0
for j in xrange(a, c + 1):
prob += factorial(c) / (factorial(j)*factorial(c-j)) \
* x**j * (1 - x)**(c-j)
return prob
>>> prob(20, 0.3, 100)
0.016462853241869437
>>> 1-prob(40-1, 0.3, 100)
0.020988576003924564
Normal Estimate, good for large n
import math
def erf(z):
t = 1.0 / (1.0 + 0.5 * abs(z))
# use Horner's method
ans = 1 - t * math.exp( -z*z - 1.26551223 +
t * ( 1.00002368 +
t * ( 0.37409196 +
t * ( 0.09678418 +
t * (-0.18628806 +
t * ( 0.27886807 +
t * (-1.13520398 +
t * ( 1.48851587 +
t * (-0.82215223 +
t * ( 0.17087277))))))))))
if z >= 0.0:
return ans
else:
return -ans
def normal_estimate(s, p, n):
u = n * p
o = (u * (1-p)) ** 0.5
return 0.5 * (1 + erf((s-u)/(o*2**0.5)))
>>> normal_estimate(20, 0.3, 100)
0.014548164531920815
>>> 1-normal_estimate(40-1, 0.3, 100)
0.024767304545069813
Poisson Estimate: Good for large n and small p
import math
def poisson(s,p,n):
L = n*p
sum = 0
for i in xrange(0, s+1):
sum += L**i/factorial(i)
return sum*math.e**(-L)
>>> poisson(20, 0.3, 100)
0.013411150012837811
>>> 1-poisson(40-1, 0.3, 100)
0.046253037645840323
I was on a project where we needed to be able to calculate the binomial CDF in an environment that didn't have a factorial or gamma function defined. It took me a few weeks, but I ended up coming up with the following algorithm which calculates the CDF exactly (i.e. no approximation necessary). Python is basically as good as pseudocode, right?
import numpy as np
def binomial_cdf(x,n,p):
cdf = 0
b = 0
for k in range(x+1):
if k > 0:
b += + np.log(n-k+1) - np.log(k)
log_pmf_k = b + k * np.log(p) + (n-k) * np.log(1-p)
cdf += np.exp(log_pmf_k)
return cdf
Performance scales with x. For small values of x, this solution is about an order of magnitude faster than scipy.stats.binom.cdf, with similar performance at around x=10,000.
I won't go into a full derivation of this algorithm because stackoverflow doesn't support MathJax, but the thrust of it is first identifying the following equivalence:
For all k > 0, sp.misc.comb(n,k) == np.prod([(n-k+1)/k for k in range(1,k+1)])
Which we can rewrite as:
sp.misc.comb(n,k) == sp.misc.comb(n,k-1) * (n-k+1)/k
or in log space:
np.log( sp.misc.comb(n,k) ) == np.log(sp.misc.comb(n,k-1)) + np.log(n-k+1) - np.log(k)
Because the CDF is a summation of PMFs, we can use this formulation to calculate the binomial coefficient (the log of which is b in the function above) for PMF_{x=i} from the coefficient we calculated for PMF_{x=i-1}. This means we can do everything inside a single loop using accumulators, and we don't need to calculate any factorials!
The reason most of the calculations are done in log space is to improve the numerical stability of the polynomial terms, i.e. p^x and (1-p)^(1-x) have the potential to be extremely large or extremely small, which can cause computational errors.
EDIT: Is this a novel algorithm? I've been poking around on and off since before I posted this, and I'm increasingly wondering if I should write this up more formally and submit it to a journal.
I think you want to evaluate the incomplete beta function.
There's a nice implementation using a continued fraction representation in "Numerical Recipes In C", chapter 6: 'Special Functions'.
I can't totally vouch for the efficiency, but Scipy has a module for this
from scipy.stats.distributions import binom
binom.cdf(successes, attempts, chance_of_success_per_attempt)
An efficient and, more importantly, numerical stable algorithm exists in the domain of Bezier Curves used in Computer Aided Design. It is called de Casteljau's algorithm used to evaluate the Bernstein Polynomials used to define Bezier Curves.
I believe that I am only allowed one link per answer so start with Wikipedia - Bernstein Polynomials
Notice the very close relationship between the Binomial Distribution and the Bernstein Polynomials. Then click through to the link on de Casteljau's algorithm.
Lets say I know the probability of throwing a heads with a particular coin is P.
What is the probability of me throwing
the coin T times and getting at least
S heads?
Set n = T
Set beta[i] = 0 for i = 0, ... S - 1
Set beta[i] = 1 for i = S, ... T
Set t = p
Evaluate B(t) using de Casteljau
or at most S heads?
Set n = T
Set beta[i] = 1 for i = 0, ... S
Set beta[i] = 0 for i = S + 1, ... T
Set t = p
Evaluate B(t) using de Casteljau
Open source code probably exists already. NURBS Curves (Non-Uniform Rational B-spline Curves) are a generalization of Bezier Curves and are widely used in CAD. Try openNurbs (the license is very liberal) or failing that Open CASCADE (a somewhat less liberal and opaque license). Both toolkits are in C++, though, IIRC, .NET bindings exist.
If you are using Python, no need to code it yourself. Scipy got you covered:
from scipy.stats import binom
# probability that you get 20 or less successes out of 100, when p=0.3
binom.cdf(20, 100, 0.3)
>>> 0.016462853241869434
# probability that you get exactly 20 successes out of 100, when p=0.3
binom.pmf(20, 100, 0.3)
>>> 0.0075756449257260777
From the portion of your question "getting at least S heads" you want the cummulative binomial distribution function. See http://en.wikipedia.org/wiki/Binomial_distribution for the equation, which is described as being in terms of the "regularized incomplete beta function" (as already answered). If you just want to calculate the answer without having to implement the entire solution yourself, the GNU Scientific Library provides the function: gsl_cdf_binomial_P and gsl_cdf_binomial_Q.
The DCDFLIB Project has C# functions (wrappers around C code) to evaluate many CDF functions, including the binomial distribution. You can find the original C and FORTRAN code here. This code is well tested and accurate.
If you want to write your own code to avoid being dependent on an external library, you could use the normal approximation to the binomial mentioned in other answers. Here are some notes on how good the approximation is under various circumstances. If you go that route and need code to compute the normal CDF, here's Python code for doing that. It's only about a dozen lines of code and could easily be ported to any other language. But if you want high accuracy and efficient code, you're better off using third party code like DCDFLIB. Several man-years went into producing that library.
Try this one, used in GMP. Another reference is this.
import numpy as np
np.random.seed(1)
x=np.random.binomial(20,0.6,10000) #20 flips of coin,probability of
heads percentage and 10000 times
done.
sum(x>12)/len(x)
The output is 41% of times we got 12 heads.

Resources