Test a model with random data on GLPK - random

I'm new on GLPK, I want to test my simple model,
I use this comment to generate different random data:
param seed:=gmtime();
param u{(i,j) in E}:=(round(seed*Uniform01())) mod 40 ;
and I want to solve model for 100 times and obtain the average value of optimal value of objective function.
I don't know how to code iterated expression to repeat solving model in .mod file. Could you please help me?
This is my model:
### VARIABLES ###
var x{(i,j) in E} >= 0, <= u[i,j];
### OBJECTIVE ###
maximize Val: sum {(1,j) in E} x[1,j];
### CONSTRAINTS ###
subject to Balance {i in V diff {1,n}}:
sum {(j,i) in E} x[j,i] = sum {(i,k) in E} x[i,k];
solve;

I also asked this question in "https://lists.gnu.org/mailman/listinfo/help-glpk",
and I got this answer from Heinrich Schuchardt and it works.
"Dear Shaghayegh,
glpsol cannot iterate over multiple models or data sets by itself.
https://en.wikibooks.org/wiki/GLPK/Scripting_plus_MathProg
shows how to use a scripting language to call glpsol multiple times.
You can pass the seed value for the random number generator like this
glpsol --seed SEEDVALUE
In awk you can use function rand() to create random numbers."
To complete the model, the only thing we need is to add following comments:
param n >=1 integer; # Number of nodes
set V := 1..n; # Set of nodes
set E within (V cross V); # Set of arcs
This is a maximum flow problem.

Related

Pairing the weight of a protein sequence with the correct sequence

This piece of code is part of a larger function. I already created a list of molecular weights and I also defined a list of all the fragments in my data.
I'm trying to figure out how I can go through the list of fragments, calculate their molecular weight and check if it matches the number in the other list. If it matches, the sequence is appended into an empty list.
combs = [397.47, 2267.58, 475.63, 647.68]
fragments = ['SKEPFKTRIDKKPCDHNTEPYMSGGNY', 'KMITKARPGCMHQMGEY', 'AINV', 'QIQD', 'YAINVMQCL', 'IEEATHMTPCYELHGLRWV', 'MQCL', 'HMTPCYELHGLRWV', 'DHTAQPCRSWPMDYPLT', 'IEEATHM', 'MVGKMDMLEQYA', 'GWPDII', 'QIQDY', 'TPCYELHGLRWVQIQDYA', 'HGLRWVQIQDYAINV', 'KKKNARKW', 'TPCYELHGLRWV']
frags = []
for c in combs:
for f in fragments:
if c == SeqUtils.molecular_weight(f, 'protein', circular = True):
frags.append(f)
print(frags)
I'm guessing I don't fully know how the SeqUtils.molecular_weight command works in Python, but if there is another way that would also be great.
You are comparing floating point values for equality. That is bound to fail. You always have to account for some degree of error when dealing with floating point values. In this particular case you also have to take into account the error margin of the input values.
So do not compare floats like this
x == y
but instead like this
abs(x - y) < epsilon
where epsilon is some carefully selected arbitrary number.
I did two slight modifications to your code: I swapped the order of the f and the c loop to be able to store the calculated value of w. And I append the value of w to the list frags as well in order to better understand what is happening.
Your modified code now looks like this:
from Bio import SeqUtils
combs = [397.47, 2267.58, 475.63, 647.68]
fragments = ['SKEPFKTRIDKKPCDHNTEPYMSGGNY', 'KMITKARPGCMHQMGEY', 'AINV', 'QIQD', 'YAINVMQCL', 'IEEATHMTPCYELHGLRWV',
'MQCL', 'HMTPCYELHGLRWV', 'DHTAQPCRSWPMDYPLT', 'IEEATHM', 'MVGKMDMLEQYA', 'GWPDII', 'QIQDY',
'TPCYELHGLRWVQIQDYA', 'HGLRWVQIQDYAINV', 'KKKNARKW', 'TPCYELHGLRWV']
frags = []
threshold = 0.5
for f in fragments:
w = SeqUtils.molecular_weight(f, 'protein', circular=True)
for c in combs:
if abs(c - w) < threshold:
frags.append((f, w))
print(frags)
This prints the result
[('AINV', 397.46909999999997), ('IEEATHMTPCYELHGLRWV', 2267.5843), ('MQCL', 475.6257), ('QIQDY', 647.6766)]
As you can see, the first value for the weight differs from the reference value by about 0.0009. That's why you did not catch it with your approach.

Is there a way to get a random sample from a particular decile of the normal distribution using Stata's rnormal() function?

I'm working with a dataset where the values of my variable of interest are hidden. I have the range (min max), mean, and sd of this variable and for each observation, I have information on which decile the value for observation lies in. Is there any way I can impute some values for this variable using the random number generator or rnormal() suite of commands in Stata? Something along the lines of:
set seed 1
gen imputed_var=rnormal(mean,sd,decile) if decile==1
Appreciate any help on this, thanks!
I am not familiar with Stata, but the following may get you in the right direction.
In general, to generate a random number in a certain decile:
Generate a random number in [(decile-1)/10, decile/10], where decile is the desired decile, from 1 through 10.
Find the quantile of the random number just generated.
Thus, in pseudocode, the following will achieve what you want (I'm not sure about the exact names of the corresponding functions in Stata, though, which is why it's pseudocode):
decile = 4 # 4th decile
# Generate a random number in the decile (here, [0.3, 0.4]).
v = runiform((decile-1)/10, decile/10)
# Convert the number to a normal random number
q = qnormal(v) # Quantile of the standard normal distribution
# Scale and shift the number to the desired mean
# and standard deviation
q = q * sd + mean
This is precisely the suggestion just made by #Peter O. I make the same assumption he did: that by a common abuse of terminology, "decile" is your shorthand for decile class, bin or interval. Historically, deciles are values corresponding to cumulative probabilities 0.1(0.1)0.9, not any bins those values delimit.
. clear
. set obs 100
number of observations (_N) was 0, now 100
. set seed 1506
. gen foo = invnormal(runiform(0, 0.1))
. su foo
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
foo | 100 -1.739382 .3795648 -3.073447 -1.285071
and (closer to your variable names)
gen wanted = invnormal(runiform(0.1 * (decile - 1), 0.1 * decile))

If Random variable is a function

The basic definition of random variable is that it is a function based on random experiment.the question is that if it is a function say f then how can it take numerical values..
Suppose if we toss two coins and X be random variable relating no. of heads with (0,1,2) .For event of two heads say w....we have X(w)=2 is value of function X at w. and not of X itself..
But sometimes it is written that x is a r .v taking values 0,1,2,....
Don't it sound wrong to say function and takes values?
A random variable is a well defined function X: E -> R, whose domain E is a probability space and its codomain is (generally speaking) the set of real numbers.
Intuitively, X is some kind of metric or measurement on the elements of E.
Example 1
Let E be the set of users of Stack Overflow at a given point in time, say right now. And let X be the function that assigns their reputation to every SO user. For example, you could calculate P(X >= 5000) which is the percent of SO users with a reputation of 5000 or more.
Notice that P(X >= 5000) is nothing but a compact notation for the subset of E defined as:
{u in E | X(u) >= 5000}
meaning the subset of SO users u with a reputation of 5000 or more.
Example 2
Let E be the set of questions in SO and X the function that assigns the number of votes (at certain point in time) to each question. If you pick one question q at random, X(q) would be its number of votes and we could ask for the probability of, say, X < 0 (down-voted questions.)
Here the subset of such questions is
{q in E | X(q) < 0}
i.e., the subset of questions q having a negative vote count.
Conclusion
There is nothing random in a Random variable. The randomness is in the way we pick elements (or subsets) from its domain.
Speaking of functions - Yes, it is safe to say that a function can take certain values. Speaking of random variables and probability, the definition I know is:
A random variable assigns a numerical value to each possible outcome of a random experiment
This definition does indeed say that X (aka random variable) is a function. In your case, where it is said that X (as in function) can take values 0,1,2 is basically saying that the subset of the codomain (or even the codomain or target set itself) of function X is the set {0,1,2}, or interval
[0,2] ⊂ ℕ.

Implementing cartesian product, such that it can skip iterations

I want to implement a function which will return cartesian product of set, repeated given number. For example
input: {a, b}, 2
output:
aa
ab
bb
ba
input: {a, b}, 3
aaa
aab
aba
baa
bab
bba
bbb
However the only way I can implement it is firstly doing cartesion product for 2 sets("ab", "ab), then from the output of the set, add the same set. Here is pseudo-code:
function product(A, B):
result = []
for i in A:
for j in B:
result.append([i,j])
return result
function product1(chars, count):
result = product(chars, chars)
for i in range(2, count):
result = product(result, chars)
return result
What I want is to start computing directly the last set, without computing all of the sets before it. Is this possible, also a solution which will give me similar result, but it isn't cartesian product is acceptable.
I don't have problem reading most of the general purpose programming languages, so if you need to post code you can do it in any language you fell comfortable with.
Here's a recursive algorithm that builds S^n without building S^(n-1) "first". Imagine an infinite k-ary tree where |S| = k. Label with the elements of S each of the edges connecting any parent to its k children. An element of S^m can be thought of as any path of length m from the root. The set S^m, in that way of thinking, is the set of all such paths. Now the problem of finding S^n is a problem of enumerating all paths of length n - and we can name a path by considering the sequence of edge labels from beginning to end. We want to directly generate S^n without first enumerating all of S^(n-1), so a depth-first search modified to find all nodes at depth n seems appropriate. This is essentially how the below algorithm works:
// collection to hold generated output
members = []
// recursive function to explore product space
Products(set[1...n], length, current[1...m])
// if the product we're working on is of the
// desired length then record it and return
if m = length then
members.append(current)
return
// otherwise we add each possible value to the end
// and generate all products of the desired length
// with the new vector as a prefix
for i = 1 to n do
current.addLast(set[i])
Products(set, length, current)
currents.removeLast()
// reset the result collection and request the set be generated
members = []
Products([a, b], 3, [])
Now, a breadth-first approach is no less efficient than a depth-first one, and if you think about it would be no different from exactly what you're already doing. Indeed, and approach that generates S^n must necessarily generate S^(n-1) at least once, since that can be found in a solution to S^n.

Solving State Space Response with Variable A matrix

I am trying to verify my RK4 code and have a state space model to solve the same system. I have a 14 state system with initial conditions, but the conditions change with time (each iteration). I am trying to formulate A,B,C,D matrices and use sys and lsim in order to compile the results for all of my states for the entire time span. I am trying to do it similar to this:
for t=1:1:5401
y1b=whatever
.
.
y14b = whatever
y_0 = vector of ICs
A = (will change with time)
B = (1,14) with mostly zeros and 3 ones
C = ones(14,1)
D = 0
Q = eye(14)
R = eye(1)
k = lqr(A,B,C,D)
A_bar = A - B*k
sys = ss(A_bar,B,C,D)
u = zeros(14,1)
sto(t,14) = lsim(sys,u,t,y_0)
then solve for new y1b-y14b from outside function
end
In other words I am trying to use sto(t,14) to store each iteration of lsim and end up with a matrix of all of my states for each time step from 1 to 5401. I keep getting this error message:
Error using DynamicSystem/lsim (line 85)
In time response commands, the time vector must be real, finite, and must contain
monotonically increasing and evenly spaced time samples.
and
Error using DynamicSystem/lsim (line 85)
When simulating the response to a specific input signal, the input data U must be a
matrix with as many rows as samples in the time vector T, and as many columns as
input channels.
Any helpful input is greatly appreciated. Thank you
For lsim to work, t has to contain at least 2 points.
Also, the sizes of B and C are flipped. You have 1 input and 1 output so u should be length of t in lsim by 1.
Lastly, it looks like you try to put all initials conditions at once in lsim with y_0 where you just want the part relevant to this iteration.
s = [t-1 t];
u = [0; 0];
if t==1
y0 = y_0;
else
y0 = sto(t-1,1:14);
end
y = lsim(sys, u, s, y0);
sto(t,1:14) = y(end,:);
I'm not sure I understood correctly your question but I hope it helps.

Resources