how to formulate the problem of finding the optimal PID paramters in gekko? - gekko

I have defined a first order process model and would like to find the optimal PID parameters for this process. The optimization objective is to minimize the IAE ( Integral of absolute error between the setpoint and process value) for set point change over a horizon of 5 times the process time constant.
It is neither a dynamic optimization ( IMODE =6) problem , nor a pure steady state optimization problem (IMODE=3) as it involves the derivatives. How to formulate the above problem in gekko?
m = GEKKO(remote=False)
# Controller model
Kc = m.Var(1.0,lb=0.01,ub=10) # controller gain
tauI = m.Var(2.0,lb=0.01,ub=1000) # controller reset time
tauD = m.Var(1.0,lb=0.0,ub=100) # derivative constant
OP = m.Var(value=0.0,lb=0.0,ub=100) # controller output
PV = m.Var(value=0.0) # process variable
SP = 1.0 # set point
Intgl = m.Var(value=0.0) # integral of the error
err = m.Intermediate(SP-PV) # set point error
m.Equation(Intgl.dt()==err) # integral of the error
m.Equation(OP == Kc*(err + (1/tauI)*Intgl + tauD*PV.dt()))
# Process model
Kp = 2 # process gain
tauP = 10.0 # process time constant
m.Equation(tauP*PV.dt() + PV == Kp*OP)
m.Obj((SP-PV)**2) # how to define the objective to minimize the error over a horizon
m.options.IMODE=3
m.solve(disp=False)
print(str(Kc.VALUE))
print(str(tauI.VALUE))
print(str(tauD.VALUE))
print(str(m.options.OBJFCNVAL))

There is a video tutorial on simulating (00:00-17:00) and optimizing (17:00-23:41) PID tuning parameters with GEKKO. There is starting code as problem #14 in this list of tutorials.
The main points from the video are to switch to IMODE=6 and set the STATUS=1 for the parameters that should be adjusted to minimize the error: (SP-PV)**2.

Related

Use Dymos and OpenMDAO to simulate a pressurized bottle fluid expulsion

I am opening this new thread because I am looking for some to use Dymos, in order to simulate a dynamic system.
Indeed, I am trying to simulate a system which is composing of a pressurized bottle and a fluid inside. When t=0, the pressure is pushing the fluid through the bottle output, and as a result the pressure inside the bottle is decreasing. My aim is to simulate the behaviour of the pressure inside the bottle and the fluid volumic flow which is escaping from the bottle. If found an Dymos example whicih is very similar to what I am trying to do, but more simpler. https://openmdao.github.io/dymos/examples/water_rocket/water_rocket.html
To model my system, I am using two explicit components: the PressureRate, the VolumeFLowRate. Then I am defining the group component PressureModelODE to connect these two last components and their variables.
Here are these components:
class PressureRate(om.ExplicitComponent):
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
# Inputs
self.add_input('p', shape=(nn,), desc='Pressure inside the nox bottle', units='Pa')
self.add_input('Vb', shape=(nn,), desc='Bottle volume', units='m**3')
self.add_input('Vl', shape=(nn,), desc='Liquid volume', units='m**3')
self.add_input('Vl_dot', shape=(nn,), desc='Liquid volume flow rate', units='m**3/s')
self.add_input('gamma', shape=(nn,), desc='Heat capacity ratio')
# Outputs
self.add_output('p_dot', val=np.ones(nn), desc='Pressure change rate', units='Pa/s')
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self, inputs, outputs):
p = inputs['p']
Vb = inputs['Vb']
Vl = inputs['Vl']
Vl_dot = inputs['Vl_dot']
gamma = inputs['gamma']
outputs['p_dot'] = gamma * p/(Vb - Vl) * Vl_dot
class VolumeFlowRate(om.ExplicitComponent):
"""
A Dymos ODE for a damped harmonic oscillator.
"""
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
def setup(self):
# Inputs
self.add_input('p', desc='Pressure inside the nox_bottle', units='Pa')
self.add_input('pout', desc='Pressure outside the nox_bottle', units='Pa')
self.add_input('deltap', desc='Nox bottle pressure losses', units='Pa')
self.add_input('rhol', desc='Liquid density', units='kg/m**3')
self.add_input('Aout', desc='Output nox_bottle area', units='m**2')
# Outputs
self.add_output('Vl_dot', desc='Volume flow rate', units='m**3/s')
self.declare_partials(of='*', wrt='*', method='fd')
def compute(self, inputs, outputs):
p = inputs['p']
pout = inputs['pout']
deltap = inputs['deltap']
rhol = inputs['rhol']
Aout = inputs['Aout']
outputs['Vl_dot'] = Aout*np.sqrt(2/rhol*(p - pout - deltap))
class BottleModelODE(om.Group):
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
self.add_subsystem('pressure_rate', subsys=PressureRate(num_nodes=nn),
promotes_inputs=['p', "Vb", "Vl", "Vl_dot", "gamma"], promotes_outputs=['p_dot'])
self.add_subsystem('volume_flow_rate', subsys=VolumeFlowRate(num_nodes=nn),
promotes_inputs=['p', "pout", 'deltap', 'rhol', "Aout"], promotes_outputs=['Vl_dot'])
self.connect('pressure_rate.p', 'volume_flow_rate.p')
self.connect('pressure_rate.Vl_dot', 'volume_flow_rate.Vl_dot')
Then to solve these equations and simulate my model, I build a program based on the oscillator example: https://openmdao.github.io/dymos/getting_started/intro_to_dymos/intro_segments.html
I am defining a phase called "explusion" by using the following function:
def expulsion_phase_fn(transcription: dm.transcriptions.pseudospectral.radau_pseudospectral.Radau, pamb: float):
phase = dm.Phase(ode_class=BottleModelODE, transcription=transcription)
phase.set_time_options(fix_initial=True, fix_duration=True)
phase.add_state('p', units='bar', rate_source='pressure_rate.p_dot',
targets=['pressure_rate.p', "volume_flow_rate.p"], fix_initial=True, fix_final=False, lower=pamb)
phase.add_state('Vl', units='m**3', rate_source='volume_flow_rate.Vl_dot', targets=['pressure_rate.Vl'],
fix_initial=True, fix_final=False, lower=0)
phase.add_parameter('Vb', targets=['pressure_rate.Vb'], units='m**3')
phase.add_parameter('gamma', targets=['pressure_rate.gamma'])
phase.add_parameter('rhol', targets=['volume_flow_rate.rhol'], units='kg/m**3')
phase.add_parameter('Aout', targets=['volume_flow_rate.Aout'], units='m**2')
phase.add_parameter('pout', targets=['volume_flow_rate.pout'], units="Pa")
phase.add_parameter('deltap', targets=['volume_flow_rate.deltap'], units="Pa")
return phase
Then, I am defining a trajectory with this function:
def trajectory(pamb: float):
transcript = dm.Radau(num_segments=50, solve_segments='forward')
traj = dm.Trajectory()
# Add phases to trajectory
expulsion_phase = traj.add_phase('expulsion',
expulsion_phase_fn(transcription=transcript, pamb=pamb))
return traj, expulsion_phase
And finally, I am setting the OpenMDAO problem, provide the initial values,... by doing the following lines, which are based on the Oscillator example:
def launch_compt():
# Set ambiant conditions
Tamb = 20 + 273.15
pamb = 100*10**3
deltap = 0
Vb = 5*10**-3
Aout = 10*10**-4
# Set NOX bottle properties up
bottle_params = {"Vb": 5*10**-3, "gamma": 1.4, "Aout": 3*10**-2, "rhol": 1000, "pout":
100*10**3, pinit": 300*10**3, "Vl": 1*10**-3}
# Instantiate an OpenMDAO Problem instance
prob = om.Problem(model=om.Group())
prob.driver = om.ScipyOptimizeDriver(optimizer='SLSQP')
# Instantiate a Dymos trjectory and add it to the Problem model
traj, phase = trajectory(pamb= 100*10*3)
phase.add_objective("time", loc="final")
# Setup the OpenMDAO problem
prob.model.add_subsystem("traj", traj)
prob.setup()
# Assign values to the times and states
prob.set_val('traj.explusion.t_initial', 0.0)
prob.set_val('traj.explusion.t_duration', 200.0)
prob.set_val('traj.explusion.states:p', bottle_params["pinit"])
prob.set_val('traj.explusion.states:Vl', bottle_params["Vl"])
prob.set_val('traj.explusion.parameters:Vb', bottle_params["Vb"])
prob.set_val('traj.explusion.parameters:gamma', bottle_params["gamma"])
prob.set_val('traj.explusion.parameters:rhol', bottle_params["rhol"])
prob.set_val('traj.explusion.parameters:Aout', bottle_params["Aout"])
prob.set_val('traj.explusion.parameters:pout', bottle_params["pout"])
prob.set_val('traj.explusion.parameters:deltap', bottle_params["deltap"])
prob.run_driver()
Unofortunately, that does not work I cannot understand why. It returns me that the parameter Vb (Bottle total volume) is not provided but I cannot understand why: it is provided when I am adding the parameters to the problem, like within the Oscillator example.
In that respect I am contacting, in the hope to find some help.
Thank in advance for any answer.
PS: Here is the error message that I get when I am trying to run the program:
raise ValueError(f'Invalid parameter in phase `{self.pathname}`.\n{str(e)}') from e
ValueError: Invalid parameter in phase `traj.phases.expulsion`.
Parameter `Vb` has invalid target(s).
No such ODE input: 'pressure_rate.Vb'.
The primary issue that you have asked about, related to the No such ODE input error, is cased by the way you coded your ODE and more specifically the way you promoted variables and then added the ODE to the phase.
For example, you promoted your input P then set the state target to pressure_rate.P. This is incorrect. When you promoted P that effectively moved the name address up to the top level of the ODE, so the name target is just P now. You can read more about promotion vs connection in the docs. You have this problem in most of your script, where you are not accounting for promotion when you set targets.
Unfortunately, this is not the only issue in your script. There are several more, and enough that I am not able to get things fully working.
Here are some other modest issues in rough order of importance:
The VolumeFlowRate component input and outputs are scalar, but seem to be intended to connect to the vector (of size num_nodes) variables of PressureRate. I suspect you meant to make them vector as well, but am not 100% sure
You have an execution order issue between PressureRate and VolumeRate. Pressure rate seems to need as an input Vl_dot, which comes from VolumeRate`, but you have added it first so it will run BEFORE the component providing its input value.
You had a typo in your set_val calls (explusion vs expulsion)
You did not have a deltap key in the parameter diction, but you did have a variable for it.
After fixing those, I could get the problem to start running but it did not converge or give an answer. You had solve_segments set to forward and had set 50 segments. Both of those seemed like bad settings to me, so I changed them to 3 segments, and removed the solve_segments option.
Then I was able to get the optimizer to take a few steps, but it errored with
Current function value: [200.]
Iterations: 6
Function evaluations: 12
Gradient evaluations: 2
Optimization FAILED.
Positive directional derivative for linesearch
Which indicated a problem with the derivatives. So I changed your setting for partial derivatives from fd to cs. That allowed it to iterate more, but still didn't converge. Without diving more into the physics of your problem I can't easily diagnose this further. I suspect you have some bad boundary conditions and probably bad initial guesses though.
Here is the modified script I came up with to at least get the optimizer iterating.
import numpy as np
import openmdao.api as om
import dymos as dm
class PressureRate(om.ExplicitComponent):
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
# Inputs
self.add_input('p', shape=(nn,), desc='Pressure inside the nox bottle', units='Pa')
self.add_input('Vb', shape=(nn,), desc='Bottle volume', units='m**3')
self.add_input('Vl', shape=(nn,), desc='Liquid volume', units='m**3')
self.add_input('Vl_dot', shape=(nn,), desc='Liquid volume flow rate', units='m**3/s')
self.add_input('gamma', shape=(nn,), desc='Heat capacity ratio')
# Outputs
self.add_output('p_dot', val=np.ones(nn), desc='Pressure change rate', units='Pa/s')
self.declare_partials(of='*', wrt='*', method='cs')
def compute(self, inputs, outputs):
p = inputs['p']
Vb = inputs['Vb']
Vl = inputs['Vl']
Vl_dot = inputs['Vl_dot']
gamma = inputs['gamma']
outputs['p_dot'] = gamma * p/(Vb - Vl) * Vl_dot
class VolumeFlowRate(om.ExplicitComponent):
"""
A Dymos ODE for a damped harmonic oscillator.
"""
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
# Inputs
self.add_input('p', shape=(nn,), desc='Pressure inside the nox_bottle', units='Pa')
self.add_input('pout', shape=(nn,), desc='Pressure outside the nox_bottle', units='Pa')
self.add_input('deltap', shape=(nn,), desc='Nox bottle pressure losses', units='Pa')
self.add_input('rhol', shape=(nn,), desc='Liquid density', units='kg/m**3')
self.add_input('Aout', shape=(nn,), desc='Output nox_bottle area', units='m**2')
# Outputs
self.add_output('Vl_dot', shape=(nn,), desc='Volume flow rate', units='m**3/s')
self.declare_partials(of='*', wrt='*', method='cs')
def compute(self, inputs, outputs):
p = inputs['p']
pout = inputs['pout']
deltap = inputs['deltap']
rhol = inputs['rhol']
Aout = inputs['Aout']
outputs['Vl_dot'] = Aout*np.sqrt(2/rhol*(p - pout - deltap))
class BottleModelODE(om.Group):
def initialize(self):
self.options.declare('num_nodes', types=int)
def setup(self):
nn = self.options['num_nodes']
self.add_subsystem('volume_flow_rate', subsys=VolumeFlowRate(num_nodes=nn),
promotes_inputs=['p', "pout", 'deltap', 'rhol', "Aout"], promotes_outputs=['Vl_dot'])
self.add_subsystem('pressure_rate', subsys=PressureRate(num_nodes=nn),
promotes_inputs=['p', "Vb", "Vl", "Vl_dot", "gamma"], promotes_outputs=['p_dot'])
def expulsion_phase_fn(transcription: dm.transcriptions.pseudospectral.radau_pseudospectral.Radau, pamb: float):
phase = dm.Phase(ode_class=BottleModelODE, transcription=transcription)
phase.set_time_options(fix_initial=True, fix_duration=True)
phase.add_state('p', units='bar', rate_source='p_dot',
targets=['p'], fix_initial=True, fix_final=False, lower=pamb)
phase.add_state('Vl', units='m**3', rate_source='Vl_dot', targets=['Vl'],
fix_initial=True, fix_final=False, lower=0)
phase.add_parameter('Vb', targets=['Vb'], units='m**3')
phase.add_parameter('gamma', targets=['gamma'])
phase.add_parameter('rhol', targets=['rhol'], units='kg/m**3')
phase.add_parameter('Aout', targets=['Aout'], units='m**2')
phase.add_parameter('pout', targets=['pout'], units="Pa")
phase.add_parameter('deltap', targets=['deltap'], units="Pa")
return phase
def trajectory(pamb: float):
# transcript = dm.Radau(num_segments=50, solve_segments='forward')
transcript = dm.Radau(num_segments=3)
traj = dm.Trajectory()
# Add phases to trajectory
expulsion_phase = traj.add_phase('expulsion', expulsion_phase_fn(transcription=transcript, pamb=pamb))
return traj, expulsion_phase
if __name__ == "__main__":
# Set ambiant conditions
Tamb = 20 + 273.15
pamb = 100*10**3
deltap = 0
Vb = 5*10**-3
Aout = 10*10**-4
# Set NOX bottle properties up
bottle_params = {"Vb": 5*10**-3, "gamma": 1.4, "Aout": 3*10**-2, "rhol": 1000, "pout": 100*10**3, "pinit": 300*10**3, "Vl": 1*10**-3}
# Instantiate an OpenMDAO Problem instance
prob = om.Problem(model=om.Group())
prob.driver = om.ScipyOptimizeDriver(optimizer='SLSQP')
# Instantiate a Dymos trjectory and add it to the Problem model
traj, phase = trajectory(pamb=100*10*3)
phase.add_objective("time", loc="final")
# Setup the OpenMDAO problem
prob.model.add_subsystem("traj", traj)
prob.setup()
# Assign values to the times and states
prob.set_val('traj.expulsion.t_initial', 0.0)
prob.set_val('traj.expulsion.t_duration', 200.0)
prob.set_val('traj.expulsion.states:p', bottle_params["pinit"])
prob.set_val('traj.expulsion.states:Vl', bottle_params["Vl"])
prob.set_val('traj.expulsion.parameters:Vb', bottle_params["Vb"])
prob.set_val('traj.expulsion.parameters:gamma', bottle_params["gamma"])
prob.set_val('traj.expulsion.parameters:rhol', bottle_params["rhol"])
prob.set_val('traj.expulsion.parameters:Aout', bottle_params["Aout"])
prob.set_val('traj.expulsion.parameters:pout', bottle_params["pout"])
prob.set_val('traj.expulsion.parameters:deltap', deltap)
prob.run_driver()

TensorFlow - directly calling tf.function much faster than calling tf.function returned from wrapper

I am training a VAE (using federated learning, but that is not so important) and wanted to keep the loss and train functions simple to exchange. The initial approach was to have a tf.function as loss function and a tf.function as train function as follows:
#tf.function
def kl_reconstruction_loss(model, model_input, beta):
x, y = model_input
mean, logvar = model.encode(x, y)
z = model.reparameterize(mean, logvar)
x_logit = model.decode(z, y)
cross_ent = tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=x)
reconstruction_loss = tf.reduce_mean(tf.reduce_sum(cross_ent, axis=[1, 2, 3]), axis=0)
kl_loss = tf.reduce_mean(0.5 * tf.reduce_sum(tf.exp(logvar) + tf.square(mean) - 1. - logvar, axis=-1), axis=0)
loss = reconstruction_loss + beta * kl_loss
return loss, kl_loss, reconstruction_loss
#tf.function
def train_fn(model: tf.keras.Model, batch, optimizer, kl_beta):
"""Trains the model on a single batch.
Args:
model: The VAE model.
batch: A batch of inputs [images, labels] for the vae.
optimizer: The optimizer to train the model.
beta: Weighting of KL loss
Returns:
The loss.
"""
def vae_loss():
"""Does the forward pass and computes losses for the generator."""
# N.B. The complete pass must be inside loss() for gradient tracing.
return kl_reconstruction_loss(model, batch, kl_beta)
with tf.GradientTape() as tape:
loss, kl_loss, rc_loss = vae_loss()
grads = tape.gradient(loss, model.trainable_variables)
grads_and_vars = zip(grads, model.trainable_variables)
optimizer.apply_gradients(grads_and_vars)
return loss
For my dataset this results in an epoch duration of approx. 25 seconds. However, since I have to call those functions directly in my code, I would have to enter different ones if I would want to try out different loss/train functions.
So, alternatively, I followed https://github.com/google-research/federated/tree/master/gans and wrapped the loss function in a class and the train function in another function. Now I have:
class VaeKlReconstructionLossFns(AbstractVaeLossFns):
#tf.function
def vae_loss(self, model, model_input, labels, global_round):
# KL Reconstruction loss
mean, logvar = model.encode(model_input, labels)
z = model.reparameterize(mean, logvar)
x_logit = model.decode(z, labels)
cross_ent = tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=model_input)
reconstruction_loss = tf.reduce_mean(tf.reduce_sum(cross_ent, axis=[1, 2, 3]), axis=0)
kl_loss = tf.reduce_mean(0.5 * tf.reduce_sum(tf.exp(logvar) + tf.square(mean) - 1. - logvar, axis=-1), axis=0)
loss = reconstruction_loss + self._get_beta(global_round) * kl_loss
if model.losses:
loss += tf.add_n(model.losses)
return loss, kl_loss, reconstruction_loss
def create_train_vae_fn(
vae_loss_fns: vae_losses.AbstractVaeLossFns,
vae_optimizer: tf.keras.optimizers.Optimizer):
"""Create a function that trains VAE, binding loss and optimizer.
Args:
vae_loss_fns: Instance of gan_losses.AbstractVAELossFns interface,
specifying the VAE training loss.
vae_optimizer: Optimizer for training the VAE.
Returns:
Function that executes one step of VAE training.
"""
# We check that the optimizer has not been used previously, which ensures
# that when it is bound the train fn isn't holding onto a different copy of
# the optimizer variables then the copy that is being exchanged b/w server and
# clients.
if vae_optimizer.variables():
raise ValueError(
'Expected vae_optimizer to not have been used previously, but '
'variables were already initialized.')
#tf.function
def train_vae_fn(model: tf.keras.Model,
model_inputs,
labels,
global_round,
new_optimizer_state=None):
"""Trains the model on a single batch.
Args:
model: The VAE model.
model_inputs: A batch of inputs (usually images) for the VAE.
labels: A batch of labels corresponding to the inputs.
global_round: The current glob al FL round for beta calculation
new_optimizer_state: A possible optimizer state to overwrite the current one with.
Returns:
The number of examples trained on.
The loss.
The updated optimizer state.
"""
def vae_loss():
"""Does the forward pass and computes losses for the generator."""
# N.B. The complete pass must be inside loss() for gradient tracing.
return vae_loss_fns.vae_loss(model, model_inputs, labels, global_round)
# Set optimizer vars
optimizer_state = get_optimizer_state(vae_optimizer)
if new_optimizer_state is not None:
# if optimizer is uninitialised, initialise vars
try:
tf.nest.assert_same_structure(optimizer_state, new_optimizer_state)
except ValueError:
initialize_optimizer_vars(vae_optimizer, model)
optimizer_state = get_optimizer_state(vae_optimizer)
tf.nest.assert_same_structure(optimizer_state, new_optimizer_state)
tf.nest.map_structure(lambda a, b: a.assign(b), optimizer_state, new_optimizer_state)
with tf.GradientTape() as tape:
loss, kl_loss, rc_loss = vae_loss()
grads = tape.gradient(loss, model.trainable_variables)
grads_and_vars = zip(grads, model.trainable_variables)
vae_optimizer.apply_gradients(grads_and_vars)
return tf.shape(model_inputs)[0], loss, optimizer_state
return train_vae_fn
This new formulation takes about 86 seconds per epoch.
I am struggling to understand why the second version performs so much worse than the first one. Does anyone have a good explanation for this?
Thanks in advance!
EDIT: My Tensorflow version is 2.5.0

Trying to put together a teaching-example with pyhf

I'm trying to learn more about pyhf and my understanding of what the goals are might be limited. I would love to fit my HEP data outside of ROOT, but I could be imposing expectations on pyhf which are not what the authors intended for it's use.
I'd like to write myself a hello-world example, but I might just not know what I'm doing. My misunderstanding could also be gaps in my statistical knowledge.
With that preface, let me explain what I'm trying to explore.
I have some observed set of events for which I calculate some observable and make a binned histogram of that data. I hypothesize that there are two contributing physics processes, which I call signal and background. I generate some Monte Carlo samples for these processes and the theorized total number of events is close to, but not exactly what I observe.
I would like to:
Fit the data to this two process hypothesis
Get from the fit the optimal values for the number of events for each process
Get the uncertainties on these fitted values
If appropriate, calculate an upper limit on the number of signal events.
My starter code is below, where all I'm doing is an ML fit but I'm not sure where to go. I know it's not set up to do what I want, but I'm getting lost in the examples I find on RTD. I'm sure it's me, this is not a criticism of the documentation.
import pyhf
import numpy as np
import matplotlib.pyplot as plt
nbins = 15
# Generate a background and signal MC sample`
MC_signal_events = np.random.normal(5,1.0,200)
MC_background_events = 10*np.random.random(1000)
signal_data = np.histogram(MC_signal_events,bins=nbins)[0]
bkg_data = np.histogram(MC_background_events,bins=nbins)[0]
# Generate an observed dataset with a slightly different
# number of events
signal_events = np.random.normal(5,1.0,180)
background_events = 10*np.random.random(1050)
observed_events = np.array(signal_events.tolist() + background_events.tolist())
observed_sample = np.histogram(observed_events,bins=nbins)[0]
# Plot these samples, if you like
plt.figure(figsize=(12,4))
plt.subplot(1,3,1)
plt.hist(observed_events,bins=nbins,label='Observations')
plt.legend()
plt.subplot(1,3,2)
plt.hist(MC_signal_events,bins=nbins,label='MC signal')
plt.legend()
plt.subplot(1,3,3)
plt.hist(MC_background_events,bins=nbins,label='MC background')
plt.legend()
# Use a very naive estimate of the background
# uncertainties
bkg_uncerts = np.sqrt(bkg_data)
print("Defining the PDF.......")
pdf = pyhf.simplemodels.hepdata_like(signal_data=signal_data.tolist(), \
bkg_data=bkg_data.tolist(), \
bkg_uncerts=bkg_uncerts.tolist())
print("Fit.......")
data = pyhf.tensorlib.astensor(observed_sample.tolist() + pdf.config.auxdata)
bestfit_pars, twice_nll = pyhf.infer.mle.fit(data, pdf, return_fitted_val=True)
print(bestfit_pars)
print(twice_nll)
plt.show()
Note: this answer is based on pyhf v0.5.2.
Alright, so it looks like you've managed to figure most of the big pieces for sure. However, there's two different ways to do this depending on how you prefer to set things up. In both cases, I assume you want an unconstrained fit and you want to...
fit your signal+background model to observed data
fit your background model to observed data
First, let's discuss uncertainties briefly. At the moment, we default to numpy for the tensor background and scipy for the optimizer. See documentation:
numpy backend
scipy optimizer
However, one unfortunate drawback right now with the scipy optimizer is that it cannot return the uncertainties. What you need to do anywhere in your code before the fit (although we generally recommend as early as possible) is to use the minuit optimizer instead:
pyhf.set_backend('numpy', 'minuit')
This will get you the nice features of being able to get the correlation matrix, the uncertainties on the fitted parameters, and the hessian -- amongst other things. We're working to make this consistent for scipy as well, but this is not ready right now.
All optimizations go through our optimizer API which you can currently view through the mixin here in our documentation. Specifically, the signature is
minimize(
objective,
data,
pdf,
init_pars,
par_bounds,
fixed_vals=None,
return_fitted_val=False,
return_result_obj=False,
do_grad=None,
do_stitch=False,
**kwargs)
There are a lot of options here. Let's just focus on the fact that one of the keyword arguments we can pass through is return_uncertainties which will change the bestfit parameters by adding a column for the fitted parameter uncertainty which you want.
1. Signal+Background
In this case, we want to just use the default model
result, twice_nll = pyhf.infer.mle.fit(
data,
pdf,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
2. Background-Only
In this case, we need to turn off the signal. The way we do this is by setting the parameter of interest (POI) fixed to 0.0. Then we can get the fitted parameters for the background-only model in a similar way, but using fixed_poi_fit instead of an unconstrained fit:
result, twice_nll = pyhf.infer.mle.fixed_poi_fit(
0.0,
data,
pdf,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
Note that this is quite simply a quick way of doing the following unconstrained fit
bkg_params = pdf.config.suggested_init()
fixed_params = pdf.config.suggested_fixed()
bkg_params[pdf.config.poi_index] = 0.0
fixed_params[pdf.config.poi_index] = True
result, twice_nll = pyhf.infer.mle.fit(
data,
pdf,
init_pars=bkg_params,
fixed_params=fixed_params,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
Hopefully that clarifies things up more!
Giordon's solution should answer all of your question, but I thought I'd also write out the code to basically address everything we can.
I also take the liberty of changing some of your values a bit so that the signal isn't so strong that the observed CLs value isn't far off to the right of the Brazil band (the results aren't wrong obviously, but it probably makes more sense to be talking about using the discovery test statistic at that point then setting limits. :))
Environment
For this example I'm going to setup a clean Python 3 virtual environment and then install the dependencies (here we're going to be using pyhf v0.5.2)
$ python3 -m venv "${HOME}/.venvs/question"
$ . "${HOME}/.venvs/question/bin/activate"
(question) $ cat requirements.txt
pyhf[minuit,contrib]~=0.5.2
black
(question) $ python -m pip install -r requirements.txt
Code
While we can't easily get the best fit value for both the number of signal events as well as the background events we definitely can do inference to get the best fit value for the signal strength.
The following chunk of code (which is long only because of the visualization) should address all of the points of your question.
# answer.py
import numpy as np
import pyhf
import matplotlib.pyplot as plt
import pyhf.contrib.viz.brazil
# Goals:
# - Fit the model to the observed data
# - Infer the best fit signal strength given the model
# - Get the uncertainties on the best fit signal strength
# - Calculate an 95% CL upper limit on the signal strength
def plot_hist(ax, bins, data, bottom=0, color=None, label=None):
bin_width = bins[1] - bins[0]
bin_leftedges = bins[:-1]
bin_centers = [edge + bin_width / 2.0 for edge in bin_leftedges]
ax.bar(
bin_centers, data, bin_width, bottom=bottom, alpha=0.5, color=color, label=label
)
def plot_data(ax, bins, data, label="Data"):
bin_width = bins[1] - bins[0]
bin_leftedges = bins[:-1]
bin_centers = [edge + bin_width / 2.0 for edge in bin_leftedges]
ax.scatter(bin_centers, data, color="black", label=label)
def invert_interval(test_mus, hypo_tests, test_size=0.05):
# This will be taken care of in v0.5.3
cls_obs = np.array([test[0] for test in hypo_tests]).flatten()
cls_exp = [
np.array([test[1][idx] for test in hypo_tests]).flatten() for idx in range(5)
]
crossing_test_stats = {"exp": [], "obs": None}
for cls_exp_sigma in cls_exp:
crossing_test_stats["exp"].append(
np.interp(
test_size, list(reversed(cls_exp_sigma)), list(reversed(test_mus))
)
)
crossing_test_stats["obs"] = np.interp(
test_size, list(reversed(cls_obs)), list(reversed(test_mus))
)
return crossing_test_stats
def main():
np.random.seed(0)
pyhf.set_backend("numpy", "minuit")
observable_range = [0.0, 10.0]
bin_width = 0.5
_bins = np.arange(observable_range[0], observable_range[1] + bin_width, bin_width)
n_bkg = 2000
n_signal = int(np.sqrt(n_bkg))
# Generate simulation
bkg_simulation = 10 * np.random.random(n_bkg)
signal_simulation = np.random.normal(5, 1.0, n_signal)
bkg_sample, _ = np.histogram(bkg_simulation, bins=_bins)
signal_sample, _ = np.histogram(signal_simulation, bins=_bins)
# Generate observations
signal_events = np.random.normal(5, 1.0, int(n_signal * 0.8))
bkg_events = 10 * np.random.random(int(n_bkg + np.sqrt(n_bkg)))
observed_events = np.array(signal_events.tolist() + bkg_events.tolist())
observed_sample, _ = np.histogram(observed_events, bins=_bins)
# Visualize the simulation and observations
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
plot_hist(ax, _bins, bkg_sample, label="Background")
plot_hist(ax, _bins, signal_sample, bottom=bkg_sample, label="Signal")
plot_data(ax, _bins, observed_sample)
ax.legend(loc="best")
ax.set_ylim(top=np.max(observed_sample) * 1.4)
ax.set_xlabel("Observable")
ax.set_ylabel("Count")
fig.savefig("components.png")
# Build the model
bkg_uncerts = np.sqrt(bkg_sample)
model = pyhf.simplemodels.hepdata_like(
signal_data=signal_sample.tolist(),
bkg_data=bkg_sample.tolist(),
bkg_uncerts=bkg_uncerts.tolist(),
)
data = pyhf.tensorlib.astensor(observed_sample.tolist() + model.config.auxdata)
# Perform inference
fit_result = pyhf.infer.mle.fit(data, model, return_uncertainties=True)
bestfit_pars, par_uncerts = fit_result.T
print(
f"best fit parameters:\
\n * signal strength: {bestfit_pars[0]} +/- {par_uncerts[0]}\
\n * nuisance parameters: {bestfit_pars[1:]}\
\n * nuisance parameter uncertainties: {par_uncerts[1:]}"
)
# Perform hypothesis test scan
_start = 0.0
_stop = 5
_step = 0.1
poi_tests = np.arange(_start, _stop + _step, _step)
print("\nPerforming hypothesis tests\n")
hypo_tests = [
pyhf.infer.hypotest(
mu_test,
data,
model,
return_expected_set=True,
return_test_statistics=True,
qtilde=True,
)
for mu_test in poi_tests
]
# Upper limits on signal strength
results = invert_interval(poi_tests, hypo_tests)
print(f"Observed Limit on µ: {results['obs']:.2f}")
print("-----")
for idx, n_sigma in enumerate(np.arange(-2, 3)):
print(
"Expected {}Limit on µ: {:.3f}".format(
" " if n_sigma == 0 else "({} σ) ".format(n_sigma),
results["exp"][idx],
)
)
# Visualize the "Brazil band"
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
ax.set_title("Hypothesis Tests")
ax.set_ylabel(r"$\mathrm{CL}_{s}$")
ax.set_xlabel(r"$\mu$")
pyhf.contrib.viz.brazil.plot_results(ax, poi_tests, hypo_tests)
fig.savefig("brazil_band.png")
if __name__ == "__main__":
main()
which when run gives
(question) $ python answer.py
best fit parameters:
* signal strength: 1.5884737977889158 +/- 0.7803435235862329
* nuisance parameters: [0.99020988 1.06040191 0.90488207 1.03531383 1.09093327 1.00942088
1.07789316 1.01125627 1.06202964 0.95780043 0.94990993 1.04893286
1.0560711 0.9758487 0.93692481 1.04683181 1.05785515 0.92381263
0.93812855 0.96751869]
* nuisance parameter uncertainties: [0.06966439 0.07632218 0.0611428 0.07230328 0.07872258 0.06899675
0.07472849 0.07403246 0.07613661 0.08606657 0.08002775 0.08655314
0.07564512 0.07308117 0.06743479 0.07383134 0.07460864 0.06632003
0.06683251 0.06270965]
Performing hypothesis tests
/home/stackoverflow/.venvs/question/lib/python3.7/site-packages/pyhf/infer/calculators.py:229: RuntimeWarning: invalid value encountered in double_scalars
teststat = (qmu - qmu_A) / (2 * self.sqrtqmuA_v)
Observed Limit on µ: 2.89
-----
Expected (-2 σ) Limit on µ: 0.829
Expected (-1 σ) Limit on µ: 1.110
Expected Limit on µ: 1.542
Expected (1 σ) Limit on µ: 2.147
Expected (2 σ) Limit on µ: 2.882
Let us know if you have any further questions!

mpi4py: Internal Error: invalid error code 409e0e (Ring ids do not match)

I am coding in python and using mpi4py to do some optimization in parallel. I am using Ordinary Least Squares, and my data is too large to fit on one processor, so I have a master process that then spawns other processes. These child processes each import a section of the data that they respectively work with throughout the optimization process.
I am using scipy.optimize.minimize for the optimization, so the child processes receive a coefficient guess from the parent process, and then report the sum of squared error (SSE) to the parent process, and then scipy.optimize.minimize goes through iterations, trying to find a minimum for the SSE. After each iteration of the minimize function, the parent broadcasts the new coefficient guesses to the child processes, who then calculate the SSE again. In the child processes, this algorithm is set up in a while loop. In the parent process, I simply call scipy.optimize.minimize.
On the part that is giving me a problem, I am doing a nested optimization, or an optimization within an optimization. The inner optimization is an OLS regression as described above, and then the outer optimization is minimizing another function that uses the coefficient of the inner optimization (the OLS regression).
So in my parent process, I have two functions that I minimize, and the second function calls on the first and does a new optimization for every iteration of the second function's optimization. The child processes have a nested while loop for those two optimizations.
Hopefully that all makes sense. If more information is needed, please let me know.
Here is the relevant code for the parent process:
comm = MPI.COMM_SELF.Spawn(sys.executable,args = ['IVQTparallelSlave_cdf.py'],maxprocs=processes)
# First stage: reg D on Z, X
def OLS(betaguess):
comm.Bcast([betaguess,MPI.DOUBLE], root=MPI.ROOT)
SSE = np.array([0],dtype='d')
comm.Reduce(None,[SSE,MPI.DOUBLE], op=MPI.SUM, root = MPI.ROOT)
comm.Bcast([np.array([1],'i'),MPI.INT], root=MPI.ROOT)
return SSE
# Here is the CDF function.
def CDF(yguess, delta_FS, tau):
# Calculate W(y) in the slave process
# Solving the Reduced form after every iteration: reg W(y) on Z, X
comm.Bcast([yguess,MPI.DOUBLE], root=MPI.ROOT)
betaguess = np.zeros(94).astype('d')
###########
# This calculates the reduced form coefficient
coeffs_RF = scipy.minimize(OLS,betaguess,method='Powell')
# This little block is to get the slave processes to stop
comm.Bcast([betaguess,MPI.DOUBLE], root=MPI.ROOT)
SSE = np.array([0],dtype='d')
comm.Reduce(None,[SSE,MPI.DOUBLE], op=MPI.SUM, root = MPI.ROOT)
cont = np.array([0],'i')
comm.Bcast([cont,MPI.INT], root=MPI.ROOT)
###########
contCDF = np.array([1],'i')
comm.Bcast([contCDF,MPI.INT], root=MPI.ROOT) # This is to keep the outer while loop going
delta_RF = coeffs_RF.x[1]
return abs(delta_RF/delta_FS - tau)
########### This one finds Y(1) ##############
betaguess = np.zeros(94).astype('d')
######### First Stage: reg D on Z, X #########
coeffs_FS = scipy.minimize(OLS,betaguess,method='Powell')
print coeffs_FS
# This little block is to get the slave processes' while loops to stop
comm.Bcast([betaguess,MPI.DOUBLE], root=MPI.ROOT)
SSE = np.array([0],dtype='d')
comm.Reduce(None,[SSE,MPI.DOUBLE], op=MPI.SUM, root = MPI.ROOT)
cont = np.array([0],'i')
comm.Bcast([cont,MPI.INT], root=MPI.ROOT)
delta_FS = coeffs_FS.x[1]
######### CDF Function #########
yguess = np.array([3340],'d')
CDF1 = lambda yguess: CDF(yguess, delta_FS, tau)
y_minned_1 = scipy.minimize(CDF1,yguess,method='Powell')
Here is the relevant code for the child processes:
#IVQTparallelSlave_cdf.py
comm = MPI.Comm.Get_parent()
.
.
.
# Importing data. The data is the matrices D, and ZX
.
.
.
########### This one finds Y(1) ##############
######### First Stage: reg D on Z, X #########
cont = np.array([1],'i')
betaguess = np.zeros(94).astype('d')
# This corresponds to 'coeffs_FS = scipy.minimize(OLS,betaguess,method='Powell')' of the parent process
while cont[0]:
comm.Bcast([betaguess,MPI.DOUBLE], root=0)
SSE = np.array(((D - np.dot(ZX,betaguess).reshape(local_n,1))**2).sum(),'d')
comm.Reduce([SSE,MPI.DOUBLE],None, op=MPI.SUM, root = 0)
comm.Bcast([cont,MPI.INT], root=0)
if rank==0: print '1st Stage OLS regression done'
######### CDF Function #########
cont = np.array([1],'i')
betaguess = np.zeros(94).astype('d')
contCDF = np.array([1],'i')
yguess = np.array([0],'d')
# This corresponds to 'y_minned_1 = spicy.minimize(CDF1,yguess,method='Powell')'
while contCDF[0]:
comm.Bcast([yguess,MPI.DOUBLE], root=0)
# This calculates the reduced form coefficient
while cont[0]:
comm.Bcast([betaguess,MPI.DOUBLE], root=0)
W = 1*(Y<=yguess)*D
SSE = np.array(((W - np.dot(ZX,betaguess).reshape(local_n,1))**2).sum(),'d')
comm.Reduce([SSE,MPI.DOUBLE],None, op=MPI.SUM, root = 0)
comm.Bcast([cont,MPI.INT], root=0)
#if rank==0: print cont
comm.Bcast([contCDF,MPI.INT], root=0)
My problem is that after one iteration through the outer minimization, it spits out the following error:
Internal Error: invalid error code 409e0e (Ring ids do not match) in MPIR_Bcast_impl:1328
Traceback (most recent call last):
File "IVQTparallelSlave_cdf.py", line 100, in <module>
if rank==0: print 'CDF iteration'
File "Comm.pyx", line 406, in mpi4py.MPI.Comm.Bcast (src/mpi4py.MPI.c:62117)
mpi4py.MPI.Exception: Other MPI error, error stack:
PMPI_Bcast(1478).....: MPI_Bcast(buf=0x2409f50, count=1, MPI_INT, root=0, comm=0x84000005) failed
MPIR_Bcast_impl(1328):
I haven't been able to find any information about this "ring id" error or how to fix it. Help would be much appreciated. Thanks!

PID control - value of process parameter based on PID result

I'm trying to implement a PID controller following http://en.wikipedia.org/wiki/PID_controller
The mechanism I try to control works as follows:
1. I have an input variable which I can control. Typical values would be 0.5...10.
2. I have an output value which I measure daily. My goal for the output is roughly at the same range.
The two variables have strong correlation - when the process parameter goes up, the output generally goes up, but there's quite a bit of noise.
I'm following the implementation here:
http://code.activestate.com/recipes/577231-discrete-pid-controller/
Now the PID seems like it is correlated with the error term, not the measured level of output. So my guess is that I am not supposed to use it as-is for the process variable, but rather as some correction to the current value? How is that supposed to work exactly?
For example, if we take Kp=1, Ki=Kd=0, The process (input) variable is 4, the current output level is 3 and my target is a value of 2, I get the following:
error = 2-3 = -1
PID = -1
Then I should set the process variable to -1? or 4-1=3?
You need to think in terms of the PID controller correcting a manipulated variable (MV) for errors, and that you need to use an I term to get to an on-target steady-state result. The I term is how the PID retains and applies memory of the prior behavior of the system.
If you are thinking in terms of the output of the controller being changes in the MV, it is more of a 'velocity form' PID, and the memory of prior errors and behavior is integrated and accumulated in the prior MV setting.
From your example, it seems like a manipulated value of -1 is not feasible and that you would like the controller to suggest a value like 3 to get a process output (PV) of 2. For a PID controller to make use of "The process (input) variable is 4,..." (MV in my terms) Ki must be non-zero, and if the system was at steady-state, whatever was accumulated in the integral (sum_e=sum(e)) would precisely equal 4/Ki, so:
Kp= Ki = 1 ; Kd =0
error = SV - PV = 2 - 3 = -1
sum_e = sum_e + error = 4/Ki -1
MV = PID = -1(Kp) + (4/Ki -1)Ki = -1Kp + 4 - 1*Ki = -1 +4 -1 = 2
If you used a slower Ki than 1, it would smooth out the noise more and not adjust the MV so quickly:
Ki = 0.1 ;
MV = PID = -1(Kp) + (4/Ki -1)Ki = -1Kp + 4 - 1*Ki = -1 +4 -0.1 = 2.9
At steady state at target (PV = SV), sum_e * Ki should produce the steady-state MV:
PV = SV
error = SV - PV = 0
Kp * error = 0
MV = 3 = PID = 0 * Kp + Ki * sum_e
A nice way to understand the PID controller is to put units on everything and think of Kp, Ki, Kd as conversions of the process error, accumulated error*timeUnit, and rate-of-change of error/timeUnit into terms of the manipulated variable, and that the controlled system converts the controller's manipulated variable into units of output.

Resources