Binary Image Classification - Validation loss is much higher than training loss - image

I´m facing a strange behaviour which I can´t figure out why it is happening. I´m getting a really high loss(BinaryCrossentropy) on my validation batch around 20 or even higher while training. But after the training I do a prediction on the tet set and I get a loss which is lower than 1. Why is that? I went through my code over and over and can´t find the problem.
I´m doing a binary image classification for brian tumors on a dataset provided via kaggle(Link.
And you can find my notebook here: Google-Colab Notebook
My data is loaded this way:
batch_size = 20
train_ds = tf.keras.utils.image_dataset_from_directory(
train_data_path,
subset='training',
seed=42,
color_mode='grayscale',
batch_size=batch_size,
validation_split=0.30
)
valid_ds = tf.keras.utils.image_dataset_from_directory(
train_data_path,
subset='validation',
seed=42,
batch_size=batch_size,
color_mode='grayscale',
validation_split=0.30
)
test_ds = tf.keras.utils.image_dataset_from_directory(
test_data_path,
color_mode='grayscale',
batch_size=batch_size,
shuffle=False
)
This is my modle strcuture
input_shape = image_batch[0].shape
# set up the model structure
model = tf.keras.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Dropout(0.3),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Dropout(0.3),
layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
layers.Dropout(0.3),
layers.Dense(1, activation="sigmoid")
])
model.summary()
This is my callback function which returns the plots during training:
class PlotLearning(tf.keras.callbacks.Callback):
"""
Callback to plot the learning curves of the model during training.
"""
def on_train_begin(self, logs={}):
self.metrics = {}
for metric in logs:
self.metrics[metric] = []
def on_epoch_end(self, epoch, logs={}):
# Storing metrics
print(logs)
for metric in logs:
if metric in self.metrics:
self.metrics[metric].append(logs.get(metric))
else:
self.metrics[metric] = [logs.get(metric)]
# Plotting
metrics = [x for x in logs if 'val' not in x]
f, axs = plt.subplots(1, len(metrics), figsize=(15,5))
clear_output(wait=True)
for i, metric in enumerate(metrics):
axs[i].plot(range(1, epoch + 2),
self.metrics[metric],
label=metric)
if logs['val_' + metric]:
axs[i].plot(range(1, epoch + 2),
self.metrics['val_' + metric],
label='val_' + metric)
axs[i].legend()
axs[i].grid()
plt.tight_layout()
plt.show()
callbacks_list = [PlotLearning()]
and this is the part where I start the training
# compile model
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['accuracy']
)
# fit model
history = model.fit(prep_train_ds,
epochs=30,
validation_data=valid_ds,
callbacks=callbacks_list)
This is the output of the callback function after the last epoch run through:
As you can see the loss is really high and oscillating around 20, so I guess it is overfitting.
But as mentiod above, here is what I get when I make a prediction on the test set and calculate the binary crossentropy. The loss is again less than 1 and at least in the range of the training loss
I tried so many things like, chaning batch size, bcs. not enough samples of one class might be in one batch. Then I wanted to see if it is overfitting and changed the number of filters, applyed droput etc. But I couldn´t get the loss function down on the validation set. I´m quite new in the field of image classification and maybe I´m oversseing a thing.

Related

auto_arima() m value, and seasonal decomposition period parameter

I am working on arima modeling. The data has hourly granularity - taken from 1st May 2022 till 8th June 2022. I am trying to do forecasting for next 30 days i.e 720 hours. I am facing trouble & getting confused with the below doubts. If anybody could provide pointers then it will be great.
Tried plotting the raw data & found no trend, and seasonality
a) Checked with seasonal_decomposition() with a few period values with period=1 (correct with my understanding that season should be 0)
b) period = 12 (just random - but why it is showing some seasons?. Even if I pot without period for which default value is 7, it still shows season - why?)
Plotted this graph with seasonality value False as in the raw plot I do not see any seasons/trend & getting the below plot. How & what should be concluded???
Then I thought of capturing this season thing through resampling by plotting daily graph and getting further confused.
a) period - 7 (default for seasonal_decomposition), again I can see seasonality of 4 days when the raw plot do not show seasons.
The forecasting for this resampled (daily) data is below
I am extremely clueless now as to what to see. The more I am reading the more I am getting confused.
Below is the code that I am using.
df=pd.read_csv('~/Desktop/gru-scl/gru-scl-filtered.csv', index_col="time")
del df["Index"]
df.index=pd.to_datetime(df.index)
model = pm.auto_arima(df.bps, start_p=0, start_q=0,
test='adf', # use adftest to find optimal 'd'
max_p=3, max_q=3, # maximum p and q
m=24, # frequency of series
d=None, # let model determine 'd'
seasonal=False, # No Seasonality
start_P=0,
D=0,
trace=True,
error_action='ignore',
suppress_warnings=True,
stepwise=True)
f_steps=720
fc, confint = model.predict(n_periods=f_steps, return_conf_int=True)
fc_index = np.arange(len(df.bps), len(df.bps)+f_steps)
val=0
for f in fc:
val = val+f
mean = val/f_steps
print(mean)
# make series for plotting purpose
fc_series = pd.Series(fc, index=fc_index)
lower_series = pd.Series(confint[:, 0], index=fc_index)
upper_series = pd.Series(confint[:, 1], index=fc_index)
# Plot
plt.plot(df.bps, label="Actual values")
plt.plot(fc, color='darkgreen', label="Predicted values")
plt.fill_between(fc_index,
lower_series,
upper_series,
color='k', alpha=.15)
plt.legend(loc='upper left', fontsize=8)
plt.title('Forecast vs Actuals')
plt.xlabel("Hours since 1st May 2022")
plt.ylabel("Bps")
plt.show()

Change all images in training set

I have a convolutional neural network. And I wanted to train it on images from the training set but first they should be wrapped with my function change(tensor, float) that takes in a tensor/image of the form [hight,width,3] and a float.
Batch size =4
loading data
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=2)
Cnn architecture
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
#size of inputs [4,3,32,32]
#size of labels [4]
inputs = change(inputs,0.1) <----------------------------
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs) #[4, 10]
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss = 0.0
print('Finished Training')
I am trying to apply the image function change but it gives an object error.
it there a quick way to fix it?
I am using a Julia function but it works completely fine with other objects. Error message:
JULIA: MethodError: no method matching copy(::PyObject)
Closest candidates are:
copy(!Matched::T) where T<:SHA.SHA3_CTX at /opt/julia-1.7.2/share/julia/stdlib/v1.7/SHA/src/types.jl:213
copy(!Matched::T) where T<:SHA.SHA2_CTX at /opt/julia-1.7.2/share/julia/stdlib/v1.7/SHA/src/types.jl:212
copy(!Matched::Number) at /opt/julia-1.7.2/share/julia/base/number.jl:113
I would recommend to put change function to transforms list, so you do data changes on transformation stage.
partial from functools will help you to fix number of arguments, like this:
from functools import partial
def change(input, float):
pass
# Use partial to fix number of params, such that change accepts only input
change_partial = partial(change, float=pass_float_value_here)
# Add change_partial to a list of transforms before or after converting to tensors
transforms = Compose([
RandomResizedCrop(img_size), # example
# Add change_partial here if it operates on PIL Image
change_partial,
ToTensor(), # convert to tensor
# Add change_partial here if it operates on torch tensors
change_partial,
])

TensorFlow - directly calling tf.function much faster than calling tf.function returned from wrapper

I am training a VAE (using federated learning, but that is not so important) and wanted to keep the loss and train functions simple to exchange. The initial approach was to have a tf.function as loss function and a tf.function as train function as follows:
#tf.function
def kl_reconstruction_loss(model, model_input, beta):
x, y = model_input
mean, logvar = model.encode(x, y)
z = model.reparameterize(mean, logvar)
x_logit = model.decode(z, y)
cross_ent = tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=x)
reconstruction_loss = tf.reduce_mean(tf.reduce_sum(cross_ent, axis=[1, 2, 3]), axis=0)
kl_loss = tf.reduce_mean(0.5 * tf.reduce_sum(tf.exp(logvar) + tf.square(mean) - 1. - logvar, axis=-1), axis=0)
loss = reconstruction_loss + beta * kl_loss
return loss, kl_loss, reconstruction_loss
#tf.function
def train_fn(model: tf.keras.Model, batch, optimizer, kl_beta):
"""Trains the model on a single batch.
Args:
model: The VAE model.
batch: A batch of inputs [images, labels] for the vae.
optimizer: The optimizer to train the model.
beta: Weighting of KL loss
Returns:
The loss.
"""
def vae_loss():
"""Does the forward pass and computes losses for the generator."""
# N.B. The complete pass must be inside loss() for gradient tracing.
return kl_reconstruction_loss(model, batch, kl_beta)
with tf.GradientTape() as tape:
loss, kl_loss, rc_loss = vae_loss()
grads = tape.gradient(loss, model.trainable_variables)
grads_and_vars = zip(grads, model.trainable_variables)
optimizer.apply_gradients(grads_and_vars)
return loss
For my dataset this results in an epoch duration of approx. 25 seconds. However, since I have to call those functions directly in my code, I would have to enter different ones if I would want to try out different loss/train functions.
So, alternatively, I followed https://github.com/google-research/federated/tree/master/gans and wrapped the loss function in a class and the train function in another function. Now I have:
class VaeKlReconstructionLossFns(AbstractVaeLossFns):
#tf.function
def vae_loss(self, model, model_input, labels, global_round):
# KL Reconstruction loss
mean, logvar = model.encode(model_input, labels)
z = model.reparameterize(mean, logvar)
x_logit = model.decode(z, labels)
cross_ent = tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=model_input)
reconstruction_loss = tf.reduce_mean(tf.reduce_sum(cross_ent, axis=[1, 2, 3]), axis=0)
kl_loss = tf.reduce_mean(0.5 * tf.reduce_sum(tf.exp(logvar) + tf.square(mean) - 1. - logvar, axis=-1), axis=0)
loss = reconstruction_loss + self._get_beta(global_round) * kl_loss
if model.losses:
loss += tf.add_n(model.losses)
return loss, kl_loss, reconstruction_loss
def create_train_vae_fn(
vae_loss_fns: vae_losses.AbstractVaeLossFns,
vae_optimizer: tf.keras.optimizers.Optimizer):
"""Create a function that trains VAE, binding loss and optimizer.
Args:
vae_loss_fns: Instance of gan_losses.AbstractVAELossFns interface,
specifying the VAE training loss.
vae_optimizer: Optimizer for training the VAE.
Returns:
Function that executes one step of VAE training.
"""
# We check that the optimizer has not been used previously, which ensures
# that when it is bound the train fn isn't holding onto a different copy of
# the optimizer variables then the copy that is being exchanged b/w server and
# clients.
if vae_optimizer.variables():
raise ValueError(
'Expected vae_optimizer to not have been used previously, but '
'variables were already initialized.')
#tf.function
def train_vae_fn(model: tf.keras.Model,
model_inputs,
labels,
global_round,
new_optimizer_state=None):
"""Trains the model on a single batch.
Args:
model: The VAE model.
model_inputs: A batch of inputs (usually images) for the VAE.
labels: A batch of labels corresponding to the inputs.
global_round: The current glob al FL round for beta calculation
new_optimizer_state: A possible optimizer state to overwrite the current one with.
Returns:
The number of examples trained on.
The loss.
The updated optimizer state.
"""
def vae_loss():
"""Does the forward pass and computes losses for the generator."""
# N.B. The complete pass must be inside loss() for gradient tracing.
return vae_loss_fns.vae_loss(model, model_inputs, labels, global_round)
# Set optimizer vars
optimizer_state = get_optimizer_state(vae_optimizer)
if new_optimizer_state is not None:
# if optimizer is uninitialised, initialise vars
try:
tf.nest.assert_same_structure(optimizer_state, new_optimizer_state)
except ValueError:
initialize_optimizer_vars(vae_optimizer, model)
optimizer_state = get_optimizer_state(vae_optimizer)
tf.nest.assert_same_structure(optimizer_state, new_optimizer_state)
tf.nest.map_structure(lambda a, b: a.assign(b), optimizer_state, new_optimizer_state)
with tf.GradientTape() as tape:
loss, kl_loss, rc_loss = vae_loss()
grads = tape.gradient(loss, model.trainable_variables)
grads_and_vars = zip(grads, model.trainable_variables)
vae_optimizer.apply_gradients(grads_and_vars)
return tf.shape(model_inputs)[0], loss, optimizer_state
return train_vae_fn
This new formulation takes about 86 seconds per epoch.
I am struggling to understand why the second version performs so much worse than the first one. Does anyone have a good explanation for this?
Thanks in advance!
EDIT: My Tensorflow version is 2.5.0

Trying to put together a teaching-example with pyhf

I'm trying to learn more about pyhf and my understanding of what the goals are might be limited. I would love to fit my HEP data outside of ROOT, but I could be imposing expectations on pyhf which are not what the authors intended for it's use.
I'd like to write myself a hello-world example, but I might just not know what I'm doing. My misunderstanding could also be gaps in my statistical knowledge.
With that preface, let me explain what I'm trying to explore.
I have some observed set of events for which I calculate some observable and make a binned histogram of that data. I hypothesize that there are two contributing physics processes, which I call signal and background. I generate some Monte Carlo samples for these processes and the theorized total number of events is close to, but not exactly what I observe.
I would like to:
Fit the data to this two process hypothesis
Get from the fit the optimal values for the number of events for each process
Get the uncertainties on these fitted values
If appropriate, calculate an upper limit on the number of signal events.
My starter code is below, where all I'm doing is an ML fit but I'm not sure where to go. I know it's not set up to do what I want, but I'm getting lost in the examples I find on RTD. I'm sure it's me, this is not a criticism of the documentation.
import pyhf
import numpy as np
import matplotlib.pyplot as plt
nbins = 15
# Generate a background and signal MC sample`
MC_signal_events = np.random.normal(5,1.0,200)
MC_background_events = 10*np.random.random(1000)
signal_data = np.histogram(MC_signal_events,bins=nbins)[0]
bkg_data = np.histogram(MC_background_events,bins=nbins)[0]
# Generate an observed dataset with a slightly different
# number of events
signal_events = np.random.normal(5,1.0,180)
background_events = 10*np.random.random(1050)
observed_events = np.array(signal_events.tolist() + background_events.tolist())
observed_sample = np.histogram(observed_events,bins=nbins)[0]
# Plot these samples, if you like
plt.figure(figsize=(12,4))
plt.subplot(1,3,1)
plt.hist(observed_events,bins=nbins,label='Observations')
plt.legend()
plt.subplot(1,3,2)
plt.hist(MC_signal_events,bins=nbins,label='MC signal')
plt.legend()
plt.subplot(1,3,3)
plt.hist(MC_background_events,bins=nbins,label='MC background')
plt.legend()
# Use a very naive estimate of the background
# uncertainties
bkg_uncerts = np.sqrt(bkg_data)
print("Defining the PDF.......")
pdf = pyhf.simplemodels.hepdata_like(signal_data=signal_data.tolist(), \
bkg_data=bkg_data.tolist(), \
bkg_uncerts=bkg_uncerts.tolist())
print("Fit.......")
data = pyhf.tensorlib.astensor(observed_sample.tolist() + pdf.config.auxdata)
bestfit_pars, twice_nll = pyhf.infer.mle.fit(data, pdf, return_fitted_val=True)
print(bestfit_pars)
print(twice_nll)
plt.show()
Note: this answer is based on pyhf v0.5.2.
Alright, so it looks like you've managed to figure most of the big pieces for sure. However, there's two different ways to do this depending on how you prefer to set things up. In both cases, I assume you want an unconstrained fit and you want to...
fit your signal+background model to observed data
fit your background model to observed data
First, let's discuss uncertainties briefly. At the moment, we default to numpy for the tensor background and scipy for the optimizer. See documentation:
numpy backend
scipy optimizer
However, one unfortunate drawback right now with the scipy optimizer is that it cannot return the uncertainties. What you need to do anywhere in your code before the fit (although we generally recommend as early as possible) is to use the minuit optimizer instead:
pyhf.set_backend('numpy', 'minuit')
This will get you the nice features of being able to get the correlation matrix, the uncertainties on the fitted parameters, and the hessian -- amongst other things. We're working to make this consistent for scipy as well, but this is not ready right now.
All optimizations go through our optimizer API which you can currently view through the mixin here in our documentation. Specifically, the signature is
minimize(
objective,
data,
pdf,
init_pars,
par_bounds,
fixed_vals=None,
return_fitted_val=False,
return_result_obj=False,
do_grad=None,
do_stitch=False,
**kwargs)
There are a lot of options here. Let's just focus on the fact that one of the keyword arguments we can pass through is return_uncertainties which will change the bestfit parameters by adding a column for the fitted parameter uncertainty which you want.
1. Signal+Background
In this case, we want to just use the default model
result, twice_nll = pyhf.infer.mle.fit(
data,
pdf,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
2. Background-Only
In this case, we need to turn off the signal. The way we do this is by setting the parameter of interest (POI) fixed to 0.0. Then we can get the fitted parameters for the background-only model in a similar way, but using fixed_poi_fit instead of an unconstrained fit:
result, twice_nll = pyhf.infer.mle.fixed_poi_fit(
0.0,
data,
pdf,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
Note that this is quite simply a quick way of doing the following unconstrained fit
bkg_params = pdf.config.suggested_init()
fixed_params = pdf.config.suggested_fixed()
bkg_params[pdf.config.poi_index] = 0.0
fixed_params[pdf.config.poi_index] = True
result, twice_nll = pyhf.infer.mle.fit(
data,
pdf,
init_pars=bkg_params,
fixed_params=fixed_params,
return_uncertainties=True,
return_fitted_val=True
)
bestfit_pars, errors = result.T
Hopefully that clarifies things up more!
Giordon's solution should answer all of your question, but I thought I'd also write out the code to basically address everything we can.
I also take the liberty of changing some of your values a bit so that the signal isn't so strong that the observed CLs value isn't far off to the right of the Brazil band (the results aren't wrong obviously, but it probably makes more sense to be talking about using the discovery test statistic at that point then setting limits. :))
Environment
For this example I'm going to setup a clean Python 3 virtual environment and then install the dependencies (here we're going to be using pyhf v0.5.2)
$ python3 -m venv "${HOME}/.venvs/question"
$ . "${HOME}/.venvs/question/bin/activate"
(question) $ cat requirements.txt
pyhf[minuit,contrib]~=0.5.2
black
(question) $ python -m pip install -r requirements.txt
Code
While we can't easily get the best fit value for both the number of signal events as well as the background events we definitely can do inference to get the best fit value for the signal strength.
The following chunk of code (which is long only because of the visualization) should address all of the points of your question.
# answer.py
import numpy as np
import pyhf
import matplotlib.pyplot as plt
import pyhf.contrib.viz.brazil
# Goals:
# - Fit the model to the observed data
# - Infer the best fit signal strength given the model
# - Get the uncertainties on the best fit signal strength
# - Calculate an 95% CL upper limit on the signal strength
def plot_hist(ax, bins, data, bottom=0, color=None, label=None):
bin_width = bins[1] - bins[0]
bin_leftedges = bins[:-1]
bin_centers = [edge + bin_width / 2.0 for edge in bin_leftedges]
ax.bar(
bin_centers, data, bin_width, bottom=bottom, alpha=0.5, color=color, label=label
)
def plot_data(ax, bins, data, label="Data"):
bin_width = bins[1] - bins[0]
bin_leftedges = bins[:-1]
bin_centers = [edge + bin_width / 2.0 for edge in bin_leftedges]
ax.scatter(bin_centers, data, color="black", label=label)
def invert_interval(test_mus, hypo_tests, test_size=0.05):
# This will be taken care of in v0.5.3
cls_obs = np.array([test[0] for test in hypo_tests]).flatten()
cls_exp = [
np.array([test[1][idx] for test in hypo_tests]).flatten() for idx in range(5)
]
crossing_test_stats = {"exp": [], "obs": None}
for cls_exp_sigma in cls_exp:
crossing_test_stats["exp"].append(
np.interp(
test_size, list(reversed(cls_exp_sigma)), list(reversed(test_mus))
)
)
crossing_test_stats["obs"] = np.interp(
test_size, list(reversed(cls_obs)), list(reversed(test_mus))
)
return crossing_test_stats
def main():
np.random.seed(0)
pyhf.set_backend("numpy", "minuit")
observable_range = [0.0, 10.0]
bin_width = 0.5
_bins = np.arange(observable_range[0], observable_range[1] + bin_width, bin_width)
n_bkg = 2000
n_signal = int(np.sqrt(n_bkg))
# Generate simulation
bkg_simulation = 10 * np.random.random(n_bkg)
signal_simulation = np.random.normal(5, 1.0, n_signal)
bkg_sample, _ = np.histogram(bkg_simulation, bins=_bins)
signal_sample, _ = np.histogram(signal_simulation, bins=_bins)
# Generate observations
signal_events = np.random.normal(5, 1.0, int(n_signal * 0.8))
bkg_events = 10 * np.random.random(int(n_bkg + np.sqrt(n_bkg)))
observed_events = np.array(signal_events.tolist() + bkg_events.tolist())
observed_sample, _ = np.histogram(observed_events, bins=_bins)
# Visualize the simulation and observations
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
plot_hist(ax, _bins, bkg_sample, label="Background")
plot_hist(ax, _bins, signal_sample, bottom=bkg_sample, label="Signal")
plot_data(ax, _bins, observed_sample)
ax.legend(loc="best")
ax.set_ylim(top=np.max(observed_sample) * 1.4)
ax.set_xlabel("Observable")
ax.set_ylabel("Count")
fig.savefig("components.png")
# Build the model
bkg_uncerts = np.sqrt(bkg_sample)
model = pyhf.simplemodels.hepdata_like(
signal_data=signal_sample.tolist(),
bkg_data=bkg_sample.tolist(),
bkg_uncerts=bkg_uncerts.tolist(),
)
data = pyhf.tensorlib.astensor(observed_sample.tolist() + model.config.auxdata)
# Perform inference
fit_result = pyhf.infer.mle.fit(data, model, return_uncertainties=True)
bestfit_pars, par_uncerts = fit_result.T
print(
f"best fit parameters:\
\n * signal strength: {bestfit_pars[0]} +/- {par_uncerts[0]}\
\n * nuisance parameters: {bestfit_pars[1:]}\
\n * nuisance parameter uncertainties: {par_uncerts[1:]}"
)
# Perform hypothesis test scan
_start = 0.0
_stop = 5
_step = 0.1
poi_tests = np.arange(_start, _stop + _step, _step)
print("\nPerforming hypothesis tests\n")
hypo_tests = [
pyhf.infer.hypotest(
mu_test,
data,
model,
return_expected_set=True,
return_test_statistics=True,
qtilde=True,
)
for mu_test in poi_tests
]
# Upper limits on signal strength
results = invert_interval(poi_tests, hypo_tests)
print(f"Observed Limit on µ: {results['obs']:.2f}")
print("-----")
for idx, n_sigma in enumerate(np.arange(-2, 3)):
print(
"Expected {}Limit on µ: {:.3f}".format(
" " if n_sigma == 0 else "({} σ) ".format(n_sigma),
results["exp"][idx],
)
)
# Visualize the "Brazil band"
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)
ax.set_title("Hypothesis Tests")
ax.set_ylabel(r"$\mathrm{CL}_{s}$")
ax.set_xlabel(r"$\mu$")
pyhf.contrib.viz.brazil.plot_results(ax, poi_tests, hypo_tests)
fig.savefig("brazil_band.png")
if __name__ == "__main__":
main()
which when run gives
(question) $ python answer.py
best fit parameters:
* signal strength: 1.5884737977889158 +/- 0.7803435235862329
* nuisance parameters: [0.99020988 1.06040191 0.90488207 1.03531383 1.09093327 1.00942088
1.07789316 1.01125627 1.06202964 0.95780043 0.94990993 1.04893286
1.0560711 0.9758487 0.93692481 1.04683181 1.05785515 0.92381263
0.93812855 0.96751869]
* nuisance parameter uncertainties: [0.06966439 0.07632218 0.0611428 0.07230328 0.07872258 0.06899675
0.07472849 0.07403246 0.07613661 0.08606657 0.08002775 0.08655314
0.07564512 0.07308117 0.06743479 0.07383134 0.07460864 0.06632003
0.06683251 0.06270965]
Performing hypothesis tests
/home/stackoverflow/.venvs/question/lib/python3.7/site-packages/pyhf/infer/calculators.py:229: RuntimeWarning: invalid value encountered in double_scalars
teststat = (qmu - qmu_A) / (2 * self.sqrtqmuA_v)
Observed Limit on µ: 2.89
-----
Expected (-2 σ) Limit on µ: 0.829
Expected (-1 σ) Limit on µ: 1.110
Expected Limit on µ: 1.542
Expected (1 σ) Limit on µ: 2.147
Expected (2 σ) Limit on µ: 2.882
Let us know if you have any further questions!

LSTM - LSTM - future value prediction error

After some research, I was able to predict the future value using the LSTM code below. I have also attached the Dmd1ahr.csv file in the github link that I am using.
https://github.com/ukeshchawal/hello-world/blob/master/Dmd1ahr.csv
As you all can see below, 90 data points are training sets and 91st to 100th are future value prediction.
However some of the questions that I still have are:
In order to predict these values I had to originally take more than hundred data sets (here, I have taken 500 data sets) which is not exactly what my primary goal is. Is there a way that given 500 data sets, it will predict the rest 10 or 20 out of sample data points? If yes, will you please write me a sample code where you can just take 500 data points from Dmd1ahr.csv file attached below and it will predict some future values (say 501 to 520) based on those 500 points?
The prediction are way off compared to the one who have in your blogs (definitely indicates for parameter tuning - I tried changing epochs, LSTM layers, Activation, optimizer). What other parameter tuning I can do to make it more robust?
Thank you'll in advance.
import numpy as np
import matplotlib.pyplot as plt
import pandas
# By twaking the architecture it could be made more robust
np.random.seed(7)
numOfSamples = 500
lengthTrain = 90
lengthValidation = 100
look_back = 1 # Can be set higher, in my experiments it made performance worse though
transientTime = 90 # Time to "burn in" time series
series = pandas.read_csv('Dmd1ahr.csv')
def generateTrainData(series, i, look_back):
return series[i:look_back+i+1]
trainX = np.stack([generateTrainData(series, i, look_back) for i in range(lengthTrain)])
testX = np.stack([generateTrainData(series, lengthTrain + i, look_back) for i in range(lengthValidation)])
trainX = trainX.reshape((lengthTrain,look_back+1,1))
testX = testX.reshape((lengthValidation, look_back + 1, 1))
trainY = trainX[:,1:,:]
trainX = trainX[:,:-1,:]
testY = testX[:,1:,:]
testX = testX[:,:-1,:]
############### Build Model ###############
import keras
from keras.models import Model
from keras import layers
from keras import regularizers
inputs = layers.Input(batch_shape=(1,look_back,1), name="main_input")
inputsAux = layers.Input(batch_shape=(1,look_back,1), name="aux_input")
# this layer makes the actual prediction, i.e. decides if and how much it goes up or down
x = layers.recurrent.LSTM(300,return_sequences=True, stateful=True)(inputs)
x = layers.recurrent.LSTM(200,return_sequences=True, stateful=True)(inputs)
x = layers.recurrent.LSTM(100,return_sequences=True, stateful=True)(inputs)
x = layers.recurrent.LSTM(50,return_sequences=True, stateful=True)(inputs)
x = layers.wrappers.TimeDistributed(layers.Dense(1, activation="linear",
kernel_regularizer=regularizers.l2(0.005),
activity_regularizer=regularizers.l1(0.005)))(x)
# auxillary input, the current input will be feed directly to the output
# this way the prediction from the step before will be used as a "base", and the Network just have to
# learn if it goes a little up or down
auxX = layers.wrappers.TimeDistributed(layers.Dense(1,
kernel_initializer=keras.initializers.Constant(value=1),
bias_initializer='zeros',
input_shape=(1,1), activation="linear", trainable=False
))(inputsAux)
outputs = layers.add([x, auxX], name="main_output")
model = Model(inputs=[inputs, inputsAux], outputs=outputs)
model.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['mean_squared_error'])
#model.summary()
#model.fit({"main_input": trainX, "aux_input": trainX[look_back-1,look_back,:]},{"main_output": trainY}, epochs=4, batch_size=1, shuffle=False)
model.fit({"main_input": trainX, "aux_input": trainX[:,look_back-1,:].reshape(lengthTrain,1,1)},{"main_output": trainY}, epochs=100, batch_size=1, shuffle=False)
############### make predictions ###############
burnedInPredictions = np.zeros(transientTime)
testPredictions = np.zeros(len(testX))
# burn series in, here use first transitionTime number of samples from test data
for i in range(transientTime):
prediction = model.predict([np.array(testX[i, :, 0].reshape(1, look_back, 1)), np.array(testX[i, look_back - 1, 0].reshape(1, 1, 1))])
testPredictions[i] = prediction[0,0,0]
burnedInPredictions[:] = testPredictions[:transientTime]
# prediction, now dont use any previous data whatsoever anymore, network just has to run on its own output
for i in range(transientTime, len(testX)):
prediction = model.predict([prediction, prediction])
testPredictions[i] = prediction[0,0,0]
# for plotting reasons
testPredictions[:np.size(burnedInPredictions)-1] = np.nan
############### plot results ###############
#import matplotlib.pyplot as plt
plt.plot(testX[:, 0, 0])
plt.show()
plt.plot(burnedInPredictions, label = "training")
plt.plot(testPredictions, label = "prediction")
plt.legend()
plt.show()

Resources