binary:logistic like parameter in LightGBM - lightgbm

I want my predictions in probabilities between 0 and 1. I already did that in xgboost but I wanna try out Lightgbm too but its outputting solid predictions(that is in integer only). I could do that in XGBoost by setting 'objective' parameter to binary:logistic but in Lightgbm there doesn't seem to be any parameter like that, It only has binary and it is giving output in 0 or 1.

To get the class probability between 0 and 1 in lightgbm, you have to use a default value of a parameter "objective" is a regression.
'objective' = 'binary' ( return class label 0 or 1)
'objective' = 'regression' ( return class probability between 0 and 1)

You can do it by setting objective: “multiclass” with num_class: 2 as parameters. The results might not be the same with direct binary classification model yet I can ensure you that there will be no performance loss.
Bonus: As loss metric, you can use “multi_error” or “multi_logloss” or interestingly a combination of both like:
metric: “multi_error”, “multi_logloss”

You can use predict(raw_score=True)
If you are using the sklearn API - You can use objective "binary", just use predict_proba() instead of predict()

Related

Tuning max_depth in Random Forest using CARET

I'm building a Random Forest with Caret package on R with method = "rf". I see that every type of random forest on caret seems only tune mtry which is the number of features selected randomly for each tree. I do not understand why max_depth of each tree is not a tunable parameter (like cart) ? In my mind, it is a parameter which can limit over-fitting.
For example, my rf seems really better on train data than the test data :
model <- train(
group ~., data = train.data, method = "rf",
trControl = trainControl("repeatedcv", number = 5,repeats =10),
tuneLength=5
)
> postResample(fitted(model),train.data$group)
Accuracy Kappa
0.9574592 0.9745841
> postResample(predict(model,test.data),test.data$group)
Accuracy Kappa
0.7333333 0.5428571
As you can see my model is clearly over-fitted. However, I tried a lot of different things to handle this but nothing worked. I always have something like 0.7 accuracy on test data and 0.95 on train data. This is why I want to optimize other parameters.
I cannot share my data to reproduce this.

SARIMAX model in PyMC3

I would like to write down the following SARIMAX model (2,0,0) (2,0,0,12) in PyMC3 to perform bayesian estimation of its coefficients but I cannot figure out how to start with the seasonal part
Has anyone tries something like this?
with pm.Model() as ar2:
theta = pm.Normal("theta", 0.0, 1.0, shape=2)
sigma = pm.HalfNormal("sigma", 3)
likelihood = pm.AR("y", theta, sigma=sigma, observed=data)
trace = pm.sample(
1000,
tune=2000,
random_seed=13,
)
idata = az.from_pymc3(trace)
Although it would be best (e.g. best performance) if you can get an answer that uses PyMC3 exclusively, in case that does not exist yet, there is an alternative way to do this that uses the SARIMAX model in Statsmodels in combination with PyMC3.
There are too many details to repeat a full answer here, but basically you wrap the log-likelihood and gradient methods associated with a Statsmodels SARIMAX model. Here is a link to an example Jupyter notebook that shows how to do this:
https://www.statsmodels.org/stable/examples/notebooks/generated/statespace_sarimax_pymc3.html
I'm not sure if you'll still need it, however, expanding on cfulton's answer, here is how to fix the error in the statsmodels example (https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_pymc3.html, cell 8):
with pm.Model():
# Priors
arL1 = pm.Uniform('ar.L1', -0.99, 0.99)
maL1 = pm.Uniform('ma.L1', -0.99, 0.99)
sigma2 = pm.InverseGamma('sigma2', 2, 4)
# convert variables to tensor vectors
# # this is wrong:
theta = tt.as_tensor_variable([arL1, maL1, sigma2])
# # this is correct:
theta = tt.as_tensor_variable([arL1, maL1, sigma2], 'v')
# use a DensityDist (use a lamdba function to "call" the Op)
# # this is wrong:
# pm.DensityDist('likelihood', lambda v: loglike(v), observed={'v': theta})
# # this is correct:
pm.DensityDist('likelihood', lambda v: loglike(v), observed=theta)
# Draw samples
trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, cores=4)
I'm no pymc3/theano expert, but I think the error means that Theano has failed to associate the tensor's name with the values. If you define the name along with the values right at the beginning, it works.
I know it's not a direct answer to your question. Nevertheless, I hope it helps.

Variable.assign(value) on Multi-GPU with Tensorflow 2

I have a model that works perfectly on a single GPU as follows:
alpha = tf.Variable(alpha,
name='ws_alpha',
trainable=False,
dtype=tf.float32,
aggregation=tf.VariableAggregation.ONLY_FIRST_REPLICA,
)
...
class CustomModel(tf.keras.Model):
#tf.function
def train_step(inputs):
...
alpha.assign_add(increment)
...
model.fit(dataset, epochs=10)
However, when I run on multiple GPUs, the assignment is not being done. It works for two training steps, and then remains the same over the whole epoch.
The alpha is for a weighted sum of two layers e.g. out = a*Layer1 + (1-a)*Layer2. It is not a trainable parameter, but something akin to a step_count variable.
Has anyone had experience with assigning individual values in a multi-GPU setting on tensorflow 2?
Would it be better to assign the variable as:
with tf.device("CPU:0"):
alpha = tf.Variable()
?
Simple fix, as per tensorflow issues
alpha = tf.Variable(alpha,
name='ws_alpha',
trainable=False,
dtype=tf.float32,
aggregation=tf.VariableAggregation.ONLY_FIRST_REPLICA,
synchronization=tf.VariableSynchronization.ON_READ,
)

Using tf.metrics.mean_iou during training

I want to train a model using the tensorflow estimator and want to track multiple metrics during training end evaluation. The metrics i want to track are accruacy and mean intersection-over-union (and my loss).
I managed to figure out how to track the accuracy during training:
if mode == tf.estimator.ModeKeys.TRAIN:
...
accuracy = tf.metrics.accuracy(labels=indices_ground_truth, predictions=indices_prediction, name='acc_op')
tf.summary.scalar('accuracy', accuracy[1])
and evaluation:
if mode == tf.estimator.ModeKeys.EVAL:
...
accuracy = tf.metrics.accuracy(labels=indices_ground_truth, predictions=indices_prediction)
eval_metric_ops = {'accuracy': accuracy}
return tf.estimator.EstimatorSpec(mode, loss=loss, eval_metric_ops=eval_metric_ops)
For evaluation the mean intersection over union works the same. So its actually:
if mode == tf.estimator.ModeKeys.EVAL:
...
miou = tf.metrics.mean_iou(labels=indices_ground_truth, predictions=indices_prediction, num_classes=13)
accuracy = tf.metrics.accuracy(labels=indices_ground_truth, predictions=indices_prediction)
eval_metric_ops = {'miou': miou,
'accuracy': accuracy}
return tf.estimator.EstimatorSpec(mode, loss=loss, eval_metric_ops=eval_metric_ops)
As far as i know i have to track the update operation (the second return value) on the value during training. Otherwise it returns 0 every time. For a single value like the accuracy that works.
But for the miou the second return value is the update operation of the confusion matrix used to calculate the miou. Thats a [numClass,numClass] tensor. If i try to track it like the accuracy tf.summary.scalar('miou', miou[1]) it crashes because a [numClass,numClass] tensor is not a scalar.
tf.summary.scalar('miou', miou[0]) gives me 0s everytime.
So how can i give the miou to the summary?
Here is how I calculate the IoU while training:
mIoU, update_op = tf.contrib.metrics.streaming_mean_iou(predict, raw_gt, num_classes=2, weights=None)
tf.summary.scalar('meanIoU', mIoU)
confusion_matrix, _ = sess.run([update_op, train_op], feed_dict=feed_dict)
iou = sess.run(mIoU)
print('iou score = {:.3f}, ({:.3f} sec/step)'.format(iou, duration))
You don't need to track the confusion matrix output to track the IoU on tensorboard. The above works fine for me. I think, what you are missing is running the tensors in your session. You need to run update_op such as sess.run(update_op), while running metric operations as sess.run(iou)

Using if conditions inside a TensorFlow graph

In tensorflow CIFAR-10 tutorial in cifar10_inputs.py line 174 it is said you should randomize the order of the operations random_contrast and random_brightness for better data augmentation.
To do so the first thing I think of is drawing a random variable from the uniform distribution between 0 and 1 : p_order. And do:
if p_order>0.5:
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
else:
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
However there are two possible options for getting p_order:
1) Using numpy which disatisfies me as I wanted pure TF and that TF discourages its user to mix numpy and tensorflow
2) Using TF, however as p_order can only be evaluated in a tf.Session()
I do not really know if I should do:
with tf.Session() as sess2:
p_order_tensor=tf.random_uniform([1,],0.,1.)
p_order=float(p_order_tensor.eval())
All those operations are inside the body of a function and are run from another script which has a different session/graph. Or I could pass the graph from the other script as an argument to this function but I am confused.
Even the fact that tensorflow functions like this one or inference for example seem to define the graph in a global fashion without explicitly returning it as an output is a bit hard to understand for me.
You can use tf.cond(pred, fn1, fn2, name=None) (see doc).
This function allows you to use the boolean value of pred inside the TensorFlow graph (no need to call self.eval() or sess.run(), hence no need of a Session).
Here is an example of how to use it:
def fn1():
distorted_image=tf.image.random_contrast(image)
distorted_image=tf.image.random_brightness(distorted_image)
return distorted_image
def fn2():
distorted_image=tf.image.random_brightness(image)
distorted_image=tf.image.random_contrast(distorted_image)
return distorted_image
# Uniform variable in [0,1)
p_order = tf.random_uniform(shape=[], minval=0., maxval=1., dtype=tf.float32)
pred = tf.less(p_order, 0.5)
distorted_image = tf.cond(pred, fn1, fn2)

Resources