Training did not improve the model performance on validation data - validation

I trying to train my Resnet-50 network on a database which collects 5968 images for training and 1492 for validation (746 classes with 8 images/class for training and 2 images/class for validation). I used ImageDataGenerator flow_from_directory method to get labels from folders
My problem is that during the training, the accuracy of the training was increasing and the loss was decreasing which is good. In fact, the validation accuracy was very low (around 0.003) and there is no improvement. Also the validation loss is very high and still oscillating into very high values!!
Here is my code
import numpy as np
from keras_preprocessing.image import ImageDataGenerator
from keras.utils.vis_utils import plot_model
import resnet
import json
from keras.callbacks import ModelCheckpoint, EarlyStopping
import keras
import pydot as pyd
keras.utils.vis_utils.pydot = pyd
data_path_l =".\\TRAIN\\left_750\\"
test_data_path_l =".\\TEST\\left_750\\"
num_classes=746
train_images=5968
val_images=1492
batch_size=32
epochs=500
img_channels=3
img_rows=224
img_cols=224
input_imgen = ImageDataGenerator(shear_range = 0.2,
zoom_range = 0.2,
rotation_range=5.,
horizontal_flip = True)
valid_imgen = ImageDataGenerator()
train_it = input_imgen.flow_from_directory(directory=data_path_l,target_size=(img_rows,img_cols),
color_mode="rgb",
batch_size=batch_size,
class_mode="categorical",
shuffle=False,
)
valid_it = valid_imgen.flow_from_directory(directory=test_data_path_l,target_size=(img_rows,img_cols),
color_mode="rgb",
batch_size=batch_size,
class_mode="categorical",
shuffle=False,
)
model = resnet.ResnetBuilder.build_resnet_50((img_channels, img_rows, img_cols), num_classes)
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
filepath=".\\conv2D_models\\model-{epoch:02d}-{loss:.4f}.hdf5"
mc = ModelCheckpoint(filepath, save_weights_only=False, verbose=1,
monitor='loss', mode='min')
history=model.fit_generator(train_it,
steps_per_epoch= train_images // batch_size,
validation_data = valid_it,
validation_steps = val_images // batch_size,
epochs = epochs,callbacks=[mc],
shuffle=False)
model.save('resnet2D_1sample.h5')
and here is a part of training epochs:
Epoch 00059: saving model to .\conv2D_models\model-59-3.6342.hdf5
Epoch 60/500
186/186 [==============================] - 262s 1s/step - loss: 3.6074 - acc: 0.4078 - val_loss: 12.1131 - val_acc: 0.0034
Epoch 00060: saving model to .\conv2D_models\model-60-3.6084.hdf5
Epoch 61/500
186/186 [==============================] - 276s 1s/step - loss: 3.5681 - acc: 0.4236 - val_loss: 12.0455 - val_acc: 0.0034
Epoch 00061: saving model to .\conv2D_models\model-61-3.5683.hdf5
Epoch 62/500
186/186 [==============================] - 100s 536ms/step - loss: 3.4684 - acc: 0.4415 - val_loss: 10.2444 - val_acc: 0.0068
Epoch 00062: saving model to .\conv2D_models\model-62-3.4674.hdf5
Epoch 63/500
186/186 [==============================] - 96s 516ms/step - loss: 3.4523 - acc: 0.4414 - val_loss: 11.6459 - val_acc: 0.0062
Epoch 00063: saving model to .\conv2D_models\model-63-3.4530.hdf5
Epoch 64/500
186/186 [==============================] - 96s 516ms/step - loss: 3.3837 - acc: 0.4782 - val_loss: 12.3293 - val_acc: 0.0062
Epoch 00064: saving model to .\conv2D_models\model-64-3.3847.hdf5
Epoch 65/500
186/186 [==============================] - 96s 515ms/step - loss: 3.2915 - acc: 0.5045 - val_loss: 12.8812 - val_acc: 0.0034
Epoch 00065: saving model to .\conv2D_models\model-65-3.2928.hdf5
Epoch 66/500
186/186 [==============================] - 96s 517ms/step - loss: 3.2506 - acc: 0.5129 - val_loss: 13.2886 - val_acc: 0.0034
Epoch 00066: saving model to .\conv2D_models\model-66-3.2527.hdf5
Epoch 67/500
186/186 [==============================] - 96s 515ms/step - loss: 3.2511 - acc: 0.5123 - val_loss: 14.4090 - val_acc: 0.0034
Epoch 00067: saving model to .\conv2D_models\model-67-3.2530.hdf5
Epoch 68/500
186/186 [==============================] - 97s 519ms/step - loss: 3.2632 - acc: 0.5163 - val_loss: 16.2364 - val_acc: 0.0027
Epoch 00068: saving model to .\conv2D_models\model-68-3.2650.hdf5
Epoch 69/500
186/186 [==============================] - 96s 517ms/step - loss: 3.1477 - acc: 0.5585 - val_loss: 16.2729 - val_acc: 0.0021
Epoch 00069: saving model to .\conv2D_models\tmodel-69-3.1487.hdf5
Epoch 70/500
186/186 [==============================] - 96s 516ms/step - loss: 2.9347 - acc: 0.6099 - val_loss: 16.7732 - val_acc: 0.0014
Epoch 00070: saving model to .\conv2D_models\model-70-2.9369.hdf5
Epoch 71/500
186/186 [==============================] - 96s 515ms/step - loss: 2.7118 - acc: 0.6715 - val_loss: 15.4640 - val_acc: 0.0075
Epoch 00071: saving model to .\conv2D_models\model-71-2.7134.hdf5
Epoch 72/500
186/186 [==============================] - 96s 517ms/step - loss: 2.6145 - acc: 0.6835 - val_loss: 16.2367 - val_acc: 0.0055
Epoch 00072: saving model to .\conv2D_models\model-72-2.6159.hdf5
Epoch 73/500
186/186 [==============================] - 96s 517ms/step - loss: 2.5492 - acc: 0.6816 - val_loss: 16.8155 - val_acc: 0.0000e+00
Epoch 00073: saving model to .\conv2D_models\model-73-2.5503.hdf5
Epoch 74/500
186/186 [==============================] - 96s 516ms/step - loss: 2.5743 - acc: 0.6786 - val_loss: 14.1867 - val_acc: 0.0021
Epoch 00074: saving model to .\conv2D_models\model-74-2.5759.hdf5
Epoch 75/500
186/186 [==============================] - 96s 516ms/step - loss: 2.5295 - acc: 0.6962 - val_loss: 12.3790 - val_acc: 0.0055
could anyone suggest to me some potential raisons that leads to this strange training behavior because it's been blocking me for a week..

Related

Autokeras StructuredDataClassifier fails after a few trials

I'm using StructuredDataClassifier to train a model and I encounter the following error after a few trials.
Trial 3 Complete \[00h 00m 23s\]
val_accuracy: 0.9289383292198181
Best val_accuracy So Far: 0.9289383292198181
Total elapsed time: 00h 01m 02s
Search: Running Trial #4
Value |Best Value So Far |Hyperparameter
True |True |structured_data_block_1/normalize
False |False |structured_data_block_1/dense_block_1/use_batchnorm
2 |2 |structured_data_block_1/dense_block_1/num_layers
32 |32 |structured_data_block_1/dense_block_1/units_0
0 |0 |structured_data_block_1/dense_block_1/dropout
32 |32 |structured_data_block_1/dense_block_1/units_1
0 |0 |classification_head_1/dropout
adam |adam |optimizer
0\.01 |0.001 |learning_rate
Epoch 1/1000
148/148 \[==============================\] - 2s 9ms/step - loss: 0.1917 - accuracy: 0.9576 - val_loss: 0.5483 - val_accuracy: 0.9289
Epoch 2/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1572 - accuracy: 0.9628 - val_loss: 0.3410 - val_accuracy: 0.9289
Epoch 3/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1434 - accuracy: 0.9628 - val_loss: 0.3330 - val_accuracy: 0.9289
Epoch 4/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1414 - accuracy: 0.9628 - val_loss: 0.3014 - val_accuracy: 0.9289
Epoch 5/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1395 - accuracy: 0.9628 - val_loss: 0.3012 - val_accuracy: 0.9289
Epoch 6/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1334 - accuracy: 0.9628 - val_loss: 0.4439 - val_accuracy: 0.9289
Epoch 7/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1370 - accuracy: 0.9628 - val_loss: 0.2964 - val_accuracy: 0.9289
Epoch 8/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1309 - accuracy: 0.9628 - val_loss: 0.2949 - val_accuracy: 0.9289
Epoch 9/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1282 - accuracy: 0.9628 - val_loss: 0.2927 - val_accuracy: 0.9289
Epoch 10/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1301 - accuracy: 0.9628 - val_loss: 0.2937 - val_accuracy: 0.9289
Epoch 11/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1278 - accuracy: 0.9628 - val_loss: 0.3152 - val_accuracy: 0.9289
Epoch 12/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1270 - accuracy: 0.9628 - val_loss: 0.3062 - val_accuracy: 0.9289
Epoch 13/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1286 - accuracy: 0.9628 - val_loss: 0.3198 - val_accuracy: 0.9289
Epoch 14/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1268 - accuracy: 0.9628 - val_loss: 0.3318 - val_accuracy: 0.9289
Epoch 15/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1244 - accuracy: 0.9628 - val_loss: 0.3038 - val_accuracy: 0.9289
Epoch 16/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1239 - accuracy: 0.9628 - val_loss: 0.3050 - val_accuracy: 0.9289
Epoch 17/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1222 - accuracy: 0.9628 - val_loss: 0.3180 - val_accuracy: 0.9289
Epoch 18/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1239 - accuracy: 0.9628 - val_loss: 0.3298 - val_accuracy: 0.9289
Epoch 19/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1220 - accuracy: 0.9628 - val_loss: 0.2916 - val_accuracy: 0.9289
Epoch 20/1000
148/148 \[==============================\] - 1s 8ms/step - loss: 0.1203 - accuracy: 0.9630 - val_loss: 0.3548 - val_accuracy: 0.9289
Epoch 21/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1243 - accuracy: 0.9628 - val_loss: 0.3047 - val_accuracy: 0.9289
Epoch 22/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1208 - accuracy: 0.9633 - val_loss: 0.4035 - val_accuracy: 0.9289
Epoch 23/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1242 - accuracy: 0.9628 - val_loss: 0.3383 - val_accuracy: 0.9289
Epoch 24/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1181 - accuracy: 0.9635 - val_loss: 0.3576 - val_accuracy: 0.9289
Epoch 25/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1171 - accuracy: 0.9641 - val_loss: 0.3221 - val_accuracy: 0.9289
Epoch 26/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1149 - accuracy: 0.9635 - val_loss: 0.3314 - val_accuracy: 0.9289
Epoch 27/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1136 - accuracy: 0.9635 - val_loss: 0.3554 - val_accuracy: 0.9289
Epoch 28/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1196 - accuracy: 0.9633 - val_loss: 0.3311 - val_accuracy: 0.9289
Epoch 29/1000
148/148 [==============================] - 1s 8ms/step - loss: 0.1176 - accuracy: 0.9635 - val_loss: 0.3684 - val_accuracy: 0.9289
Trial 4 Complete [00h 00m 36s]
val_accuracy: 0.9289383292198181
Best val_accuracy So Far: 0.9289383292198181
Total elapsed time: 00h 01m 37s
Search: Running Trial #5
Value |Best Value So Far |Hyperparameter
True |True |structured_data_block_1/normalize
False |False |structured_data_block_1/dense_block_1/use_batchnorm
2 |2 |structured_data_block_1/dense_block_1/num_layers
32 |32 |structured_data_block_1/dense_block_1/units_0
0 |0 |structured_data_block_1/dense_block_1/dropout
32 |32 |structured_data_block_1/dense_block_1/units_1
0 |0 |classification_head_1/dropout
adam_weight_decay |adam |optimizer
0.001 |0.001 |learning_rate
Epoch 1/1000
2022-12-11 16:22:23.607384: W tensorflow/core/framework/op_kernel.cc:1807] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to
float is not supported
2022-12-11 16:22:23.607506: W tensorflow/core/framework/op_kernel.cc:1807] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to
float is not supported
Traceback (most recent call last):
File "/home/anand/automl/automl.py", line 30, in <module>
clf.fit(x=X_train, y=y_train, use_multiprocessing=True, workers=8, verbose=True)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/tasks/structured_data.py", line 326, in fit
history = super().fit(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/tasks/structured_data.py", line 139, in fit
history = super().fit(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/auto_model.py", line 292, in fit
history = self.tuner.search(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/engine/tuner.py", line 193, in search
super().search(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras_tuner/engine/base_tuner.py", line 183, in search
results = self.run_trial(trial, *fit_args, **fit_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras_tuner/engine/tuner.py", line 295, in run_trial
obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/engine/tuner.py", line 101, in _build_and_fit_model
_, history = utils.fit_with_adaptive_batch_size(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 88, in fit_with_adaptive_batch_size
history = run_with_adaptive_batch_size(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 101, in run_with_adaptive_batch_size
history = func(x=x, validation_data=validation_data, **fit_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 89, in <lambda>
batch_size, lambda **kwargs: model.fit(**kwargs), **fit_kwargs
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/anand/automl/.venv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnimplementedError: Graph execution error:
Detected at node 'Cast_1' defined at (most recent call last):
File "/home/anand/automl/automl.py", line 30, in <module>
clf.fit(x=X_train, y=y_train, use_multiprocessing=True, workers=8, verbose=True)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/tasks/structured_data.py", line 326, in fit
history = super().fit(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/tasks/structured_data.py", line 139, in fit
history = super().fit(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/auto_model.py", line 292, in fit
history = self.tuner.search(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/engine/tuner.py", line 193, in search
super().search(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras_tuner/engine/base_tuner.py", line 183, in search
results = self.run_trial(trial, *fit_args, **fit_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras_tuner/engine/tuner.py", line 295, in run_trial
obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/engine/tuner.py", line 101, in _build_and_fit_model
_, history = utils.fit_with_adaptive_batch_size(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 88, in fit_with_adaptive_batch_size
history = run_with_adaptive_batch_size(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 101, in run_with_adaptive_batch_size
history = func(x=x, validation_data=validation_data, **fit_kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/utils/utils.py", line 89, in <lambda>
batch_size, lambda **kwargs: model.fit(**kwargs), **fit_kwargs
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
return fn(*args, **kwargs)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit
tmp_logs = self.train_function(iterator)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function
return step_function(self, iterator)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/engine/training.py", line 1233, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/engine/training.py", line 1222, in run_step
outputs = model.train_step(data)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/engine/training.py", line 1027, in train_step
self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
self.apply_gradients(grads_and_vars)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/autokeras/keras_layers.py", line 360, in apply_gradients
return super(AdamWeightDecay, self).apply_gradients(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_grad
ients
return super().apply_gradients(grads_and_vars, name=name)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 632, in apply_gradi
ents
self._apply_weight_decay(trainable_variables)
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1159, in _apply_wei
ght_decay
tf.__internal__.distribute.interim.maybe_merge_call(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1155, in distribute
d_apply_weight_decay
distribution.extended.update(
File "/home/anand/automl/.venv/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1151, in weight_dec
ay_fn
wd = tf.cast(self.weight_decay, variable.dtype)
Node: 'Cast_1'
2 root error(s) found.
(0) UNIMPLEMENTED: Cast string to float is not supported
[[{{node Cast_1}}]]
(1) CANCELLED: Function was cancelled before it was started
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_70943]
This is my Python code
import tensorflow as tf
import pandas as pd
import numpy as np
import autokeras as ak
from sklearn.model_selection import LeaveOneGroupOut
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder
data = pd.read_csv("p_feature_df.csv")
y = data.pop('is_p')
y = y.astype(np.int32)
data.pop('idx')
groups = data.pop('owner')
data = data.astype(np.float32)
X = data.to_numpy()
lb = LabelEncoder()
y = lb.fit_transform(y)
logo = LeaveOneGroupOut()
logo.get_n_splits(X,y,groups)
results = []
models = []
for train_index, test_index in logo.split(X,y,groups):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
clf = ak.StructuredDataClassifier(overwrite=True)
clf.fit(x=X_train, y=y_train, use_multiprocessing=True, workers=8, verbose=True)
loss, acc = clf.evaluate(x=X_test, y=y_test, verbose=True)
results.append( (loss, acc))
models.append(clf)
print( (loss, acc) )`
The code fails when adam_weight_decay is used.
Same issue here
I think it's related to some download that autokeras made in colab and in local pc didn't
The file is:
https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels_notop.h5
I had the same problem. I solved it by creation of an AutoModel (Baseclass) like:
input_node = ak.StructuredDataInput()
output_node = ak.DenseBlock(use_batchnorm=True)(input_node)
output_node = ak.DenseBlock(dropout=0.1)(output_node)
output_node = ak.DenseBlock(use_batchnorm=True)(input_node)
output_node = ak.ClassificationHead()(output_node)
clf = ak.AutoModel(
inputs=input_node, outputs=output_node, overwrite=True )
It seems to be a bug in StructuredDataClassifier

single layer net in keras with imagedatagenerator, but loss is always negative

I have tried many kinds of net, but even in basic net(single layer), loss which set as binary_crossentropy is always negative
here is the code
from __future__ import print_function
import numpy as np
import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os
import cv2
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
train_path = 'D:/rectangle'
val_path = 'D:/rectang'
model = Sequential()
model.add(Conv2D(32, 1, 1, input_shape=(230, 230, 3)))
model.add(Flatten())
model.add(Dense(64))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
samplewise_center=True,
samplewise_std_normalization=True)
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=(230, 230),
batch_size=32,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
val_path,
target_size=(230, 230),
batch_size=32,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=200,
epochs=50,
validation_data=validation_generator,
nb_val_samples=800
)
here is the processing:
1/200 [..............................] -
ETA: 20:17 - loss: 12.9030 - acc: 0.1250
2/200 [..............................] -
ETA: 10:22 - loss: -2.0179 - acc: 0.0625
3/200 [..............................] -
ETA: 7:03 - loss: -6.3273 - acc: 0.0417
4/200 [..............................] -
ETA: 5:23 - loss: -7.8592 - acc: 0.0312
5/200 [..............................] -
ETA: 4:24 - loss: -8.6776 - acc: 0.0250
6/200 [..............................] -
ETA: 3:44 - loss: -9.5563 - acc: 0.0208
7/200 [>.............................] -
ETA: 3:15 - loss: -9.3298 - acc: 0.0179
8/200 [>.............................] -
ETA: 2:54 - loss: -9.3455 - acc: 0.0156
9/200 [>.............................] -
ETA: 2:37 - loss: -10.2439 - acc: 0.0139
10/200 [>.............................] -
ETA: 2:24 - loss: -10.5647 - acc: 0.0125
11/200 [>.............................] -
ETA: 2:13 - loss: -10.8719 - acc: 0.0114
12/200 [>.............................] -
ETA: 2:04 - loss: -11.3775 - acc: 0.0104
13/200 [>.............................] -
ETA: 1:56 - loss: -11.3066 - acc: 0.0096
14/200 [=>............................] -
ETA: 1:49 - loss: -11.4598 - acc: 0.0089
15/200 [=>............................] -
ETA: 1:48 - loss: -11.4930 - acc: 0.0083
16/200 [=>............................] -
ETA: 1:47 - loss: -11.6465 - acc: 0.0078
17/200 [=>............................] -
ETA: 1:51 - loss: -11.6061 - acc: 0.0074
the input image is the photo of breast cancer hispological images, with 460*460 size and 20000 pics in PNG format.
I would appreciate it if it will be solved!
Since you are doing a binary classification (based in your loss), your last activation function should be sigmoid. So
instead of
model.add(Dense(1))
your last layer should look like:
model.add(Dense(1,activation='sigmoid'))
Without specifying it, your activation will be just linear by default, which fits a regression senario rather than classification.

Checking validation results in Keras shows only 50% correct. Clearly random

I'm struggling with a, seemingly simple, problem. I can't figure out how to match my input images to the resulting probabilities produced by my model.
Training and Validation of my model (Vanilla VGG16, re-trainined for 2 classes, dogs and cats) are going fine, getting me close to 97% validation accuracy, but when I run the check to see what I got right and what I got wrong I only get random results.
Found 1087 correct labels (53.08%)
I am pretty sure it has something to do with the ImageDataGenerator which produces random batches on my validation images, although I DO set shuffle = false
I just save the filenames and classes of my generator before I run them and I ASSUME that the index of my filenames and classes is the same as the output of my probabilities.
Here's my setup (Vanilla VGG16, with last layer replaced to match 2 categories for cats and dogs)
new_model.summary()
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
Binary_predictions (Dense) (None, 2) 8194
=================================================================
Total params: 134,268,738
Trainable params: 8,194
Non-trainable params: 134,260,544
_________________________________________________________________
batch_size=16
epochs=3
learning_rate=0.01
This is the definition of the generators, for training and validation. I did not yet include the data augmentation part at this point.
train_datagen = ImageDataGenerator()
validation_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
train_filenames = train_generator.filenames
train_samples = len(train_filenames)
validation_generator = validation_datagen.flow_from_directory(
valid_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted
validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
Finetuning the model goes fine
#Fine-tune the model
#DOC: fit_generator(generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None,
# validation_data=None, validation_steps=None, class_weight=None,
# max_queue_size=10, workers=1, use_multiprocessing=False, initial_epoch=0)
new_model.fit_generator(
train_generator,
steps_per_epoch=train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=validation_samples // batch_size)
Epoch 1/3
1434/1434 [==============================] - 146s - loss: 0.5456 - acc: 0.9653 - val_loss: 0.5043 - val_acc: 0.9678
Epoch 2/3
1434/1434 [==============================] - 148s - loss: 0.5312 - acc: 0.9665 - val_loss: 0.4293 - val_acc: 0.9722
Epoch 3/3
1434/1434 [==============================] - 148s - loss: 0.5332 - acc: 0.9665 - val_loss: 0.4329 - val_acc: 0.9731
As is the extraction of the validation data
#We need the probabilities/scores for the validation set
#DOC: predict_generator(generator, steps, max_queue_size=10, workers=1,
# use_multiprocessing=False, verbose=0)
probs = new_model.predict_generator(
validation_generator,
steps=validation_samples // batch_size,
verbose = 1)
#Extracting the probabilities and labels
our_predictions = probs[:,0]
our_labels = np.round(1-our_predictions)
expected_labels = validation_generator.classes
Now, when I calculate the success of my validation set by comparing the expected labels and the calculated labels, I get something that is suspiciously close to random:
correct = np.where(our_labels==expected_labels)[0]
print("Found {:3d} correct labels ({:.2f}%)".format(len(correct),
100*len(correct)/len(our_predictions)))
Found 1087 correct labels (53.08%)
Clearly this is not correct.
I suspect this is something to do with the randomness of the Generators, but I set shuffle = False.
This code was DIRECTLY copied from the Fast.ai course by the great Jeremy Howard, but I can't get it to work anymore..
I am using Keras 2.0.8 and TensorFlow 1.3 backend on Python 3.5 under Anaconda...
Please help me retain my sanity!
You need to call validation_generator.reset() in between fit_generator() and predict_generator().
In *_generator() functions, data batches are inserted into a queue before being used to fit/evaluate the model. The underlying queue is always kept full, so there will be some extra batches in the queue when training ends. You can verify it by printing validation_generator.batch_index after training. Therefore, your predict_generator() does not start with the first batch, and probs[0] is not the prediction of the first image. That's why our_labels does not align with expected_labels and the accuracy is low.
BTW, you should use validation_steps=validation_samples // batch_size + 1 (also for the training generator). Unless validation_samples is a multiple of batch_size, you're ignoring one batch in each epoch if you use validation_steps=validation_samples // batch_size, and your model is evaluated on a (slightly) different dataset in each epoch.
I met a similar problem before, I think predict_generator() is not friendly, so I write a function to test the data set.
Here is my code snippet:
from PIL import Image
import numpy as np
import json
def get_img_result(img_path):
image = Image.open(img_path)
image.load()
image = image.resize((img_width, img_height))
if image.mode is not 'RGB':
image = image.convert('RGB')
array = np.asarray(image, dtype='int32')
array = array / 255
array = np.asarray([array])
result = new_model.predict(array)
print(result)
return result
# path: the root folder of the validation data set. validation->cat->kitty.jpg
def validate(path):
result_list = []
right_count = 0
wrong_count = 0
categories = os.listdir(path)
for i in range(len(categories)):
images = os.listdir(os.path.join(path, categories[i]))
for image in images:
result = get_img_result(os.path.join(path, categories[i], image))[0]
if result[i] != max(result):
result_list.append({'image': image, 'category': categories[i], 'score': result.tolist(), 'right': 0})
wrong_count = wrong_count + 1
else:
result_list.append({'image': image, 'category': categories[i], 'score': result.tolist(), 'right': 1})
right_count = right_count + 1
json_string = json.dumps(result_list)
with open('result.json', 'w') as f:
f.write(json_string)
print('right count : {0} \n wrong count : {1} \n accuracy : {2}'.format(right_count, wrong_count,
(right_count) / (
right_count + wrong_count)))
I use PIL convert image to numpy array as Keras do, I test all images and save the result into a json file.
Wish it helps.

Why is my CNN not learning

I am sorry for such a cliche question, but I really don't know why my CNN is not improving.
I am training a CNN for SVHN dataset (single digit) with images of 32x32.
For preprocessing, I transform RGB to grayscale and normalize all pixel data by standardization. So the data range becomes (-1,1). To verify that my X and y correspond to each other correctly, I randomly pick an image from X and a label from y with the same index, and it shows that they do.
Here's my code (Keras, tensorflow backend):
"""
Single Digit Recognition
"""
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Activation, Convolution2D
from keras.layers.pooling import MaxPooling2D
from keras.optimizers import SGD
from keras.layers.core import Dropout, Flatten
model = Sequential()
model.add(Convolution2D(16, 5, 5, border_mode='same', input_shape=(32, 32, 1)))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, border_mode='same', dim_ordering='default'))
model.add(Convolution2D(32, 5, 5, border_mode='same', input_shape=(16, 16, 16)))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, border_mode='same', dim_ordering='default'))
model.add(Convolution2D(64, 5, 5, border_mode='same', input_shape=(32, 8, 8)))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size=(2, 2), strides=None, border_mode='same', dim_ordering='default'))
model.add(Flatten())
model.add(Dense(128, input_dim=1024))
model.add(Activation("relu"))
model.add(Dense(10, input_dim=128))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(train_X, train_y,
validation_split=0.1,
nb_epoch=20,
batch_size=64)
score = model.evaluate(test_X, test_y, batch_size=16)
After running 10 epochs, the accuracy is still the same as in the first epoch, and that's why I stopped it.
Train on 65931 samples, validate on 7326 samples
Epoch 1/20
65931/65931 [==============================] - 190s - loss: 2.2390 - acc: 0.1882 - val_loss: 2.2447 - val_acc: 0.1885
Epoch 2/20
65931/65931 [==============================] - 194s - loss: 2.2395 - acc: 0.1893 - val_loss: 2.2399 - val_acc: 0.1885
Epoch 3/20
65931/65931 [==============================] - 167s - loss: 2.2393 - acc: 0.1893 - val_loss: 2.2402 - val_acc: 0.1885
Epoch 4/20
65931/65931 [==============================] - 172s - loss: 2.2394 - acc: 0.1883 - val_loss: 2.2443 - val_acc: 0.1885
Epoch 5/20
65931/65931 [==============================] - 172s - loss: 2.2393 - acc: 0.1884 - val_loss: 2.2443 - val_acc: 0.1885
Epoch 6/20
65931/65931 [==============================] - 179s - loss: 2.2397 - acc: 0.1881 - val_loss: 2.2433 - val_acc: 0.1885
Epoch 7/20
65931/65931 [==============================] - 173s - loss: 2.2399 - acc: 0.1888 - val_loss: 2.2410 - val_acc: 0.1885
Epoch 8/20
65931/65931 [==============================] - 175s - loss: 2.2392 - acc: 0.1893 - val_loss: 2.2439 - val_acc: 0.1885
Epoch 9/20
65931/65931 [==============================] - 175s - loss: 2.2395 - acc: 0.1893 - val_loss: 2.2401 - val_acc: 0.1885
Epoch 10/20
9536/65931 [===>..........................] - ETA: 162s - loss: 2.2372 - acc: 0.1909
Should I keep trying with more patience or is there something wrong with my CNN?
Try switching your optimizer to Adam, as it is more capable than SGD. You can include Nesterov momentum with nAdam. So i would try the following.
model.compile(loss='categorical_crossentropy',
optimizer='nadam',
metrics=['accuracy'])
This will adjust learning rates automatically and you don't need to worry about it as much.

Can a PNG be encrypted without losing the exif data?

I've been working on this Reddit puzzle:
http://www.reddit.com/r/playitforward/comments/1v6jfh/contest_first_one_to_solve_this_riddle_gets_my/
and most of users in the thread are stumped. Full disclosure, I'd love to win the prize, but by bringing attention to it and asking for assistance, I understand that I lessen my chances, but at this point I want to know what the image says more than anything.
We narrowed down the cyphers to a URL to a text file with PNG exif data, but when opened as a PNG, it turns out corrupted. Could this PNG be encrypted or purposely corrupted in a way to preserve the exif data and what would be the best way to unravel it? Note that the string of numbers and "AK" were explicitly linked to this clue, so I can only assume there is maybe an Asynchronous Key involved or some standard pioneered by Arjen Kampf Lenstra or some Angry Kid behind it all.
Sure, using ImageMagick like this:
# Look at rose image before we start, and its header
identify -verbose rose.jpg
Image: rose.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Mime type: image/jpeg
Class: DirectClass
Geometry: 70x46+0+0
Units: Undefined
Type: TrueColor
Endianess: Undefined
Colorspace: sRGB
Depth: 8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
Pixels: 3220
Red:
min: 31 (0.121569)
max: 255 (1)
mean: 145.56 (0.570825)
standard deviation: 69.1755 (0.271277)
kurtosis: -1.38839
skewness: 0.139004
entropy: 0.97057
Green:
min: 27 (0.105882)
max: 255 (1)
mean: 89.2475 (0.34999)
standard deviation: 52.4516 (0.205693)
kurtosis: 2.60505
skewness: 1.80798
entropy: 0.869705
Blue:
min: 21 (0.0823529)
max: 255 (1)
mean: 80.4214 (0.315378)
standard deviation: 54.9267 (0.215399)
kurtosis: 2.93861
skewness: 1.9566
entropy: 0.85334
Image statistics:
Overall:
min: 21 (0.0823529)
max: 255 (1)
mean: 105.076 (0.412064)
standard deviation: 59.3109 (0.232592)
kurtosis: 1.24657
skewness: 1.44732
entropy: 0.897872
Rendering intent: Perceptual
Gamma: 0.454545
Chromaticity:
red primary: (0.64,0.33)
green primary: (0.3,0.6)
blue primary: (0.15,0.06)
white point: (0.3127,0.329)
Background color: white
Border color: srgb(223,223,223)
Matte color: grey74
Transparent color: black
Interlace: None
Intensity: Undefined
Compose: Over
Page geometry: 70x46+0+0
Dispose: Undefined
Iterations: 0
Compression: JPEG
Quality: 92
Orientation: Undefined
Properties:
date:create: 2015-10-04T18:46:03+01:00
date:modify: 2015-10-04T18:46:03+01:00
jpeg:colorspace: 2
jpeg:sampling-factor: 1x1,1x1,1x1
signature: 38a8912b601557d5a377bff360f03804c383c3298b48d9917504b488e8f4152b
Artifacts:
filename: rose.jpg
verbose: true
Tainted: False
Filesize: 2.65KB
Number pixels: 3.22K
Pixels per second: 3.22EB
User time: 0.000u
Elapsed time: 0:01.000
Version: ImageMagick 6.9.1-10 Q32 x86_64 2015-10-02 http://www.imagemagick.org
Now add a comment into the image and encrypt it as encrypted.png:
convert -comment "Freddy frog" rose.jpg -encipher passphrase.txt encrypted.png
Check the header of encrypted image to see if EXIF data and comment and other data are visible within it - yes, they are:
identify -verbose encrypted.png
Image: encrypted.png
Format: PNG (Portable Network Graphics)
Mime type: image/png
Class: DirectClass
Geometry: 70x46+0+0
Units: Undefined
Type: TrueColor
Endianess: Undefined
Colorspace: sRGB
Depth: 8-bit
Channel depth:
red: 8-bit
green: 8-bit
blue: 8-bit
Channel statistics:
Pixels: 3220
Red:
min: 0 (0)
max: 255 (1)
mean: 126.755 (0.497077)
standard deviation: 73.7824 (0.289343)
kurtosis: -1.18047
skewness: 0.0142557
entropy: 0.99254
Green:
min: 0 (0)
max: 255 (1)
mean: 127.937 (0.501712)
standard deviation: 75.0501 (0.294314)
kurtosis: -1.23185
skewness: -0.0233363
entropy: 0.992485
Blue:
min: 0 (0)
max: 255 (1)
mean: 127.594 (0.500368)
standard deviation: 74.64 (0.292706)
kurtosis: -1.22352
skewness: -0.0177342
entropy: 0.992544
Image statistics:
Overall:
min: 0 (0)
max: 255 (1)
mean: 127.428 (0.499719)
standard deviation: 74.4927 (0.292128)
kurtosis: -1.21239
skewness: -0.00900116
entropy: 0.992523
Rendering intent: Perceptual
Gamma: 0.45455
Chromaticity:
red primary: (0.64,0.33)
green primary: (0.3,0.6)
blue primary: (0.15,0.06)
white point: (0.3127,0.329)
Background color: white
Border color: srgb(223,223,223)
Matte color: grey74
Transparent color: black
Interlace: None
Intensity: Undefined
Compose: Over
Page geometry: 70x46+0+0
Dispose: Undefined
Iterations: 0
Compression: Zip
Orientation: Undefined
Properties:
cipher:mode: CTR
cipher:nonce: d3d57ca43eacb27a9d72b65ef976923e5b761c7aaaee1d1914d1769ca4834488
cipher:type: AES
comment: Freddy frog <--- comment is visible
date:create: 2015-10-04T18:48:43+01:00
date:modify: 2015-10-04T18:48:43+01:00
png:bKGD: chunk was found (see Background color, above)
png:cHRM: chunk was found (see Chromaticity, above)
png:gAMA: gamma=0.45454544 (See Gamma, above)
png:IHDR.bit-depth-orig: 8
png:IHDR.bit_depth: 8
png:IHDR.color-type-orig: 2
png:IHDR.color_type: 2 (Truecolor)
png:IHDR.interlace_method: 0 (Not interlaced)
png:IHDR.width,height: 70, 46
png:sRGB: intent=0 (Perceptual Intent)
png:text: 6 tEXt/zTXt/iTXt chunks were found
signature: 273e3934027f6ffbcf00b3eca7eb0c576d8fd180e87133112ecacd59225986ee
Artifacts:
filename: encrypted.png
verbose: true
Tainted: False
Filesize: 10.1KB
Number pixels: 3.22K
Pixels per second: 3.22EB
User time: 0.000u
Elapsed time: 0:01.000
Version: ImageMagick 6.9.1-10 Q32 x86_64 2015-10-02 http://www.imagemagick.org
Now look at the encrypted image - junk
Decrypt image as decrypted.jpg - looks like a rose to me :-)
convert encrypted.png -decipher passphrase.txt decrypted.jpg

Resources