Classification Image, Tensorflow, Accuracy stuck in 10% - image

i'm actually doing the CINIC-10 Classification image challenge for my IT studies.
i never had DeepLearning experience before so i learnt it with some youtube videos.
I first tried the MNIST dataset for handwritting numbers and i had a great experience from it.
My model had a 92% chance of prediction and it worked great.
now i'm Trying to classify some images and even when i use different models from Keras my training model don't go above 10% of accuracy.
here's how i proceeded :
First i'm loading my Dasasets i have a train dataset and a validation dataset.
# loading in the data
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
cinic_directory_train,
validation_split=0.2,
subset="training",
seed=123,
image_size=(32, 32),
batch_size=16
)
validation_ds = tf.keras.preprocessing.image_dataset_from_directory(
cinic_directory_train,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(32, 32),
batch_size=16
)
with that i can get my clases names
class_names= train_ds.class_names
print(class_names)
Output :
['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'\]
and this is my model construction :
model = keras.Sequential([
keras.layers.experimental.preprocessing.Rescaling(1./255),
keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
keras.layers.MaxPooling2D(),
keras.layers.Dropout(0.2),
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10)
])
model.compile(
optimizer='adam', #Fonction d'optimisation
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
And when i start the train session
history = model.fit(
train_ds,
validation_data=validation_ds,
epochs=3
)
my Accuracy is stuck between 0.09 and 0.10
I even tested my friends code and i keep getting the same accuracy beside they get like 30-50% of accuracy.
I'm using google Collabs for this.
I tried all those model and i keep getting alow accuracy :
VVG16 => 9%
Resnet50 => 9%
DenseNEt => 8%
EfficientNet => 2%
MobileNet => 9%
I can't find my problem and how to fix it!

you final layer should be
keras.layers.Dense(10. activation='softmax')

Related

Overfitting in my Convolutional Neural Network Model

I have built a CNN model with the CUB-200-2011 dataset. But my model accuracy and validation accuracy have a huge different which my accuracy have 0.6 above and my validation accuracy only have 0.2.
I have tried to add data augmentation and tried with different model architecture. I have no idea why this situation occurs.
#get train dataset
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
path,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size
)
#get validation dataset
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
path,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size
)
# Create data augmentation layer
data_augmentation = tf.keras.Sequential([
tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal"),
tf.keras.layers.experimental.preprocessing.RandomRotation(0.1),
tf.keras.layers.experimental.preprocessing.RandomZoom(0.2)
])
#model architecture
model = tf.keras.models.Sequential([
data_augmentation,
tf.keras.layers.experimental.preprocessing.Rescaling(1./255),
tf.keras.layers.Conv2D(32, 3, activation="relu", input_shape=(img_height, img_width, 3)),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Conv2D(64, 3, activation="relu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Conv2D(128, 3, activation="relu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation="relu"),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(200, activation="softmax")
])

Binary Image Classification - Validation loss is much higher than training loss

I´m facing a strange behaviour which I can´t figure out why it is happening. I´m getting a really high loss(BinaryCrossentropy) on my validation batch around 20 or even higher while training. But after the training I do a prediction on the tet set and I get a loss which is lower than 1. Why is that? I went through my code over and over and can´t find the problem.
I´m doing a binary image classification for brian tumors on a dataset provided via kaggle(Link.
And you can find my notebook here: Google-Colab Notebook
My data is loaded this way:
batch_size = 20
train_ds = tf.keras.utils.image_dataset_from_directory(
train_data_path,
subset='training',
seed=42,
color_mode='grayscale',
batch_size=batch_size,
validation_split=0.30
)
valid_ds = tf.keras.utils.image_dataset_from_directory(
train_data_path,
subset='validation',
seed=42,
batch_size=batch_size,
color_mode='grayscale',
validation_split=0.30
)
test_ds = tf.keras.utils.image_dataset_from_directory(
test_data_path,
color_mode='grayscale',
batch_size=batch_size,
shuffle=False
)
This is my modle strcuture
input_shape = image_batch[0].shape
# set up the model structure
model = tf.keras.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Dropout(0.3),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Dropout(0.3),
layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
layers.Dropout(0.3),
layers.Dense(1, activation="sigmoid")
])
model.summary()
This is my callback function which returns the plots during training:
class PlotLearning(tf.keras.callbacks.Callback):
"""
Callback to plot the learning curves of the model during training.
"""
def on_train_begin(self, logs={}):
self.metrics = {}
for metric in logs:
self.metrics[metric] = []
def on_epoch_end(self, epoch, logs={}):
# Storing metrics
print(logs)
for metric in logs:
if metric in self.metrics:
self.metrics[metric].append(logs.get(metric))
else:
self.metrics[metric] = [logs.get(metric)]
# Plotting
metrics = [x for x in logs if 'val' not in x]
f, axs = plt.subplots(1, len(metrics), figsize=(15,5))
clear_output(wait=True)
for i, metric in enumerate(metrics):
axs[i].plot(range(1, epoch + 2),
self.metrics[metric],
label=metric)
if logs['val_' + metric]:
axs[i].plot(range(1, epoch + 2),
self.metrics['val_' + metric],
label='val_' + metric)
axs[i].legend()
axs[i].grid()
plt.tight_layout()
plt.show()
callbacks_list = [PlotLearning()]
and this is the part where I start the training
# compile model
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer,
loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
metrics=['accuracy']
)
# fit model
history = model.fit(prep_train_ds,
epochs=30,
validation_data=valid_ds,
callbacks=callbacks_list)
This is the output of the callback function after the last epoch run through:
As you can see the loss is really high and oscillating around 20, so I guess it is overfitting.
But as mentiod above, here is what I get when I make a prediction on the test set and calculate the binary crossentropy. The loss is again less than 1 and at least in the range of the training loss
I tried so many things like, chaning batch size, bcs. not enough samples of one class might be in one batch. Then I wanted to see if it is overfitting and changed the number of filters, applyed droput etc. But I couldn´t get the loss function down on the validation set. I´m quite new in the field of image classification and maybe I´m oversseing a thing.

Unable to use multi GPU training with TensorFlow 2

Setup: Win 10, 2x Geforce RTX 2080 Ti, Tensorflow 2.9.1 (also tested older versions), Geforce Driver 512.95
I tried multiple tutorials for multi GPU training with Tensorflow 2 and was never able to utelize more then one GPU.
Here is my code.
def mnist_dataset(batch_size):
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
x_train = x_train / np.float32(255)
y_train = y_train.astype(np.int64)
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(60000).repeat().batch(batch_size)
return train_dataset
def build_and_compile_cnn_model():
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(28, 28)),
tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
tf.keras.layers.Conv2D(512, 3, activation='relu'),
tf.keras.layers.Conv2D(512, 3, activation='relu'),
tf.keras.layers.Conv2D(512, 3, activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)
])
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
model.summary()
return model
strategy = tf.distribute.MirroredStrategy()
#strategy = tf.distribute.MirroredStrategy(devices=["/gpu:0", "/gpu:1"], cross_device_ops=tf.distribute.ReductionToOneDevice())
print(tf.__version__)
print(tf.config.list_physical_devices())
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
with strategy.scope():
multi_worker_model = build_and_compile_cnn_model()
batch_size = 64
multi_worker_dataset = mnist_dataset(batch_size)
multi_worker_model = build_and_compile_cnn_model()
multi_worker_model.fit(multi_worker_dataset, epochs=10, steps_per_epoch=250)
Output from tf.config.list_physical_devices() and strategy.num_replicas_in_sync.
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
Number of devices: 2
But only one GPU gets used.
GPU usage and temperature multi GPU test
To test if both GPUs are working and are accessable by Tensorflow I tried this code.
with tf.device('/gpu:0'):
batch_size = 64
single_worker_dataset = mnist_dataset(batch_size)
single_worker_model = build_and_compile_cnn_model()
single_worker_model.fit(single_worker_dataset, epochs=5, steps_per_epoch=250)
with tf.device('/gpu:1'):
batch_size = 64
single_worker_dataset = mnist_dataset(batch_size)
single_worker_model = build_and_compile_cnn_model()
single_worker_model.fit(single_worker_dataset, epochs=5, steps_per_epoch=250)
Runs as expected. First GPU:0 gets used and then GPU:1. No Error. Everything is fine.
GPU usage and temperature single GPU test
In Tensorflow 1 I was able to use multiple GPUs with keras.utils multi_gpu_model and the same setup.
Any idea what the problem could be?

Good training and validation accuracy but poor confusion matrix

I have training my model to detect normal vs pneumonia chest x-ray classes. This is my dataset as listed below:
train_batch= ImageDataGenerator(preprocessing_function=tf.keras.applications.vgg16.preprocess_input)\
.flow_from_directory(directory=train_path, target_size=(224,224), classes=['NORMAL', 'PNEUMONIA'],
batch_size=32,class_mode='categorical')
val_batch= ImageDataGenerator(preprocessing_function=tf.keras.applications.vgg16.preprocess_input) \
.flow_from_directory(directory=val_path, target_size=(224,224), classes=['NORMAL', 'PNEUMONIA'], batch_size=32, class_mode='categorical')
test_batch= ImageDataGenerator(preprocessing_function=tf.keras.applications.vgg16.preprocess_input) \
.flow_from_directory(directory=test_path, target_size=(224,224), classes=['NORMAL', 'PNEUMONIA'], batch_size=16,class_mode='categorical', shuffle=False)
Found 3616 images belonging to 2 classes. #training
Found 1616 images belonging to 2 classes. #validation
Found 624 images belonging to 2 classes. #test
my model consist of 5 CNN layers where image w,h = (224* 224,3) with 16 feature map as first layer and then 32, 64, 128,256. Batch normalization , max pooling and dropout is added to every cnn layer, but last dense layer is as follow
model.add(Dense(units=2 , activation='softmax'))
optim = Adam( lr=0.001 )
model.compile(optimizer=optim , loss= 'categorical_crossentropy' , metrics= ['accuracy'])
history=model.fit_generator(train_batch,
steps_per_epoch= 113, #3616/32=113
epochs = 25,
validation_data = val_batch,
validation_steps = 51 #1616/32=51
#verbose=2
#callbacks=callbacks #remove to chk
)
as it can be seen in the graph that my training and validation accuracy and loss is good but when I plot confusion matrix it dose not seems good why??
prediction = model.predict_generator(test_batch,steps= stepss) #, verbose=0)
prediction1 = np.argmax(prediction, axis=1)
cm = confusion_matrix (test_batch.classes, prediction1)
print(cm)
this is my confusion matrix as below
as you can see my graph which is as below
after that I did fine tuning of my model with VGG!6 by replacing last dense layer with my own dense layer with two outputs and here is the graph and confusion matrix:
I do not understand why my testing in not going good even with vgg16 model as you can see the results so please give me your valuable suggestions THANKS

Tensorflow 1 vs Tensorflow 2 Keras Inference Speed Differ by 2+ times

I'm trying to figure out the reason behind speed different from two different models.
an LSTM RNN model built using tensorflow 1.x:
self.input_placeholder = tf.placeholder(
tf.int32, shape=[self.config.batch_size, self.config.num_steps], name='Input')
self.labels_placeholder = tf.placeholder(
tf.int32, shape=[self.config.batch_size, self.config.num_steps], name='Target')
embedding = tf.get_variable(
'Embedding', initializer = self.embedding_matrix, trainable = False)
inputs = tf.nn.embedding_lookup(embedding, self.input_placeholder)
inputs = [tf.squeeze(x, axis = 1) for x in tf.split(inputs, self.config.num_steps, axis = 1)]
self.initial_state = tf.zeros([self.config.batch_size, self.config.hidden_size])
lstm_cell = tf.contrib.rnn.BasicLSTMCell(self.config.hidden_size)
outputs, _ = tf.contrib.rnn.static_rnn(
lstm_cell, inputs, dtype = tf.float32,
sequence_length = [self.config.num_steps]*self.config.batch_size)
with tf.variable_scope('Projection'):
proj_U = tf.get_variable('Matrix', [self.config.hidden_size, self.config.vocab_size])
proj_b = tf.get_variable('Bias', [self.config.vocab_size])
outputs = [tf.matmul(o, proj_U) + proj_b for o in rnn_outputs]
the same model (at least from my understanding) built using tensorflow 2.0 keras:
def setup_model():
model = Sequential()
model.add(Embedding(input_dim=vocab_size,
output_dim=embedding_dim,
weights=[embedding_matrix],
input_length=4,
trainable=False))
model.add(LSTM(config.hidden_size, activation="tanh"))
model.add(Dense(vocab_size, activation="softmax"))
return model
The architecture is:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 4, 100) 55400
_________________________________________________________________
lstm (LSTM) (None, 100) 80400
_________________________________________________________________
dense (Dense) (None, 554) 55954
=================================================================
Total params: 191,754
Trainable params: 136,354
Non-trainable params: 55,400
_________________________________________________________________
I was expecting similar inference runtime but the one built with tensorflow 1.x is much much faster. I was trying to convert tensorflow 1.x model to tensorflow 2 using only native tensorlow functions, but I have trouble converting due to the big change in tensorflow from 1.x to 2, and I was only able to create it using tf.keras.
In terms of speed, since I'm using both for generating text sequences + getting word probabilities, so I don't have a single inference time difference (I can't modify existing API from tensorflow 1.x model to get this). But in general, I'm seeing at least 2x difference in time from my use cases.
What can be the possible reasons behind this difference of inference speed? I'm happy to provide more information if needed.

Resources