I am trying to run this model but I keep getting this error. There is some mistake with regard to the shape of input data, I played around with it but I still get these errors.
Error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape (None, 32, 32, 3)
# Image size
img_width = 32
img_height = 32
# Define X as feature variable and Y as name of the class(label)
X = []
Y = []
for features,label in data_set:
X.append(features)
Y.append(label)
X = np.array(X).reshape(-1,img_width,img_height,3)
Y = np.array(Y)
print(X.shape) # Output :(4943, 32, 32, 3)
print(Y.shape) # Output :(4943,)
# Normalize the pixels
X = X/255.0
# Build the model
cnn = Sequential()
cnn.add(keras.Input(shape = (32,32,1)))
cnn.add(Conv2D(32, (3, 3), activation = "relu", input_shape = X.shape[1:]))
cnn.add(MaxPooling2D(pool_size = (2, 2)))
cnn.add(Conv2D(32, (3, 3), activation = "relu",input_shape = X.shape[1:]))
cnn.add(MaxPooling2D(pool_size = (2, 2)))
cnn.add(Conv2D(64, (3,3), activation = "relu",input_shape = X.shape[1:]))
cnn.add(MaxPooling2D(pool_size = (2,2)))
cnn.add(Flatten())
cnn.add(Dense(activation = "relu", units = 150))
cnn.add(Dense(activation = "relu", units = 50))
cnn.add(Dense(activation = "relu", units = 10))
cnn.add(Dense(activation = 'softmax', units = 1))
cnn.summary()
cnn.compile(loss = 'categorical_crossentropy',optimizer = 'adam',metrics = ['accuracy'])
# Model fit
cnn.fit(X, Y, epochs = 15)e
I tried reading about this issue, but still didn't understand it very well.
your input shape should be (32,32,3). y is your label matrix. I assume it contains N unique integer values where N is the number of classes. If N=2 you can treat this as a binary classification problem. In that case your code for the top layer should be
cnn.add(Dense(1, activation = 'sigmoid'))
your code for compile should be
cnn.compile(loss = 'binary_crossentropy',optimizer = 'adam',metrics = ['accuracy'])
If you have more than 2 classes then your code should be
cnn.add(Dense(N, activation = 'softmax'))
cnn.compile(loss = 'sparse_categorical_crossentropy',optimizer = 'adam',metrics = ['accuracy'])
Where N is the number of classes,
Change this line (the last dimension):
cnn.add(keras.Input(shape = (32,32,3)))
Related
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
image_size = (180, 180)
batch_size = 32
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"PetImages",
validation_split=0.2,
subset="training",
seed=1337,
image_size=image_size,
batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
"PetImages",
validation_split=0.2,
subset="validation",
seed=1337,
image_size=image_size,
batch_size=batch_size,
)
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
]
)
train_ds = train_ds.prefetch(buffer_size=32)
val_ds = val_ds.prefetch(buffer_size=32)
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(32, 3, strides=2, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.Conv2D(64, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [128, 256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
activation = "sigmoid"
units = 1
else:
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
model = make_model(input_shape=image_size + (3,), num_classes=2)
keras.utils.plot_model(model, show_shapes=True)
epochs = 50
callbacks = [
keras.callbacks.ModelCheckpoint("save_at_{epoch}.h5"),
]
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="binary_crossentropy",
metrics=["accuracy"],
)
model.fit(
train_ds, epochs=epochs, callbacks=callbacks, validation_data=val_ds,
)
So the strategy was to begin the model with the data_augmentation preprocessor, followed by a Rescaling layer and a dropout layer before the final classification layer as shown in the make_model function
for training the model as you can see I set epochs=50 and used buffered prefetching for my input data as it would yield data from disk without having I/O blocking. As for the rest of the parameters I think it was pretty standard. nothing too complicated but when I run my code each epoch is taking approximately 40 minutes and I don't know why.
Any suggestions?
I have about 17 soil variables that I'd like to run correlations with elevations, temperature and rainfall against species richness and abundance. I have 39 plots (rows) and the columns contain, environmental variables such as elevation, abundance, species richness, temperature, rainfall and then the list of soil variables (17 columns). Below is my script.
Is there a problem with my script or is it the laptop compatibility of the mac I am using? Please help. Thanks
After running the codes, I am getting this error:
Error in stop_if_high_cardinality(data, columns, cardinality_threshold) :
Column 'pH' has more levels (24) than the threshold (15) allowed.
Please remove the column or increase the 'cardinality_threshold' parameter. Increasing the cardinality_threshold may produce long processing times
GGally::ggpairs(
na.omit(nfi_nontree_soilclim_data[, c(11:18)]),
upper = list(
continuous = wrap(
custom_ggally_cor,
method = "spearman", exact = FALSE,
size = 2.5, col = "black", family = "serif", digits = 2
), combo = "box_no_facet", discrete = "count", na = "na"
),
lower = list(
continuous = wrap(
ggally_smooth,
method = "loess", formula = y ~ x,
se = F, lwd = 3, col = "red", shrink = T
), combo = "facethist", discrete = "facetbar", na = "na"
),
diag = list(
continuous = wrap(
ggally_densityDiag,
col = "darkgrey", lwd = .1,
stat = "density", fill = "darkgrey"
), continuous = "densityDiag", na = "naDiag"
), axisLabels = c("show")
) + theme_bw() + theme(
text = element_text(family = "serif", size = 4),
axis.text = element_text(family = "serif", size = 4),
panel.grid = element_blank()
)```
This error is a built-in stop because the default parameter is set to only allow 15 levels of a variable to be displayed in one graph. You have 24 levels for one of your variables, so you can either adjust the parameter, i.e., the cardinality_threshold, to that value of 24 or set it to NULL. Null may be more generalizable if the value of 24 isn't always the same. But in general, that number of levels depicted at once is going to be discouraged and have these stop-limits.
library(GGally)
data(iris)
Create data that has factor of more than 15 levels
iris$group = as.factor(sample(sample(letters,16), 150, replace = TRUE))
Just demonstrating that either entry can work
ggpairs(iris, cardinality_threshold = 16)
ggpairs(iris, cardinality_threshold = NULL)
I have trained different number of layers in CNN+LSTM encoder and decoder model with attention.
The problem I am facing is very strange to me. The validation loss is fluctuating around 3.***. As we can see from the below loss graphs. I have 3 CNN layer+1 layer BLSTM at encoder and 1 LSTM at decoder
3 layer CNN+2 layers of BLSTM at encoder and 1 layer LSTM at encoder
I have also tried weight decay from 0.1 to 0.000001. But still I am getting this type of loss graphs. Note that the Accuracy of the model is increasing on both validation and trainset. How is it possible that validation loss is still around 3 but accuracy is increasing? Can someone explain this?
Thanks
`
class Encoder(nn.Module):
def init(self,height, width, enc_hid_dim, dec_hid_dim, dropout):
super().init()
self.height= height
self.enc_hid_dim=enc_hid_dim
self.width=width
self.layer0 = nn.Sequential(
nn.Conv2d(1, 8, kernel_size=(3,3),stride =(1,1), padding=1),
nn.ReLU(),
nn.BatchNorm2d(8),
nn.MaxPool2d(2,2),
)
self.layer1 = nn.Sequential(
nn.Conv2d(8, 32, kernel_size=(3,3),stride =(1,1), padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.MaxPool2d(2,2),
)
self.layer2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=(3,3),stride =(1,1), padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.MaxPool2d(2,2)
)
self.rnn = nn.LSTM(self.height//8*64, self.enc_hid_dim, bidirectional=True)
self.fc = nn.Linear(enc_hid_dim * 2, dec_hid_dim)
self.dropout = nn.Dropout(dropout)
self.cnn_dropout = nn.Dropout(p=0.2)
def forward(self, src, in_data_len, train):
batch_size = src.shape[0]
out = self.layer0(src)
out = self.layer1(out)
out = self.layer2(out)
out = self.dropout(out) # torch.Size([batch, channel, h, w])
out = out.permute(3, 0, 2, 1) # (width, batch, height, channels)
out.contiguous()
out = out.reshape(-1, batch_size, self.height//8*64) #(w,batch, (height, channels))
width = out.shape[0]
src_len = in_data_len.numpy()*(width/self.width)
src_len = src_len + 0.999 # in case of 0 length value from float to int
src_len = src_len.astype('int')
out = pack_padded_sequence(out, src_len.tolist(), batch_first=False)
outputs, hidden_out = self.rnn(out)
hidden=hidden_out[0]
cell=hidden_out[1]
# output: t, b, f*2 hidden: 2, b, f
outputs, output_len = pad_packed_sequence(outputs, batch_first=False)
hidden = torch.tanh(self.fc(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim = 1)))
cell = torch.tanh(self.fc(torch.cat((cell[-2,:,:], cell[-1,:,:]), dim = 1)))
return outputs, hidden, cell, output_len
class Decoder(nn.Module):
def init(self, output_dim, emb_dim, enc_hid_dim, dec_hid_dim, dropout, attention):
super().init()
self.output_dim = output_dim
self.attention = attention
self.embedding = nn.Embedding(output_dim, emb_dim)
self.rnn = nn.LSTM((enc_hid_dim * 2) + emb_dim, dec_hid_dim)
self.fc_out = nn.Linear((enc_hid_dim * 2) + dec_hid_dim + emb_dim, output_dim)
self.dropout_layer = nn.Dropout(dropout)
def forward(self, input, hidden, cell, encoder_outputs, train):
input=torch.topk(input,1)[1]
embedded = self.embedding(input)
if train:
embedded=self.dropout_layer(embedded)
embedded = embedded.permute(1, 0, 2)
#embedded = [1, batch size, emb dim]
a = self.attention(hidden, encoder_outputs)
#a = [batch size, src len]
a = a.unsqueeze(1)
#a = [batch size, 1, src len]
encoder_outputs = encoder_outputs.permute(1, 0, 2)
#encoder_outputs = [batch size, src len, enc hid dim * 2]
weighted = torch.bmm(a, encoder_outputs)
weighted = weighted.permute(1, 0, 2)
#weighted = [1, batch size, enc hid dim * 2]
rnn_input = torch.cat((embedded, weighted), dim = 2)
output, hidden_out = self.rnn(rnn_input (hidden.unsqueeze(0),cell.unsqueeze(0)))
hidden=hidden_out[0]
cell=hidden_out[1]
assert (output == hidden).all()
embedded = embedded.squeeze(0)
output = output.squeeze(0)
weighted = weighted.squeeze(0)
prediction = self.fc_out(torch.cat((output, weighted, embedded), dim = 1))
return prediction, hidden.squeeze(0), cell.squeeze(0)
`
I've been training a U-Net for single class small lesion segmentation, and have been getting consistently volatile validation loss. I have about 20k images split 70/30 between training and validation sets-so I don't think the issue is too little data. I've tried shuffling and resplitting the sets a few times with no change in volatility-so I don't think the validation set is unrepresentative. I have tried lowering the learning rate with no effect on volatility. And I have tried a few loss functions (dice coefficient, focal tversky, weighted binary cross-entropy). I'm using a decent amount of augmentation so as to avoid overfitting. I've also run through all my data (512x512 float64s with corresponding 512x512 int64 masks--both stored as numpy arrays) do double check that the value range, dtypes, etc. aren't screwy...and I even removed any ROIs in the masks under 35 pixels in area which I thought might be artifact and messing with loss.
I'm using keras ImageDataGen.flow_from_directory...I was initially using zca_whitening and brightness_range augmentation but I think this causes issues with flow_from_directory and the link between mask and image being lost.. so I skipped this.
I've tried validation generators with and without shuffle=True. Batch size is 8.
Here's some of my code, happy to include more if it would help:
# loss
from keras.losses import binary_crossentropy
import keras.backend as K
import tensorflow as tf
epsilon = 1e-5
smooth = 1
def dsc(y_true, y_pred):
smooth = 1.
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
score = (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
return score
def dice_loss(y_true, y_pred):
loss = 1 - dsc(y_true, y_pred)
return loss
def bce_dice_loss(y_true, y_pred):
loss = binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
return loss
def confusion(y_true, y_pred):
smooth=1
y_pred_pos = K.clip(y_pred, 0, 1)
y_pred_neg = 1 - y_pred_pos
y_pos = K.clip(y_true, 0, 1)
y_neg = 1 - y_pos
tp = K.sum(y_pos * y_pred_pos)
fp = K.sum(y_neg * y_pred_pos)
fn = K.sum(y_pos * y_pred_neg)
prec = (tp + smooth)/(tp+fp+smooth)
recall = (tp+smooth)/(tp+fn+smooth)
return prec, recall
def tp(y_true, y_pred):
smooth = 1
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pos = K.round(K.clip(y_true, 0, 1))
tp = (K.sum(y_pos * y_pred_pos) + smooth)/ (K.sum(y_pos) + smooth)
return tp
def tn(y_true, y_pred):
smooth = 1
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pred_neg = 1 - y_pred_pos
y_pos = K.round(K.clip(y_true, 0, 1))
y_neg = 1 - y_pos
tn = (K.sum(y_neg * y_pred_neg) + smooth) / (K.sum(y_neg) + smooth )
return tn
def tversky(y_true, y_pred):
y_true_pos = K.flatten(y_true)
y_pred_pos = K.flatten(y_pred)
true_pos = K.sum(y_true_pos * y_pred_pos)
false_neg = K.sum(y_true_pos * (1-y_pred_pos))
false_pos = K.sum((1-y_true_pos)*y_pred_pos)
alpha = 0.7
return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
def tversky_loss(y_true, y_pred):
return 1 - tversky(y_true,y_pred)
def focal_tversky(y_true,y_pred):
pt_1 = tversky(y_true, y_pred)
gamma = 0.75
return K.pow((1-pt_1), gamma)
model = BlockModel((len(os.listdir(os.path.join(imageroot,'train_ct','train'))), 512, 512, 1),filt_num=16,numBlocks=4)
#model.compile(optimizer=Adam(learning_rate=0.001), loss=weighted_cross_entropy)
#model.compile(optimizer=Adam(learning_rate=0.001), loss=dice_coef_loss)
model.compile(optimizer=Adam(learning_rate=0.001), loss=focal_tversky)
train_mask = os.path.join(imageroot,'train_masks')
val_mask = os.path.join(imageroot,'val_masks')
model.load_weights(model_weights_path) #I'm initializing with some pre-trained weights from a similar model
data_gen_args_mask = dict(
rotation_range=10,
shear_range=20,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=[0.8,1.2],
horizontal_flip=True,
#vertical_flip=True,
fill_mode='nearest',
data_format='channels_last'
)
data_gen_args = dict(
**data_gen_args_mask
)
image_datagen_train = ImageDataGenerator(**data_gen_args)
mask_datagen_train = ImageDataGenerator(**data_gen_args)#_mask)
image_datagen_val = ImageDataGenerator()
mask_datagen_val = ImageDataGenerator()
seed = 1
BS = 8
steps = int(np.floor((len(os.listdir(os.path.join(train_ct,'train'))))/BS))
print(steps)
val_steps = int(np.floor((len(os.listdir(os.path.join(val_ct,'val'))))/BS))
print(val_steps)
train_image_generator = image_datagen_train.flow_from_directory(
train_ct,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
train_mask_generator = mask_datagen_train.flow_from_directory(
train_mask,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
val_image_generator = image_datagen_val.flow_from_directory(
val_ct,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
val_mask_generator = mask_datagen_val.flow_from_directory(
val_mask,
target_size = (512, 512),
color_mode = ("grayscale"),
classes=None,
class_mode=None,
seed = seed,
shuffle = True,
batch_size = BS)
train_generator = zip(train_image_generator, train_mask_generator)
val_generator = zip(val_image_generator, val_mask_generator)
# make callback for checkpointing
plot_losses = PlotLossesCallback(skip_first=0,plot_extrema=False)
%matplotlib inline
filepath = os.path.join(versionPath, model_version + "_saved-model-{epoch:02d}-{val_loss:.2f}.hdf5")
if reduce:
cb_check = [ModelCheckpoint(filepath,monitor='val_loss',
verbose=1,save_best_only=False,
save_weights_only=True,mode='auto',period=1),
reduce_lr,
plot_losses]
else:
cb_check = [ModelCheckpoint(filepath,monitor='val_loss',
verbose=1,save_best_only=False,
save_weights_only=True,mode='auto',period=1),
plot_losses]
# train model
history = model.fit_generator(train_generator, epochs=numEp,
steps_per_epoch=steps,
validation_data=val_generator,
validation_steps=val_steps,
verbose=1,
callbacks=cb_check,
use_multiprocessing = False
)
And here's how my loss looks:
Another potentially relevant thing: I tweaked the flow_from_directory code a bit (added npy to the white list). But training loss looks fine so assuming the issue isnt here
Two suggestions:
Switch to the classic validation data format (i.e. numpy array) instead of using a generator -- this will ensure you always use the exactly same validation data every time. If you see a different validation curve, then there is something "random" in the validation generator giving you different data at different epochs.
Use a fixed set of samples (100 or 1000 should be enough w/o any data augmentation) for both training and validation. If everything goes well, you should see your network quickly overfit to this dataset and your training and validation curves should very much similar. If not, debug your network.
I tried using SGD, Adadelta, Adabound, Adam. Everything gives me fluctuations in validation accuracy. I tried all the activation functions in keras, but still, I'm getting fluctuations in val_acc.
Training samples: 1352
Validation Samples: 339
Validation Accuracy
# first (and only) CONV => RELU => POOL block
inpt = Input(shape = input_shape)
x = Conv2D(32, (3, 3), padding = "same")(inpt)
x = Activation("swish")(x)
x = BatchNormalization(axis = channel_dim)(x)
x = MaxPooling2D(pool_size = (3, 3))(x)
# x = Dropout(0.25)(x)
# first CONV => RELU => CONV => RELU => POOL block
x = Conv2D(64, (3, 3), padding = "same")(x)
x = Activation("swish")(x)
x = BatchNormalization(axis = channel_dim)(x)
x = Conv2D(64, (3, 3), padding = "same")(x)
x = Activation("swish")(x)
x = BatchNormalization(axis = channel_dim)(x)
x = MaxPooling2D(pool_size = (2, 2))(x)
# x = Dropout(0.25)(x)
# second CONV => RELU => CONV => RELU => POOL Block
x = Conv2D(128, (3, 3), padding = "same")(x)
x = Activation("swish")(x)
x = BatchNormalization(axis = channel_dim)(x)
x = Conv2D(128, (3, 3), padding = "same")(x)
x = Activation("swish")(x)
x = BatchNormalization(axis = channel_dim)(x)
x = MaxPooling2D(pool_size = (2, 2))(x)
# x = Dropout(0.25)(x)
# first (and only) FC layer
x = Flatten()(x) # Change to GlobalMaxPooling2D
x = Dense(256, activation = 'swish')(x)
x = BatchNormalization(axis = channel_dim)(x)
x = Dropout(0.4)(x)
x = Dense(128, activation = 'swish')(x)
x = BatchNormalization()(x)
x = Dropout(0.4)(x)
x = Dense(64, activation = 'swish')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)
x = Dense(32, activation = 'swish')(x)
x = BatchNormalization()(x)
x = Dense(nc, activation = 'softmax')(x)
model = Model(inputs=inpt, outputs = x)
model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics = ['accuracy'])
Your model may be too noise sensitive, see this answer.
Based on the answer in the link and what I see from your model, your network may be too deep for the amount of data you have (large model and not enough datas ==> overfitting ==> noise sensitivity). I suggest to use a simpler model as a sanity check.
The learning rate could also be a possible reason (as stated by Neb). You are using the default learning rate of sgd (which is 0.01, maybe too high). Try with 1.e-3 or below.