Huggingface transformers) training loss sometimes decreases really slowly (using Trainer) - sentiment-analysis

I'm fine-tuning sentiment analysis model using news data. As the simplest way is using Huggingface pre-trained model (roberta-base), I followed Huggingface tutorial - https://huggingface.co/blog/sentiment-analysis-python - this one.
The custom input data is simple : There're 2 columns named 'text' and 'labels'. The column 'text' is consisted with news sentence and 'label' is consisted with '0' (40%) and '1' (60%). Then it was separated into train, eval, test set.
So this is the problem what I met : 'eval_loss' never changes during training but its accuracy passed 50%. And training loss is decreasing while training. So It seems learned something. Maybe it didn't learn after first epoch or selected best checkpoint automatically - but I'm confusing what is actually happened.
And this is the training code (without labeling code):
from datasets import load_dataset
from transformers import AutoTokenizer
from transformers import DataCollatorWithPadding
from transformers import AutoModelForSequenceClassification
import numpy as np
from datasets import load_metric
from transformers import set_seed
set_seed(42)
dataset = load_dataset('json',data_files={'train':'./data/labeled_news/labeled_news_heads_train.json',
'eval':'./data/labeled_news/labeled_news_heads_eval.json'}, field='data')
tokenizer = AutoTokenizer.from_pretrained("roberta-base")
def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True)
tokenized_datasets = dataset.map(tokenize_function, batched=True)
train_dataset = tokenized_datasets["train"].shuffle(seed=42)
eval_dataset = tokenized_datasets["eval"].shuffle(seed=42)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
model = AutoModelForSequenceClassification.from_pretrained("roberta-base", num_labels=2)
def compute_metrics(eval_pred):
load_accuracy = load_metric("accuracy")
load_f1 = load_metric("f1")
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
accuracy = load_accuracy.compute(predictions=predictions, references=labels)["accuracy"]
f1 = load_f1.compute(predictions=predictions, references=labels)["f1"]
return {"accuracy": accuracy, "f1": f1}
from transformers import TrainingArguments, Trainer, EarlyStoppingCallback
repo_name = "Direct_v1"
training_args = TrainingArguments(
output_dir=repo_name,
learning_rate=2e-5,
per_device_train_batch_size=24,
per_device_eval_batch_size=1,
num_train_epochs=5,
weight_decay=0.01,
save_strategy="steps",
evaluation_strategy ='steps',
eval_steps = 250,
save_steps=250,
push_to_hub=False,
save_total_limit = 5,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()
And this is the result printed on console:
Using custom data configuration default-e08b7987c7aa36c3
Reusing dataset json (/home/nvme20142249/.cache/huggingface/datasets/json/default-e08b7987c7aa36c3/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b)
100%|██████████| 2/2 [00:00<00:00, 315.56it/s]
Loading cached processed dataset at /home/nvme20142249/.cache/huggingface/datasets/json/default-e08b7987c7aa36c3/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b/cache-050035fb0e59db40.arrow
Loading cached processed dataset at /home/nvme20142249/.cache/huggingface/datasets/json/default-e08b7987c7aa36c3/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b/cache-2981b391c69b5e0c.arrow
Loading cached shuffled indices for dataset at /home/nvme20142249/.cache/huggingface/datasets/json/default-e08b7987c7aa36c3/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b/cache-26ea42ee0127a8d9.arrow
Loading cached shuffled indices for dataset at /home/nvme20142249/.cache/huggingface/datasets/json/default-e08b7987c7aa36c3/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b/cache-ef064a1251721c99.arrow
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.bias', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.out_proj.weight', 'classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The following columns in the training set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
/home/nvme20142249/PycharmProjects/StockPrediction/venv/lib/python3.8/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
***** Running training *****
Num examples = 10147
Num Epochs = 5
Instantaneous batch size per device = 24
Total train batch size (w. parallel, distributed & accumulation) = 24
Gradient Accumulation steps = 1
Total optimization steps = 2115
12%|█▏ | 250/2115 [02:04<15:33, 2.00it/s]The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
100%|██████████| 634/634 [00:14<00:00, 53.32it/s]
Saving model checkpoint to Direct_v1/checkpoint-250
Configuration saved in Direct_v1/checkpoint-250/config.json
{'eval_loss': 0.6686041951179504, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.2853, 'eval_samples_per_second': 44.381, 'eval_steps_per_second': 44.381, 'epoch': 0.59}
Model weights saved in Direct_v1/checkpoint-250/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-250/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-250/special_tokens_map.json
24%|██▎ | 500/2115 [04:28<14:23, 1.87it/s]The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
{'loss': 0.6803, 'learning_rate': 1.5271867612293146e-05, 'epoch': 1.18}
24%|██▎ | 500/2115 [04:43<14:23, 1.87it/s]
100%|██████████| 634/634 [00:15<00:00, 49.78it/s]
Saving model checkpoint to Direct_v1/checkpoint-500
Configuration saved in Direct_v1/checkpoint-500/config.json
{'eval_loss': 0.6686403751373291, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 15.0809, 'eval_samples_per_second': 42.04, 'eval_steps_per_second': 42.04, 'epoch': 1.18}
Model weights saved in Direct_v1/checkpoint-500/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-500/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-500/special_tokens_map.json
35%|███▌ | 750/2115 [06:56<11:30, 1.98it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
35%|███▌ | 750/2115 [07:10<11:30, 1.98it/s]
100%|██████████| 634/634 [00:14<00:00, 51.95it/s]
Saving model checkpoint to Direct_v1/checkpoint-750
Configuration saved in Direct_v1/checkpoint-750/config.json
{'eval_loss': 0.6685948967933655, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.3642, 'eval_samples_per_second': 44.138, 'eval_steps_per_second': 44.138, 'epoch': 1.77}
Model weights saved in Direct_v1/checkpoint-750/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-750/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-750/special_tokens_map.json
47%|████▋ | 1000/2115 [09:18<09:18, 2.00it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
{'loss': 0.6786, 'learning_rate': 1.054373522458629e-05, 'epoch': 2.36}
47%|████▋ | 1000/2115 [09:32<09:18, 2.00it/s]
100%|██████████| 634/634 [00:14<00:00, 52.47it/s]
Saving model checkpoint to Direct_v1/checkpoint-1000
Configuration saved in Direct_v1/checkpoint-1000/config.json
{'eval_loss': 0.6686900854110718, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.7566, 'eval_samples_per_second': 42.964, 'eval_steps_per_second': 42.964, 'epoch': 2.36}
Model weights saved in Direct_v1/checkpoint-1000/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-1000/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-1000/special_tokens_map.json
59%|█████▉ | 1250/2115 [11:40<07:14, 1.99it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
59%|█████▉ | 1250/2115 [11:54<07:14, 1.99it/s]
100%|██████████| 634/634 [00:14<00:00, 52.63it/s]
Saving model checkpoint to Direct_v1/checkpoint-1250
Configuration saved in Direct_v1/checkpoint-1250/config.json
{'eval_loss': 0.6696870923042297, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.2725, 'eval_samples_per_second': 44.421, 'eval_steps_per_second': 44.421, 'epoch': 2.96}
Model weights saved in Direct_v1/checkpoint-1250/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-1250/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-1250/special_tokens_map.json
71%|███████ | 1500/2115 [14:01<05:09, 1.99it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
{'loss': 0.6798, 'learning_rate': 5.815602836879432e-06, 'epoch': 3.55}
71%|███████ | 1500/2115 [14:16<05:09, 1.99it/s]
100%|██████████| 634/634 [00:14<00:00, 52.17it/s]
Saving model checkpoint to Direct_v1/checkpoint-1500
Configuration saved in Direct_v1/checkpoint-1500/config.json
{'eval_loss': 0.6706184148788452, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.5084, 'eval_samples_per_second': 43.699, 'eval_steps_per_second': 43.699, 'epoch': 3.55}
Model weights saved in Direct_v1/checkpoint-1500/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-1500/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-1500/special_tokens_map.json
Deleting older checkpoint [Direct_v1/checkpoint-250] due to args.save_total_limit
83%|████████▎ | 1750/2115 [16:25<03:03, 1.99it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
83%|████████▎ | 1750/2115 [16:39<03:03, 1.99it/s]
100%|██████████| 634/634 [00:14<00:00, 50.95it/s]
Saving model checkpoint to Direct_v1/checkpoint-1750
Configuration saved in Direct_v1/checkpoint-1750/config.json
{'eval_loss': 0.6691468954086304, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 14.515, 'eval_samples_per_second': 43.679, 'eval_steps_per_second': 43.679, 'epoch': 4.14}
Model weights saved in Direct_v1/checkpoint-1750/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-1750/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-1750/special_tokens_map.json
Deleting older checkpoint [Direct_v1/checkpoint-500] due to args.save_total_limit
95%|█████████▍| 2000/2115 [18:48<00:58, 1.95it/s]
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 634
Batch size = 1
{'loss': 0.6784, 'learning_rate': 1.087470449172577e-06, 'epoch': 4.73}
95%|█████████▍| 2000/2115 [19:04<00:58, 1.95it/s]
100%|██████████| 634/634 [00:15<00:00, 50.16it/s]
Saving model checkpoint to Direct_v1/checkpoint-2000
Configuration saved in Direct_v1/checkpoint-2000/config.json
{'eval_loss': 0.6719586253166199, 'eval_accuracy': 0.610410094637224, 'eval_f1': 0.7580803134182175, 'eval_runtime': 15.2941, 'eval_samples_per_second': 41.454, 'eval_steps_per_second': 41.454, 'epoch': 4.73}
Model weights saved in Direct_v1/checkpoint-2000/pytorch_model.bin
tokenizer config file saved in Direct_v1/checkpoint-2000/tokenizer_config.json
Special tokens file saved in Direct_v1/checkpoint-2000/special_tokens_map.json
Deleting older checkpoint [Direct_v1/checkpoint-750] due to args.save_total_limit
100%|██████████| 2115/2115 [20:05<00:00, 2.05it/s]
Training completed. Do not forget to share your model on huggingface.co/models =)
100%|██████████| 2115/2115 [20:05<00:00, 1.75it/s]
{'train_runtime': 1205.4397, 'train_samples_per_second': 42.088, 'train_steps_per_second': 1.755, 'train_loss': 0.6791386345035922, 'epoch': 5.0}
I think this is quite weird because it seems learned something but eval_loss doesn't change while training. Does 'transformers.Trainer' select best checkpoint automatically? I'm confusing this is an error or not.
** edited on 4/25 : I changed compute_metrics function by
load_accuracy = load_metric("accuracy")
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = np.argmax(predictions, axis=1)
return load_accuracy.compute(predictions=predictions, references=labels)
and training error decreased normally while training. I thought the problem was solved but, sometimes It doesn't. Training error didn't decrease with same datasets. (different checkpoints) Why did this happen?

Related

Getting The Actual Video Label After Model.Predict Operations With 3DCNN Sequential Model

I have a challenge and I am trying to solve this in order to move forward, it is the final piece of the puzzle for my model operations.
What I am trying to do?:*
is verify the videos that are being used in the Xval_test variable via the split operations here as per the example via here In Python sklearn, how do I retrieve the names of samples/variables in test/training data? :
X_train, Xval_test, Y_train, Yval_test = train_test_split(
X, Y, train_size=0.8, test_size=0.2, random_state=1, shuffle=True)
1.
What I tried?:
is calling the name from the actual tag via file_path name, however that is not working. (every time the code runs the names from the file path are taken and not from the actual split operations Xval_test variable. This causes an issue during the model.fit() procedures as it changes the 1D flattened tensor to (a number of rows, 1 column)
file_paths = []
for file_name in os.listdir(root):
file_path = os.path.join(root, file_name)
if os.path.isfile(file_path):
file_paths.append(file_path)
print('**********************************************************')
print('ALL Directory File Paths Completed', file_paths)
I am not sure if the files are being shuffled properly with my weak attempt as per the guidelines from the split() forum. (based on my knowledge, every time I run the code, those files would be shuffled to a new Xval_test set relative to it's specified split parameter 80:20.
2.
I tried calling the model.predict(), that presents no labels for which I was hoping that it did (maybe I am using it the wrong way for calling the indices, I don't know).
my_pred = model.predict(Xval_test).argmax(axis=1)
I tried calling np.argsmax():( I KNOW THE TOTAL AMT OF FILES IN Xval_test is 16 based on the split())
Y_valpred = np.argmax(model.predict(Xval_test), axis=1) # model
This returns just the class label and not it's contents e.g. the classes in the datastore are folders containing (walking and fencing) rather than the actual video labels such as (walking0.avi....100/n and fencing0.avi.....100n/) !!!???!
I am not certain of the operation to get the folder content's tags, the actual file itself. It is this that I am trying to get from the X_test variable.
(or maybe its the wrong variable or functioning I using, again I am lacking the knowledge to understand this, please assist so that I can move to the next stage).
3.
I tried printing all of the variable's from the previous operations to see where that name tag would be stored and it is stored in the name variable below as per my operations: (but how do I call these folder content's file tags forward to the X_test variable or as per my choice the model.predict() outputs in a column together with the other metrics. So far, this causes issues with the model.fit() function???)
for files3 in files2:
name = os.path.join(namelist, files3)
name1 = name.strip("./dataset/")
name2 = name1.strip("Fencing/")
name3 = name2.strip("Stabing/")
name3 = name3.replace('.av', '')
name4 = name3.split()
# print("This is name1 ", name1)
# name5 = pd.DataFrame({"vid_names": name4}).to_csv("results.csv")
# name1 = name1.replace('[]', '')
with open('vid_names.csv', 'a',newline='') as f:
writer = csv.writer(f)
writer = writer.writerow(name4)
# print("My Video Names => ", name3)
3A.
Thank you in advance, I am grateful for any guidance provided, Please assist!
QUESTIONS: ############################################
Ques: 1.
Is it possible to see what video label tags are segmented within the X_Test Variable?
Ques: 1A.
If yes, may I request your guidance here, please, on how this can be done?:
I have been researching for weeks and cannot seem to get this sorted, your efforts would be greatly appreciated.
Ques: 2. MY Expected OUTCOME:
I am trying to access the prediction. So, In the end I would get an output relative to the actual video tag that insinuates the actual video that was used in the prediction operation along with its class tag (see below):
Initially, the model.predict() operations outputs numerical data relative to the class label.
I am trying to access the actual file label as well:
For example, what I want the predictions to look like is as follows:
X_test_labs Pred_labs Actual_File Pred_Score
0 Fencing Fencing fencing0.avi 0.99650866
1 Walking Fencing walking6.avi 0.9948837
2 Walking Walking walking21.avi 0.9967557
3 Fencing Fencing fencing32.avi 0.9930409
4 Walking Fencing walking43.avi 0.9961387
5 Walking Walking walking48.avi 0.6467387
6 Walking Walking walking50.avi 0.5465369
7 Walking Walking walking9.avi 0.3478027
8 Fencing Fencing fencing22.avi 0.1247543
9 Fencing Fencing fencing46.avi 0.7477777
10 Walking Walking walking37.avi 0.8499399
11 Fencing Fencing fencing19.avi 0.8887722
12 Walking Walking walking12.avi 0.7775351
13 Fencing Fencing fencing33.avi 0.4323323
14 Fencing Fencing fencing51.avi 0.7812434
15 Fencing Fencing fencing8.avi 0.8723476
I am not sure how to achieve this task, this one is a little more tricky for me than anticipated
This is my code*
'''*******Load Dependencies********'''
from keras.regularizers import l2
from keras.layers import Dense
from keras_tqdm import TQDMNotebookCallback
from tqdm.keras import TqdmCallback
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import math
import tensorflow as tf
from tqdm import tqdm
import videoto3d
import seaborn as sns
import scikitplot as skplt
from sklearn import preprocessing
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score
from keras.utils.vis_utils import plot_model
from keras.utils import np_utils
from tensorflow.keras.optimizers import Adam
from keras.models import Sequential
from keras.losses import categorical_crossentropy
from keras.layers import (Activation, Conv3D, Dense, Dropout, Flatten,MaxPooling3D)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import argparse
import time
import sys
import openpyxl
import os
import re
import csv
from keras import models
import cv2
import pickle
import glob
from numpy import load
np.seterr(divide='ignore', invalid='ignore')
print('**********************************************************')
print('Graphical Representation Of Accuracy & Validation Results Completed')
def plot_history(history, result_dir):
plt.plot(history.history['val_accuracy'], marker='.')
plt.plot(history.history['accuracy'], marker='.')
plt.title('model accuracy')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.grid()
plt.legend(['Val_acc', 'Test_acc'], loc='lower right')
plt.savefig(os.path.join(result_dir, 'model_accuracy.png'))
plt.close()
plt.plot(history.history['val_loss'], marker='.')
plt.plot(history.history['loss'], marker='.')
plt.title('model Loss')
plt.xlabel('epoch')
plt.ylabel('loss')
plt.grid()
plt.legend(['Val_loss', 'Test_loss'], loc='upper right')
plt.savefig(os.path.join(result_dir, 'model_loss.png'))
plt.close()
# Saving History Accuracy & Validation Acuuracy Results To Directory
print('**********************************************************')
print('Generating History Acuuracy Results Completed')
def save_history(history, result_dir):
loss = history.history['loss']
acc = history.history['accuracy']
val_loss = history.history['val_loss']
val_acc = history.history['val_accuracy']
nb_epoch = len(acc)
# Creating The Results File To Directory = Store Results
print('**********************************************************')
print('Saving History Acuuracy Results To Directory Completed')
with open(os.path.join(result_dir, 'result.txt'), 'w') as fp:
fp.write('epoch\tloss\tacc\tval_loss\tval_acc\n')
# print(fp)
for i in range(nb_epoch):
fp.write('{}\t{}\t{}\t{}\t{}\n'.format(
i, loss[i], acc[i], val_loss[i], val_acc[i]))
print('**********************************************************')
print('Loading All Specified Video Data Samples From Directory Completed')
def loaddata(video_dir, vid3d, nclass, result_dir, color=False, skip=True):
files = os.listdir(video_dir)
with open('files.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(files)
root = '/Users/symbadian/3DCNN_latest_Version/3DCNNtesting/dataset/'
dirlist = [item for item in os.listdir(
root) if os.path.isdir(os.path.join(root, item))]
print('Get the filesname and path')
print('DIRLIST Directory Completed', dirlist)
file_paths = []
for file_name in os.listdir(root):
file_path = os.path.join(root, file_name)
if os.path.isfile(file_path):
file_paths.append(file_path)
print('**********************************************************')
print('ALL Directory File Paths Completed', file_paths)
roots, dirsy, fitte = next(os.walk(root), ([], [], []))
print('**********************************************************')
print('ALL Directory ROOTED', roots, fitte, dirsy)
X = []
print('X labels==>', X) # This stores all variable data in an object format
labellist = []
pbar = tqdm(total=len(files)) # generate progress bar for file processing
print('**********************************************************')
print('Generating/Join Class Labels For Video Dataset For Input Completed')
# Accessing files and labels from dataset directory
for filename in files:
pbar.update(1)
if filename == '.DS_Store':#.DS_Store
continue
namelist = os.path.join(video_dir, filename)
files2 = os.listdir(namelist)
###############################################################################
######### NEEDS TO FIX THIS Data Adding to CSV Rather Than REWRITTING #########
for files3 in files2:
name = os.path.join(namelist, files3)
#Call a function that extract the frames details of all file names
label = vid3d.get_UCF_classname(filename)
if label not in labellist:
if len(labellist) >= nclass:
continue
labellist.append(label)
# This X variable is the point where the lables are store (I think??!?!)
X.append(vid3d.video3d(name, color=color, skip=skip))
pbar.close()
# generating labellist/ writing to directory
print('******************************************************')
print('Saving All Class Labels For Referencing To Directory Completed')
with open(os.path.join(result_dir, 'classes.txt'), 'w') as fp:
for i in range(len(labellist)):
# print('These are labellist i classes',i) #Not This
fp.write('{}\n'.format(labellist[i]))
# print('These are my labels: ==>',mylabel)
for num, label in enumerate(labellist):
for i in range(len(labels)):
if label == labels[i]:
labels[i] = num
# print('This is labels i',labels[i]) #Not this
if color: # conforming image channels of image for input sequence
return np.array(X).transpose((0, 2, 3, 4, 1)), labels
else:
return np.array(X).transpose((0, 2, 3, 1)), labels
print('**********************************************************')
print('Generating Args Informative Messages/ Tuning Parameters Options Completed')
def main():
parser = argparse.ArgumentParser(description='A 3D Convolution Model For Action Recognition')
parser.add_argument('--batch', type=int, default=130)
parser.add_argument('--epoch', type=int, default=100)
parser.add_argument('--videos', type=str, default='dataset',help='Directory Where Videos Are Stored')# UCF101
parser.add_argument('--nclass', type=int, default= 2)
parser.add_argument('--output', type=str, required=True)
parser.add_argument('--color', type=bool, default=False)
parser.add_argument('--skip', type=bool, default=True)
parser.add_argument('--depth', type=int, default=10)
args = parser.parse_args()
# print('This is the Option Arguments ==>',args)
print('**********************************************************')
print('Specifying Input Size and Channels Completed')
img_rows, img_cols, frames = 32, 32, args.depth
channel = 3 if args.color else 1
print('**********************************************************')
print('Saving Dataset As NPZ To Directory Completed')
fname_npz = 'dataset_{}_{}_{}.npz'.format(args.nclass, args.depth, args.skip)
vid3d = videoto3d.Videoto3D(img_rows, img_cols, frames)
nb_classes = args.nclass
# loading the data
if os.path.exists(fname_npz):
loadeddata = np.load(fname_npz)
X, Y = loadeddata["X"], loadeddata["Y"]
else:
x, y = loaddata(args.videos, vid3d, args.nclass,args.output, args.color, args.skip)
X = x.reshape((x.shape[0], img_rows, img_cols, frames, channel))
Y = np_utils.to_categorical(y, nb_classes)
X = X.astype('float32')
#save npzdata to file
np.savez(fname_npz, X=X, Y=Y)
print('Saved Dataset To dataset.npz. Completed')
print('X_shape:{}\nY_shape:{}'.format(X.shape, Y.shape))
print('**********************************************************')
print('Initialise Model Layers & Layer Parameters Completed')
# Sequential groups a linear stack of layers into a tf.keras.Model.
# Sequential provides training and inference features on this model
model = Sequential()
model.add(Conv3D(32, kernel_size=(3, 3, 3),input_shape=(X.shape[1:]), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(32, kernel_size=(3, 3, 3), padding='same'))
model.add(MaxPooling3D(pool_size=(3, 3, 3), padding='same'))
model.add(Conv3D(64, kernel_size=(3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(64, kernel_size=(3, 3, 3), padding='same'))
model.add(MaxPooling3D(pool_size=(3, 3, 3), padding='same'))
model.add(Conv3D(128, kernel_size=(3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(128, kernel_size=(3, 3, 3), padding='same'))
model.add(MaxPooling3D(pool_size=(3, 3, 3), padding='same'))
model.add(Dropout(0.5))
model.add(Conv3D(256, kernel_size=(3, 3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv3D(256, kernel_size=(3, 3, 3), padding='same'))
model.add(MaxPooling3D(pool_size=(3, 3, 3), padding='same'))
model.add(Dropout(0.5))
model.add(Flatten())
# Dense function to convert FCL to 512 values
model.add(Dense(512, activation='sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes, activation='softmax'))
model.compile(loss=categorical_crossentropy,optimizer=Adam(), metrics=['accuracy'])
model.summary()
print('this is the model shape')
model.output_shape
plot_model(model, show_shapes=True,to_file=os.path.join(args.output, 'model.png'))
print('**********************************************************')
print("Train Test Method HoldOut Performance")
X_train, Xval_test, Y_train, Yval_test = train_test_split(
X, Y, train_size=0.8, test_size=0.2, random_state=1, stratify=Y, shuffle=True)
print('**********************************************************')
print('Deploying Data Fitting/ Performance Accuracy Guidance Completed')
#Stop operations when experiencing no learning
rlronp = tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=1, mode='auto', min_delta=0.0001, cooldown=1, min_lr=0.0001)
# Fit the training data
history = model.fit(X_train, Y_train, validation_split=0.20, batch_size=args.batch,epochs=args.epoch, verbose=1, callbacks=[rlronp], shuffle=True)
# Predict X_Test (Xval_test) data and Labels
predict_labels = model.predict(Xval_test, batch_size=args.batch,verbose=1,use_multiprocessing=True)
classes = np.argmax(predict_labels, axis = 1)
label = np.argmax(Yval_test,axis = 1)
print('This the BATCH size', args.batch)
print('This the DEPTH size', args.depth)
print('This the EPOCH size', args.epoch)
print('This the TRAIN SPLIT size', len(X_train))
print('This the TEST SPLIT size', len(Xval_test))
# https://stackoverflow.com/questions/52261597/keras-model-fit-verbose-formatting
# A json file enhances the model performance by a simple to save/load model
model_json = model.to_json()
if not os.path.isdir(args.output):
os.makedirs(args.output)
with open(os.path.join(args.output, 'ucf101_3dcnnmodel.json'), 'w') as json_file:
json_file.write(model_json)
# hd5 contains multidimensional arrays of scientific data
model.save_weights(os.path.join(args.output, 'ucf101_3dcnnmodel.hd5'))
''' Evaluation is a process
'''
print('**********************************************************')
print('Displying Test Loss & Test Accuracy Completed')
loss, acc = model.evaluate(Xval_test, Yval_test, verbose=2, batch_size=args.batch, use_multiprocessing=True) # verbose 0
print('this is args output', args.output)
plot_history(history, args.output)
save_history(history, args.output)
print('**********************************************************')
# Generating Picture Of Confusion matrix
print('**********************************************************')
print('Generating CM InputData/Classification Report Completed')
#Ground truth (correct) target values.
y_valtest_arg = np.argmax(Yval_test, axis=1)
#Estimated targets as returned by a classifier
Y_valpred = np.argmax(model.predict(Xval_test), axis=1) # model
print('y_valtest_arg Shape is ==>', y_valtest_arg.shape)
print('Y_valpred Shape is ==>', Y_valpred.shape)
print('**********************************************************')
print('Classification_Report On Model Performance Completed==')
print(classification_report(y_valtest_arg.round(), Y_valpred.round(), target_names=filehandle, zero_division=1))
'''Intitate Confusion Matrix'''
# print('Model Confusion Matrix Per Test Data Completed===>')
cm = confusion_matrix(y_valtest_arg, Y_valpred, normalize=None)
print('Display Confusion Matrix ===>', cm)
print('**********************************************************')
print('Model Overall Accuracy')
print('Model Test loss:', loss)
print('**********************************************************')
print('Model Test accuracy:', acc)
print('**********************************************************')
if __name__ == '__main__':
main()
I Think the solution is around the prediction, train, test split and the evaluation arguments. However, I am lacking the knowledge to access the details required from the train, test, split(), if that's where the issue is.
I am really thankful for your guidance in advance, thanks a whole lot for clarifying this for me and closing the gaps in my understanding. really appreciate this!!!

Change all images in training set

I have a convolutional neural network. And I wanted to train it on images from the training set but first they should be wrapped with my function change(tensor, float) that takes in a tensor/image of the form [hight,width,3] and a float.
Batch size =4
loading data
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
shuffle=True, num_workers=2)
Cnn architecture
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
#size of inputs [4,3,32,32]
#size of labels [4]
inputs = change(inputs,0.1) <----------------------------
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs) #[4, 10]
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss = 0.0
print('Finished Training')
I am trying to apply the image function change but it gives an object error.
it there a quick way to fix it?
I am using a Julia function but it works completely fine with other objects. Error message:
JULIA: MethodError: no method matching copy(::PyObject)
Closest candidates are:
copy(!Matched::T) where T<:SHA.SHA3_CTX at /opt/julia-1.7.2/share/julia/stdlib/v1.7/SHA/src/types.jl:213
copy(!Matched::T) where T<:SHA.SHA2_CTX at /opt/julia-1.7.2/share/julia/stdlib/v1.7/SHA/src/types.jl:212
copy(!Matched::Number) at /opt/julia-1.7.2/share/julia/base/number.jl:113
I would recommend to put change function to transforms list, so you do data changes on transformation stage.
partial from functools will help you to fix number of arguments, like this:
from functools import partial
def change(input, float):
pass
# Use partial to fix number of params, such that change accepts only input
change_partial = partial(change, float=pass_float_value_here)
# Add change_partial to a list of transforms before or after converting to tensors
transforms = Compose([
RandomResizedCrop(img_size), # example
# Add change_partial here if it operates on PIL Image
change_partial,
ToTensor(), # convert to tensor
# Add change_partial here if it operates on torch tensors
change_partial,
])

Saving bert model at every epoch for further training

I am using bert_model.save_pretrained for saving the model at end as this is the command that helps in saving the model with all configurations and weights but this cannot be used in model.fit command as in callbacks saving model at each epoch does not save with save_pretrained. Can anybody help me in saving bert model at each epoch since i cannot train whole bert model in one go?
Edit
Code for loading pre trained bert model
bert_model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=num_classes)
Code for compiling the bert model
from tensorflow.keras import optimizers
bert_model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(learning_rate=0.00005),
metrics=['accuracy'])
bert_model.summary()
Code for training and saving the bert model
checkpoint_filepath_1 = 'callbacks_models/BERT1.{epoch:02d}-
{val_loss:.2f}.h5'
checkpoint_filepath_2 = 'callbacks_models/complete_best_BERT_model_1.h5'
callbacks_1 = ModelCheckpoint(
filepath=checkpoint_filepath_1,
monitor='val_loss',
mode='min',
save_best_only=False,
save_weights_only=False,
save_freq='epoch')
callbacks_2 = ModelCheckpoint(
filepath=checkpoint_filepath_2,
monitor='val_loss',
mode='min',
save_best_only=True)
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1,
patience=5)
hist = bert_model.fit([train1_input_ids, train1_attention_masks],
y_train1, batch_size=16, epochs=1,validation_data=
([val_input_ids, val_attention_masks], y_val),
callbacks
[es,callbacks_1,callbacks_2,history_logger])
min_val_score = min(hist.history['val_loss'])
print ("\nMinimum validation loss = ", min_val_score)
bert_model.save_pretrained("callbacks_models/Complete_BERT_model_1.h5")

Number of features of the model must match the input. Model n_features is 51 and input n_features is 55 error with BERT tokenizer

I am working on classification model. I have a Description column in my data on which I am using Bert tokenization.
def tokenization_and_encoding(data,model_name,independent_col,target_col):
tokenizer = BertTokenizerFast.from_pretrained(model_name,do_lower_case=True)
train_text=list(data[independent_col])
train_labels=list(data[target_col])
train_encodings = tokenizer(train_text, truncation=True, padding=True,max_length=256)
train_encodings=train_encodings['input_ids']
return train_encodings,train_labels
model_name='uncased_L-12_H-768_A-12/'
data=data[['Description','Target']]
#drop null values
data = data[data['Outage Description'].notnull()]
calibrated_svc = CalibratedClassifierCV(LinearSVC(), method='sigmoid')
calibrated_svc.fit(train_encodings,train_labels)
length_of_encoding = len(train_encodings[0])##length is 51
pickle.dump(calibrated_svc, open(r".\model\bert__"+str(length_of_encoding)+".pkl", 'wb'), protocol=4)
#########################################################################
##########################Prediction#####################################
tokenizer = BertTokenizerFast.from_pretrained(model_name,do_lower_case=True)
#get test text
test_text=list(test_data[independent_col])
#
#set encoding size
test_encodings_fix=[0]*51
#encode text
test_encodings = tokenizer(test_text, truncation=True, padding=True, max_length=256)
test_encodings=test_encodings['input_ids']
#make encoding fix lenght
for enc in test_encodings:
test_encodings_fix_trim=test_encodings_fix[len(enc):51]
enc.extend(test_encodings_fix_trim)
#load model
Pkl_Filename = r'\model_new\bert_model.pkl'
with open(Pkl_Filename, 'rb') as file:
Pickled_svc_Model = pickle.load(file)
#predict
predict_svc_test_pred_bbc = pd.DataFrame(Pickled_svc_Model.predict(test_encodings))
Running the prediction module throwing me error as :
ValueError: Number of features of the model must match the input. Model n_features is 51 and input n_features is 55.
When I checked the test_encoding there the value is 55.
My training data has 105 records and test data has 5 records.
I am not able to figure it out where I need to fix.

Anormal number of sims document in gensim

I'm actually injecting 77 document in a gensim mode by reading them from a database with a first script and i save the document on file system.
I then load an other doc to check the similarity with a vector
def read_corpus_bdd(cursor, tokens_only=False):
for i, (url_id, url_label, contenu) in enumerate(cursor):
tokens = gensim.utils.simple_preprocess(contenu)
if tokens_only:
yield tokens
else:
# For training data, add tags
# yield gensim.models.doc2vec.TaggedDocument(tokens, dataLine[0])
yield gensim.models.doc2vec.TaggedDocument(tokens, [int(str(url_id))])
print (int(str(url_id)))
targetContentCorpus = list(read_corpus_bdd(cursor))
# Param of trainer corpus
model = gensim.models.doc2vec.Doc2Vec(vector_size=40, min_count=2, epochs=40)
# Build a vocabulary
model.build_vocab(targetContentCorpus)
###############################################################################
model.train(targetContentCorpus, total_examples=model.corpus_count, epochs=model.epochs)
##generate file model name for save
from datetime import date
pathModelSave=os.getenv("MODEL_BASE_SAVE") +'/projet_'+ str(projetId)
When i infer the vector :
inferred_vector = model.infer_vector(test_corpus[0])
sims = model.docvecs.most_similar([inferred_vector], topn=len(model.docvecs))
len(sims) #output 335
So I don't understand where this 335 come from and also why
sims[0][0]
return other id than the tagged one in the yield section
enter code here

Resources