How to train a Huggingface NER dataset? - huggingface-datasets

I'd like to train or fine-tune a transformer to recognize NER (named entity recognition) from a given sentence.
Here is the dataset:
from datasets import load_dataset
dataset = load_dataset("limsc/requirements-entity-recognition")
https://huggingface.co/datasets/limsc/requirements-entity-recognition
Now, how to train it?

Related

How can I change the pre-trained file of the Yolov4 algorithm?

I want to train my custom dataset using the yolov4 algorithm.
The difference between my data and the coco dataset is that my objects label's are as follows:
label annotation: <object-class> <x_center> <y_center> <width> <height> <d>
We always use pre-trained files to object detection, but given that my data has different labels, Do I need to change pre-trained data?
how should I train the dataset?

How to map features from two different data using regressor for classification?

I am trying to build a Keras model to implement to approach explained in this paper.
Context of my implementation:
I have two different kinds of data representing the same set of classes(labels) that needs to be classified. The 1st kind is Image data, and the second kind is EEG data (a time series sequence).
I know that to classify image data we can use CNN models like this:
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
model.add(Dense(1000))
model.add(Activation('relu'))
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(40))
model.add(Activation('softmax'))
And to classify sequence data we can use LSTM models like this:
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(40, activation='softmax'))
But the approach of the paper above shows that EEG feature vectors can be mapped with image vectors through regression like this:
The first approach is to train a CNN to map images to corresponding
EEG feature vectors. Typically, the first layers of CNN attempt to
learn the general (global) features of the images, which are common
between many tasks, thus we initialize the weights of these layers
using pre-trained models, and then learn the weights of the last
layers from scratch in an end-to-end setting. In particular, we used
the pre-trained AlexNet CNN, and modified it by replacing the
softmax classification layer with a regression layer (containing as
many neurons as the dimensionality of the EEG feature vectors),
using Euclidean loss as the objective function.
The second approach consists of extracting image features using
pre-trained CNN models and then employ regression methods to map
image features to EEG feature vectors. We used our fine-tuned
AlexNet as feature extractors by
reading the output of the last fully connected layer, and then
applied several regression methods (namely, k-NN regression, ridge
regression, random forest regression) to obtain the predicted
feature vectors
I am not able to comprehend how to code the above two approaches. I have never used a regressor for feature mapping and then do classification. Any leads on this are much appreciated.
In my understanding the training data consists of (eeg_signal,image,class_label) triplets.
Train the LSTM model with input=eeg_signal, output=class_label. Loss is crossentropy.
Peel off the last layer of the LSTM model. Let's say the pre-last layer's output is a vector of size 20. Let's call it eeg_representation.
Run this truncated model on all your eeg_signal inputs, save the output of eeg_representation. You will get a tensor of [batch, 20]
Take that AlexNet mentioned in the paper (or any other image classifier), peel off the last layer. Let's say the pre-last layer's output is a vector of size 30. Let's call it image_representation.
Stich a linear layer to the end of the previous layer. This layer will convert image_representation to eeg_representation. It has 20 x 30 weight.
Train the stiched model on (image, eeg_representation) pairs. Loss is the Euclidean distance.
And now the fun part: Stich together model trained in step 7. and the peeled off part of model trained in step 1. If you input an image, you will get class predictions.
This sound like not a big deal (because we do image classification all the time), but if this is really working, it means that this is a "prediction that is running through our brains" :)
Thank you bringing up this question and linking the paper.
I feel I just repeated what's in your question and in the the paper.
I would be beneficial to have some toy dataset to be able to provide code examples.
Here's a Tensorflow tutorial on how to "peel off" the last layer of a pretrained image classification model.

Produce similar embeddings to another model with BERT

I have a dataset in the form (input_text, embedding_of_input_text), where embedding_of_input_text is an embedding of dimension 512 produced by another model (DistilBERT) when given as input input_text.
I would like to fine-tune BERT on this dataset such that it learns to produce similar embeddings (i.e. a kind of mimicking).
Furthermore, by default BERT returns embeddings of dimension 768, while here embedding_of_input_text are embeddings of dimension 512.
Which is the correct way to to that within the HuggingFace library?
you can get the tokenizer of the dataset
and add the neural network to get embedding of dimension 512.
However,what is the meaning of this operation.

How to use Inception Network for Regression

I'm trying to input an image and get a continuous number as an output.
I built a NN which takes an image with only a single node in the Hidden layer with a linear activation function. However, the model predicts the same number for the given input.
Hence I would like to use the Inception Network for this problem. Based on a recent paper by Google.
Link: https://arxiv.org/pdf/1904.06435.pdf
x = Dense(1, activation="linear")(x)
This is absolutely possible! The example from keras documentation on pre-trained models should help you with your endeavor. Make sure to adjust the output layer and the loss of your new model.
Edit: code example for your specific case
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a linear output layer
prediction = Dense(1, activation='linear')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=prediction)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non trainable)
model.compile(optimizer='rmsprop', loss='mean_squared_error')
# train the model on the new data for a few epochs
model.fit_generator(...)
This is just training the new top layers, if you like to fine-tune the lower layers as well have a look at the example from the documentation.

How to use LightGBM to fit a function curve?

I want to use LightGBM to fit a function curve,but in the examples of LightGBM's dataset,every record has a label column.
I don't know how to create my training dataset and testset.

Resources