Huggingface models: how to store a different version of a model - huggingface-transformers

I have a model that I pushed to the remote using the following code:
from transformers import CLIPProcessor, CLIPModel
checkpoint = "./checkpoints-15/checkpoint-60"
model = CLIPModel.from_pretrained(checkpoint)
processor = CLIPProcessor.from_pretrained(checkpoint)
repo = "vincentclaes/emoji-predictor"
model.push_to_hub(repo, use_temp_dir=True)
processor.push_to_hub(repo, use_temp_dir=True)
On the UI I see my model under a main branch:
What if I want to store multiple versions of a model?
Can I create a separate git branch?
Can I create a git tag?
How do I do this using the huggingface tools?
Thinking transformers, huggingface_hub, ...


saving finetuned model locally

I'm trying to understand how to save a fine-tuned model locally, instead of pushing it to the hub.
I've done some tutorials and at the last step of fine-tuning a model is running trainer.train() . And then the instruction is usually: trainer.push_to_hub
But what if I don't want to push to the hub? I want to save the model locally, and then later be able to load it from my own computer into future task so I can do inference without re-tuning.
How can I do that?
eg: Initially load a model from hugging face:
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)
trainer = Trainer(
Somehow save the new trained model locally, so that next time I can pass
model = 'some local directory where model and configs (?) got saved'
You can use the save_model method:
Or alternatively, the save_pretrained method:
Then, when reloading your model, specify the path you saved to:

Huggingface transformer export tokenizer and model

I'm currently working on a text summarizer powered by the Huggingface transformers library. The summarization process has to be done on premise, as such I have the following code (close to documentation):
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
model = BartForConditionalGeneration.from_pretrained('sshleifer/distilbart-cnn-6-6')
tokenizer = BartTokenizer.from_pretrained('sshleifer/distilbart-cnn-6-6')
inputs = tokenizer([myTextToSummarize], max_length=1024, return_tensors='pt')
summary_ids = model.generate(inputs['input_ids'], num_beams=4, early_stopping=True)
[tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids]
My problem is that I cannot load the model in memory and have my server expose an API which can directly use model and tokenizer, I would like both of them to be initialized in a first process, and made available in a second one (one that will expose an HTTP API). I saw that you can export the model on the filesystem, but again, I don't have access to it (locked k8s environment), and I'd need to store it in a specific database.
Is it possible to export both the modeland the tokenizer as string/buffer/something storable in a Database ?
Thanks a lot

Download pre-trained sentence-transformers model locally

I am using the SentenceTransformers library (here: for creating embeddings of sentences using the pre-trained model bert-base-nli-mean-tokens. I have an application that will be deployed to a device that does not have internet access. Here, it's already been answered, how to save the model Download pre-trained BERT model locally. Yet I'm stuck at loading the saved model from the locally saved path.
When I try to save the model using the above-mentioned technique, these are the output files:
When I try to load it in the memory, using
tokenizer = AutoTokenizer.from_pretrained(to_save_path)
I'm getting
Can't load config for '/bert-base-nli-mean-tokens'. Make sure that:
- '/bert-base-nli-mean-tokens' is a correct model identifier listed on ''
- or '/bert-base-nli-mean-tokens' is the correct path to a directory containing a config.json
You can download and load the model like this
from sentence_transformers import SentenceTransformer
modelPath = "local/path/to/model
model = SentenceTransformer('bert-base-nli-stsb-mean-tokens')
model = SentenceTransformer(modelPath)
this worked for me.You can check the SBERT documentation for model details for the SentenceTransformer class [Here][1]
There are many ways to solve this issue:
Assuming you have trained your BERT base model locally (colab/notebook), in order to use it with the Huggingface AutoClass, then the model (along with the tokenizers,vocab.txt,configs,special tokens and tf/pytorch weights) has to be uploaded to Huggingface. The steps to do this is mentioned here. Once it is uploaded, there will be a repository created with your username, and then the model can be accessed as follows:
from transformers import AutoTokenizer
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("<username>/<model-name>")
The second way is to use the trained model locally, and this can be done by using pipelines.The following is an example how to use this model trained(&saved) locally for your use-case (giving an example from my locally trained QA model):
from transformers import AutoModelForQuestionAnswering,AutoTokenizer,pipeline
'question': 'What is the fund price of Huggingface in NYSE?',
'context': 'Huggingface Co. has a total fund price of $19.6 million dollars'
The third way is to directly use Sentence Transformers from the Huggingface models repo.
There are also other ways to resolve this but these might help. Also this list of pretrained models might help.

I want to use "grouped_entities" in the huggingface pipeline for ner task, how to do that?

I want to use "grouped_entities" in the huggingface pipeline for ner task. However having issues doing that.
I do look the following link on git but this did not help:
I got the answer its very straight forward in the transformer v4.0.0. Previously I was using older version of transformer package.
from transformers import AutoTokenizer,
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("fine_tune_model_path")
model = AutoModelForTokenClassification.from_pretrained("fine_tune_model_path")

Reusing h2o model mojo or pojo file from python

As H2o models are only reusable with the same major version of h2o they were saved with, an alternative is to save the model as MOJO/POJO format. Is there a way these saved models can be reused/loaded from python code. Or is there any way to keep the model for further development when upgrading the H2O version??
If you want to use your model for scoring via python, you could use either h2o.mojo_predict_pandas or h2o.mojo_predict_csv. But otherwise if you want to load a binary model that you previously saved, you will need to have compatible versions.
Outside of H2O-3 you can look into pyjnius as Tom recommended:
Another alternative is to use pysparkling, if you only need it for scoring:
from import H2OMOJOModel
# Load test data to predict
df =
# Load mojo model
mojo = H2OMOJOModel.createFromMojo(mojo_path)
# Make predictions
predictions = mojo.transform(df)
# Show predictions with ground truth (y_true and y_pred)'your_target_column', 'prediction').show()
