I'm trying to make huggingface's transformer library use a model that I have downloaded that is not in the huggingface model repository.
Where does transformers look for models? Is there an equivalent of the $PATH environment variable for transformers models?
Research
This hugging face issues talks about manually downloading models.
This issue suggests that you can work around the question of where huggingface is looking for models by using the path as an argument to from_pretrained (#model = BertModel.from_pretrained('path/to/your/directory')`)
Related questions
Where does hugging face's transformers save models?
Related
I want to build a text2text model. Specifically, I want to transfer some automatically generated scrabbling text pieces into a smooth paragraph within the same language. I've already prepared the text inputs and outputs. So corpus is not the primary problem now.
I want to use hugging face models like:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
model = AutoModelForMaskedLM.from_pretrained("bert-base-chinese")
because it has already obtained the capacity to generate the language, the model is made for masked language, and there's no mature task like mine as it is really customized. So how could I use the hugging face masked language model as a base text2text model without jeopardizing its capacity? I want to fine-tune it to achieve that task/goal. I want to know how.
I'm using pre-trained model for feature extraction of CT image for COVID. Then using a classifier. I need know what are features that will be extracted when pre-trained model is used here.
I want to continue training the model.zip file with more images without retraining from the baseline model from scratch, how do I do that?
This isn't possible at the moment. ML.NET's ImageClassificationTrainer already uses a pre-trained model, so you're using transfer learning to create your model. Any additions would have to be "from scratch" on the pre-trained model.
Also, looking at the existing trainers that can be re-trained, the ImageClassificationTrainer isn't listed among them.
Is there a way to save a gensim LDA model to ONNX format? We need to be able to train using Python/gensim and then operationalize it into an Onnx model to publish and use.
Currently (March 2020, gensim-3.8.1) I don't know of any built-in support for ONNX formats in gensim.
Provided the ONNX format can represent LDA models well – & here's an indiction it does – it would be a plausible new feature.
You could add a feature request at the gensim issue tracker, but for the feature to be added, it would likely require a contribution from a skilled developer who needs the feature, & can write the code & test cases.
i have created a model in rapid miner. it is a classification model and save the model in pmml. i want to use this model in H2O.ai to predict further. is there any way i can import this pmml model to H2O.ai an used this for further prediction.
I appreciate your suggestions.
Thanks
H2O offers no support for importing/exporting(*) pmml models.
It is hard to offer a good suggestion without knowing your motivation for wanting to use both RapidMiner and H2O. I've not used RapidMiner in about 6 or 7 years, and I know H2O well, so my first choice would just be to re-build the model in H2O.
If you are doing a lot of pre-processing steps in RapidMiner, and that is why you want to use it, you could still do all that data munging there, then export the prepared data to csv, import that into H2O, then build the model.
*: Though I did just find this tool for converting H2O models to PMML: https://github.com/jpmml/jpmml-h2o But that is the opposite direction for what you want.