H2O Stacked Ensemble MOJO not returning value when Deep Learning included - h2o

For H2O version I create stacked ensembles of two models, one Deep Learning and one XGBoost and export the MOJO. I have two APIs working with other MOJO files, but for these stacked ensembles they fail. The MOJO returns an empty prediction. The models work independently, and it appears that the H2O binary works as well. I'm simply creating the model as:
ensemble = H2OStackedEnsembleEstimator(base_models=[DeepLearningModel, XGBoostModel])
ensemble.metalearner_fold_column = 'fold_numbers'
ensemble.train(x=parameters, y=response, training_frame=model_trainer.h2odata)
These fail independent of the dataset I'm training on. Also, StackedEnsemble_BestOfFamily model MOJOs fail in the same manner if DeepLearning is included as an algorithm.
Why are do these MOJOs fail to return predictions, and what can I do to stop it? Could Deep Learning be the problem somehow?


How to change the task of a pretrained model without jeopardizing the capacity of it?

I want to build a text2text model. Specifically, I want to transfer some automatically generated scrabbling text pieces into a smooth paragraph within the same language. I've already prepared the text inputs and outputs. So corpus is not the primary problem now.
I want to use hugging face models like:
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
model = AutoModelForMaskedLM.from_pretrained("bert-base-chinese")
because it has already obtained the capacity to generate the language, the model is made for masked language, and there's no mature task like mine as it is really customized. So how could I use the hugging face masked language model as a base text2text model without jeopardizing its capacity? I want to fine-tune it to achieve that task/goal. I want to know how.

GPT-3 Fine Tune a Fine Tuned Model?

The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly:
The name of the base model to fine-tune. You can select one of "ada", "babbage", "curie", "davinci", or a fine-tuned model created after 2022-04-21.
My question: is it better to fine-tune a base model or a fine-tuned model?
I created a fine-tune model from ada with file mydata1K.jsonl:
ada + mydata1K.jsonl --> ada:ft-acme-inc-2022-06-25
Now I have a bigger file of samples mydata2K.jsonl that I want to use to improve the fine-tuned model.
In this second round of fine-tuning, is it better to fine-tune ada again or to fine-tune my fine-tuned model ada:ft-acme-inc-2022-06-25? I'm assuming this is possible because my fine tuned model is created after 2022-04-21.
ada + mydata2K.jsonl --> better-model
ada:ft-acme-inc-2022-06-25 + mydata2K.jsonl --> even-better-model?
If you read the Fine-tuning documentation as of Jan 4, 2023, the only part talking about "fine-tuning a fine-tuned model" is the following part under Advanced usage:
Continue fine-tuning from a fine-tuned model
If you have already fine-tuned a model for your task and now have
additional training data that you would like to incorporate, you can
continue fine-tuning from the model. This creates a model that has
learned from all of the training data without having to re-train from
To do this, pass in the fine-tuned model name when creating a new
fine-tuning job (e.g., -m curie:ft-<org>-<date>). Other training
parameters do not have to be changed, however if your new training
data is much smaller than your previous training data, you may find it
useful to reduce learning_rate_multiplier by a factor of 2 to 4.
Which option to choose?
You're asking about two options:
Option 1: ada + bigger-training-dataset.jsonl
Option 2: ada:ft-acme-inc-2022-06-25 + additional-training-dataset.jsonl
The documentation says nothing about which option is better in terms of which would yield better results.
Choose Option 2
When training a fine-tuned model, the total tokens used will be billed
according to our training rates.
If you choose Option 1, you'll pay for some tokens in your training dataset twice. First when doing fine-tuning with initial training dataset, second when doing fine-tuning with bigger training dataset (i.e., bigger-training-dataset.jsonl = initial-training-dataset.jsonl + additional-training-dataset.jsonl).
It's better to continue fine-tuning from a fine-tuned model because you'll pay only for tokens in your additional training dataset.
Read more about fine-tuning pricing calculation.

ML.NET doesn't support resuming training for ImageClassificationTrainer

I want to continue training the model.zip file with more images without retraining from the baseline model from scratch, how do I do that?
This isn't possible at the moment. ML.NET's ImageClassificationTrainer already uses a pre-trained model, so you're using transfer learning to create your model. Any additions would have to be "from scratch" on the pre-trained model.
Also, looking at the existing trainers that can be re-trained, the ImageClassificationTrainer isn't listed among them.

Multiple Stanford CoreNLP model files made, which one is the correct one to use?

I made a sentiment analysis model using Standford CoreNLP's library. So I have a bunch of ser.gz files that look like the following:
I was wondering what model to use in my java code, but based on a previous question,
I just used the model with the highest F1 score, which in this case is model-0014-93.73.ser.gz. And in my java code, I pointed to the model I want to use by using the following line:
props.put("sentiment.model", "/path/to/model-0014-93.73.ser.gz.");
However, by referring to just that model, am I excluding the sentiment analysis from the other models that were made? Should I be referring to all the model files to make sure I "covered" all the bases or does the highest scoring model trump everything else?
You should point to only the single highest scoring model. The code has no way to make use of multiple models at the same time.

statsmodels - create model from params

I am trying to create an empty model from params saved from a previously trained model, but the constructor stubbornly wants me to provide both endogenous and exogenous variables, which I don't have. Is there any way to get around this?
For example, I only want to do:
logit = sm.Logit()
pred = logit.predict(params, X)
But the first line won't work.
No, this is not supported in statsmodels. Models are always associated with data.
However, for the usecase of prediction, it is possible to pickle the model and optionally delete all full length arrays including the data from the model instance and from the results instance before pickling. This doesn't work with formulas.
On the other hand, since this is Python, there might be several ways how to cheat, at your own risk.
It would be helpful if you open a issue on github https://github.com/statsmodels/statsmodels/issues with a description of your usecase, and it might be possible to get the relevant features into a future version.
