Saving Keras Model: UTF - 8 Error - utf-8

I've built a convolutional neural network in keras that looks like this:
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(img_cols, img_rows, 3)))
convout1 = Activation('relu')
model.add(convout1)
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
convout2 = Activation('relu')
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(convout2)
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
I'm attempting to save the weights of my model after training using this:
fname = "weights-Test-CNN.hdf5"
model.load_weights(fname)
The program runs, and creates a file but once i've opened the file this is what is displayed:
Error! C://Users/NAME/weights-Test-CNN.hdf5 is not UTF-8 encoded.
Saving Disabled.
See Console for more Details.
How do I fix this error so that the weights are correctly saved?

The weights are in fact being saved. The issue here is that you can't read them as a UTF-8 encoded file. But if you try loading the weights, it should work.

Related

ljspeech Hugging Face examples not working

When trying to run the ljspeech example, I get the following error, even when the model is moved to the only GPU in the system. I am using Cuda 11.7, Pytorch 1.13.1, and Fairseq 0.12.2.
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
The code used:
from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
import IPython.display as ipd
import torch
models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
"facebook/fastspeech2-en-ljspeech",
arg_overrides={"vocoder": "hifigan", "fp16": False}
)
model = models[0].to(torch.device('cuda'))
models[0] = model
TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
generator = task.build_generator(models, cfg)
text = "Hello, this is a test run."
sample = TTSHubInterface.get_model_input(task, text)
wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)
ipd.Audio(wav, rate=rate)

PyTorch Lightning - How can I output a summary of training to the console

PyTorch Lighting can log to TensorBoard. How can I make it log to the console a table summarizing the training runs (similar to Huggingface's Transformers, shown below):
Epoch Training Loss Validation Loss Runtime Samples Per Second
1 1.220600 1.160322 39.574900 272.496000
2 0.945200 1.121690 39.706000 271.596000
3 0.773000 1.157358 39.734000 271.405000
You can write your own callback function and add it into the trainer.
from pytorch_lightning.utilities import rank_zero_info
class LoggingCallback(pl.Callback):
def on_validation_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule):
rank_zero_info("***** Test results *****")
metrics = trainer.callback_metrics
# Log results
for key in sorted(metrics):
if key not in ["log", "progress_bar"]:
rank_zero_info("{} = {}\n".format(key, str(metrics[key])))

Error message: Cube::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD using sommer package for GWAS

Output from sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sommer_4.1.4 crayon_1.4.1 lattice_0.20-41 MASS_7.3-53.1 Matrix_1.3-2 data.table_1.14.0
loaded via a namespace (and not attached):
[1] compiler_4.0.5 tools_4.0.5 rstudioapi_0.13 Rcpp_1.0.6 grid_4.0.5
I have been trying to carry out a gwas using the sommer package with the following code:
var_cov <- A.mat(m_matrix) ## aditive relationship matrix
model <- GWAS(cbind(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13, PLA14, PLA15, PLA16, PLA17, PLA18, RGR07_09, RGR08_10, RGR09_11, RGR10_12, RGR11_13, RGR12_14, RGR13_15, RGR14_16, RGR15_17, RGR16_18, SA, SL, SW) ~ 1, random = ~ vs(accession, Gu = var_cov), data = pheno2, M = m_matrix, gTerm = "u:accession", n.PC = 5)
As described in the code, I have 26 traits and I would like to use the K+P model. My SNPs matrix has
211 260 markers and 309 accessions.
When I run this code for one and two traits, it works fine. But, when I try to run with all the 26 traits I get the error message:
Error in GWAS(cbind(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13, :
Cube::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD
I searched online and found that this error is related to the package RcppArmadillo.
Following the suggestions here (http://arma.sourceforge.net/docs.html#config_hpp_arma_64bit_word) and
here (Large Matrices in RcppArmadillo via the ARMA_64BIT_WORD define), I tried to enable the ARMA_64BIT_WORD by uncommenting the line #define ARMA_64BIT_WORD (bellow) in the file RcppArmadillo\include\armadillo_bits\config.hpp:
#if !defined(ARMA_64BIT_WORD)
//#define ARMA_64BIT_WORD
//// Uncomment the above line if you require matrices/vectors capable of holding more than 4 billion elements.
//// Note that ARMA_64BIT_WORD is automatically enabled when std::size_t has 64 bits and ARMA_32BIT_WORD is not defined.
#endiff
and also including the following line in the file Makevars.win in RcppArmadillo\skeleton.
PKG_CPPFLAGS = -DARMA_64BIT_WORD=1
None of the suggestions worked and I continue getting the same error message. My questions are: is there another option to enable the ARMA_64BIT_WORD that I am missing? Is it possible to run the GWAS function in sommer package with as many traits as 26 or this number is too much? Would the error message result from a mistake in the GWAS code?
Thank you very much in advance.
My first take Ana is that you're trying to fit an unstructured multivariate model with 26 traits when you use cbind(), that means that if you have 1000 records, this will be a model of 309 x 26 = 8,034 records which would be a bit too big for the direct inversion algorithm that sommer uses, plus the number of parameters to estimate are a lot (think all the covariance parameters (26*25)/2 = 325. I would suggest fitting a GWAS per trait in a for loop to solve your issue. Unless you have a good justification to run a multivariate GWAS this is the issue with your analysis more than the C++ code behind. For example:
var_cov <- A.mat(m_matrix) ## aditive relationship matrix
traits <- c(DW20, PLA07, PLA08, PLA09, PLA10, PLA11, PLA12, PLA13, PLA14, PLA15, PLA16, PLA17, PLA18, RGR07_09, RGR08_10, RGR09_11, RGR10_12, RGR11_13, RGR12_14, RGR13_15, RGR14_16, RGR15_17, RGR16_18, SA, SL, SW)
for(itrait in traits){
model <- GWAS(as.formula(paste(itrait,"~1")), random = ~ vs(accession, Gu = var_cov), data = pheno2, M = m_matrix, gTerm = "u:accession", n.PC = 5)
}
If it turns out that even with a single trait the arma::cube function presents memory issues then definitely we need to look at why the armadillo library cannot deal with those dimensions.
Cheers,
Eduardo

Google Cloud Translation API: Creating glossary error

I tried to test Cloud Translation API using glossary.
So I created a sample glossary file(.csv) and uploaded it on Cloud Storage.
However when I ran my test code (copying sample code from official documentation), an error occurred. It seems that there is a problem in my sample glossary file, but I cannot find it.
I attached my code, error message, and screenshot of the glossary file.
Could you please tell me how to fix it?
And can I use the glossary so that the original language is used when translated into another language?
Ex) Translation English to Korean
I want to visit California. >>> 나는 California에 방문하고 싶다.
Sample Code)
from google.cloud import translate_v3 as translate
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="my_service_account_json_file_path"
def create_glossary(
project_id="YOUR_PROJECT_ID",
input_uri="YOUR_INPUT_URI",
glossary_id="YOUR_GLOSSARY_ID",
):
"""
Create a equivalent term sets glossary. Glossary can be words or
short phrases (usually fewer than five words).
https://cloud.google.com/translate/docs/advanced/glossary#format-glossary
"""
client = translate.TranslationServiceClient()
# Supported language codes: https://cloud.google.com/translate/docs/languages
source_lang_code = "ko"
target_lang_code = "en"
location = "us-central1" # The location of the glossary
name = client.glossary_path(project_id, location, glossary_id)
language_codes_set = translate.types.Glossary.LanguageCodesSet(
language_codes=[source_lang_code, target_lang_code]
)
gcs_source = translate.types.GcsSource(input_uri=input_uri)
input_config = translate.types.GlossaryInputConfig(gcs_source=gcs_source)
glossary = translate.types.Glossary(
name=name, language_codes_set=language_codes_set, input_config=input_config
)
parent = client.location_path(project_id, location)
# glossary is a custom dictionary Translation API uses
# to translate the domain-specific terminology.
operation = client.create_glossary(parent=parent, glossary=glossary)
result = operation.result(timeout=90)
print("Created: {}".format(result.name))
print("Input Uri: {}".format(result.input_config.gcs_source.input_uri))
create_glossary("my_project_id", "file_path_on_my_cloud_storage_bucket", "test_glossary")
Error Message)
Traceback (most recent call last):
File "C:/Users/ME/py-test/translation_api_test.py", line 120, in <module>
create_glossary("my_project_id", "file_path_on_my_cloud_storage_bucket", "test_glossary")
File "C:/Users/ME/py-test/translation_api_test.py", line 44, in create_glossary
result = operation.result(timeout=90)
File "C:\Users\ME\py-test\venv\lib\site-packages\google\api_core\future\polling.py", line 127, in result
raise self._exception
google.api_core.exceptions.GoogleAPICallError: None No glossary entries found in input files. Check your files are not empty. stats = {total_examples = 0, total_successful_examples = 0, total_errors = 3, total_ignored_errors = 3, total_source_text_bytes = 0, total_target_text_bytes = 0, total_text_bytes = 0, text_bytes_by_language_map = []}
Glossary File)
https://drive.google.com/file/d/1RaladmLjgygai3XsZv3Ez4ij5uDH5EdE/view?usp=sharing
I solved my problem by changing encoding of the glossary file to UTF-8.
And I also found that I can use the glossary so that the original language is used when translated into another language.

GridSearch with XGBoost producing Depreciation error on infinite loop

I am trying to do a hyperparameter tuning using GridSearchCV on XGBoost.But, I'm getting the following error.
/usr/local/lib/python3.6/dist-packages/sklearn/preprocessing/label.py:151: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.if diff:
This keeps on running forever. Given below is the code.
classifier = xgb.XGBClassifier()
from sklearn.grid_search import GridSearchCV
n_estimators=[10,50,100,150,200,250,300]
max_depth=[2,3,4,5,6,7,8,9,10]
learning_rate=[0.1,0.01,0.09,0.08,0.07,0.001]
colsample_bytree=[0.5,0.6,0.7,0.8,0.9]
min_child_weight=[1,2,3,4,5,6,7,8,9,10]
gamma=[0.001,0.01,0.1,0.2,0.3,0.4,0.5,1]
subsample=[0.5,0.6,0.7,0.8,0.9]
param_grid=dict(n_estimators=n_estimators,max_depth=max_depth,learning_rate=learning_rate,colsample_bytree=colsample_bytree,min_child_weight=min_child_weight,gamma=gamma,subsample=subsample)
grid = GridSearchCV(classifier, param_grid, cv=10, scoring='accuracy')
grid.fit(X, Y)
grid.grid_scores_
print(grid.best_score_)
print(grid.best_params_)
print(grid.best_estimator_)
# Predicting the Test set results
Y_pred = classifier.predict(X_test)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_test, Y_pred)
I am using python3.5, XGBOOT and gridsearch library has already been preloaded. I am running this on google collaboratory.
Please suggest what is going wrong ?

Resources