How to use run_mlm with unlabeled text - huggingface-transformers

I want to fine-tune a pre-trained huggingface model for a particular domain. From this answer I know I can do it using run_mlm.py but I can't understan which format should I use for my text file. I tried to use a simple structure with one document per line and I get the following error:
python run_mlm.py --model_name_or_path "neuralmind/bert-base-portuguese-cased" --train_file ../data/full_corpus.csv --output models/ --do_train
Traceback (most recent call last):
File "run_mlm.py", line 449, in <module>
main()
File "run_mlm.py", line 384, in main
load_from_cache_file=not data_args.overwrite_cache,
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/dataset_dict.py", line 303, in map
for k, dataset in self.items()
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/dataset_dict.py", line 303, in <dictcomp>
for k, dataset in self.items()
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/arrow_dataset.py", line 1260, in map
update_data=update_data,
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/arrow_dataset.py", line 157, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/fingerprint.py", line 163, in wrapper
out = func(self, *args, **kwargs)
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/arrow_dataset.py", line 1529, in _map_single
writer.write_batch(batch)
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/arrow_writer.py", line 278, in write_batch
pa_table = pa.Table.from_pydict(typed_sequence_examples)
File "pyarrow/table.pxi", line 1474, in pyarrow.lib.Table.from_pydict
File "pyarrow/array.pxi", line 322, in pyarrow.lib.asarray
File "pyarrow/array.pxi", line 222, in pyarrow.lib.array
File "pyarrow/array.pxi", line 110, in pyarrow.lib._handle_arrow_array_protocol
File "/mnt/sdb/data-mwon/paperChega/env2/lib/python3.6/site-packages/datasets/arrow_writer.py", line 100, in __arrow_array__
if trying_type and out[0].as_py() != self.data[0]:
File "pyarrow/array.pxi", line 1058, in pyarrow.lib.Array.__getitem__
File "pyarrow/array.pxi", line 540, in pyarrow.lib._normalize_index
IndexError: index out of bounds

Related

Pipeline Loading Models and Tokenizers for Q&A

Hi I'm trying to use 'fmikaelian/flaubert-base-uncased-squad' for question answering. I understand that I should load the model and the tokenizers. I'm not sure how should I do this.
My code is basically far
from transformers import pipeline, BertTokenizer
nlp = pipeline('question-answering', \
model='fmikaelian/flaubert-base-uncased-squad', \
tokenizer='fmikaelian/flaubert-base-uncased-squad')
Most probably this can be solve with a two liner.
Many thanks
EDIT
I have also tried to use automodels but it seems those are not there:
OSError: Model name 'flaubert-base-uncased-squad' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). We assumed 'flaubert-base-uncased-squad' was a path or url to a configuration file named config.json or a directory containing such a file but couldn't find any such file at this path or url.
EDIT II
I tried following the approach suggested with the following code that loads models that have been saved from S3:
tokenizer_ = FlaubertTokenizer.from_pretrained(MODELS)
model_ = FlaubertModel.from_pretrained(MODELS)
p = transformers.QuestionAnsweringPipeline(
model=transformers.AutoModel.from_pretrained(MODELS),
tokenizer=transformers.AutoTokenizer.from_pretrained(MODELS)
)
question_="Quel est le montant de la garantie?"
language_="French"
context_="le montant de la garantie est € 1000"
output=p({'question':question_, 'context': context_})
print(output)
Unfortunately I have been getting the following error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
Traceback (most recent call last):
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 114, in _main
File "question_extraction.py", line 61, in <module>
prepare(preparation_data)
output=p({'question':question_, 'context': context_}) File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 225, in prepare
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\pipelines.py", line 802, in __call__
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\... ...\Box Sync\nlp - 2...\NLP\src\question_extraction.py", line 61, in <module>
output=p({'question':question_, 'context': context_})
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\pipelines.py", line 802, in __call__
for example in examples
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\pipelines.py", line 802, in <listcomp>
for example in examples
for example in examples File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\pipelines.py", line 802, in <listcomp>
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\data\processors\squad.py", line 304, in squad_convert_examples_to_features
for example in examples
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\site-packages\transformers\data\processors\squad.py", line 304, in squad_convert_examples_to_features
with Pool(threads, initializer=squad_convert_example_to_features_init, initargs=(tokenizer,)) as p:with Pool(threads, initializer=squad_convert_example_to_features_init, initargs=(tokenizer,)) as p:
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\context.py", line 119, in Pool
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\context.py", line 119, in Pool
context=self.get_context())context=self.get_context())
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\pool.py", line 174, in __init__
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\pool.py", line 174, in __init__
self._repopulate_pool()self._repopulate_pool()
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\pool.py", line 239, in _repopulate_pool
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\pool.py", line 239, in _repopulate_pool
w.start()
w.start()
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\process.py", line 105, in start
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\context.py", line 322, in _Popen
self._popen = self._Popen(self)
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
return Popen(process_obj) File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)reduction.dump(process_obj, to_child)
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\reduction.py", line 60, in dump
_check_not_importing_main()
File "C:\Users\... ...\AppData\Local\Continuum\Anaconda3\envs\nlp_nlp\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
*EDIT IV *
I solved the previous EDIT error by placing the functions inside the "main".
Unfortunately when I run the following code:
tokenizer_ = FlaubertTokenizer.from_pretrained(MODELS)
model_ = FlaubertModel.from_pretrained(MODELS)
def question_extraction(text, question, model, tokenizer, language="French", verbose=False):
if language=="French":
nlp = pipeline('question-answering', \
model=model, \
tokenizer=tokenizer)
else:
nlp=pipeline('question-answering')
output=nlp({'question':question, 'context': text})
answer, score = output.answer, output.score
if verbose==True:
print("Q: ", question ,"\n",\
"A:", answer,"\n", \
"Confidence (%):", "{0:.2f}".format(str(score*100) )
)
return answer, score
if __name__=="__main__":
question_="Quel est le montant de la garantie?"
language_="French"
text="le montant de la garantie est € 1000"
answer, score=question_extraction(text, question_, model_, tokenizer_, language_, verbose= True)
I'm getting the following error:
C:\...\NLP\src>python question_extraction.py
OK
OK
convert squad examples to features: 100%|████████████████████████████████████████████████| 1/1 [00:00<00:00, 4.66it/s]
add example index and unique id: 100%|███████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "question_extraction.py", line 77, in <module>
answer, score=question_extraction(text, question_, model_, tokenizer_, language_, verbose= True)
File "question_extraction.py", line 60, in question_extraction
output=nlp({'question':question, 'context': text})
File "C:\...\transformers\pipelines.py", line 818, in __call__
start, end = self.model(**fw_args)
ValueError: not enough values to unpack (expected 2, got 1)
As stated in the source, there is a specific QuestionAnsweringPipeline. Below example is what I used to successfully load the Flaubert model.
import transformers as trf
p = trf.QuestionAnsweringPipeline(model=trf.AutoModel.from_pretrained("fmikaelian/flaubert-base-uncased-squad"), tokenizer=trf.AutoTokenizer.from_pretrained("fmikaelian/flaubert-base-uncased-squad"))
Of course, there is also the alternative to use the pre-trained model FlaubertForQuestionAnswering, since pipelines just got released with the latest release and might be subject to change.

Pyinstaller no exe created, nothing in dist folder

When i try to run the pyinstall myFile.py, the build and dist folder are created, and the file myFIle.spec, but the dist folder is empty. In the shell I have a traceback at the end :
File "/home/indiana/anaconda3/bin/pyinstaller", line 11, in <module>
sys.exit(run())
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/__main__.py", line 111, in run
run_build(pyi_config, spec_file, **vars(args))
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/__main__.py", line 63, in run_build
PyInstaller.building.build_main.main(pyi_config, spec_file, **kwargs)
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/build_main.py", line 838, in main
build(specfile, kw.get('distpath'), kw.get('workpath'), kw.get('clean_build'))
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/build_main.py", line 784, in build
exec(text, spec_namespace)
File "<string>", line 31, in <module>
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/api.py", line 411, in __init__
strip_binaries=self.strip, upx_binaries=self.upx,
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/api.py", line 196, in __init__
self.__postinit__()
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/datastruct.py", line 158, in __postinit__
self.assemble()
File "/home/indiana/anaconda3/lib/python3.6/site-packages/PyInstaller/building/api.py", line 273, in assemble
pylib_name = os.path.basename(bindepend.get_python_library_path())
File "/home/indiana/anaconda3/lib/python3.6/posixpath.py", line 146, in basename
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Futhermore, I have a warning during the process :
WARNING: Hidden import "PyQt5.sip" not found!
I'm more concern about the traceback ( since it seems it's not related to my code but more about the environment), so I wonder if somebody had seen this before and if someone has a kind of a solution.
Thanks in advance !

Biom file v.1.0.0-dev created with biomformat not valid

I created a BIOM file, yyy.biom, using the functions make_biom and write_biom of the R package biomformat, version 1.2.0.
yyy.biom looks like:
"id":{},
"format":["Biological Observation Matrix 1.0.0-dev"],
"format_url":["http://biom-format.org/documentation/format_versions/biom-1.0.html"],
"type":["OTU table"],
"generated_by":["biomformat 1.2.0"],
"date":["2017-09-21 14:36:28"],
"matrix_type":["dense"],
"matrix_element_type":["int"],
"shape":[5,6],
"rows":[
{"id":["GG_OTU_1"],"metadata":["k__Bacteria","p__Proteobacteria","c__Gammaproteobacteria","o__Enterobacteriales","f__Enterobacteriaceae","g__Escherichia","s__"]},
{"id":["GG_OTU_2"],"metadata":["k__Bacteria","p__Cyanobacteria","c__Nostocophycideae","o__Nostocales","f__Nostocaceae","g__Dolichospermum","s__"]},
{"id":["GG_OTU_3"],"metadata":["k__Archaea","p__Euryarchaeota","c__Methanomicrobia","o__Methanosarcinales","f__Methanosarcinaceae","g__Methanosarcina","s__"]},
{"id":["GG_OTU_4"],"metadata":["k__Bacteria","p__Firmicutes","c__Clostridia","o__Halanaerobiales","f__Halanaerobiaceae","g__Halanaerobium","s__Halanaerobiumsaccharolyticum"]},
{"id":["GG_OTU_5"],"metadata":["k__Bacteria","p__Proteobacteria","c__Gammaproteobacteria","o__Enterobacteriales","f__Enterobacteriaceae","g__Escherichia","s__"]}
],
"columns":[
{"id":["Sample1"],"metadata":["CGCTTATCGAGA","CATGCTGCCTCCCGTAGGAGT","gut","human gut"]},
{"id":["Sample2"],"metadata":["CATACCAGTAGC","CATGCTGCCTCCCGTAGGAGT","gut","human gut"]},
{"id":["Sample3"],"metadata":["CTCTCTACCTGT","CATGCTGCCTCCCGTAGGAGT","gut","human gut"]},
{"id":["Sample4"],"metadata":["CTCTCGGCCTGT","CATGCTGCCTCCCGTAGGAGT","skin","human skin"]},
{"id":["Sample5"],"metadata":["CTCTCTACCAAT","CATGCTGCCTCCCGTAGGAGT","skin","human skin"]},
{"id":["Sample6"],"metadata":["CTAACTACCAAT","CATGCTGCCTCCCGTAGGAGT","skin","human skin"]}
],
"data": [[0,0,1,0,0,0],
[5,1,0,2,3,1],
[0,0,1,4,2,0],
[2,1,1,0,0,1],
[0,1,1,0,0,0]]
BIOM validation returns an error:
$ biom validate-table -i yyy.biom
Traceback (most recent call last):
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/bin/biom", line 6, in <module>
sys.exit(biom.cli.cli())
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 49, in validate_table
valid, report = _validate_table(input_fp, format_version, detailed_report)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 62, in _validate_table
detailed_report=detailed_report)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 113, in __call__
detailed_report=detailed_report)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 97, in run
return self._validate_json(**kwargs)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 305, in _validate_json
status_msg = method(table_json)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/cli/table_validator.py", line 496, in _valid_type
if value.lower() not in self.TableTypes:
AttributeError: 'list' object has no attribute 'lower'
I tried importing to QIIME 2 and got this error:
Traceback (most recent call last):
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/bin/qiime", line 6, in <module>
sys.exit(q2cli.__main__.qiime())
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/q2cli/tools.py", line 111, in import_data
view_type=source_format)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/qiime2/sdk/result.py", line 192, in import_data
return cls._from_view(type_, view, view_type, provenance_capture)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/qiime2/sdk/result.py", line 217, in _from_view
result = transformation(view)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/qiime2/core/transform.py", line 59, in transformation
new_view = transformer(view)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/qiime2/core/transform.py", line 207, in wrapped
file_view = transformer(view)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/q2_types/feature_table/_transformer.py", line 128, in _8
data = _parse_biom_table_v100(ff)
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/q2_types/feature_table/_transformer.py", line 46, in _parse_biom_table_v100
table = biom.Table.from_json(json.load(fh))
File "/home/akense/programs/miniconda3/envs/qiime2.env.analysis/lib/python3.5/site-packages/biom_format-2.1.5-py3.5-linux-x86_64.egg/biom/table.py", line 3672, in from_json
dtype = MATRIX_ELEMENT_TYPE[json_table['matrix_element_type']]
TypeError: unhashable type: 'list'
QIIME 2 support suggested that the format 1.0.0-dev might be causing the issue. Could that be it? If so is there a way to change format?
Thanks!

Celery not starting in OS X - dbm.error: db type is dbm.gnu, but the module is not available

I'm trying to run celery worker in OS X (Mavericks). I activated virtual environment (python 3.4) and tried to start Celery with this argument:
celery worker --app=scheduling -linfo
Where scheduling is my celery app.
But I ended up with this error: dbm.error: db type is dbm.gnu, but the module is not available
Complete stacktrace:
Traceback (most recent call last):
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 320, in __get__
return obj.__dict__[self.__name__]
KeyError: 'db'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/other/PhoenixEnv/bin/celery", line 9, in <module>
load_entry_point('celery==3.1.9', 'console_scripts', 'celery')()
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/__main__.py", line 30, in main
main()
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/celery.py", line 80, in main
cmd.execute_from_commandline(argv)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/celery.py", line 768, in execute_from_commandline
super(CeleryCommand, self).execute_from_commandline(argv)))
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/base.py", line 308, in execute_from_commandline
return self.handle_argv(self.prog_name, argv[1:])
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/celery.py", line 760, in handle_argv
return self.execute(command, argv)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/celery.py", line 692, in execute
).run_from_argv(self.prog_name, argv[1:], command=argv[0])
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/worker.py", line 175, in run_from_argv
return self(*args, **options)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/base.py", line 271, in __call__
ret = self.run(*args, **kwargs)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bin/worker.py", line 209, in run
).start()
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/__init__.py", line 100, in __init__
self.setup_instance(**self.prepare_args(**kwargs))
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/__init__.py", line 141, in setup_instance
self.blueprint.apply(self, **kwargs)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bootsteps.py", line 221, in apply
step.include(parent)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bootsteps.py", line 347, in include
return self._should_include(parent)[0]
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/bootsteps.py", line 343, in _should_include
return True, self.create(parent)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/components.py", line 220, in create
w._persistence = w.state.Persistent(w.state, w.state_db, w.app.clock)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/state.py", line 161, in __init__
self.merge()
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/state.py", line 169, in merge
self._merge_with(self.db)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 322, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/state.py", line 238, in db
return self.open()
File "/Users/other/PhoenixEnv/lib/python3.4/site-packages/celery/worker/state.py", line 165, in open
self.filename, protocol=self.protocol, writeback=True,
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/shelve.py", line 239, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/shelve.py", line 223, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/dbm/__init__.py", line 91, in open
"available".format(result))
dbm.error: db type is dbm.gnu, but the module is not available
Please help.
I switched to python3.5 and got the same error. On Ubuntu I could fix it with
aptitude install python3.5-gdbm
I got the same error. On Macbook I could fix it with
brew install gdb

Updated pyramid model directory structure now sessions are broken

background
So I have a Pyramis app with a whole lot of models that relate to each other in different ways. These models were initially kept in a bunch of different files according to their general roles. For example I had a file called auth_models.py that contained the definition for User and Group.
I've been battling to deal with imports and suchlike because all the model files relate to each other in such a complex way so I gave in and placed all of them in the same file. And then I updated all my import statements elsewhere so everything should work.
Now whenever I try to access any view at all I get an internal server error. It turns out that the error is caused by the fact that auth_models.py no longer exists. The error is coming from a picklie.loads statement so I figure there is some session info being loaded that is no longer working. The full error message as well as my session settings are included at the end of this question.
question
If my assumption is correct, how would I get Pyramid to 'forget' the last sessions in a safe way?
If my assumption is incorrect, what's the best way to fix this? I don't want to revert to my old directory structure because that causes it's own problems...
settings
session.type = file
session.data_dir = %(here)s/data/sessions/data
session.lock_dir = %(here)s/data/sessions/lock
session.key = ******
session.secret = *****
session.cookie_on_exception = true
session.auto = true
session.timeout = 1800
error
2013-04-08 10:24:15,642 ERROR [waitress][Dummy-2] Exception when serving /
Traceback (most recent call last):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/toolbar.py", line 122, in toolbar_tween
response = _handler(request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/panels/performance.py", line 55, in resource_timer_handler
result = handler(request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/tweens.py", line 21, in excview_tween
response = handler(request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_tm-0.7-py3.3.egg/pyramid_tm/__init__.py", line 82, in tm_tween
reraise(*exc_info)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_tm-0.7-py3.3.egg/pyramid_tm/compat.py", line 13, in reraise
raise value
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_tm-0.7-py3.3.egg/pyramid_tm/__init__.py", line 63, in tm_tween
response = handler(request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/router.py", line 161, in handle_request
response = view_callable(context, request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/config/views.py", line 345, in rendered_view
result = view(context, request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/config/views.py", line 462, in _class_requestonly_view
inst = view(request)
File "/home/sheena/WORK/mega-3.3/mega/wsgi/pyramidapp/pyramidapp/views/basic_views.py", line 10, in __init__
BaseView.__init__(self,request)
File "/home/sheena/WORK/mega-3.3/mega/wsgi/pyramidapp/pyramidapp/views/class_base_view.py", line 15, in __init__
BaseView.session_init(request)
File "/home/sheena/WORK/mega-3.3/mega/wsgi/pyramidapp/pyramidapp/views/class_base_view.py", line 62, in session_init
if not request.session.__contains__(sKey):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/decorator.py", line 39, in __get__
val = self.wrapped(inst)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/request.py", line 350, in session
return factory(self)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 204, in __init__
value = signed_deserialize(cookieval, self._secret)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 82, in signed_deserialize
return pickle.loads(pickled)
ImportError: No module named 'pyramidapp.models.auth_models'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/mako_templating.py", line 211, in __call__
result = template.render_unicode(**system)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/template.py", line 421, in render_unicode
as_unicode=True)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 767, in _render
**_kwargs_for_callable(callable_, data))
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 799, in _render_context
_exec_template(inherit, lclcontext, args=args, kwargs=kwargs)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 825, in _exec_template
callable_(context, *args, **kwargs)
File "pyramid_debugtoolbar_templates_toolbar_dbtmako", line 111, in render_body
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/panels/request_vars.py", line 42, in content
if hasattr(self.request, 'session'):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/decorator.py", line 39, in __get__
val = self.wrapped(inst)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/request.py", line 350, in session
return factory(self)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 204, in __init__
value = signed_deserialize(cookieval, self._secret)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 82, in signed_deserialize
return pickle.loads(pickled)
ImportError: No module named 'pyramidapp.models.auth_models'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/waitress-0.8.2-py3.3.egg/waitress/channel.py", line 329, in service
task.service()
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/waitress-0.8.2-py3.3.egg/waitress/task.py", line 173, in service
self.execute()
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/waitress-0.8.2-py3.3.egg/waitress/task.py", line 380, in execute
app_iter = self.channel.server.application(env, start_response)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/router.py", line 251, in __call__
response = self.invoke_subrequest(request, use_tweens=True)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/router.py", line 227, in invoke_subrequest
response = handle_request(request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/toolbar.py", line 135, in toolbar_tween
toolbar.process_response(response)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/toolbar.py", line 56, in process_response
vars, request=request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/renderers.py", line 88, in render
return helper.render(value, None, request=request)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/renderers.py", line 557, in render
result = renderer(value, system_values)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/mako_templating.py", line 219, in __call__
reraise(MakoRenderingException(errtext), None, exc_info[2])
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/compat.py", line 131, in reraise
raise value.with_traceback(tb)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/mako_templating.py", line 211, in __call__
result = template.render_unicode(**system)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/template.py", line 421, in render_unicode
as_unicode=True)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 767, in _render
**_kwargs_for_callable(callable_, data))
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 799, in _render_context
_exec_template(inherit, lclcontext, args=args, kwargs=kwargs)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 825, in _exec_template
callable_(context, *args, **kwargs)
File "pyramid_debugtoolbar_templates_toolbar_dbtmako", line 111, in render_body
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/panels/request_vars.py", line 42, in content
if hasattr(self.request, 'session'):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/decorator.py", line 39, in __get__
val = self.wrapped(inst)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/request.py", line 350, in session
return factory(self)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 204, in __init__
value = signed_deserialize(cookieval, self._secret)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 82, in signed_deserialize
return pickle.loads(pickled)
pyramid.mako_templating.MakoRenderingException:
Traceback (most recent call last):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/mako_templating.py", line 211, in __call__
result = template.render_unicode(**system)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/template.py", line 421, in render_unicode
as_unicode=True)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 767, in _render
**_kwargs_for_callable(callable_, data))
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 799, in _render_context
_exec_template(inherit, lclcontext, args=args, kwargs=kwargs)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/Mako-0.7.3-py3.3.egg/mako/runtime.py", line 825, in _exec_template
callable_(context, *args, **kwargs)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/templates/toolbar.dbtmako", line 60, in render_body
${panel.content()|n}
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid_debugtoolbar-1.0.4-py3.3.egg/pyramid_debugtoolbar/panels/request_vars.py", line 42, in content
if hasattr(self.request, 'session'):
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/decorator.py", line 39, in __get__
val = self.wrapped(inst)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/request.py", line 350, in session
return factory(self)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 204, in __init__
value = signed_deserialize(cookieval, self._secret)
File "/home/sheena/WORK/mega-3.3/lib/python3.3/site-packages/pyramid-1.4-py3.3.egg/pyramid/session.py", line 82, in signed_deserialize
return pickle.loads(pickled)
ImportError: No module named 'pyramidapp.models.auth_models'
You have stored some of your model instances in a session cookie, which uses pickle to serialize and deserialize that data.
Because you moved the model to another module, pickle can no longer load the session data.
You can do two things:
If you don't care about the session data, simply delete your session cookie. Use your browser tools to delete the cookie manually, perhaps delete all cookies for your site.
Create an alias for the model in the old location. Create a pyramidapp.models.auth_models module that simply imports the models that used to be there. This module does not need to be imported by anything else, pickle will load it for you when needed.
Any future sessions will be created with the new location of your models, this affects only old session data.

Resources