OSError: No space on device for weird reason - huggingface-transformers

I am running some code that wasn't initially mine using GPT2-XL; it was initially implemented for GPT-Neo-1.3B and T5-Large. Running an experiment using GPT2-XL, I got the following error:
File "/data/username/directory", line 631, in to_tsr_gpt_entity_inference _bleu_score = BLEU.compute(predictions=definition, File "/data/username/miniconda3/lib/python3.9/site-packages/evaluate/module.py", line 433, in compute self._finalize() File "/data/username/miniconda3/lib/python3.9/site-packages/evaluate/module.py", line 372, in _finalize self.writer.finalize() File "/data/username/miniconda3/lib/python3.9/site-packages/datasets/arrow_writer.py", line 589, in finalize self.stream.close() File "/data/username/miniconda3/lib/python3.9/site-packages/fsspec/implementations/local.py", line 358, in close return self.f.close() OSError: [Errno 28] No space left on device
The issue here is that I'm not quite sure what is running out of space; I have checked df -h and df -I for example, and I tried setting the TMPDIR to be different with export TMPDIR=/new_directory, but that didn't work, so I can only assume something else is causing this to run out of space. It appears to have something to do with BLEU from the evaluate module of hugging_face, however, the same code (including the BLEU bit) works when running GPT-Neo/T5, so I'm not quite sure what the issue is.
I'd greatly appreciate any help in answering this question.

Related

'Error compiling Cython file error' from one day to another

I use a lot of special characters from Hun language, and there were no problems previously. Now they all give errors when running the whole script (F9). It still runs perfect when running locally (select + F5).
FĂșĂș='bar'
Traceback:
C:\Users\my name\.ipython\cython\_cython_magic_21a3824690cdb52a9fe6a3fa1c63ee73.pyx:1:1: Unrecognized character
Traceback (most recent call last):
File "<ipython-input-6-dcfab52d0ff4>", line 1, in <module>
runfile('E:/Anyagok/Programozas/Python/projekts/elo/mindennap/untitled0.pyx', wdir='E:/Anyagok/Programozas/Python/projekts/elo/mindennap')
File "E:\Download\PROGIK\ANACONDA\lib\site-packages\spyder\utils\site\sitecustomize.py", line 703, in runfile
ipython_shell.run_cell_magic('cython', '', f.read())
File "E:\Download\PROGIK\ANACONDA\lib\site-packages\IPython\core\interactiveshell.py", line 2131, in run_cell_magic
result = fn(magic_arg_s, cell)
File "<decorator-gen-130>", line 2, in cython
File "E:\Download\PROGIK\ANACONDA\lib\site-packages\IPython\core\magic.py", line 187, in <lambda>
call = lambda f, *a, **k: f(*a, **k)
File "E:\Download\PROGIK\ANACONDA\lib\site-packages\Cython\Build\IpythonMagic.py", line 321, in cython
assert len(extensions) == 1
TypeError: object of type 'NoneType' has no len()
If I change to e.g.
Fuu='bar'
it works great. Why the sudden change of heart?
EDIT:
Have been messing around with FFMPEG and LIBAV yesterday, because wanted to download and convert Youtube videos to mp3. But I'm pretty sure I ran scripts with these characters succesfully after it.

WindowsError when calling sc.parallelize()

I want to use the sc.parallelize() function, but whenever I try to call it, I get the below error:
File "V:/PyCharmProjects/sample.py", line 9, in <module>
input_data = sc.parallelize(sc.textFile("C:\Users\Spider\Desktop\GM_coding\Sample Data.csv"))
File "V:\spark-2.2.0-bin-hadoop2.7\python\pyspark\context.py", line 497, in parallelize os.unlink(tempFile.name)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: u'C:\\Users\\Spider\\AppData\\Local\\Temp\\spark-fef6debd-ff91-4fb6-85dc-8c3a1da9690a\\pyspark-6ed523e7-358f-4e3c-ad83-a479fb8ecc52\\tmpxffhfi'
Not sure if it's relevant to your error (and cannot test it in Windows), but you are trying to parallelize something that is already an RDD (i.e. "parallelized"); from the docs:
textFile(name, minPartitions=None, use_unicode=True)
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an
RDD of Strings.
You don't need (and shouldn't use) sc.parallelize() here; the output of sc.textFile is already an RDD. You should simply go for
input_data = sc.textFile("C:\Users\Spider\Desktop\GM_coding\Sample Data.csv")
See also the example in the quick start guide.

AlignIO gives 'AssertionError' when reading emboss alignment files

I have been stuck on a problem for three days... searched everywhere, posted on Biostar, still waiting for EMBL to respond to emails... would make a bounty if I had more rep.
After aligning sequences with EMBOSSwin needle() (pairwise global alignments) I get alignment files in pair format, with a .needle file extension. I want to use Biopython to read these alignments for later analysis.
I use AlignIO.read(open('alignment.needle'),'emboss') following the instructions in Biopython's AlignIO wiki but I keep getting an AssertionError.
My code:
>>> from Bio import AlignIO
>>> alignment = AlignIO.read(open("data/all/out/pair1_alignment.needle"), "emboss")
My error:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "C:\Python27\lib\Bio\AlignIO\__init__.py", line 423, in read
first = next(iterator)
File "C:\Python27\lib\Bio\AlignIO\__init__.py", line 370, in parse
for a in i:
File "C:\Python27\lib\Bio\AlignIO\EmbossIO.py", line 150, in __next__
assert seq.replace("-", "") != ""
AssertionError
Example Alignment File:
Download the alignment file here
Versions:
Windows 7
Python version 2.7.3
Biopython version 1.63
EMBOSS version 2.10.0-0.8
Clues:
I suspect this may be related to a warning message I kept getting when actually making the alignments, which was outputted by EMBOSS needle() function:
Warning: Sequence character string not found in ajSeqCvtKS
Duplicate post on BioStars, http://www.biostars.org/p/87226/#87399
This appears to be down to a subtle change in the EMBOSS output. You have an extremely old version, EMBOSS version 2.10.0 (February 2005), and your output file has lines like this:
gag 1288 -------------------------------------------------- 1287
Using a newer version of EMBOSS (e.g. 6.3.0), gives lines like this:
gag 1287 -------------------------------------------------- 1287
The Biopython parser is expecting the latter for alignment sections with no letters (e.g. when one sequence is much longer than the other), where the start and end coordinates agree. Please update your copy of EMBOSS, and then the parser should be happy. The current EMBOSS release is version 6.5.0.
The problem is that you're passing the wrong format file to Biopython. An explanation follows.
Formatting
The format of the file you've linked to is srspair (see the header of pair1_aligned.fasta). It's worth noting that this is not the FASTA format - that's an entirely different format.
Delving into the source of Biopython's EmbossIO, we can see that the EmbossIterator (which is called by AlignIO.read when the format is 'emboss') is only meant to handle the formats pair and simple (see Alignment formats for an explanation of the various formats).
Solution
If you export EMBOSS's output in the pair format (then call AlignIO.read as you have before), that should solve your problem.

OCaml bugs during why3 usage

I'm trying to compile why3ide (why3-0.81) with krakatoa & jessie (why-2.33) for Windows (Cygwin). Everything went fine except I can't make right bottom textbox to show notations (it is always empty), moreover I get the error (highlighted in the picture) every time when I try to select the item to proof.
Image: https://dl.dropboxusercontent.com/u/39984835/why3ide/error_capture.jpg
Here is this error:
Apply transformation introduce_premises
Why3ide callback raised an exception:
anomaly: End_of_file
Backtrace:
Raised at file "format.ml", line 197, characters 41-52
Called from file "format.ml", line 425, characters 8-33
Called from file "format.ml", line 440, characters 6-24
How can I debug this error?
(I'm newbie for OCaml)
format.ml file is here:
cygwin/lib/ocaml/format.ml
Files that refers to introduce_premises transformation are here:
why3-0.81/drivers/gappa.drv
why3-0.81/src/ide/gmain.ml
why3-0.81/src/transform/introduction.ml
why3-0.81/drivers/mathematica.drv
P.S. I tried to add why3 & why3ide tags for this post, but my reputation is not enough for that yet.

Missing local commands after running zopeskel on windows

I try to create an archetype with zopskel/paster on my newly installed plone 4.2. I have adjusted the buildout.cfg (see below) to get zopeskel.exe and paster.exe generated in the bin folder.
Howerver when I run zopeskel as follows (in develop-eggs folder):
..\bin\zopeskel.exe archetype
I get an IOError (see below for output)
From what I understand I should now have local commands when running paster (like add). However when I now run paster (in the develop-eggs/nortek.test03) folder there is no commands.
Is there a bug/flaw in the zopeskel or am I doing something wrong? How do I proceed?
Traceback (most recent call last):
File "C:\Plone42\bin\zopeskel-script.py", line 16, in <module>
zopeskel.zopeskel_script.run()
File "c:\plone42\eggs\zopeskel-2.21.2-py2.6.egg\zopeskel\zopeskel_script.py", line 397, in run
command.run( [ '-q', '-t', template_name ] + optslist )
File "c:\plone42\eggs\pastescript-1.7.5-py2.6.egg\paste\script\command.py", line 238, in run
result = self.command()
File "c:\plone42\eggs\pastescript-1.7.5-py2.6.egg\paste\script\create_distro.py", line 170, in command
egg_info_dir = pluginlib.egg_info_dir(output_dir, dist_name)
File "c:\plone42\eggs\pastescript-1.7.5-py2.6.egg\paste\script\pluginlib.py", line 135, in egg_info_dir
% ', '.join(all))
IOError: No egg-info directory found (looked in .\nortek.test03\.\nortek.test03.egg-info, .\nortek.test03\CHAN
GES.txt\nortek.test03.egg-info, .\nortek.test03\CONTRIBUTORS.txt\nortek.test03.egg-info, .\nortek.test03\docs\
nortek.test03.egg-info, .\nortek.test03\MANIFEST.in\nortek.test03.egg-info, .\nortek.test03\nortek\nortek.test
03.egg-info, .\nortek.test03\README.txt\nortek.test03.egg-info, .\nortek.test03\setup.cfg\nortek.test03.egg-in
fo, .\nortek.test03\setup.py\nortek.test03.egg-info)
My buildout.cfg is identical to default except the following:
parts =
zeo
instance
run-instance
run-zeo
service
service-zeo
zopeskel
[zopeskel]
recipe = zc.recipe.egg
unzip = true
eggs =
Paste
PasteScript
ZopeSkel
[EDIT]
I tried to follow the instructions in the link provided. However there are several problems that occure:
* no paste script is generated in bin folder
* I still get exactly the same IOError issue
* There are no local commands
I put output from the different commands I ran onto this link:
http://pastie.org/4664202
So please help me as I still have the same problem
Please use the upgraded instructions:
http://collective-docs.readthedocs.org/en/latest/getstarted/paste.html
Wherever you found those instructions please tell it to us and we will try take down the bad instructions.

Resources