UnicodeDecodeError using GLib.utf8_collate_key in Windows

UnicodeDecodeError using GLib.utf8_collate_key in Windows - windows

I'm using Python 3.3 / PyGObject 3.14 in Windows 7 and I have the following problem: Using either gi.repository.GLib.utf8_collate_key and gi.repository.GLib.utf8_collate_key with a non-ascii-only string always results in an UnicodeDecodeError.
Test case:
>from gi.repository import GLib
>asciiText = "a"
>unicodeText = "á"
>asciiText.decode()
b'a'
>unicodeText.decode()
b'\xc3\xa1'
>GLib.utf8_collate_key(asciiText, -1)
'Aa'
>GLib.utf8_collate_key(unicodeText, -1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 1: unexpect
ed end of data
Expected result (from Linux)
>GLib.utf8_collate_key(asciiText, -1)
'a'
>GLib.utf8_collate_key(unicodeText, -1)
'á'
The Windows system's locale is set to Portuguese (Brazil).
Does anybody knows how to solve this? I'm considering rolling my own collating function if I can't get this to work.

Related

Using per-column compression codec in Parquet.write_table

I have pyarrow 2.0.0 installed. The docs for pyarrow.parquet.write_table state
compression (str or dict) – Specify the compression codec, either on a general basis or per-column. Valid values: {‘NONE’, ‘SNAPPY’, ‘GZIP’, ‘LZO’, ‘BROTLI’, ‘LZ4’, ‘ZSTD’}.
Works fine if compression is a string, but when I try using a dict for per-column specification, I get the following error. What am I doing wrong? I can use a similar dict for compression_level on a per-column basis without error.
(py3) C:\tmp\python>python
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow as pa
>>> import pyarrow.parquet as pq
>>> import pandas as pd
>>>
>>> df = pd.DataFrame([[1,2,3],[4,5,6]],columns=['foo','bar','baz'])
>>> t = pa.Table.from_pandas(df)
>>> pq.write_table(t,'test1.pq',compression=dict(foo='zstd',bar='snappy',baz='brotli'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\app\python\anaconda\3\envs\py3\lib\site-packages\pyarrow\parquet.py", line 1717, in write_table
with ParquetWriter(
File "c:\app\python\anaconda\3\envs\py3\lib\site-packages\pyarrow\parquet.py", line 554, in __init__
self.writer = _parquet.ParquetWriter(
File "pyarrow\_parquet.pyx", line 1390, in pyarrow._parquet.ParquetWriter.__cinit__
File "pyarrow\_parquet.pyx", line 1236, in pyarrow._parquet._create_writer_properties
File "stringsource", line 15, in string.from_py.__pyx_convert_string_from_py_std__in_string
TypeError: expected bytes, str found

Cannot debug Rust code using LLDB in VS Code: UnicodeEncodeError: 'ascii' codec can't encode characters

I want to debug Rust with VS Code. I have installed LLDB 6.0.0 and Rust 1.27.1 but I can't debug Rust code with LLDB:
Display settings: variable format=auto, show disassembly=auto, numeric pointer values=off, container summaries=on.
Internal debugger error:
Traceback (most recent call last):
File "/home/kwebi/.vscode/extensions/vadimcn.vscode-lldb-0.8.9/adapter/debugsession.py", line 1365, in handle_message
result = handler(args)
File "/home/kwebi/.vscode/extensions/vadimcn.vscode-lldb-0.8.9/adapter/debugsession.py", line 385, in DEBUG_setBreakpoints
file_id = os.path.normcase(from_lldb_str(source.get('path')))
File "/home/kwebi/.vscode/extensions/vadimcn.vscode-lldb-0.8.9/adapter/__init__.py", line 8, in <lambda>
from_lldb_str = lambda s: s.decode('utf8', 'replace')
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 12-13: ordinal not in range(128)

FileNotFoundError: No such file: 'someones_epi.nii.gzip'

I am trying to load an MRI, I keep getting the following error:
Traceback (most recent call last):
File "F:/Study/Projects/BTSaG/Programs/t3.py", line 2, in <module> epi_img = nib.load('someones_epi.nii.gzip')
File "C:\Users\AnkitaShinde\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nibabel\loadsave.py", line 38, in load raise FileNotFoundError("No such file: '%s'" % filename)
FileNotFoundError: No such file: 'someones_epi.nii.gzip'
The code is used is as follows:
import nibabel as nib
epi_img = nib.load('someones_epi.nii.gzip')
epi_img_data = epi_img.get_data()
epi_img_data.shape(53, 61, 33)
import matplotlib.pyplot as plt
def show_slices(slices):
""" Function to display row of image slices """
fig, axes = plt.subplots(1, len(slices))
for i, slice in enumerate(slices):
axes[i].imshow(slice.T, cmap="gray", origin="lower")
slice_0 = epi_img_data[26, :, :]
slice_1 = epi_img_data[:, 30, :]
slice_2 = epi_img_data[:, :, 16]
show_slices([slice_0, slice_1, slice_2])
plt.suptitle("Center slices for EPI image")
I have also updated the loadsave.py file in nibabel but it didn't work. Please help.
Edit:
The earlier error was resolved. Now another error has been encountered.
Traceback (most recent call last):File "F:\Study\Projects\BTSaG\Programs\t3.py", line 2, in <module> epi_img = nib.load('someones_epi.nii.gzip')
File "C:\Users\AnkitaShinde\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nibabel\loadsave.py", line 47, in load filename)
nibabel.filebasedimages.ImageFileError: Cannot work out file type of "someones_epi.nii.gzip"

This is an old question, however I may have the solution for it.
I just figured out that nibabel.save() does not allow me to have dot . or dash - in the folder names. These can exist in filenames however. In your case, the current path is:
C:\Users\AnkitaShinde\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\nibabel\someones_epi.nii.gzip
I would change it to:
C:\Users\AnkitaShinde\AppData\Local\Programs\Python\Python35_32\Lib\site_packages\nibabel\someones_epi.nii.gzip
This is just to give an example. Of course, I don't mean that you actually change the names of these package folders as it might cause other errors.
The actual solution would be to move the file someones_epi.nii.gzip to the user structure, something like:
C:\Users\AnkitaShinde\Desktop\nibabel\someones_epi.nii.gzip

ValueError with nltk on python 3.6 Windows

Tried to run
wordnet.synsets('table')
from Python 3.6 on Windows and got
File "<stdin>", line 1, in <module>
File "C:\Users\Leti\Anaconda3\envs\DeepVis\lib\site-packages\nltk\corpus\reader\wordnet.py", line 1424, in synsets
for p in pos
File "C:\Users\Leti\Anaconda3\envs\DeepVis\lib\site-packages\nltk\corpus\reader\wordnet.py", line 1426, in <listcomp>
for offset in index[form].get(p, [])]
File "C:\Users\Leti\Anaconda3\envs\DeepVis\lib\site-packages\nltk\corpus\reader\wordnet.py", line 1280, in _synset_from_pos_and_offset
synset = self._synset_from_pos_and_line(pos, data_file_line)
File "C:\Users\Leti\Anaconda3\envs\DeepVis\lib\site-packages\nltk\corpus\reader\wordnet.py", line 1381, in _synset_from_pos_and_line
raise WordNetError('line %r: %s' % (data_file_line, e))
nltk.corpus.reader.wordnet.WordNetError: line 'tted dalmatian \r\n': not enough values to unpack (expected 2, got 1)
On Linux it works just fine!
Does someone know what is happening?

try to convert file nltk_data\corpora\wordnet\data.noun format to Unix by tools like UltraEdit or notepad++ (Edit-->EOL Conversion-->Unix(LF))

Unicode Decode Error:invalid start byte

The purpose of the code is to create a graph for the decision tree model.
The code is given below.
dot_data=StringIO()
tree.export_graphviz(clf,out_file=dot_data)
graph=py.graph_from_dot_data(dot_data.getvalue())
print(graph)
Image.open(graph.create_png(),mode='r')
On execution, it gives the following error:
Traceback (most recent call last):
File "C:/Ankur/Python36/Python Files/Decision_Tree.py", line 58, in <module>
Image.open(graph.create_png(),mode='r')
File "C:\Ankur\Python36\lib\site-packages\PIL\Image.py", line 2477, in open
fp = builtins.open(filename, "rb")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0:
invalid start byte
I am having a hard time to resolve this error as I don't understand it.

create_png() returns a bytes object while Image.open (from PIL) expects a filename or file object.
Try
import io
Image.open(io.BytesIO(graph.create_png()))
and it should work

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

UnicodeDecodeError using GLib.utf8_collate_key in Windows - windows

Related

Using per-column compression codec in Parquet.write_table

Cannot debug Rust code using LLDB in VS Code: UnicodeEncodeError: 'ascii' codec can't encode characters

FileNotFoundError: No such file: 'someones_epi.nii.gzip'

ValueError with nltk on python 3.6 Windows

Unicode Decode Error:invalid start byte

Categories

Resources