AttributeError, ImportError on scipy.misc image functions (e.x imread, imresize, imsave, imshow etc.) - image

I have came across two kinds of errors when trying to import or directly use any of the image functions included in the scipy.misc module. Here are two error examples with the imread() function:
>>> from scipy.misc import imread
ImportError: cannot import name 'imread' from 'scipy.misc'
and
>>> import scipy.misc
>>> scipy.misc.imread
AttributeError: module 'scipy.misc' has no attribute 'imread'
What am I doing wrong?

You are not doing anything wrong. This is due to the removal of the image functions from the scipy.misc module since SciPy Version 1.2.0. I don't know why did they deem those functions deprecated and removed them, but if you want to use them, you can rollback to a previous SciPy version by uninstalling the current one and installing a previous one:
pip uninstall scipy
pip install scipy==1.1.0
Make sure you have Pillow installed too:
pip install Pillow
If you don't want to use an old version of SciPy, then you will need to change your code. According to the official docs of each deprecated function, here is what SciPy suggests:
fromimage(im) -> np.asarray(im)
imfilter() -> Use Pillow filtering functionality directly.
imread() -> imageio.imread()
imsave() -> imageio.imwrite()
imresize() -> numpy.array(Image.fromarray(arr).resize())
imrotate -> skimage.transform.rotate()
imshow() -> matplotlib.pyplot.imshow()
toimage() -> Image.fromarray()
It assumes to install the below libraries:
pip install numpy Pillow scikit-image imageio matplotlib
and import them:
import numpy as np, Pillow, skimage, imageio, matplotlib
In addition, I quote two sources I found, mentioning the deprecation of the scipy.misc image I/O functionality:
From scipy.github.io:
The following functions in scipy.misc are deprecated: bytescale, fromimage, imfilter, imread, imresize, imrotate, imsave, imshow and toimage. Most of those functions have unexpected behavior (like rescaling and type casting image data without the user asking for that). Other functions simply have better alternatives.
From imageio.readthedocs.io (especially for imread):
Transitioning from Scipy’s imread
Scipy is deprecating their image I/O functionality.
This document is intended to help people coming from Scipy to adapt to
Imageio’s imread function. We recommend reading the user api and
checkout some examples to get a feel of imageio.
Imageio makes use of variety of plugins to support reading images (and
volumes/movies) from many different formats. Fortunately, Pillow is
the main plugin for common images, which is the same library as used
by Scipy’s imread. Note that Imageio automatically selects a plugin
based on the image to read (unless a format is explicitly specified),
but uses Pillow where possible.
In short terms: For images previously read by Scipy’s imread, imageio
should generally use Pillow as well, and imageio provides the same
functionality as Scipy in these cases. But keep in mind:
Instead of mode, use the pilmode keyword argument.
Instead of flatten, use the as_gray keyword argument.
The documentation for the above arguments is not on imread, but on the docs of the individual formats, e.g. PNG.
Imageio’s functions all return numpy arrays, albeit as a subclass (so that meta data can be attached).

Related

Type aliases in type hints are not preserved

My code (reduced to just a few lines for this demo) looks like this:
AgentAssignment = List[int]
def assignment2str(assignment: AgentAssignment):
pass
The produced html documentation has List[int] as the type hint. This seems to be a resolved issue (#6518), but I am still running into it in 2022 with Sphinx version 5.1.1 (and Python 3.8.2). What am I missing?
I am not sure if there is Sphinx support for this yet, but you need to use explicit PEP 613 TypeAlias introduced in Python 3.10. Because otherwise the type resolver cannot differentiate between a normal variable assignment and a type alias. This is a generic Python solution for the type alias problem beyond the scope of Sphinx.
AgentAssignment: TypeAlias = List[int]
Ps. I am having the same issue with Sphinx
So, one needs to add at the beginning of the file (yes, before all other imports):
from __future__ import annotations
Then in conf.py:
autodoc_type_aliases = {'AgentAssignment': 'AgentAssignment'}
Not that this identity-transformation dictionary makes any sense to me, but it did the trick...
I also had this issue and was very confused why the accepted answer here stackoverflow.com/a/73273330/1822018, which mentions adding your type aliases to the autodoc_type_aliases dictionary as explained in the documentation here https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_type_aliases was not working for me. I solved my problem and I am posting it here for the benefit of others.
In my case, I also had installed the Python package sphinx-autodocs-typehints which extends/hijacks/overrides certain Sphinx functionality, in particular it appears to supplant the functionality of the autodoc_type_aliases dictionary in the given PR. To anyone trying to debug this issue I would suggest removing 'sphinx_autodoc_typehints' from the extensions list in your Sphinx conf.py file.

Why we can use train_test_split without an object creation but using linear_models like LinearRegression or StandardScaler() would need an instance

I am new to Data Science in Python and while importing different libraries I can see they are being used in 2 different ways:
`import statsmodels.api as sm
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression`
Even within sklearn train_test_split and StandardScaler are called differently before use
As you can see from the source code of Scikit-learn libraries
(link)
train_test_split is defined as a quick utility to divide arrays or matrices into random train and test subsets. It does not need much parameters to define its behaviour.
More complex utilities, like StandardScaler() or LinearRegression(), are composed by multiple methods and attributes, they give a more friendly way to the developer to manage the process he is interested into.

Parallel processing in Julia throws errors

My understanding is that parallelization is included by default in a base Julia installation.
However, when I try to use it, I am getting errors that the functions and macros are not defined. For example:
nprocs()
Throws an error:
ERROR: UndefVarError: nprocs not defined
Stacktrace:
[1] top-level scope at none:0
Nowhere in any Julia documentation can I find mention of any packages that need to be included in order to use these functions. Am I missing something here?
I am using Julia version 1.0.5 inside the JuliaPro/Atom IDE
I figured it out. I'll leave this up for anyone else who is having this problem.
The solution is to import the Distributed package using:
using Distributed
Why this is not included in the documentation I do not know.
Once you know that nproc needs to be used, there exist a couple of options to find where it is defined.
A search through the documentation can help: https://docs.julialang.org/en/v1/search/?q=nprocs
Without leaving the Julia REPL, and even before nprocs gets imported in your session, you can use apropos in order to find more about it and determine that it is needed to import the Distributed package:
julia> apropos("nprocs")
Distributed.nprocs
Distributed.addprocs
Distributed.nworkers
An other way of using apropos is via the help REPL mode:
julia> # type `?` when the cursor is right after the prompt to enter help REPL mode
# note the use of double quotes to trigger "apropos" instead of a regular help query
help?> "nprocs"
Distributed.nprocs
Distributed.addprocs
Distributed.nworkers
Previous options work well in the case of nprocs because it is part of the standard library. JuliaHub is another option which allows looking for things more broadly, in the entire Julia ecosystem. As an example, looking for nprocs in JuliaHub's "Doc Search" tool also returns relevant results: https://juliahub.com/ui/Documentation?q=nprocs

How to load Gensim FastText model in native FastText

I trained a FastText model in Gensim. I want to use it to encode my sentences. Specifically, I want to use this feature from native FastText:
./fasttext print-word-vectors model.bin < queries.txt
How to I save the model in Gensim so that it is the correct binary format that can be understood by native FastText?
I am using FastText 0.1.0 and Gensim 3.4.0 under Python 3.4.3.
In essence, I need the inverse of the load_binary_data() as given in the Gensim FastText doc.
You probably wont find such a functionality in gensim as that would mean dependence on the internal structure and code like what you see in fasttext-python (which uses pybind to directly call the internal fasttext api). To have such a huge dependency on an external library is something which the creators of gensim would like to avoid and that is why they probably deprecated the functionality to call the fasttext wrapper. RIght now gensim only seeks to provide fasttext algorithm through its own internal implementation. I would suggest you use the python bindings for fasttext.
$ git clone https://github.com/facebookresearch/fastText.git
$ cd fastText
$ pip install .
Now run the training set in your python application with the fasttext model.
from fastText import train_unsupervised
model = train_unsupervised(input="pathtotextfile", model='skipgram')
model.save_model('model.bin')
This would save the model in the fastText command line format. So you should now be able to run the following.
$ ./fasttext print-word-vectors model.bin < queries.txt

prevalence of UCS2 vs UCS4 python chars

builtin functions - unichr says
The valid range for the argument depends how Python was configured – it may be either UCS2 [0..0xFFFF] or UCS4 [0..0x10FFFF]
and
builtin functions - ord says
If a unicode argument is given and Python was built with UCS2 Unicode, then the character’s code point must be in the range [0..65535] inclusive; otherwise the string length is two, and a TypeError will be raised.
Are there any stats on how widely used the two definitions of code-unit are in production python interpreters?
Any idea how prevalent python scripts are that use something like #!/usr/bin/env python and which run with different code-unit definitions based on the environment of the user running it?
Background:
I'm wondering how much work to put into making a parser generator backend for python 2.x produce a single library that works for both configurations given that Python 3 tightened this.
Specifically, am I likely bloating the generated code bundle unnecessarily by doing
# Module my_generated_parser
try
unichr(0x10000)
except ValueError:
from my_generated_parser_ucs2 import *
else:
from my_generated_parser_ucs4 import *
and including two generated parsers by default?

Resources