I'm trying to learn Gensim using its site.
There is a function named 'remove_stopword_tokens' which is useful for my research.
Now, although the module is defined and is present on their website (exact link: link),I can't import it on my colab
Note: This is my code:
import gensim
from gensim.parsing.preprocessing import remove_stopword_tokens
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-dbd838c83237> in <module>
----> 1 from gensim.parsing.preprocessing import remove_stopword_tokens
ImportError: cannot import name 'remove_stopword_tokens' from 'gensim.parsing.preprocessing' (/usr/local/lib/python3.7/dist-packages/gensim/parsing/preprocessing.py)
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
updated & corrected answer
You've run into a limitation of Google Colab - it may not have the most-recent version of libraries.
You can see this by checking what the value of gensim.__version__ is. In my check of Google Colab right now (September 2022), it reports 3.6.0 – a version of Gensim that's about 4 years old, and lacks later fixes & addtions. The remove_stopwords_tokens() function was only added recently.
Fortunately, you can update the gensim package backing the Colab notebook yourself, using a shell-escape to run pip. Inside a Colab cell, run:
!pip install gensim -U
If you'd already done an import gensim, it will warn you that you must restart the runtime for the new code to be found.
Note that for clarity reasons you might choose to prefer using more-specific imports, as many project style guides suggest, rather than doing any broad top-level import gensim at all. Just mport the individual classes and/or functions you need, specifically & explicitly. That is, just:
from gensim.parsing.preprocessing import remove_stopword_tokens
# ... other exact class/function/variable imports you'll use...
remove_stopword_tokens(sentence)
On the other hand, if you want things simple-but-sloppy (not recommended), once you import gensim, it has already (via its own custom initialization routines) imported all of its submodules for you. So you could do:
import gensim # parsing & all gensim's other submodules now referenceable!
gensim.parsing.remove_stopword_tokens(sentence)
(Pro Python programmer style tends not to do this latter approach, of prefixing all in-the-actual-code calls with long dot-paths.)
Related
There are three different etcd-go package, they are:
github.com/coreos/etcd
go.etcd.io/etcd
go.etcd.io/etcd/v3
According to the commit here, all the
official codes have changed the package from go.etcd.io/etcd to go.etcd.io/etcd/v3 with following messages:
This change makes the etcd package compatible with the existing Go
ecosystem for module versioning.
But I can't get the go.etcd.io/etcd/v3 package by go get command.
So what's the difference between these three etcd-go packages? And how to use them properly.
Thanks in advance.
There is a known issue in the client v3.4 with go get failing. See this issue: https://github.com/etcd-io/etcd/issues/11154
Although the issue has been closed because it is (supposedly) fixed in v3.5, that version is not yet released (when writing this).
There are a few workarounds posted the issue above. The one that worked for us was to circumvent the incorrectly implemented go module of etcd and pin the version to a commit directly in our go.mod file:
require go.etcd.io/etcd v0.0.0-20200520232829-54ba9589114f
The clientv3 is then imported with:
import "go.etcd.io/etcd/clientv3"
The document for no.2 in the question points to this link
https://pkg.go.dev/go.etcd.io/etcd/clientv3?tab=doc
The package has below version and commit hash
v0.5.0 (ae9734e)
The document for no.3 in the question points to this link
https://pkg.go.dev/go.etcd.io/etcd/v3/clientv3?tab=doc
The package has below version and commit hash
v3.3.0 (c20cc05)
etcd would have made a breaking change in latest release and hence changed the module path to differ from the old path. This is a convention recommended in official Golang blog.
Read this blog.
https://blog.golang.org/v2-go-modules
Even though both of them point to the same repo, you have to import these versions differently like below. You can find the correct module path from go.mod file in the root of the repository.
import "go.etcd.io/etcd/clientv3"
import "go.etcd.io/etcd/v3/clientv3"
I'm attempting to conduct a vector identity using two shapefiles in geopandas. When this is run I get the following error.
ModuleNotFoundError: No module named 'geopandas.sindex'
I've done a quick search and there is no module called geopandas.sindex related to vector identity or anything like it. My geopandas is installed under anaconda and it is installed as I am able to import geopandas.
Here is an excerpt from my code.
import geopandas as gpd
geo_df1 = gpd.read_file("shapefile1.shp")
geo_df2 = gpd.read_file("shapefile2.shp")
geo_df3 = gpd.overlay(geo_df1,geo_df2,how="identity")`
I strongly suspect something has gone wrong in my installation as this operation was done correctly historically so any recommendations towards fixing the installation is appreciated. The expected result will be being able to run the geopandas operation successfully.
I get the following deprecation warning when saving/loading a gensim word embedding:
model.save("mymodel.model")
/home/.../lib/python3.7/site-packages/smart_open/smart_open_lib.py:398:
UserWarning: This function is deprecated, use smart_open.open instead.
See the migration notes for details:
https://github.com/RaRe-Technologies/smart_open/blob/master/README.rst#migrating-to-the-new-open-function
'See the migration notes for details: %s' % _MIGRATION_NOTES_URL
I don't understand what to do following the notes on the page.
So, how should I save and open my models instead?
I use python 3.7 , gensim 3.7.3. and smart_open 1.8.4. I think I did not get the warning when using gensim 3.7.1. and python 3.5. smart_open should have been 1.8.4.
You can ignore most "deprecation warnings", as they're just an advisory about underlying changes that for now still work, but there's a new preferred way to do things that may be required in the future.
In this case, the warning is about a function inside the smart_open package that the gensim package is using. That is, it's not the .save() you are calling that's deprecated, but something inside .save(). The gensim authors will eventually update .save() to use the newly-preferred variant of what smart_open offers.
You can just keep using .save(), ignoring the message as long as things still work for you – unless you'd like to contribute the fix to .save() to remove the warning to gensim. (It may, however, have already been fixed in the development code, to become available in the next gensim release.)
Im running a job using the mlxtend library. Specifically the sequential_feature_selector that is parallelized using joblib.Parallel source. When I run the package on my local computer it uses all the available CPUs, but when i send the job to cloud-ml it only uses one core. It doesn't matter what is the number that i put in the n_jobs parameter. I´ve also tried with differents machine types but same thing happen.
Does anybody know what the problem might be ?
For anyone that might be interested, we solve the problem fixing the sklearn version in the setup.py to the 0.20.2. we had sklearn in the packages before, but without a version.
#setup.py
from setuptools import find_packages
from setuptools import setup
REQUIRED_PACKAGES = ['joblib==0.13.0',
'scikit-learn==0.20.2',
'mlxtend']
New Go programmer here -- apologies if this is well worn territory, but my google searching hasn't turned up the answer I'm looking for.
Short Version: Can I, as a programmer external to the core Go project, force my packages to be imported with a specific name. If so, how?
Long Version: I recently tried to install the bcrypt package from the following GitHub repository, with the following go get
go get github.com/golang/crypto
The package downloaded correctly into my workspace, but when I tried to import it, I got the following error
$ go run main.go main.go:10:2: code in directory /path/to/go/src/github.com/golang/crypto/bcrypt expects import "golang.org/x/crypto/bcrypt"
i.e. something told Go this package was supposed to be imported with golang.org/x/crypto/bcrypt. This tipped me off that what I actually wanted was
go get golang.org/x/crypto/bcrypt
I'd like to do something similar in my own packages — is this functionality built into Go packaging? Or are the authors of crypto/bcrypt doing something at runtime to detect and reject invalid package import names?
Yes it's built in, I can't seem to find the implementation document (it's a relatively new feature in 1.5 or 1.6) however the syntax is:
package name // import "your-custom-path"
Example: https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go#L7
// edit
The design document for this feature is https://docs.google.com/document/d/1jVFkZTcYbNLaTxXD9OcGfn7vYv5hWtPx9--lTx1gPMs/edit
// edit
#JimB pointed out to https://golang.org/cmd/go/#hdr-Import_path_checking, and in the go1.4 release notes: https://golang.org/doc/go1.4#canonicalimports