Linking external documentation in Sphinx - python-sphinx

I am trying to link from our project's extension documentation to the core documentation in Sphinx. I've tried intersphinx, but from what I see it only supports objects, while our documentation doesn't refer to objects, it's just plain .rst.
I've added
intersphinx_mapping = {
'project': ('http://link-to-readthedocs/index.html', None),
}
to conf.py and edited the link to :ref:\`Documentation\` and later :doc:\`Documentation\` . It didn't work.
The question:
How to link from one projects' documentation to another in Sphinx for plain .rst files (not objects)?
Edit: I've done make html, found my objects.inv, but now I guess I only have it locally? I'm not sure what I'm doing anymore, but when I try to check the object references, I get:
UserWarning: intersphinx inventory 'http://myproject.com/index.html/objects.inv' not fetchable due to <class 'urllib.error.HTTPError'>: HTTP Error 404: Not Found
'%s: %s' % (inv, err.__class__, err))

The first thing to fix here is the link you've included to the base URL of your project docs:
intersphinx_mapping = {
'project': ('http://link-to-readthedocs/index.html', None),
}
According to the intersphinx docs:
A dictionary mapping unique identifiers to a tuple (target, inventory). Each target is the base URI of a foreign Sphinx documentation set and can be a local path or an HTTP URI. The inventory indicates where the inventory file can be found: it can be None (at the same location as the base URI) or another local or HTTP URI.
Thus, the error is in having the index.html at the end of your target. It should instead look something like this:
intersphinx_mapping = {
'project': ('http://project.readthedocs.io/en/latest', None),
}
If desired, replace en with the preferred docs language, and latest with the preferred RtD built version of the docs.

Related

Load a pre-trained model from disk with Huggingface Transformers

From the documentation for from_pretrained, I understand I don't have to download the pretrained vectors every time, I can save them and load from disk with this syntax:
- a path to a `directory` containing vocabulary files required by the tokenizer, for instance saved using the :func:`~transformers.PreTrainedTokenizer.save_pretrained` method, e.g.: ``./my_model_directory/``.
- (not applicable to all derived classes, deprecated) a path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (e.g. Bert, XLNet), e.g.: ``./my_model_directory/vocab.txt``.
So, I went to the model hub:
https://huggingface.co/models
I found the model I wanted:
https://huggingface.co/bert-base-cased
I downloaded it from the link they provided to this repository:
Pretrained model on English language using a masked language modeling
(MLM) objective. It was introduced in this paper and first released in
this repository. This model is case-sensitive: it makes a difference
between english and English.
Stored it in:
/my/local/models/cased_L-12_H-768_A-12/
Which contains:
./
../
bert_config.json
bert_model.ckpt.data-00000-of-00001
bert_model.ckpt.index
bert_model.ckpt.meta
vocab.txt
So, now I have the following:
PATH = '/my/local/models/cased_L-12_H-768_A-12/'
tokenizer = BertTokenizer.from_pretrained(PATH, local_files_only=True)
And I get this error:
> raise EnvironmentError(msg)
E OSError: Can't load config for '/my/local/models/cased_L-12_H-768_A-12/'. Make sure that:
E
E - '/my/local/models/cased_L-12_H-768_A-12/' is a correct model identifier listed on 'https://huggingface.co/models'
E
E - or '/my/local/models/cased_L-12_H-768_A-12/' is the correct path to a directory containing a config.json file
Similarly for when I link to the config.json directly:
PATH = '/my/local/models/cased_L-12_H-768_A-12/bert_config.json'
tokenizer = BertTokenizer.from_pretrained(PATH, local_files_only=True)
if state_dict is None and not from_tf:
try:
state_dict = torch.load(resolved_archive_file, map_location="cpu")
except Exception:
raise OSError(
> "Unable to load weights from pytorch checkpoint file. "
"If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. "
)
E OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What should I do differently to get huggingface to use my local pretrained model?
Update to address the comments
YOURPATH = '/somewhere/on/disk/'
name = 'transfo-xl-wt103'
tokenizer = TransfoXLTokenizerFast(name)
model = TransfoXLModel.from_pretrained(name)
tokenizer.save_pretrained(YOURPATH)
model.save_pretrained(YOURPATH)
>>> Please note you will not be able to load the save vocabulary in Rust-based TransfoXLTokenizerFast as they don't share the same structure.
('/somewhere/on/disk/vocab.bin', '/somewhere/on/disk/special_tokens_map.json', '/somewhere/on/disk/added_tokens.json')
So all is saved, but then....
YOURPATH = '/somewhere/on/disk/'
TransfoXLTokenizerFast.from_pretrained('transfo-xl-wt103', cache_dir=YOURPATH, local_files_only=True)
"Cannot find the requested files in the cached path and outgoing traffic has been"
ValueError: Cannot find the requested files in the cached path and outgoing traffic has been disabled. To enable model look-ups and downloads online, set 'local_files_only' to False.
Where is the file located relative to your model folder? I believe it has to be a relative PATH rather than an absolute one. So if your file where you are writing the code is located in 'my/local/', then your code should be like so:
PATH = 'models/cased_L-12_H-768_A-12/'
tokenizer = BertTokenizer.from_pretrained(PATH, local_files_only=True)
You just need to specify the folder where all the files are, and not the files directly. I think this is definitely a problem with the PATH. Try changing the style of "slashes": "/" vs "\", these are different in different operating systems. Also try using ".", like so ./models/cased_L-12_H-768_A-12/ etc.
I had this same need and just got this working with Tensorflow on my Linux box so figured I'd share.
My requirements.txt file for my code environment:
tensorflow==2.2.0
Keras==2.4.3
scikit-learn==0.23.1
scipy==1.4.1
numpy==1.18.1
opencv-python==4.5.1.48
seaborn==0.11.1
tensorflow-hub==0.12.0
nltk==3.6.2
tqdm==4.60.0
transformers==4.6.0
ipywidgets==7.6.3
I'm using Python 3.6.
I went to this site here which shows the directory tree for the specific huggingface model I wanted. I happened to want the uncased model, but these steps should be similar for your cased version. Also note that my link is to a very specific commit of this model, just for the sake of reproducibility - there will very likely be a more up-to-date version by the time someone reads this.
I manually downloaded (or had to copy/paste into notepad++ because the download button took me to a raw version of the txt / json in some cases... odd...) the following files:
config.json
tf_model.h5
tokenizer_config.json
tokenizer.json
vocab.txt
NOTE: Once again, all I'm using is Tensorflow, so I didn't download the Pytorch weights. If you're using Pytorch, you'll likely want to download those weights instead of the tf_model.h5 file.
I then put those files in this directory on my Linux box:
/opt/word_embeddings/bert-base-uncased/
Probably a good idea to make sure there's at least read permissions on all of these files as well with a quick ls -la (my permissions on each file are -rw-r--r--). I also have execute permissions on the parent directory (the one listed above) so people can cd to this dir.
From there, I'm able to load the model like so:
tokenizer:
# python
from transformers import BertTokenizer
# tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
tokenizer = BertTokenizer.from_pretrained("/opt/word_embeddings/bert-base-uncased/")
layer/model weights:
# python
from transformers import TFAutoModel
# bert = TFAutoModel.from_pretrained("bert-base-uncased")
bert = TFAutoModel.from_pretrained("/opt/word_embeddings/bert-base-uncased/")
This should be quite easy on Windows 10 using relative path. Assuming your pre-trained (pytorch based) transformer model is in 'model' folder in your current working directory, following code can load your model.
from transformers import AutoModel
model = AutoModel.from_pretrained('.\model',local_files_only=True)
Please note the 'dot' in '.\model'. Missing it will make the code unsuccessful.
In addition to config file and vocab file, you need to add tf/torch model (which has.h5/.bin extension) to your directory.
in your case, torch and tf models maybe located in these url:
torch model: https://cdn.huggingface.co/bert-base-cased-pytorch_model.bin
tf model: https://cdn.huggingface.co/bert-base-cased-tf_model.h5
you can also find all required files in files and versions section of your model: https://huggingface.co/bert-base-cased/tree/main
bert model folder containd these files:
config.json
tf_model.h5
tokenizer_config.json
tokenizer.json
vocab.txt
instaed of these if we require bert_config.json
bert_model.ckpt.data-00000-of-00001
bert_model.ckpt.index
bert_model.ckpt.meta
vocab.txt
then how to do
Here is a short ans.
tokenizer = BertTokenizer.from_pretrained('path/to/vocab.txt',local_files_only=True)
model = BertForMaskedLM.from_pretrained('/path/to/pytorch_model.bin',config='../config.json', local_files_only=True)
Usually config.json need not be supplied explicitly if it resides in the same dir.
you can use simpletransformers library. checkout the link for more detailed explanation.
model = ClassificationModel(
"bert", "dir/your_path"
)
Here I used Classification Model as an example. You can use it for many other tasks as well like question answering etc.

Swift and terminal: Using Google Endpoints in an iOS Client

I am following the tutorial at
https://cloud.google.com/appengine/docs/java/endpoints/calling-from-ios
and when I get to step 5 and Open a new Terminal window to invoke ServiceGenerator. I get the error message in my terminal saying..
Barrys-MacBook-Pro:~ barrymadej$ /Users/barrymadej/Library/Developer/Xcode/DerivedData/ServiceGenerator-avaeguyitgyhxpcnaejpgzvxezei/Build/Products/Debug/ServiceGenerator \
/Users/barrymadej/Documents/AndroidStudioProjects/StudentProgressTrackerDatabaseAndCloud/backend/build/discovery-docs/myApi-v2-rpc.discovery /
ERROR: An output directory is required.
Usage: ServiceGenerator [FLAGS] [ARGS]
Required Flags:
--outputDir PATH
The destination directory for writing the generated files.
Optional Flags:
--discoveryService URL
Instead of discovery's default URL, use the specified URL as the
location to send the JSON-RPC requests. This is useful for running
against a custom or prerelease server.
--gtlFrameworkName NAME
Will generate sources that include GTL's headers as if they are in a
framework with the given name. If you are using GTL via CocoaPods,
you'll likely want to pass "GoogleAPIClient" as the value for this.
--apiLogDir DIR
Write out a file into DIR for each JSON API description processed. These
can be useful for reporting bugs if generation fails with an error.
--httpLogDir PATH
Turn on the HTTP fetcher logging and set it to write to PATH. This can
be useful for diagnosing errors on discovery fetches.
--generatePreferred
Causes the list of services to be collected, and all preferred services
to be generated.
--httpHeader NAME:VALUE
Causes the given NAME/VALUE pair to be added as an HTTP header on *all*
HTTP requests made by the generator. Can be used repeatedly to provide
additional header pairs.
--formattedName SERVICE:VERSION=NAME
Causes the given SERVICE:VERSION pair to override its service name in
files, classes, etc. with NAME. If :VERSION is omitted the override is
for any version of the service. Can be used repeatedly to provide
several maps when generating a few things in a single run.
--addServiceNameDir yes|no Default: no
Causes the generator to add a directory with the service name in the
outputDir for the files. This is useful for generating multiple
services.
--generatedDir yes|no Default: no
Causes a directory in outputDir called "Generated" to be created and
used to contain the generated files.
--removeUnknownFiles yes|no Default: no
By default, the generator will report unknown files in the output
directory, as commonly happens when classes go away in a new API
version. This option causes the generator to also remove the unknown
files.
--rootURLOverrides yes|no Default: yes
Causes any API root URL for a Google sandbox server to be replaced with
the googleapis.com root instead.
--verbose
Generate more verbose output. Can be used more than once.
Arguments:
Multiple arguments can be given on the command line.
service:version
The description of the given [service]/[version] pair is fetched and the
files for it are generated. When using --generatePreferred version can
be '-' to skip generating the name service.
http[s]://url/to/rpc_description_json
A URL to download containing the description of a service to generate.
path/to/rpc_description.json
The path to a text file containing the description of a service to
generate.
ServiceGenerator path:
/Users/barrymadej/Library/Developer/Xcode/DerivedData/ServiceGenerator-avaeguyitgyhxpcnaejpgzvxezei/Build/Products/Debug/ServiceGenerator
ERROR: There was one or more errors; check the full output for details.
Barrys-MacBook-Pro:~ barrymadej$ --outputDir
-bash: --outputDir: command not found
Barrys-MacBook-Pro:~ barrymadej$ /Users/barrymadej/Documents/AndroidStudioProjects/StudentProgressTrackerDatabaseAndCloud/API
You should generate a REST discovery document and use the new Objective C client instead. The client library you're trying to use is deprecated anyway. It looks like it didn't work because you specified the flag without the rest of the command, though.

How does Chef include files generated on runtime as a template source

Using Chef recipe, I am first generating a .erb file dynamically based on inputs from a CSV file and then I want to use that .erb file as a template source. But unfortunately the changes made (in .erb file) are not considered while the recipe is converging the resources. I also tried to use lazy evaluation but not able to figure out how to use it for the template source.
Quoting the template documentation:
source Ruby Types: String, Array
The location of a template file. By default, the chef-client looks for
a template file in the /templates directory of a cookbook. When the
local property is set to true, use to specify the path to a template
on the local node. This property may also be used to distribute
specific files to specific platforms. See “File Specificity” below for
more information. Default value: the name of the resource block. See
“Syntax” section above for more information.)
And
local
Ruby Types: TrueClass, FalseClass
Load a template from a local path. By default, the chef-client loads
templates from a cookbook’s /templates directory. When this property
is set to true, use the source property to specify the path to a
template on the local node. Default value: false.
so what you can do is:
# generate the local .erb file let's say source.erb
template "/path/to/file" do
source "/path/to/source.erb"
local true
end
Your question sounds like and XY problem, reading a csv file to make a template sounds counter-productive and could probably be done with attributes and taking advantage of the variable attribute of template resource.
Assuming you know how to capture the values from the CSV file as a local variable in the recipe.
Examples:
csv_hostname
csv_fqdn
Here is what you do to create a template with lazy loading attributes. The following example creates a config file.
example.erb file
# Dynamically generated by awesome Chef so don't alter by hand.
HOSTNAME=<% #host_name %>
FQDN=<% #fqdn %>
recipe.rb file
template 'path\to\example.config' do
source 'example.erb'
variables(
lazy {
:host_name => csv_hostname,
:fqdn => csv_fqdn
})
end
If you need it to run at compile time, add the action to the block.
template 'xxx' do
# blah blah
end.run_action(:create)

How to create a link to external file section with Sphinx?

I want to create a link that refers to a section defined in another file.
I have found a similar question on "Python-Sphinx: Link to Section in external File" and I noticed there is an extension called "intersphinx".
So I tried this extension, but it doesn't work (Probably my usage is wrong).
I have tried the following.
conf.py
extensions = ['sphinx.ext.todo', 'sphinx.ext.intersphinx']
...
intersphinx_mapping = {'myproject': ('../build/html', None)}
foo.rst
...
****************
Install Bar
****************
Please refer :ref:`Bar Installation Instruction<myproject:bar_installation>`
I want to create a link like 'Bar Installation Instruction' with above markup.
bar.rst
...
**************************
Installation Instruction
**************************
.. _bar_installation:
some text...
When I run make html, I get the following warning and the link is not created.
foo.rst: WARNING: undefined label: myproject:bar_installation (if the link has no caption the label must precede a section header)
Thanks in advance.
Looks like it's not able to find your mapping inventory file. The first part of the tuple serves as the base URL for your links while the second part is the path to the inventory file. I believe the auto downloading of the inventory files (when you pass None) only works with URIs and not file paths.
In this example, I can build the documentation locally, but it will link to http://example.com/docs/bar.html
'myproject': (
'http://example.com/docs/',
'../html/objects.inv'
)

OpenVRML in snow-leopard (from macports)

Hey, I just Downloaded openvrml from macports
(port install openvrml)
Now I have a Sample program (pretty_print.cpp from openvrml at sourceforge) that begins like this:
# ifdef HAVE_CONFIG_H
# include <config.h>
# endif
# include <openvrml/vrml97_grammar.h>
# include <openvrml/browser.h>
# include <fstream>
...
then in Xcode, I added the following path and check "recursive" for the Header search path and Lib Search Path:
/opt/local/var/macports/software
And all '***.h file not found' errors disappeared, but now I have the following two:
complex.h 943 '__pow_helper' is not a member of std
c++locale.h 71 'vsnprintf' is not a member of std
/Developer/SDKs/MacOSX10.6.sdk/usr/include/c++/4.2.1/complex: In function 'std::complex<_Tp> std::pow(const std::complex<_Tp>&, int)':
/Developer/SDKs/MacOSX10.6.sdk/usr/include/c++/4.2.1/complex:943: error: '__pow_helper' is not a member of 'std'
both errors come from system files.
I wonder what is causing this errors...
Can anyone advice me on how to use openvrml samples on Macs?
thanks in advance.
I've had a similar problem. I defined "recursive" flag for an '/opt/local/include' path. This pulled in some strange c++ headers from boost compatiblity includes.
In general, you do not want "recursive" flag on your include paths.
Try unchecking "recursive" from your paths.
if you put recursive on a path containing boost headers you'll use some random boost headers, which are likely designed to be used in different environment and/or different compiler, instead of standard C++ headers, meaning, for example, you'll include TR1 header instead of standard header. This is likely to be the cause of your problem (it happened to me too).
Just locate the directory which contains the headers you need and put only that in header search path instead of being lazy and using "recursive" flag, since there are a lot of header files which have same name but differ in location only.

Resources