sphinx:No content update in html content from docstring - python-sphinx

I have been working on a project on to interface with Senna which is tool used in NLP processing using Python. For easy generation of documentation I followed reStructuredText documentation style which is pretty easy one.
On calling make html, few time(and sometime no warning) there has been shown warning like
docstring of pntl.tools.Annotator.test:2: WARNING: Field list ends without a blank line; unexpected unindent and one more thing what is the use of this number 2 displayed in the working.
def test(senna_path="/media/jawahar/jon/ubuntu/senna", sent="", dep_model="", batch=False,
jar_path="/media/jawahar/jon/ubuntu/practNLPTools-lite/pntl"):
"""please replace the path of yours environment(accouding to OS path)
:parama str senna_path: path for senna location
:parama str dep_model: stanford dependency parser model location
:parama str or list sent: the sentense to process with Senna
:parama bool batch: makeing as batch process with one or more sentense passing
:parama str jar_path: location of stanford-parser.jar file
"""
and Image of the built result is been attach to show error in the content of html. For detail view of my project follow this link

The error indicates that you have incorrect syntax, specifically no blank lines around the description and the field list, and indentation is not correct. White space matters.
Spelling matters, too. You probably meant :param blah blah: thing not :parama blah blah: thing:
See Info field lists for more information.
Edit
The following example should fix the issue. Note the correct spelling of "param", and the necessary line break separating the parameter list from the description in the docstring. Additionally, to avoid PEP8 warnings in your code (reStructuredText does not really care in this case), you should wrap long lines as noted in the method definition. There is another new line wrapping in the parameter list so that Sphinx will render it correctly as well as avoid the PEP8 warning.
def test(senna_path="/media/jawahar/jon/ubuntu/senna", sent="", dep_model="",
batch=False,
jar_path="/media/jawahar/jon/ubuntu/practNLPTools-lite/pntl"):
"""
please replace the path of yours environment(accouding to OS path)
:param str senna_path: path for senna location
:param str dep_model: stanford dependency parser model location
:param str or list sent: the sentense to process with Senna
:param bool batch: makeing as batch process with one or more sentense
passing
:param str jar_path: location of stanford-parser.jar file
"""

Related

Pylint 2.12.1 different behaviour from 2.11.1

We in the process of upgrading from Pylint 2.11.1 to 2.12.1 (released recently) and we're seeing strange behaviour in checking code that passes with the older version. Specifically, we have the following method (sans code but otherwise exactly the code we have so that there can be no doubt there's a cut'n'paste error):
async def run_callback(callback: common_types.AnyCallback) -> None:
"""Run the callback, handles sync and async functions.
This WILL block the event loop if a sync function is called this way.
IF a sync callback needs to be called, it should be wrapped in an
async function and then called with run in executor. This cannot be
done at this level because run_in_executor is a separate thread.
Most async stuff is not thread safe and vice versa, so this is the
minimal abstraction which won't introduce race conditions, the
developer needs to handle it by manually doing a run_in_executor.
Example:
def sync_thing():
pass
async def async_thing():
pass
from cmnlibpy import utils
await util.run_callback(sync_thing)
await util.run_callback(async_thing)
Args:
callback:
sync or async function which will be called
"""
While this is perfectly acceptable to 2.11.1, the newer version barfs on the parameter despite the fact it's in the docstring:
************* Module blah.blah.blah
blah/blah/blah.py:20:0: W9015: "callback" missing in parameter documentation (missing-param-doc)
I've tried moving it within the docstring, renaming it, and various other things to no avail. If you're concerned about the type, it's defined in an imported item from common_types.py in the same package:
from typing import Union, Awaitable, Callable
SyncCallback = Callable[[], None]
AsyncCallback = Callable[[], Awaitable[None]]
AnyCallback = Union[SyncCallback, AsyncCallback]
Why is the new PyLint complaining about this? Has it decided to deprecate Google-style parameter specifications?
Further investigation: it appears that the problem doesn't manifest itself when parameter and description are on a single line, such as:
callback: description goes here.
A diff of the GoogleDocstring regular expressions for parameter lines in extensions/_check_docs_utils.py seems to indicate it should either be a colon with the description on the same line, or no colon with description on following line:
- \s* (\w+) # identifier
- \s* :
- \s* (?:({GoogleDocstring.re_multiple_type})(?:,\s+optional)?)? # optional type declaration
- \n # description starts on a new line
- \s* (.*) # description
+ \s* (\*{{0,2}}\w+)(\s?(:|\n)) # identifier with potential asterisks
+ \s* (?:({GoogleDocstring.re_multiple_type})(?:,\s+optional)?\n)? # optional type declaration
+ \s* (.*) # optional description
However, the multi-line version didn't seem to work even if I left the : off the end of the line.
At the request of PyLint developer, I have opened a bug report (see https://github.com/PyCQA/pylint/issues/5452 for detail).
It turns out there was an issue with the handling of parameter specifications that crossed multiple lines.
The code uses a regex that always delivers the three components (name, type, and description) but the code had some strangeness that resulted in a situation where neither type nor description was set, meaning the parameter wasn't recorded.
The code responsible was in extensions/_check_docs_utils.py:
re_only_desc = re.search(":\n", entry)
if re_only_desc:
param_name = match.group(1)
param_desc = match.group(2)
param_type = None
else:
param_name = match.group(1)
param_type = match.group(2)
param_desc = match.group(3)
I'm not sure why the presence of:\n is used to decide that the type isn't available but that doesn't seem right. Its presence means that the description is on the following line (nothing to do with missing type) but the fact that you're using \s means line breaks are treated the same as any other white space in the section so it's irrelevant.
Because the regex used for this captures all three groups whether the type is there or not, if it's not there, the second group (the parameter type) is set to None and the third group is still the description (it doesn't move to group 2). So the first part of the if sets both type and description to None, stopping the variable from being added.
If the code is instead changed into (no if needed):
param_name = match.group(1)
param_type = match.group(2)
param_desc = match.group(3)
then the first test case below still works, and the second one starts working:
Args:
first_param: description
second_param:
description
Full details at the bug report and pull request, the maintainers were a pleasure to work with to get the fix done and it will hopefully be in 2.12.2, whenever that appears.
As a workaround, you can just ensure that the entire parameter specification exists on one line, such as with:
""" Doc string goes here.
Args:
param: The parameter.
"""

Bullet list style for single parameter functions in RTD theme using autodoc and Sphinx?

I've noticed that when I use autodoc with the ReadTheDoc theme, if I have multiple arguments in my functions they are listed in a bullet list style:
arg1
arg2
...
but if there is only 1 argument then it is not using the bullet list style which is a bit silly to me since it breaks the continuity of the design.
I've found how to remove the disc via CSS to make things more uniform but I actually want to do the opposite and have the disk for the single argument functions.
At this point, I'm not sure it is a CSS change and I do not know how to do that.
I've also noticed the same thing in different docs.
Here is the rendered html:
Here are the 2 methods:
def add_attribute(self, name, index):
"""
:param name: The name attached to the attribute.
:param index: The position of the attribute within the list of attributes. """
print("")
def delete_attribute(self, name):
"""
:param name: The name of the attribute to delete."""
print("")
Here is the my .rst:
API
----------------
.. automodule:: my_module
:members:
Here is the conf.py
extensions = [
'sphinx_rtd_theme',
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.coverage',
'sphinx.ext.autosummary',
]
templates_path = ['_templates']
language = 'python'
exclude_patterns = []
html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
autosummary_generate = True
Any idea?
Cheers!
After a lot of digging, I've found a partial workaround for this.
My solution involves manually editing the produced HTML files to insert the missing bullet points.
Required conf.py changes:
# Register hook to run when build is complete
def setup(app):
app.connect('build-finished', on_build_finished)
# Hook implementation
def on_build_finished(app, exception):
add_single_param_bullets("_build/html/index.html")
# Function to actually add the bullet points by overwriting the given HTML file
def add_single_param_bullets(file_path):
print('Add single parameter bullets in {:s}'.format(file_path))
if not os.path.exists(file_path):
print(' File not found, skipping...')
return
lines_enc = []
with open(file_path, 'rb') as f:
for l in f.readlines():
# Check for html that indicates single parameter function
if b'<dd class="field-odd"><p><strong>' in l:
# Work out the encoding if not defined
enc = None
if enc is None:
import chardet
enc = chardet.detect(l)['encoding']
# Decode html and get the parameter information that needs adding
l_dec = l.decode(enc)
l_insert = l_dec.replace('<dd class="field-odd">', '').replace('\r\n', '')
# Add new encoded lines to output
lines_enc.append('<dd class="field-odd"><ul class="simple">'.encode('utf=8'))
lines_enc.append('<li>{:s}</li>'.format(l_insert).encode(enc))
lines_enc.append('</ul>'.encode('utf=8'))
else:
lines_enc.append(l)
# Overwrite the original file with the new changes
with open(file_path, 'wb') as f:
for l in lines_enc:
f.write(l)
In my case, I only have single argument functions in index.html. However, you can register additional files in on_build_finished.
A few things to note:
This only edits the produced HTML files, and doesn't actually solve the underlying problem. I dug through the source for a bit but couldn't find why the bullet points aren't added for single parameter function.
The problem is not just for the RTD theme. It seems to occur with the basic theme as well. So I suspect it's a deeper problem with Sphinx rather than the RTD theme.
The code above somewhat deals with different encodings in the original HTML.
This does not work on the RTD website. As the HTML files are edited in place, and the RTD build outputs the HTML files to a different directory, this solution doesn't seem to work on the RTD website. This is quite annoying. A solution would be to somehow change the RTD build process, or tell RTD to use pre-built HTML sources rather than building its own, but I don't know how to do so.
After spending a few hours working all this out, I actually think it looks better without the bullet points...

Module variable documentation error

I get the following error while documenting a module variable json_class_index (See source), which does not have a docstring.
The generated documentation seems to be fine. What is a good fix?
reading sources... [100%] sanskrit_data_schema_common
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:3: WARNING: Unexpected indentation.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:7: WARNING: Unexpected indentation.
/home/vvasuki/sanskrit_data/sanskrit_data/schema/common.py:docstring of sanskrit_data.schema.common.json_class_index:8: WARNING: Inline strong start-string without end-string.
Edit:
PS: Note that removing the below docstring makes the error disappear, so it seems to be the thing to fix.
.. autodata:: json_class_index
:annotation: Maps jsonClass values to Python object names. Useful for (de)serialization. Updated using update_json_class_index() calls at the end of each module file (such as this one) whose classes may be serialized.
The warning messages indicate that the reStructuredText syntax of your docstrings is not valid and needs to be corrected.
Additionally your source code does not comply with PEP 8. Indentation should be 4 spaces, but your code uses 2, which might cause problems with Sphinx.
First make your code compliant with PEP 8 indentation.
Second, you must have two lines separating whatever precedes info field lists and the info field lists themselves.
Third, if the warnings persist, then look at the line numbers in the warnings—3, 4, 7, and 8—and the warnings themselves. It appears that the warnings correspond to this block of code:
#classmethod
def make_from_dict(cls, input_dict):
"""Defines *our* canonical way of constructing a JSON object from a dict.
All other deserialization methods should use this.
Note that this assumes that json_class_index is populated properly!
- ``from sanskrit_data.schema import *`` before using this should take care of it.
:param input_dict:
:return: A subclass of JsonObject
"""
Try this instead, post-PEP-8-ification, which should correct most of the warnings caused by faulty white space in your docstring:
#classmethod
def make_from_dict(cls, input_dict):
"""
Defines *our* canonical way of constructing a JSON object from a dict.
All other deserialization methods should use this.
Note that this assumes that json_class_index is populated properly!
- ``from sanskrit_data.schema import *`` before using this should take care of it.
:param input_dict:
:return: A subclass of JsonObject
"""
This style is acceptable according to PEP 257. The indentation is visually and vertically consistent, where the triple quotes vertically align with the left indentation. I think it's easier to read.
The fix was to add a docstring for the variable as follows:
#: Maps jsonClass values to Python object names. Useful for (de)serialization. Updated using update_json_class_index() calls at the end of each module file (such as this one) whose classes may be serialized.
json_class_index = {}

Separate YAML and plain text on the same document

While building a blog using django I realized that it would be extremely practical to store the text of an article and all the related informations (title, author, etc...) together in a human-readable file format, and then charge those files on the database using a simple script.
Now that said, YAML caught my attention for his readability and ease of use, the only downside of the YAML syntax is the indentation:
---
title: Title of the article
author: Somebody
# Other stuffs here ...
text:|
This is the text of the article. I can write whatever I want
but I need to be careful with the indentation...and this is a
bit boring.
---
I believe that's not the best solution (especially if the files are going to be written by casual users). A format like this one could be much better
---
title: Title of the article
author: Somebody
# Other stuffs here ...
---
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
Is there any solution? Preferably using python.
Other file formats propositions are welcome as well!
Unfortunately this is not possible, what one would think could work is using | for a single scalar in the separate document:
import ruamel.yaml
yaml_str = """\
title: Title of the article
author: Somebody
---
|
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
for d in ruamel.yaml.load_all(yaml_str):
print(d)
print('-----')
but it doesn't because | is the block indentation indicator. And although at the top level an indentation of 0 (zero) would easily work, ruamel.yaml (and PyYAML) don't allow this.
It is however easy to parse this yourself, which has the advantage over using the front matter package that you can use YAML 1.2 and are not restricted to using YAML 1.1 because of frontmaker using the PyYAML. Also note that I used the more appropriate end of document marker ... to separate YAML from the markdown:
import ruamel.yaml
combined_str = """\
title: Title of the article
author: Somebody
...
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
with open('test.yaml', 'w') as fp:
fp.write(combined_str)
data = None
lines = []
yaml_str = ""
with open('test.yaml') as fp:
for line in fp:
if data is not None:
lines.append(line)
continue
if line == '...\n':
data = ruamel.yaml.round_trip_load(yaml_str)
continue
yaml_str += line
print(data['author'])
print(lines[2])
which gives:
Somebody
I want...
(the round_trip_load allows dumping with preservation of comments, anchor names etc).
I found Front Matter does exactly what I want to do.
There is also a python package.

SPSS syntax for naming individual analyses in output file outline

I have created syntax in SPSS that gives me 90 separate iterations of general linear model, each with slightly different variations fixed factors and covariates. In the output file, they are all just named as "General Linear Model." I have to then manually rename each analysis in the output, and I want to find syntax that will add a more specific name to each result that will help me identify it out of the other 89 results (e.g. "General Linear Model - Males Only: Mean by Gender w/ Weight covariate").
This is an example of one analysis from the syntax:
USE ALL.
COMPUTE filter_$=(Muscle = "BICEPS" & Subj = "S1" & SMU = 1 ).
VARIABLE LABELS filter_$ 'Muscle = "BICEPS" & Subj = "S1" & SMU = 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0). FILTER BY filter_$.
EXECUTE.
GLM Frequency_Wk6 Frequency_Wk9
Frequency_Wk12 Frequency_Wk16
Frequency_Wk20
/WSFACTOR=Time 5 Polynomial
/METHOD=SSTYPE(3)
/PLOT=PROFILE(Time)
/EMMEANS=TABLES(Time)
/CRITERIA=ALPHA(.05)
/WSDESIGN=Time.
I am looking for syntax to add to this that will name this analysis as: "S1, SMU1 BICEPS, GLM" Not to name the whole output file, but each analysis within the output so I don't have to do it one-by-one. I have over 200 iterations at times that come out in a single output file, and renaming them individually within the output file is taking too much time.
Making an assumption that you are exporting the models to Excel (please clarify otherwise).
There is an undocumented command (OUTPUT COMMENT TEXT) that you can utilize here, though there is also a custom extension TEXT also designed to achieve the same but that would need to be explicitly downloaded via:
Utilities-->Extension Bundles-->Download And Install Extension Bundles--->TEXT
You can use OUTPUT COMMENT TEXT to assign a title/descriptive text just before the output of the GLM model (in the example below I have used FREQUENCIES as an example).
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
oms /select all /if commands=['output comment' 'frequencies'] subtypes=['comment' 'frequencies']
/destination format=xlsx outfile='C:\Temp\ExportOutput.xlsx' /tag='ExportOutput'.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - jobcat".
freq jobcat.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - gender".
freq gender.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - minority".
freq minority.
omsend tag=['ExportOutput'].
You could use TITLE command here also but it is limited to only 60 characters.
You would have to change the OMS tags appropriately if using TITLE or TEXT.
Edit:
Given the OP wants to actually add a title to the left hand pane in the output viewer, a solution for this is as follows (credit to Albert-Jan Roskam for the Python code):
First save the python file "editTitles.py" to a valid Python search path (for example (for me anyway): "C:\ProgramData\IBM\SPSS\Statistics\23\extensions")
#editTitles.py
import tempfile, os, sys
import SpssClient
def _titleToPane():
"""See titleToPane(). This function does the actual job"""
outputDoc = SpssClient.GetDesignatedOutputDoc()
outputItemList = outputDoc.GetOutputItems()
textFormat = SpssClient.DocExportFormat.SpssFormatText
filename = tempfile.mktemp() + ".txt"
for index in range(outputItemList.Size()):
outputItem = outputItemList.GetItemAt(index)
if outputItem.GetDescription() == u"Page Title":
outputItem.ExportToDocument(filename, textFormat)
with open(filename) as f:
outputItem.SetDescription(f.read().rstrip())
os.remove(filename)
return outputDoc
def titleToPane(spv=None):
"""Copy the contents of the TITLE command of the designated output document
to the left output viewer pane"""
try:
outputDoc = None
SpssClient.StartClient()
if spv:
SpssClient.OpenOutputDoc(spv)
outputDoc = _titleToPane()
if spv and outputDoc:
outputDoc.SaveAs(spv)
except:
print "Error filling TITLE in Output Viewer [%s]" % sys.exc_info()[1]
finally:
SpssClient.StopClient()
Re-start SPSS Statistics and run below as a test:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
title="##Model##: jobcat".
freq jobcat.
title="##Model##: gender".
freq gender.
title="##Model##: minority".
freq minority.
begin program.
import editTitles
editTitles.titleToPane()
end program.
The TITLE command will initially add a title to main output viewer (right hand side) but then the python code will transfer that text to the left hand pane output tree structure. As mentioned already, note TITLE is capped to 60 characters only, a warning will be triggered to highlight this also.
This editTitles.py approach is the closest you are going to get to include a descriptive title to identify each model. To replace the actual title "General Linear Model." with a custom title would require scripting knowledge and would involve a lot more code. This is a simpler alternative approach. Python integration required for this to work.
Also consider using:
SPLIT FILE SEPARATE BY <list of filter variables>.
This will automatically produce filter labels in the left hand pane.
This is easy to use for mutually exclusive filters but even if you have overlapping filters you can re-run multiple times (and have filters applied to get as close to your desired set of results).
For example:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
sort cases by jobcat minority.
split file separate by jobcat minority.
freq educ.
split file off.

Resources