Separate YAML and plain text on the same document - yaml

While building a blog using django I realized that it would be extremely practical to store the text of an article and all the related informations (title, author, etc...) together in a human-readable file format, and then charge those files on the database using a simple script.
Now that said, YAML caught my attention for his readability and ease of use, the only downside of the YAML syntax is the indentation:
---
title: Title of the article
author: Somebody
# Other stuffs here ...
text:|
This is the text of the article. I can write whatever I want
but I need to be careful with the indentation...and this is a
bit boring.
---
I believe that's not the best solution (especially if the files are going to be written by casual users). A format like this one could be much better
---
title: Title of the article
author: Somebody
# Other stuffs here ...
---
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
Is there any solution? Preferably using python.
Other file formats propositions are welcome as well!

Unfortunately this is not possible, what one would think could work is using | for a single scalar in the separate document:
import ruamel.yaml
yaml_str = """\
title: Title of the article
author: Somebody
---
|
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
for d in ruamel.yaml.load_all(yaml_str):
print(d)
print('-----')
but it doesn't because | is the block indentation indicator. And although at the top level an indentation of 0 (zero) would easily work, ruamel.yaml (and PyYAML) don't allow this.
It is however easy to parse this yourself, which has the advantage over using the front matter package that you can use YAML 1.2 and are not restricted to using YAML 1.1 because of frontmaker using the PyYAML. Also note that I used the more appropriate end of document marker ... to separate YAML from the markdown:
import ruamel.yaml
combined_str = """\
title: Title of the article
author: Somebody
...
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
with open('test.yaml', 'w') as fp:
fp.write(combined_str)
data = None
lines = []
yaml_str = ""
with open('test.yaml') as fp:
for line in fp:
if data is not None:
lines.append(line)
continue
if line == '...\n':
data = ruamel.yaml.round_trip_load(yaml_str)
continue
yaml_str += line
print(data['author'])
print(lines[2])
which gives:
Somebody
I want...
(the round_trip_load allows dumping with preservation of comments, anchor names etc).

I found Front Matter does exactly what I want to do.
There is also a python package.

Related

Bullet list style for single parameter functions in RTD theme using autodoc and Sphinx?

I've noticed that when I use autodoc with the ReadTheDoc theme, if I have multiple arguments in my functions they are listed in a bullet list style:
arg1
arg2
...
but if there is only 1 argument then it is not using the bullet list style which is a bit silly to me since it breaks the continuity of the design.
I've found how to remove the disc via CSS to make things more uniform but I actually want to do the opposite and have the disk for the single argument functions.
At this point, I'm not sure it is a CSS change and I do not know how to do that.
I've also noticed the same thing in different docs.
Here is the rendered html:
Here are the 2 methods:
def add_attribute(self, name, index):
"""
:param name: The name attached to the attribute.
:param index: The position of the attribute within the list of attributes. """
print("")
def delete_attribute(self, name):
"""
:param name: The name of the attribute to delete."""
print("")
Here is the my .rst:
API
----------------
.. automodule:: my_module
:members:
Here is the conf.py
extensions = [
'sphinx_rtd_theme',
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.coverage',
'sphinx.ext.autosummary',
]
templates_path = ['_templates']
language = 'python'
exclude_patterns = []
html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
autosummary_generate = True
Any idea?
Cheers!
After a lot of digging, I've found a partial workaround for this.
My solution involves manually editing the produced HTML files to insert the missing bullet points.
Required conf.py changes:
# Register hook to run when build is complete
def setup(app):
app.connect('build-finished', on_build_finished)
# Hook implementation
def on_build_finished(app, exception):
add_single_param_bullets("_build/html/index.html")
# Function to actually add the bullet points by overwriting the given HTML file
def add_single_param_bullets(file_path):
print('Add single parameter bullets in {:s}'.format(file_path))
if not os.path.exists(file_path):
print(' File not found, skipping...')
return
lines_enc = []
with open(file_path, 'rb') as f:
for l in f.readlines():
# Check for html that indicates single parameter function
if b'<dd class="field-odd"><p><strong>' in l:
# Work out the encoding if not defined
enc = None
if enc is None:
import chardet
enc = chardet.detect(l)['encoding']
# Decode html and get the parameter information that needs adding
l_dec = l.decode(enc)
l_insert = l_dec.replace('<dd class="field-odd">', '').replace('\r\n', '')
# Add new encoded lines to output
lines_enc.append('<dd class="field-odd"><ul class="simple">'.encode('utf=8'))
lines_enc.append('<li>{:s}</li>'.format(l_insert).encode(enc))
lines_enc.append('</ul>'.encode('utf=8'))
else:
lines_enc.append(l)
# Overwrite the original file with the new changes
with open(file_path, 'wb') as f:
for l in lines_enc:
f.write(l)
In my case, I only have single argument functions in index.html. However, you can register additional files in on_build_finished.
A few things to note:
This only edits the produced HTML files, and doesn't actually solve the underlying problem. I dug through the source for a bit but couldn't find why the bullet points aren't added for single parameter function.
The problem is not just for the RTD theme. It seems to occur with the basic theme as well. So I suspect it's a deeper problem with Sphinx rather than the RTD theme.
The code above somewhat deals with different encodings in the original HTML.
This does not work on the RTD website. As the HTML files are edited in place, and the RTD build outputs the HTML files to a different directory, this solution doesn't seem to work on the RTD website. This is quite annoying. A solution would be to somehow change the RTD build process, or tell RTD to use pre-built HTML sources rather than building its own, but I don't know how to do so.
After spending a few hours working all this out, I actually think it looks better without the bullet points...

Generate structured documentation from commented code

How do i get Asciidoc(tor) to generate eg. a nice overall function description out of several code comments and some code, including the function signature, without butchering my code with tags?
AFAIK Asciidoc only supports external includes in its Asciidoc file via surrounding tags in the code like
# tag::mytag[]
<CODE TO INCLUDE HERE>
# end::mytag[]
which would be quite noisy around every describing comment within a single function body and around every function signature.
Maybe there is an exotic, less verbose way like marking the single line comments like #! and single line tags that tells Asciidoctor to read only a single line relative to these tags.
Consider this tiny example.
def uber_func(to_uber: str) -> str:
"""
This is an overall description. Delivers some context.
"""
# Trivial code here
# To uber means <include code below>
result = to_uber + " IS SOOO " + to_uber + "!!!"
# Trivial code here
# Function only returns upper case.
return result.upper()
My naive Asciidoc approach to include all meaningfull comments, the docstring and the function signature from the code above would look awefull, plus, Asciidoc doesn't recognize and remove comment marks, so the resulting documentation might not be so pretty too.
Instead of this very ugly
# tag::uber_func[]
def uber_func(to_uber: str) -> str:
"""
This is an overall description. Delivers some context.
"""
# end::uber_func[]
# Trivial code here
# tag::uber_func[]
# To uber means
result = to_uber + " IS SOOO " + to_uber + "!!!"
# end::uber_func[]
# Trivial code here
# tag::uber_func[]
# Function only returns upper case.
# end::uber_func[]
return result.upper()
I would like to use some thing like (pseudo):
def uber_func(to_uber: str) -> str:
# tag::uber_func[readline:-1,ignore-comment-marks,doc-comment:#!]
#! This is an overall description. Delivers some context.
# Trivial code here
#! To uber means
# tag::uber_func[readline:+1]
result = to_uber + " IS SOOO " + to_uber + "!!!"
# Trivial code here
#! Function only returns upper case.
return result.upper()
# end::uber_func[]
I think the general issue is, that Asciidoc is merely a text formatting tool, which means, if i want it to generate a structured documentation mostly from my code, i would need to provide this structe in my code and in my .adoc file.
Documentation generators like Doxygen on the other side recognize this structure and the documenting comments automatically.
I value this feature very much, that some generators allow you to write code and pretty documentation side by side, which lowers the overall effort alot.
If Asciidoc doesn't allow me to do this in a reasonable way, i will have look for something else.
I think you would have to write a scraper that puts the comments into a structure, then pull that structure into your AsciiDoc. This way the comments can be internally formatted with AsciiDoc markup, and you can output it in Asciidoctor-generated documents, but you won't need Asciidoctor to read the source files directly.
I would try a system of using one # for non-publishing comments and ## for ones you wish to publish, or vice versa, or append a # to the ones that are for docs publishing. As well as those denoted by the """ notation. Then your scraper can read the block name (uber_func or whatever portion is important) and then scrape the keeper comments and all the literal code, arranging them all in a file. The below file has seen most comments tagged as text, non-keeper comments dropped, and non-comment content as code:
# tag::function__uber_func[]
# tag::function__uber_func_form[]
uber_func(to_uber: str) -> str:
# end::function__uber_func_form[]
# tag::function__uber_func_desc[]
This is an overall description. Delivers some context.
# end::function__uber_func_desc[]
# tag::function__uber_func_body[]
# tag::function__uber_func_text[]
To uber means
# end::function__uber_func_text[]
# tag::function__uber_func_code[]
----
result = to_uber + " IS SOOO " + to_uber + "!!!"
----
# end::function__uber_func_code[]
# tag::function__uber_func_text[]
Function only returns upper case.
# end::function__uber_func_text[]
# tag::function__uber_func_code[]
----
return result.upper()
----
# end::function__uber_func_code[]
# end::function__uber_func[]
I know this looks hideous, but it is super useful to an AsciiDoc template. For instance, use just:
uber_func::
include::includes/api-stuff.adoc[tags="function__uber_func_form"]
+
include::includes/api-stuff.adoc[tags="function__uber_func_desc"]
+
include::includes/api-stuff.adoc[tags="function__uber_func_body"]
This would be even better if you parse it to a data format (like JSON or YAML) and then press it into AsciiDoc template dynamically. But you could maintain something like the above if it was not too massive. At a certain size (20+ such records?) you need an intermediary datasource (an ephemeral data file produced by the scraping), and at a certain larger scale (> 100 code blocks/endpoints?), you likely need a system that specializes in API documentation, such as Doxygen, et al.

How to add link to source code in Sphinx

class torch.FloatStorage[source]
byte()
Casts this storage to byte type
char()
Casts this storage to char type
Im trying to get some documentation done, i have managed to to get the format like the one shown above, But im not sure how to give that link of source code which is at the end of that function!
The link takes the person to the file which contains the code,But im not sure how to do it,
This is achieved thanks to one of the builtin sphinx extension.
The one you are looking for in spinx.ext.viewcode. To enable it, add the string 'sphinx.ext.viewcode' to the list extensions in your conf.py file.
In summary, you should see something like that in conf.py
extensions = [
# other extensions that you might already use
# ...
'sphinx.ext.viewcode',
]
I'd recommend looking at the linkcode extension too. Allows you to build a full HTTP link to the code on GitHub or such like. This is sometimes a better option that including the code within the documentation itself. (E.g. may have stronger permission on it than the docs themselves.)
You write a little helper function in your conf.py file, and it does the rest.
What I really like about linkcode is that it creates links for enums, enum values, and data elements, which I could not get to be linked with viewcode.
I extended the link building code to use #:~:text= to cause the linked-to page to scroll to the text. Not perfect, as it will only scroll to the first instance, which may not always be correct, but likely 80~90% of the time it will be.
from urllib.parse import quote
def linkcode_resolve(domain, info):
# print(f"domain={domain}, info={info}")
if domain != 'py':
return None
if not info['module']:
return None
filename = quote(info['module'].replace('.', '/'))
if not filename.startswith("tests"):
filename = "src/" + filename
if "fullname" in info:
anchor = info["fullname"]
anchor = "#:~:text=" + quote(anchor.split(".")[-1])
else:
anchor = ""
# github
result = "https://<github>/<user>/<repo>/blob/master/%s.py%s" % (filename, anchor)
# print(result)
return result

sphinx:No content update in html content from docstring

I have been working on a project on to interface with Senna which is tool used in NLP processing using Python. For easy generation of documentation I followed reStructuredText documentation style which is pretty easy one.
On calling make html, few time(and sometime no warning) there has been shown warning like
docstring of pntl.tools.Annotator.test:2: WARNING: Field list ends without a blank line; unexpected unindent and one more thing what is the use of this number 2 displayed in the working.
def test(senna_path="/media/jawahar/jon/ubuntu/senna", sent="", dep_model="", batch=False,
jar_path="/media/jawahar/jon/ubuntu/practNLPTools-lite/pntl"):
"""please replace the path of yours environment(accouding to OS path)
:parama str senna_path: path for senna location
:parama str dep_model: stanford dependency parser model location
:parama str or list sent: the sentense to process with Senna
:parama bool batch: makeing as batch process with one or more sentense passing
:parama str jar_path: location of stanford-parser.jar file
"""
and Image of the built result is been attach to show error in the content of html. For detail view of my project follow this link
The error indicates that you have incorrect syntax, specifically no blank lines around the description and the field list, and indentation is not correct. White space matters.
Spelling matters, too. You probably meant :param blah blah: thing not :parama blah blah: thing:
See Info field lists for more information.
Edit
The following example should fix the issue. Note the correct spelling of "param", and the necessary line break separating the parameter list from the description in the docstring. Additionally, to avoid PEP8 warnings in your code (reStructuredText does not really care in this case), you should wrap long lines as noted in the method definition. There is another new line wrapping in the parameter list so that Sphinx will render it correctly as well as avoid the PEP8 warning.
def test(senna_path="/media/jawahar/jon/ubuntu/senna", sent="", dep_model="",
batch=False,
jar_path="/media/jawahar/jon/ubuntu/practNLPTools-lite/pntl"):
"""
please replace the path of yours environment(accouding to OS path)
:param str senna_path: path for senna location
:param str dep_model: stanford dependency parser model location
:param str or list sent: the sentense to process with Senna
:param bool batch: makeing as batch process with one or more sentense
passing
:param str jar_path: location of stanford-parser.jar file
"""

How to use YAML front and back matter in Ruby?

I have heard of the term "front matter" and "back matter" to refer to some YAML parsing at the beginning or end of a non-YAML file. However, I can't seem to find any examples/documentation of how to implement this. Maybe this isn't a standard YAML feature. How can I make use of this feature in my Ruby project?
FYI: The reason I want to do this is to be able to require some ruby files at the top, and assume the rest is YAML. I don't think this is normally allowed in a YAML file.
I just came across a nice example of something similar to what I am trying to do. It isn't necessarily an example of "front/back matter" but it might help someone in the future:
Using the __END__ keyword, you can stop ruby from parsing the rest of the file. The rest of the file is stored in a DATA variable, which is actually a File object:
#!/usr/bin/env ruby
%w(yaml pp).each { |dep| require dep }
obj = YAML::load(DATA)
pp obj
__END__
---
-
name: Adam
age: 28
admin: true
-
name: Maggie
age: 28
admin: false
Source

Resources