I need to build HTML from RST with sphinx-build. Now I use command:
os.system("sphinx-build -b singlehtml -T -D html_add_permalinks=None -D extensions='sphinx.ext.autodoc' -D master_doc='index' -C /my/doc /tmp/sphinx")
But as result it gives complicated HTML with css and JS. But I need only one HTML page with all combined RST files. Maybe even without table of content. Or if it is possible with table of content that works without JS.
I searched for such option in official documentation a lot but did not find what I need.
Please help if somebody knows how to do it.
To convert to pure HTML you better use https://pandoc.org/
Alternatively you could post-process the Sphinx generated HTML to remove all CSS and Javascript.
Related
I'm converting docx files using pandoc 1.16.0.2 and everything works great except right after each image, the size attributes are showing as text in teh
![](./media/media/image4.png){width="3.266949912510936in"
height="2.141852580927384in"}
So it shows the image fine in the md but also the size tag as plain text right behind/after/below each image. The command I'm using is:
pandoc --extract-media ./media2 -s word.docx markdown -o exm_word2.md
I've read the manual as best I can but don’t see any flags to use to control this. Also most searches are coming up where people want to have the attributes and control them.
Any suggestions to kill the size attributes or is my markdown app (MarkdownPad2 - v-2.5.x) reading this md wrong?
Use -w gfm as argument in the command line to omit the dimensional of Images.
You could write a filter to do this. You'll need to install panflute. Save this as remove_img_size.py:
import panflute as pf
def change_md_link(elem, doc):
if isinstance(elem, pf.Image):
elem.attributes.pop('width', None)
elem.attributes.pop('height', None)
return elem
if __name__ == "__main__":
pf.run_filter(change_md_link)
Then compile with
pandoc word.docx -F remove_img_size.py -o exm_word2.md
There are two ways to do this: either remove all image attributes with a Lua filter or choose an output format that doesn't support attributes on images.
Output format
The easiest (and most standard-compliant) method is to convert to commonmark. However, CommonMark allows raw HTML snippets, so pandoc tries to be helpful and creates an HTML <img> element for images with attributes. We can prevent that by disabling the raw_html format extension:
pandoc --to=commonmark-raw_html ...
If you intend to publish the document on GitHub, then GitHub Flavored Markdown (gfm) is a good choice.
pandoc --to=gfm-raw_html ...
For pandoc's Markdown, we have to also disable the link_attributes extension:
pandoc --to=markdown-raw_html-link_attributes ...
This last method is the only one that works with older (pre 2.0) pandoc version; all other suggestions here require newer versions.
Lua filter
The filter is straight-forward, it simply removes all attributes from all images
function Image (img)
img.attr = pandoc.Attr{}
return img
end
To apply the filter, we need to save the above into a file no-img-attr.lua and pass that file to pandoc with
pandoc --lua-filter=no-img-attr.lua ...
I'm manually converting a MS Word document to asciidoc format.
By doing so I ran into an issue that I can't work around yet.
There is an example where I want to show the reader of how the syntax of a file link should look like.
So I used this as an example:
file:///<Path>/<to>/<Keytab>
Asciidoc now renders this pseudo link into an actual link and warns me about this while converting my asciidoc document into HTML and PDF.
Usually, I would simply use the [source] element to prevent the link rendering. But the file link is part of a table.
[options="header,footer",cols="15%,85%"]
|=======================
|parameter|usage
|keyTabLocation |file:///<Path>/<to>/<Keytab>
|=======================
Is there a way to prevent the rendering/convertion of the file link?
Okay, I found the solution. I had to escape the whole macro using a \ at the beginning.
So this did the trick:
[options="header,footer",cols="15%,85%"]
|=======================
|parameter|usage
|keyTabLocation |\file:///<Path>/<to>/<Keytab>
|=======================
I want to have a template variable pre-processed in a markdown doc.
I tried converting the filename to file.html.md.eco but it just comes out as plain text - ie the markdown plugin doesn't seem to get applied.
The file just as html.md renders fine.
Is it needed to add the plugins to the docpad.coffee to make sure they're applied when using multiple passes?
the FAQ states how to use multiple processors
http://docpad.org/docs/faq
... Alternatively, we can get pretty inventive and do something like this: .html.md.eco which means process this with Eco, then Markdown and finally render it as HTML.
I've only startd looking into pandoc a few hours ago so there's still lots I don't know. I know that if I modify a generated docx and add footer and page numbers, then I can use that as a template, but I'm wondering if it's possible to use pandox without a template and generate a footer and page numbers?
I was thinking this would be possible
pandoc <args> --page_numbers --footer="Created by John Smith"
Or is that only doable with a template?
There is currently no default template for the docx format (see 1, 2, 3) so you cannot pass a footer file by command line.
You can use the reference-docx option to provide a footer but cannot override the footer variable:
pandoc -f markdown --reference-docx=template.docx -t docx input.md -o output.docx
Edit: Adding (from Word) page numbers on the template does work.
I am trying to convert from html to pdf with Pandoc. The output is pretty nice, still with the command pandoc index.html -o output.pdfI lose all my internal links (from table of contents to chapters, from text to footnotes, etc).
In my HTML this is the outdegree link
<p class="calibre18"><span class="calibre8">CHAPTER ONE</span><br class="calibre19"></br>The Ever Expanding Domain of Computation</p>
which then lands here
Chapter 1 makes the case that because of...
and here
<p class="calibre18"><span class="calibre8">CHAPTER ONE</span><br class="calibre19"></br>The Ever Expanding Domain of Computation</p>...
Is there any way to keep all the links also in the output?
The Pandoc User's Guide section on Internal Links says
Internal links are currently supported for HTML formats (including HTML slide shows and EPUB), LaTeX, and ConTeXt.
This suggests that internal links aren't currently supported for PDF output, even though the PDF output is generated via LaTeX.
Internal links should work straightforwardly in PDF. However, for printing purposes, the default is not to color them. Have you tried clicking on the text that should be a link?