How can I prevent pandoc from inserting an <h1> element of my title in the content - pandoc

I have a book made of several ordered Markdown files. I am using Pandoc to convert those into an epub file, and things are mostly okay. I can embed the font I like and provide my own CSS, etc. The problem is that the output file contains an element that is not present in the Markdown (as a "#" header element). This element is then being picked up by the ToC function and inserted into the Table of Contents. I didn't ask for the element to be present, and I can't find an option to turn it off.
Here's how to reproduce, with a much simpler case than my actual one, but it's sufficient to demonstrate the problem. I have the following file structure:
- pandoctest/
- src/
- file1.md
- file2.md
- epub.yml
The contents are as follows:
file1.md:
Here is some text.
file2.md:
# Chapter one
The chapter goes here.
epub.yml:
---
title:
- type: main
text: A Book
creator:
- role: author
text: Some Dude
---
And the pandoc command I'm running is:
pandoc -o output.epub epub.yml --toc src/*
The end result is something like this:
Page 1: An appropriate title page using the title and author elements from epub.yml
Page 2: The table of contents page. At the top, the title from epub.yml. Beneath that are two ToC entries. The first is the title of the book and refers to the element I don't want present on the next page. The second is "Chapter One" which refers to the # Chapter One element from my Markdown (this is appropriate).
Page 3: First, the undesired element, which, in the raw XML looks like this:
<h1 class="unnumbered" data-number="">A Book</h1>
Then, "Here is some text", a paragraph that I did indeed tell it to put there.
Page 4: A correctly rendered "Chapter One" page.
The question here is how to get pandoc to not render the "unnumbered" header element that is not present in the Markdown. It screws up the Table of Contents and I never asked for it to be there.
For reference, here is the epub that is rendered from my little test here: https://www.dropbox.com/s/dj4jo08g7q4f9i2/output.epub?dl=0

Related

Include standard reST label reference in toctree [duplicate]

I have a .. toctree as part of a sphinx page, which includes relative links to other rst files in my package. How can I include a link to a subsection of a given page, rather than the full page itself?
I took a stab at
.. toctree::
page#section
But that didn't work. Any help is great.
After much hackery, I've come to the following solution, but I should first state that my goal was to:
have the heading NOT appear in the content body
have the heading appear in the TOC
So basically linking from the TOC to an arbitrary but invisible part of a document.
I needed this in order to be able to link to methods in some source code documentation rendered with Sphinxcontrib PHPDomain - these methods generate section links of their own, but do not get added into the TOC by default.
Step 1:
At the top of your RST file which needs this linking functionality, add a new role as such:
.. role:: hidden
:class: hidden
Step 2:
Somewhere in the content, use this role as such:
:hidden:`My Arbitrary Location`
"""""""""""""""""""""""""""""""
Step 3:
Add new CSS to the project (usually done by adding a CSS file into _static, or defining a style sheet or something like that - see this):
.rst-content .hidden {
display: none;
}
nav .hidden {
display: unset;
}
This forces the heading to be hidden in the content, but shown in the TOC.
Then, reuse the role as necessary in other documents.
Note that if your goal is to link to arbitrary locations in the document and still have the headings show up in the content, just change the CSS to style the headings to your liking rather than hide them.
When creating the ToC, Sphinx is including all headings and subheadings of referenced files within the configured tree depth. So you can simply not start the page with a heading and insert the heading at the point you want the ToC to point to, e.g.:
.. _my-rst-file:
**You can use bold print here if you want. This will not appear in the ToC**
.. rubric:: Or the "rubric" directive
And here some more text, normal font weight.
Here comes the heading that will appear in the ToC
""""""""""""""""""""""""""""""""""""""""""""""""""
And so on...
You need to include the page reference in the ToC as usual.
So in the ToC, you have:
.. toctree::
my_rst_file
In our example, the build result (HTML, PDF, whatever) will only have a reference to Here comes the heading that will appear in the ToCin the ToC.

Using fancyhdr in YAML metadata produces multiple page numbers with Pandoc

I'm using Pandoc to generate a PDF from markdown. When specifying header/footer information in the YAML metadata (as below), I continue to get a page number in the center of my footer (with the text of \fancyfoot[L] written overtop), in addition to the page number in footer on the right that I've specified with \fancyfoot[R].
How can I remove the default page number in the footer at center? If I use \pagenumbering{gobble} it just removes all page numbers, at center and on right.
---
title: Test Title
author: Author Name
header-includes:
- \usepackage{fancyhdr}
- \pagestyle{fancy}
- \fancyhead[L]{Author Name}
- \fancyhead[R]{Test Title}
- \fancyfoot[L]{Extra text here}
- \fancyfoot[R]{\thepage}
---
Currently using Pandoc 1.17.2 on OSX 10.11.6.
Well, I think this should work if you just give the center field an empty content field. That is at least one way in which it works in Latex and hopefully the same for pandoc.
\fancyfoot[C]{}

How to go up a level once a sub-section has been added to the document?

I have a Restructured Text document which several hierarchical sections, such as:
Main title
##########
Text under the main title.
Sub-section
===========
Text under the sub-section.
This works great, I get the correct HTML formatting when I compile it using Sphinx.
My question is: how can I go up a hierarchy level so I can add more text after a few sub-sections?
For example:
Main title
##########
Text under the main title.
Sub-section
===========
Text under the sub-section.
In my CSS, sub-section is indented.
I want this paragraph to be rendered as part of the Main title section,
not the sub-section.
I'm basically looking for a way to go up a level in the hierarchy.
Is this possible?
Thanks!
It's not possible to go "up" the hierarchy without starting a new section at the level you desire. Document section structure doesn't work that way. Changes in section level must be preceded by corresponding section titles. Using indentation to signal a section-like structure should only be used in a limited local scope.
What you're describing isn't a subsection, it's a topic. Docutils has a directive for that:
.. topic:: title
Topic text.
May include multiple body elements (paragraphs, lists, etc.).
There is also a "sidebar" directive for a similar structure, but typically for a parallel topic that's off to the side. The "rubric" directive may also be of interest.
See http://docutils.sourceforge.net/docs/ref/rst/directives.html for details.

How can I link to a page section in a sphinx toctree

I have a .. toctree as part of a sphinx page, which includes relative links to other rst files in my package. How can I include a link to a subsection of a given page, rather than the full page itself?
I took a stab at
.. toctree::
page#section
But that didn't work. Any help is great.
After much hackery, I've come to the following solution, but I should first state that my goal was to:
have the heading NOT appear in the content body
have the heading appear in the TOC
So basically linking from the TOC to an arbitrary but invisible part of a document.
I needed this in order to be able to link to methods in some source code documentation rendered with Sphinxcontrib PHPDomain - these methods generate section links of their own, but do not get added into the TOC by default.
Step 1:
At the top of your RST file which needs this linking functionality, add a new role as such:
.. role:: hidden
:class: hidden
Step 2:
Somewhere in the content, use this role as such:
:hidden:`My Arbitrary Location`
"""""""""""""""""""""""""""""""
Step 3:
Add new CSS to the project (usually done by adding a CSS file into _static, or defining a style sheet or something like that - see this):
.rst-content .hidden {
display: none;
}
nav .hidden {
display: unset;
}
This forces the heading to be hidden in the content, but shown in the TOC.
Then, reuse the role as necessary in other documents.
Note that if your goal is to link to arbitrary locations in the document and still have the headings show up in the content, just change the CSS to style the headings to your liking rather than hide them.
When creating the ToC, Sphinx is including all headings and subheadings of referenced files within the configured tree depth. So you can simply not start the page with a heading and insert the heading at the point you want the ToC to point to, e.g.:
.. _my-rst-file:
**You can use bold print here if you want. This will not appear in the ToC**
.. rubric:: Or the "rubric" directive
And here some more text, normal font weight.
Here comes the heading that will appear in the ToC
""""""""""""""""""""""""""""""""""""""""""""""""""
And so on...
You need to include the page reference in the ToC as usual.
So in the ToC, you have:
.. toctree::
my_rst_file
In our example, the build result (HTML, PDF, whatever) will only have a reference to Here comes the heading that will appear in the ToCin the ToC.

How to set the positon of table of content when using wkhtmltopdf

I'm using wkhtmltopdf to generate pdf from html pages.
My question is how to set the position of table of content page? It seems that it automatically generated in the beginning of first page. In addition, how to set the css of content of content?
There's an--xsl-style-sheet (file) parameter to wkhtmltopdf, detailed thusly in the extended command line --help (or -H).
A table of content can be added to the document by adding a toc object to
the command line. For example:
wkhtmltopdf toc http://doc.trolltech.com/4.6/qstring.html qstring.pdf
The table of content is generated based on the H tags in the input
documents. First a XML document is generated, then it is converted to
HTML using XSLT.
The generated XML document can be viewed by dumping it to a file using
the --dump-outline switch. For example:
wkhtmltopdf --dump-outline toc.xml http://doc.trolltech.com/4.6/qstring.html qstring.pdf
The XSLT document can be specified using the --xsl-style-sheet switch.
For example:
wkhtmltopdf toc --xsl-style-sheet my.xsl http://doc.trolltech.com/4.6/qstring.html qstring.pdf
The --dump-default-toc-xsl switch can be used to dump the default
XSLT style sheet to stdout. This is a good start for writing your
own style sheet
wkhtmltopdf --dump-default-toc-xsl
The XML document is in the namespace
http://code.google.com/p/wkhtmltopdf/outline
it has a root node called "outline" which contains a number of
"item" nodes. An item can contain any number of item. These are the
outline subsections to the section the item represents. A item node
has the following attributes:
- "title" the name of the section
- "page" the page number the section occurs on
- "link" a URL that links to the section.
- "backLink" the name of the anchor the the section will link back to.
The remaining TOC options only affect the default style sheet
so they will not work when specifying a custom style sheet.
So you define your own XSLT, possibly based on their default, and pass it in. No problemo.
If you want you can even make your customized TOC using a html file. e.g.; if you want to create TOC on names of html file name(s) which will be used in PDF creation (please note that for this you should know names in advance) then you can do this by passing a HTML file say user_toc.html. In this files you can put all your captions/css etc and make a placeholder for file name. This files needs to be parsed with server side code which should fill the file name in placeholder. Now the modified file can be used for TOC.
Example code in Perl:
my $TOCPage = "user_toc.html";
my $file1 = "Testfile1.html";
my $file2 = "Testfile2.html";
my $toc_file_lines;
# Open the user TOC for parsing and write in a buffer
open(FILE, $TOCPage);
read(FILE, $toc_file_lines, -s $TOCPage);
close(FILE);
# Substitute the placeholder with actual file name
$toc_file_lines =~ s/$file_name_placeholder/$file1/;
# Open the same file again and write back the buffer
open(FILE, ">".$TOCPage);
print FILE $toc_file_lines;
close(FILE);
# Linux path for wkhtmltopdf installation
my $wkhtmltopdf_path = '/usr/bin/wkhtmltopdf-i386';
my $command = "$wkhtmltopdf_path --margin-top 20mm --margin-bottom 15mm
--margin-left 15mm --margin-right 15mm
$TOCPage $file1 $file2 $pdf_manual_name";
`$command 2>&1`;
More over you can add a lot other info in TOC like chapter no/total page in chapter so on.
Hope this helps.

Resources