wkhtmltopdf from list of files generates wrong TOC based on H tags - wkhtmltopdf

I am using python-pdfkit as follows to generate a PDF:
pdfkit.from_file(list_of_files, toc=toc, cover=cover, cover_first=True, options=default_options)
My problem is regarding the table of contents that is generated:
The table of contents is generated based on the H tags in the input
documents
If my html files are like:
index1.html
<h1>Title</h1>
...
[content]
index2.html
<h2>Subtitle</h2>
...
[content]
index3.html
<h3>Sub-subtitle</h3>
...
[content]
Since they are 3 different files, then the generated TOC is:
Title --------------------- Page x
Subtitle ------------------ Page y
Sub-subtitle -------------- Page z
Instead of
Title--------------------- Page x
Subtitle -------------- Page y
Sub-subtitle ------- Page z
I have tried merging all html files together but it is giving me a lot of problems with the internal links... linking to files instead to HTML #IDs gets tricky with a single HTML file all merged.
Any idea how to tell wkhtmltopdf to respect the H tags hierarchy without resetting it per file?
Thanks!
Edit:
After some discussion in the wkhtmltopdf github issues section, the only easy way of achieving this result is pre-parsing the HTML files to merge them all together.
See the following link for more details:https://github.com/wkhtmltopdf/wkhtmltopdf/issues/4310

Related

Issue with xsl-fo :footnote when generating pdf/ua-1 document with fop.: "tagged PDF note id is missing"

I have an issue with <fo:footnote> when generating pdf/ua-1 document with fop.
The resulting pdf displays correctly the footnote in the page but don’t pass the pdf-ua validation. A severe error on pdf tag Note “id is missing” is raised so the document is not conformed. I'm using PAC3 for the conformance test.
In the example below I have extracted the basic <fo:footnote> element which has a unique id.
How can I generate the missing Id attribute in the pdf tagged Note element?
Here is the xsl-fo really simple footnote. Note that I used an id to reference the footnote.
some text...
<fo:footnote id="FNE0001">
<fo:inline font-size="6pt" baseline-shift="super">E0001</fo:inline>
<fo:footnote-body>
<fo:block>
<fo:inline>E0001</fo:inline><fo:inline > JO L 139 du 29.5.2002, p. 9.</fo:inline>
</fo:block>
</fo:footnote-body>
</fo:footnote> some text...
Apache FOP has been set to generate pdf-ua through the conf file as follow:
<renderers>
<renderer mime="application/pdf">
<!-- Before setting the pdf-ua-mode, we must insert metadata Title in FO declaration -->
<pdf-ua-mode>PDF/UA-1</pdf-ua-mode>
....
PAC3 is checking against the failure conditions in the Matterhorn Protocol; latest edition at https://www.pdfa.org/resource/the-matterhorn-protocol/.
PAC3 is probably reporting on failure condition 19-003, ID entry of the tag is not present.
fo:footnote-body can have an id property. You could try adding an ID to that. In the fo:inline for the footnote marker, you might also need to add an fo:basic-link that refers to that ID.
FWIW, your footnote does not cause that error when checking PDF/UA generated by AH Formatter, and I'm only guessing at what FOP would be doing differently. (PAC3 and I generally disagree when it complains about a footnote being a possibly incorrect use of a Note tag, but that's another story: PAC3 tries to automate checking the conditions that should be checked by a human, and it doesn't always get it right.)

Is there any way for a section to link back to the table of contents in pandoc?

So I'm able to create a nice table of contents with pandoc --toc option, but I was wondering if there is any way of linking the header or a symbol near the heading back to the table of contents. For example, when you create a footnote in pandoc, it links the subscript number to the bottom of the page. At the end of the note, there is this little sign (↩︎) with a link for going back to the line where the footnote was. I'd like to do this with my table of contents for each header. I don't mind not use --toc, and instead manually writing out the table of contents, but I not sure whether this particular feature was available. Any tips would be very helpful!
A Lua filter can be used to add a link back to the TOC.
local link_to_toc = pandoc.Link({pandoc.Str '↑'}, '#TOC')
function Header (h)
h.content = h.content .. {pandoc.Space(), link_to_toc}
return h
end
Save the above into a file and pass it to pandoc via the --lua-filter (or -L) command line option.
Linking to a specific line in the TOC is not possible though.

Generating table of contents for multi-file asciidoc book

I am a beginner at asciidoc. I have structured my project into modular files so it is easier to manage. And I am able to generate the pdf using asciidoctor. However, the toc does not include the list of files it gets through the include directive.
Here is the main file:
= Booktitle
Vinay <email>
:sectnums:
:toc:
:toclevels:
:leveloffset: 1
include::chapters/chapter_00.adoc
include::chapters/chapter_01.adoc
include::chapters/chapter_02.adoc
:leveloffset: 0
Index
======
And here is chapter_01.adoc:
= The First Chapter
This is the first chapter.
The table of contents only includes a link to the Index. What am I doing wrong?
The command I used is: asciidoctor-pdf book.adoc
Your include is missing a pair of square brackets. For a book that has a title page, you might want to set the doctype attribute to book. The attribute toclevel should be set to a number indicating the heading levels you want to list in your table of contents. If you leave it empty, the table of contents will be empty.
Tested with Asciidoctor PDF 1.5.3 using Asciidoctor 2.0.10, the following worked for me:
= Booktitle
Vinay <email>
:sectnums:
:toc:
:toclevels: 2
:doctype: book
:leveloffset: 1
include::chapters/chapter_00.adoc[]
include::chapters/chapter_01.adoc[]
include::chapters/chapter_02.adoc[]
:leveloffset: 0
[Index]
= Index

How to number code listings in asciidoc?

I'm writing a blog in ascidoc and would like the code listings to be automatically numbered, e.g.
Listing 1.3 The Hello World code
...
Listing 1.4 Some other code sample
Is there an attribute that I can set for the entire text so asciidoc would automatically number code listings?
It is not that clear in the asciidoctor user-manual (Listing Blocks), but in order for asciidoctor to number the listing you need to:
add a caption to each block
set the listing-caption attribute at the begining of your document
Example:
:listing-caption: Listing
.The Hello World code
----
//TODO: add hello world code
----
.Some other code sample
----
//TODO: Some other code sample
----
Controlling the way the listing blocks are numbered (1.1, 1.2 in your example) is not that easy. There is a discussion on GitHub for that.

Doxygen: How to hide certain page in treeview

I have an issue I could not resolve by myself. Help please.
I have (conditionally):
/** #mainpage A
#subpage B
*/
/** #page B
#subpage C
*/
/** #page C */
Doxygen makes the tree where all the pages are shown on the root level.
+A/
|---B/
|------C
|---B <- WANT TO HIDE
|---C <- WANT TO HIDE
but I need only top (A here and nested B & C) to be visible i.e. should be organized accordingly #subpage tags.
I also tried to set visible to 'no'
in DoxygenLayout.xml. But it hides all the pages, only 'files' and 'classes'
are left.
Thanx in advance.
Your code generates the required tree view (only nested pages without separate entries at the root level) when the page/subpage files belong to most of the supported formats like *.c, *.cpp, *.dox etc. The only exception that I could find (in Doxygen 1.8.6) is the markdown format (*.md or *.markdown), for which separate root level entries are generated as well.
Until markdown files are treated like the other file formats, a workaround would be to use one of the other file formats (like *.dox) instead of *.md for the pages/subpages. Currently, the markdown format can be used, without generating root level entries, only for the mainpage.

Resources