Remove Title from Table of Contents in Pandoc epub output - pandoc

I have an html document which I am converting to epub using pandoc as follows:
pandoc -f html -t epub --css pandoc.css --standalone --toc -o book.epub book.html
The generated epub book has a title page added, and this becomes the first item in the table of contents. For example, if the html is as follows:
<title>My Book</title>
<h1>Chapter 1</h1>
<p>One</p>
<h1>Chapter 2</h1>
<p>Two</p>
<h1>Chapter 3</h1>
<p>Three</p>
Then the contents will be:
Contents
1. My Book
2. Chapter 1
3. Chapter 2
4. Chapter 3
I have been able to remove the title page from the document by modifying the default.epub3 template to not include this section - however it creates a blank page and generates an entry for it in the TOC.
Are there any ways to prevent the title being added to the TOC?

Related

How to start page numbering from 2 rather 1

I can not figure out how is it possible to start page numbering from 2 rather 1, i.e 2, 3, 4, ..., in Pandoc when converting to PDF?
Pandoc relies on LaTeX for PDF generation, and you can write inline/raw TeX. So try inserting the following at the beginning of your document:
\setcounter{page}{2}
Pandoc produces pdf through latex. You need to add \setcounter{page}{2} to your file. You could also create an option that allows you to set the starting page number in your yaml header.
edit ~/.pandoc/templates/default.latex (or create it : pandoc -D latex > ~/.pandoc/templates/default.latex
add the following lines in the header:
$if(start-page)$
\setcounter{page}{$start-page$}
$endif$
Add the following your document yaml header
---
start-page: 2
---
Compile with the usual options, e.g. pandoc mydoc.md -o mydoc.pdf

Pandoc: use variables in custom latex preamble

I have the file test.md which contains:
---
footertext: some text for the footer
headertext: this is in the header
---
here is the text body.
And the file format.tex which contains:
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhead[L]{$headertext$}
\fancyfoot[L]{$footertext$}
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
\setlength{\headsep}{0.25in}
I run the command:
pandoc -H format.tex test.md -o test.pdf
You can see what I want to do. I am trying to get the text "this is in the header" to show up in the header, but it does not, it only shows the string "headertext" (same problem for footer).
What am I doing wrong?
Edit: OK, I think I understand. Apparently variables are only available in templates, not in included begin or end code blocks (like I am using), or in the md itself. So new question: Why is this? It is unintuitive, inconvenient, and poorly documented.
You can easily modify a pandoc template. Access the default template with
pandoc -D latex > new_template.latex
Paste the content of your format.tex in the preamble. You should use $if$ to check if the variable exists before using it if you want to use this template for more than one document :
\usepackage{fancyhdr}
\pagestyle{fancy}
$if(headertext)$\fancyhead[L]{$headertext$}$endif$
$if(footertext)$\fancyfoot[L]{$footertext$}$endif$
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
\setlength{\headsep}{0.25in}
Then compile with :
pandoc test.md -o test.pdf --template=new_template.latex

Header and footer in YAML metablock for rmarkdown and pandoc

Is it possible to specify in the YAML metablock a pdf header and/or footer?Something you could set to appear on every page. This is in rmarkdown and will then be rendered to a pdf (using pandoc and knitr). I have not been able to find anything (header: and footer: sadly did not work!)
---
title: testpdf | test
footer: "test footer"
theme: "zenburn"
date: "`r Sys.Date()`"
output: "pdf_document"
fig_width: 12
fig_height: 6
fig_caption: true
---
There is no native pandoc-markdown support of headers and footers. However, since pandoc generates pdf documents through latex, you can tweak your latex template to accommodate those.
Create a new template from the default one :
pandoc -D latex > footer_template.latex
Add the following lines in the latex preamble (this is a very basic example that should produce a centered footer, see the fancyhdr user manual if you need something more advanced)
$if(footer)$
\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyfoot[C]{$footer$}
$endif$
Finally, add this to your document yaml header :
output:
pdf_document:
template: test.latex
Your template should either be in ~/.pandoc/templates or in the same directory as test.rmd for this to work.

Render .adoc file in HTML template using asciidoctor

My goal is to use asciidoctor to render an html file, including an html template, and an .adoc file that can be easily edited by a non-technical worker. I can currently get the html template to render, but am not sure how I can wire up an adoc file to place text inside of specific tags, i.e. inside of the template divs/paragraphs.
Currently running this command from terminal:
rm assets/templates/about/digitization.html && asciidoctor -a stylesheet! -T
assets/templates/asciidoc/about/templates/ -o assets/templates/about/digitization.html
assets/templates/asciidoc/about/digitization.adoc
With this command, currently anything inside of digitization.adoc is not showing up (nor do I understand how to get text to render within the correct places in the html template).

How to extract links behind a text tag of web page (using either curl,wget or userscript)

I'm trying to extract href links under tag.
Refer to the attachment. I want to save all link under the tag "PDF".
http://tinypic.com/r/2n9erdj/8 Sorry I'm not allowed update pictures as yet.
Specifically the href details appear as arnumber=60940cc as shown in red circle.
Can someone suggest how to implement this. I'm intending to use either a userscript or bash commands.
html elements details relavant to a single pdf is shown below.
<a aria-label="Download or View the PDF: IEEE Transactions on Power Electronics publication information" href="/stampPDF/getPDF.jsp?tp=&arnumber=6094072"><img class="button" src="http://staticieeexplore.ieee.org/assets/img/iconPdf.png" alt="PDF file icon" title="Download or View the PDF">PDF</a>
The web page I'm testing is
http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6088512
The objective is to filter the content named as "pdf" and its urls.
Try this: Tweak sed part if you don't want "http://ieeexplorer.ieee.org/" just before stamp word or etc
wget http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6088512 -O file.html
grep -o "href.*stamp.*\"><" file.html |sed 's#"#"http://ieeexplorer.ieee.org#;s#><##'
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094070"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094072"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094110"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6088513"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5680978"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5985544"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5723758"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5716681"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5936741"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5934597"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5734858"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5756244"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5759746"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5958614"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5999721"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6021380"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5961632"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5951783"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5983448"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5934423"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5957306"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5898425"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5959991"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5776690"
href="http://ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5953525"
OR
$ grep -o "href.*stamp.*\"><" file.html |sed 's#href="#ieeexplorer.ieee.org#;s#"><##'
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094070
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094072
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6094110
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6088513
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5680978
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5985544
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5723758
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5716681
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5936741
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5934597
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5734858
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5756244
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5759746
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5958614
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5999721
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=6021380
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5961632
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5951783
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5983448
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5934423
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5957306
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5898425
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5959991
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5776690
ieeexplorer.ieee.org/stamp/stamp.jsp?tp=&arnumber=5953525

Categories

Resources