Combining yaml metadata (header-includes) with Pandoc - yaml

If I have these files:
text.md:
---
header-includes:
- \usepackage{pgf-pie}
---
\begin{tikzpicture}
\pie{50/, 50/}
\end{tikzpicture}
settings.yaml:
variables:
header-includes:
- \pagecolor{black}
and I compile them with pandoc with the command:
pandoc text.md -d settings -o text.pdf
...the header-includes value in the defaults file settings.yaml will overwrite the metadata block in text.md, thus failing to compile.
Is there a way to get pandoc to combine the two header-includes lists instead?

Combining these header-includes lists is not possible.
There are two reasons to this: One, the values from the defaults file always take precedence. In addition, if a name is used both as as a variable and in the metadata, then the variable will be used.
For additional info and discussions of this topic, see these pandoc GitHub issues:
Command-line options --css and --include-in-header override corresponding metadata fields instead of accumulating
Allow defaults to be folded into YAML metadata
A possible workaround would be to use a "new style" custom writer, as these provide write access to both metadata and variables:
function Writer (doc, opts)
local includes_tmpl = pandoc.template.compile('$header-includes$')
local vars = {['header-includes'] = opts.variables['header-includes'] or ''}
-- Write header-includes, once with variables, once without (thus
-- allowing metadata values to be used instead)
opts.variables['header-includes'] =
pandoc.write(doc, 'latex', {template=includes_tmpl, variables=vars}) ..
'\n' ..
pandoc.write(doc, 'latex', {template=includes_tmpl})
return pandoc.write(doc, 'latex', opts)
end
Note, however, that this currently requires the development version, so you'd need to use a nightly build. You'll also need to explicitly specify the template and PDF engine, e.g., --template=default.latex --pdf-engine=xelatex.

Related

Remove \hypertarget from pandoc LaTex output

I am using pypandoc to convert a markdown file to LaTex. My markdown file has a header, for example:
# Header Text #
When pypandoc renders the file as a .tex file, this appears as:
\hypertarget{header-text}{%
\section{Header Text}\label{header-text}}
While this is a nice feature to make it easy to link back to section headers, I don't necessarily want that and would prefer in this case for pypandoc to just generate:
\section{Header Text}
Is there a pandoc setting, or a pypandoc setting, that can be used to turn off the \hypertarget{} feature? I have reviewed the documentation for pandoc and didn't see it anywhere.
I had the same need, and I am using the -auto_identifiers switch,
pandoc -r markdown-auto_identifiers -w latex test.md -o test.tex
That will remove both
\hypertarget{header-text}{%
and
\label{header-text}}
leaving only
\section{Header Text}
like you requested.
Source
There is no such switch. If you want different output, you'd either have to use a pandoc filter or, as #mb21 already noted, post-process the output.
Neither of these options is very good: using a filter to manually define header output will lose you all kinds of other pandoc features, like --top-level-division and support for unnumbered headers. Post-processing, on the other hand, tends to be brittle and difficult to get right.
Anyway, below is a panflute filter, which will replace headers with a custom command. Save it to a file and pass it to pypandoc via the filters option; this should give you the desired output.
from panflute import *
sectionTypes = ["section", "subsection", "subsubsection",
"paragraph", "subparagraph"]
def reduce_header(elem, doc):
if type(elem) == Header:
cmd = "\\%s{" % sectionTypes[elem.level - 1]
inlines = [RawInline(cmd, "tex")]
inlines.extend(elem.content)
inlines.append(RawInline("}", "tex"))
return Plain(*inlines)
if __name__ == "__main__":
run_filter(reduce_header)

Use fmpp command line parameter in template

I have some configuration templates which use FMPP to generate the
real runtime config files based upon info in a csv and properties
file (defined in config.fmpp).
I want to be able to configure a second cluster server for the same task using the same set of templates and config.fmpp information. However, there are slight differences needed in the generated runtime config and I can do this if I know which server instance I am on ("serverA" or "serverB") using a standard fmpp variable like ${myserver}.
But there must only be one set of templates and FMPP config files so I need to somehow get the value of "myserver" from the runtime
environment in each server.
Some of the options I might have are:
pass value of myserver on the command line tool invocation (best way); or
get it from an environment variable.
Does anyone have an example of the code to do any of these and any suggestions of the best approach? Online reference would be great.
fmpp -S /home/me/sample-project/src -Param myserver:serverA
Environment settings:
fmpp v0.9.14
freemarker v2.3.19
Use the -D command line option (see --help):
-D, --data=<TDD> Creates shared data that all templates will see. <TDD> is the
Textual Data Definition, e.g.:
-D "properties(style.properties), onLine:true"
Note that paths like "style.properties" are relative to the
data root directory.
Like:
fmpp -S /home/me/sample-project/src -D myserver:serverA
Note that there's a space after the -D. (It's not like the java command line syntax, but rather like the standard GNU command line syntax.
This -D has nothing to do with Java's -D option.
The documentation shows onLine:true, but such Boolean values are legacy and no longer accepted. Use online:yes to parse Boolean values.
For example:
fmpp \
-S /path/ \
--verbose \
-D "online:yes"
Then, within the template:
<p>
online: ${online}
</p>
Will result in:
online: yes
The --verbose command-line parameter is useful to show any errors when parsing the template.

Specifying metadata for input formats other than Markdown

Pandoc allows you to include metadata at the beginning of a Markdown document using a header like
---
title: The Song That Never Ends
subtitle: It Goes On and On My Friends
author: Abraham Lincoln
lang: en_US
---
Is there any way to convey this information to Pandoc when the input format is not Markdown? I’m specifically interested in HTML input. I tried calling Pandoc with --from=html+yaml_metadata_block, but this didn’t seem to change the behavior at all—the YAML block is just interpreted as HTML.
(It is possible to include some metadata in the “percent format” shown in the “pandoc_title_block” section of the manual, but there doesn’t seem to be a way to give a separate title and subtitle with that syntax. It’s also possible to include the YAML header before the HTML and to force Pandoc to interpret the input as Markdown, but this seems hacky, and if you try to convert that to “real” Markdown then the output is full of HTML tags instead of Markdown formatting characters.)
You can use the --metadata (short -M) or --metadata-file options to supply metadata on the command line, for example:
pandoc -M title="The Song That Never Ends"
A simple solution would be to use Lua filters to augment the metadata read from the HTML file as described in the Lua filters doc. Below is an updated version:
-- file: additional-metadata.lua
function read_file_as_markdown_yaml (filename)
-- read metadata file into string
local metafile = io.open(filename, 'r')
local content = metafile:read('*a')
metafile:close()
-- get metadata
return pandoc.read(content, 'markdown').meta
end
function Meta (meta)
-- read YAML file and add its content to the metadata
local yaml_meta = read_file_as_markdown_yaml(meta.default_meta_file)
for k, v in pairs(yaml_meta) do
-- use YAML metadata as fallback
meta[k] = meta[k] or v
end
return meta
end
Use with
pandoc --lua-filter additional-metadata.lua \
--metadata default_meta_file:YOUR-FILE-HERE.yaml \
your-input-file.html

How can I specify pandoc's markdown extensions using a YAML block?

Background
Pandoc's markdown lets you specify extensions for how you would like your markdown to be handled:
Markdown syntax extensions can be individually enabled or disabled by appending +EXTENSION or -EXTENSION to the format name. So, for example, markdown_strict+footnotes+definition_lists is strict markdown with footnotes and definition lists enabled, and markdown-pipe_tables+hard_line_breaks is pandoc’s markdown without pipe tables and with hard line breaks.
My specific question
For a given pandoc conversion where, say, I use grid tables in my source:
pandoc myReport.md --from markdown+pipe_tables --to latex -o myReport.pdf
How can I write a pandoc YAML block to accomplish the same thing (specifying that my source contains grid tables?)
A generalized form of my question
How can I turn extensions on and off using pandoc YAML?
Stack Overflow Questions that I don't think completely answer my question
Can I set command line arguments using the YAML metadata - This one deals with how to specify output options, but I'm trying to tell pandoc about the structure of my input
What can I control with YAML header options in pandoc? - Answerers mention pandoc's templates, but neither the latex output template nor the markdown template indicate any sort of option for grid_tables. So, it's not clear to me from these answers how knowing about the templates will help me figure out how to structure my YAML.
There may also not be a way to do this
It's always possible that pandoc isn't designed to let you specify those extensions in the YAML. Although, I'm hoping it is.
You can use Markdown Variants to do this in an Rmarkdown document. Essentially, you enter your extensions into a variant option in the YAML header block at the start of the your .Rmd file.
For example, to use grid tables, you have something like this in your YAML header block:
---
title: "Habits"
author: John Doe
date: March 22, 2005
output: md_document
variant: markdown+grid_tables
---
Then you can compile to a PDF directly in pandoc by typing in your command line something like:
pandoc yourfile.md -o yourfile.pdf
For more information on markdown variants in RStudio: http://rmarkdown.rstudio.com/markdown_document_format.html#markdown_variants
For more information on Pandoc extensions in markdown/Rmarkdown in RStudio:
http://rmarkdown.rstudio.com/authoring_pandoc_markdown.html#pandoc_markdown
You can specify pandoc markdown extension in the yaml header using md_extension argument included in each output format.
---
title: "Your title"
output:
pdf_document:
md_extensions: +grid_tables
---
This will activate the extension. See Rmarkdown Definitive Guide for details.
Outside Rmarkdown scope, you can use Pandocomatic to it, or Paru for Ruby.
---
title: My first pandocomatic-converted document
pandocomatic_:
pandoc:
from: markdown+footnotes
to: html
...
As Merchako noted, the accepted answer is specific to rmarkdown. In, for instance, Atom md_extensions: does not work.
A more general approach would be to put the extensions in the command line options. This example works fine:
----
title: "Word document with emojis"
author: me
date: June 9, 2021
output:
word_document:
pandoc_args: ["--standalone", "--from=markdown+emoji"]
----
For people stumbling across this in or after 2021, this can be done without Rmarkdown. You can specify a YAML "defaults" file, which basically includes anything you could want to configure.
In order to do what OP wanted, all you'd need to do is
from: markdown+pipe_tables
in the defaults file, then pass it when you compile.
You can also specify the input and output files, so you can end up with the very minimal command
pandoc --defaults=defaults.yaml
and have it handle the rest for you. See https://pandoc.org/MANUAL.html#extensions for more.

dblatex ignore --texstyle or -s command

I want to write an asciidoc document and convert it into a pdf document. However, I want to use a format style different than the default ones. To do so I convert the txt file to docbook using asciidoc and then try to convert the resulting docbook xml to a pdf file using dblatex.
The idea is to set a particular tex style for dblatex to obtain the desired pdf result. I've copied the existing docbook.sty style as it is recommended here to do a small style modification. The only change done in the ./docbook file is \setlength{\textwidth}{18cm} to \setlength{\textwidth}{12cm}. However, when I run the command
dblatex --texstyle=./docbook.sty test.txt
Or the command
dblatex -s ./docbook.sty test.txt
Both produce the same result in the style change: none. I mean, no matter which modification I do to ./docbook.sty file, these modifications are not applied to the output. I obtain always the same result, a pdf with the default formatting. Do you guys have any idea where is the problem?
Thanks in advance.
I would recommend:
Copy the Dblatex docbook.sty to a new filename in your working directory which is "obviously yours" (e.g., mydbstyle.sty).
Continue to supply a full or relative path argument to the --texstyle option (e.g., /path/to/mydbstyle.sty or ./mydbstyle.sty). Failing to do so requires that mydbstyle.sty be in a directory enumerated by the TEXINPUTS environment variable (which you likely have not explicitly set).
Within mydbstyle.sty, use the following directives to initialize your style:
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{mydbstyle}[2013/02/15 DocBook Style]
\RequirePackageWithOptions{docbook}
% ...
% your LaTeX commands here
Pass a DocBook 4.5 XML file as an argument to Dblatex (in your example you are passing test.txt which makes me uncertain whether you're passing an AsciiDoc source file).
dblatex --texstyle=./mydbstyle.sty mybook.xml

Resources