Citeproc and Pandoc Fignos conflicts with `#` in Pandoc? - pandoc

I am using the latest version Pandoc to convert MD to LaTex to PDF, with citeproc: true in the Defaults file. Additionally, I am using pandoc-xnos to reference figures.
The problem appears to be their similar syntax. Near any # seems to trigger CiteProc, and Pandoc Xnos requires #fig:id to reference files. Everything generates, although Citeproc generates warnings for each xnos reference ([WARNING] Citeproc: citation fig:id not found) and surrounds each Figure reference with [] as it does links.
Has anyone found a way to merge these two better? Ideally, Citeproc would only generate with [#cite] and xnox only with {#cite}, or recognize #fig: is not a typical citation, or the like, but reading over the documentation of both I cannot find an option or solution.

Adding citeproc: true to the defaults file will run citeproc as one of the first filters. You can control filter order in the defaults file by removing the citeproc setting and define the filters sequence like so:
# these filters run in the defined order
filters:
- type: json
path: pandoc-xnos
- type: citeproc
Older versions of pandoc contain a small bug which require to add an arbitrary path to the citeproc entry:
- type: citeproc
path: does not matter

Related

Combining yaml metadata (header-includes) with Pandoc

If I have these files:
text.md:
---
header-includes:
- \usepackage{pgf-pie}
---
\begin{tikzpicture}
\pie{50/, 50/}
\end{tikzpicture}
settings.yaml:
variables:
header-includes:
- \pagecolor{black}
and I compile them with pandoc with the command:
pandoc text.md -d settings -o text.pdf
...the header-includes value in the defaults file settings.yaml will overwrite the metadata block in text.md, thus failing to compile.
Is there a way to get pandoc to combine the two header-includes lists instead?
Combining these header-includes lists is not possible.
There are two reasons to this: One, the values from the defaults file always take precedence. In addition, if a name is used both as as a variable and in the metadata, then the variable will be used.
For additional info and discussions of this topic, see these pandoc GitHub issues:
Command-line options --css and --include-in-header override corresponding metadata fields instead of accumulating
Allow defaults to be folded into YAML metadata
A possible workaround would be to use a "new style" custom writer, as these provide write access to both metadata and variables:
function Writer (doc, opts)
local includes_tmpl = pandoc.template.compile('$header-includes$')
local vars = {['header-includes'] = opts.variables['header-includes'] or ''}
-- Write header-includes, once with variables, once without (thus
-- allowing metadata values to be used instead)
opts.variables['header-includes'] =
pandoc.write(doc, 'latex', {template=includes_tmpl, variables=vars}) ..
'\n' ..
pandoc.write(doc, 'latex', {template=includes_tmpl})
return pandoc.write(doc, 'latex', opts)
end
Note, however, that this currently requires the development version, so you'd need to use a nightly build. You'll also need to explicitly specify the template and PDF engine, e.g., --template=default.latex --pdf-engine=xelatex.

How to include multiple rows of LaTeX code via the YAML header (header-includes field) in RMarkdown?

I need to include the following code in a .tex file that is generated from a custom template via RMarkdown, in order to get rid of an error. However, if I try it as below in the YAML heading:
header-includes:
\newenvironment{CSLReferences}%
{}%
{\par}
it gets parsed into the .tex file as single line, like \newenvironment{CSLReferences}% {}% {\par}, thus commenting out everything after %. So how can I change the YAML part so that it correctly gets interpreted as 3 different lines?
Instead of worrying about the markdown parsing, you can write the command in a single line:
header-includes:
\newenvironment{CSLReferences}{}{\par}
Alternatively avoid all these annoying problems with markdown parsing and put your definition in a .tex file which you can include via
includes:
in_header: header.tex
After some trials & searching this works (found a solution while writing the question):
header-includes:
- "\\newenvironment{CSLReferences}%"
- "{}%"
- "{\\par}"
Interestingly, I couldn't find much in the official documentation.
EDIT:
As #samcarter mentioned in the comments & an answer, in this particular case a single line would've been enough, as
header-includes:
\newenvironment{CSLReferences}{}{\par}

pandoc does not produce bibliography when biblio file is in YAML-metadata only

I assume that inserting a reference to a BibTex bibliography in a YAML-metadata is sufficient for the references to be produced. This is like pandoc does not print references when .bib file is in YAML, which was perhaps misunderstood and which has no accepted answer yet.
I have the example input file:
---
title: Ontologies of what?
author: auf
date: 2010-07-29
keywords: homepage
abstract: |
What are the objects ontologists talk about.
Inconsistencies become visible if one models real objects (cats) and children playthings.
bibliography: "BibTexExample.bib"
---
An example post. With a reference to [#Frank2010a] and more.
## References
I invoke the conversion to latex with :
pandoc -f markdown -t pdf postWithReference.markdown -s --verbose -o postWR.pdf -w latex
The pdf is produced, but it contains no references and the text is rendered as With a reference to [#Frank2010a] and more. demonstrating that the reference file was not used. The title and author is inserted in the pdf, thus the YAML-metadata is read. If I add the reference file on the command line, the output is correctly produce with the reference list.
What am I doing wrong? I want to avoid specifying the bibliography file (as duplication, DRY) on the command line. Is there a general switch to demand bibliography processing and leaving the selection of the bibliography file to the document YAML-metada?
In the more recent version requires --citeproc instead of --filter=pandoc-citeproc
Theo bibliography is inserted by the pandoc-citeproc filter. It will be run automatically when biblioraphy is set via the command lines, but has to be run manually in cases such as yours. Addind --filter=pandoc-citeproc will make it work as expected.

How can I specify pandoc's markdown extensions using a YAML block?

Background
Pandoc's markdown lets you specify extensions for how you would like your markdown to be handled:
Markdown syntax extensions can be individually enabled or disabled by appending +EXTENSION or -EXTENSION to the format name. So, for example, markdown_strict+footnotes+definition_lists is strict markdown with footnotes and definition lists enabled, and markdown-pipe_tables+hard_line_breaks is pandoc’s markdown without pipe tables and with hard line breaks.
My specific question
For a given pandoc conversion where, say, I use grid tables in my source:
pandoc myReport.md --from markdown+pipe_tables --to latex -o myReport.pdf
How can I write a pandoc YAML block to accomplish the same thing (specifying that my source contains grid tables?)
A generalized form of my question
How can I turn extensions on and off using pandoc YAML?
Stack Overflow Questions that I don't think completely answer my question
Can I set command line arguments using the YAML metadata - This one deals with how to specify output options, but I'm trying to tell pandoc about the structure of my input
What can I control with YAML header options in pandoc? - Answerers mention pandoc's templates, but neither the latex output template nor the markdown template indicate any sort of option for grid_tables. So, it's not clear to me from these answers how knowing about the templates will help me figure out how to structure my YAML.
There may also not be a way to do this
It's always possible that pandoc isn't designed to let you specify those extensions in the YAML. Although, I'm hoping it is.
You can use Markdown Variants to do this in an Rmarkdown document. Essentially, you enter your extensions into a variant option in the YAML header block at the start of the your .Rmd file.
For example, to use grid tables, you have something like this in your YAML header block:
---
title: "Habits"
author: John Doe
date: March 22, 2005
output: md_document
variant: markdown+grid_tables
---
Then you can compile to a PDF directly in pandoc by typing in your command line something like:
pandoc yourfile.md -o yourfile.pdf
For more information on markdown variants in RStudio: http://rmarkdown.rstudio.com/markdown_document_format.html#markdown_variants
For more information on Pandoc extensions in markdown/Rmarkdown in RStudio:
http://rmarkdown.rstudio.com/authoring_pandoc_markdown.html#pandoc_markdown
You can specify pandoc markdown extension in the yaml header using md_extension argument included in each output format.
---
title: "Your title"
output:
pdf_document:
md_extensions: +grid_tables
---
This will activate the extension. See Rmarkdown Definitive Guide for details.
Outside Rmarkdown scope, you can use Pandocomatic to it, or Paru for Ruby.
---
title: My first pandocomatic-converted document
pandocomatic_:
pandoc:
from: markdown+footnotes
to: html
...
As Merchako noted, the accepted answer is specific to rmarkdown. In, for instance, Atom md_extensions: does not work.
A more general approach would be to put the extensions in the command line options. This example works fine:
----
title: "Word document with emojis"
author: me
date: June 9, 2021
output:
word_document:
pandoc_args: ["--standalone", "--from=markdown+emoji"]
----
For people stumbling across this in or after 2021, this can be done without Rmarkdown. You can specify a YAML "defaults" file, which basically includes anything you could want to configure.
In order to do what OP wanted, all you'd need to do is
from: markdown+pipe_tables
in the defaults file, then pass it when you compile.
You can also specify the input and output files, so you can end up with the very minimal command
pandoc --defaults=defaults.yaml
and have it handle the rest for you. See https://pandoc.org/MANUAL.html#extensions for more.

How to set Sphinx's `exclude_patterns` from the command line?

I'm using Sphinx on Windows.
Most of my documentation is for regular users, but there are some sub-pages with content for administrators only.
So I want to build two versions of my documentation: a complete version, and a second version with the "admin" pages excluded.
I used the exclude_patterns in the build configuration for that.
So far, it works. Every file in every subfolder whose name contains "admin" is ignored when I put this into the conf.py file:
exclude_patterns = ['**/*admin*']
The problem is that I'd like to run the build once to get both versions.
What I'm trying to do right now is running make.bat twice and supply different parameters on each run.
According to the documentation, I can achieve this by setting the BUILDDIR and SPHINXOPTS variables.
So now I have a build.bat that looks like this:
path=%path%;c:\python27\scripts
rem BUILD ADMIN DOCS
set SPHINXOPTS=
set BUILDDIR=c:\build\admin
call make clean
call make html
rem BUILD USER DOCS
set SPHINXOPTS=-D exclude_patterns=['**/*admin*']
set BUILDDIR=c:\build\user
call make clean
call make html
pause
The build in the two different directories works when I delete the line set BUILDDIR=build from the sphinx-generated make.bat file.
However, the exclude pattern does not work.
The batch file listed above outputs this for the second build (the one with the exclude pattern):
Making output directory...
Running Sphinx v1.1.3
loading translations [de]... done
loading pickled environment... not yet created
Exception occurred:
File "C:\Python27\lib\site-packages\sphinx-1.1.3-py2.7.egg\sphinx\environment.
py", line 495, in find_files
['**/' + d for d in config.exclude_dirnames] +
TypeError: coercing to Unicode: need string or buffer, list found
The full traceback has been saved in c:\users\myusername\appdata\local\temp\sphinx-err-kmihxk.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
Either send bugs to the mailing list at <http://groups.google.com/group/sphinx-dev/>,
or report them in the tracker at <http://bitbucket.org/birkenfeld/sphinx/issues/>.
What am I doing wrong?
Is the syntax for exclude_patterns in the sphinx-build command line different than in the conf.py file?
Or is there a better way to build two different versions in one step?
My first thought was that this was a quoting issue, quoting being notoriously difficult to get right on the Windows command line. However, I wasn't able to come up with any combination of quoting that changed the behavior at all. (The problem is easy to replicate)
Of course it could still just be some quoting issue I'm not smart enough to figure out, but I suspect this is a Sphinx bug of some kind, and hope you will report it to the Sphinx developers.
In the meantime, here's an alternate solution:
quoting from here:
There is a special object named tags available in the config file. It can be used to query and change the tags (see Including content based on tags). Use tags.has('tag') to query, tags.add('tag') and tags.remove('tag') to change
This allows you to essentially pass flags into the conf.py file from the command line, and since the conf.py file is just Python, you can use if statements to set the value of exclude_patterns conditionally based on the tags you pass in.
For example, you could pass Sphinx options like:
set SPHINXOPTS=-t foradmins
to pass the "foradmins" tag, and then check for it in your conf.py like so:
exclude_patterns = blah
if tags.has('foradmins'):
exclude_patterns = []
That should allow you to do what you want. Good Luck!

Resources