Pandoc, Markdown to Doc, how to use variables? - syntax

AFAIK, variables can be defined in a YAML external file or inside the Markdown file in a header.
Then they can be used in the document. I have found examples with two different sytaxes:
$variable$ will convert variable to math mode, which is great (i.e. I want to keep that behaviour).
#{variable} does nothing.
Questions:
Is it possible to use variables in the pandoc conversion from markdown to .docx?
If so, how?

Pandoc variables can only be used in pandoc templates, not the document itself (there's an open issue about that).
For that you should check out a preprocessor like gpp or use a pandoc filter like pandoc-mustache or this lua-filter.

Related

How to customize Asciidoctor's DocBook converter?

For example, when the goal is to represent several lines of a program's source code, then the AsciiDoc block markup choices are using
a literal block (....), which is converted to the literallayout DocBook tag
a listing block (----), which is converted to the screen DocBook tag
These are solid defaults, but the proper semantic markup would be programlisting in this case.
Using passthrough blocks is one solution, but at the cost of polymorphism (or convenience, if I decide to pepper my documents with ifdefs / ifndefs):
WARNING
Using passthroughs to pass content (without substitutions) can couple your content to a specific output format, such as HTML. In these cases, you should use conditional preprocessor directives to route passthrough content for different output formats based on the current backend.
I don't mind using the conditionals, but simply wondering if I am missing something obvious or straightforward? Such as passing semantic tag names to a block attrlist or something. Read through the entire AsciiDoc manual, the asciidoctor man page, the Generate DocBook from AsciiDoc article, and tried keyword searches, but couldn't find a thing. (It is highly probably though that I missed it..:)

Rinohtype/Spinx - How to use python variables in stylesheet

I use sphinx to generate HTML and PDF documentation, and was using latex until now to generate PDF, but now looking at swapping for rinohtype.
I'm looking at setting up some custom headers and footers, but would like to include variable text into them, for the version number for example, which comes from a sphinx python plugin. I have rst substitutions, for example |version|, that I use in various places in the document, but if I add it to the header via a stylesheet it doesn't get substituted. I also have python variables, for example version, in my conf.py so I also tried to use {version} in my stylesheet, but the builder complains that the variable doesn't exists.
FYI, here is how I tried to define my header :
[contents_page]
header_text = '|document_id| |version| |shortdate|' (header)
[contents_page]
header_text = '{document_id} {version} {shortdate}' (header)
Any idea how to get around that issue ?
Thanks
rinohtype does support including reStructuredText substitutions (|subst|) in document templates, but this is one of the features that isn't documented yet.
To use substitutions in a page header or footer, you need to include them as follows:
[contents_page]
header_text = '{UserStrings.document_id} {UserStrings.version} {UserStrings.shortdate}' (header)
(Actually, I don't remember why I require the UserStrings. prefix. Perhaps there is a good reason for that. If not, that prefix might not be needed in the future.)

With Pandoc, how to converting between different formats with additional rules?

I have some existing Mediawiki format texts that contain categories tokens like
[[Category:XXX]]
[[Category:YYY]]
I'd like to convert them to Markdown texts. The basic command for doing that with Pandoc is
pandoc -f mediawiki -t markdown -s mytext.mediawiki -o mytext.md
The resultant Markdown text is mostly usable except that it converts the category tokens to
<Category:XXX> <Category:YYY>
which isn't really what I need. Instead, I need
[[!tag XXX YYY]]
because I'm using the resultant Markdown files as source files in a special content management system called Ikiwiki which has its idiosyncratic format for tags. How to do that with Pandoc?
It's probably easiest to do this as a second step with a search and replace on <Category:XXX>. Note that pandoc without the -o option writes to standard-out, so you can pipe it directly to some custom post-processing script.
[[Category:XXX]] is converted by pandoc internally to a link along the lines of Category:XXX (try pandoc -f mediawiki -t native).
So generally, additional rules for elements are implemented through custom scripts that match on Pandoc's internal data types, see Pandoc scripting. So you could match on those kind of links. It's more work (the first time), but makes quite sure you don't replace false positives.

Convert HTML and inline Mathjax math to LaTeX with pandoc ruby

I'm building a Rails app and I'm looking for a way to convert database entries with html and inline MathJax math (TeX) to LaTeX for pdf creation.
I found similar questions like mine:
Convert html mathjax to markdown with pandoc
How to convert HTML with mathjax into latex using pandoc?
and I see two options here:
Create a Haskell executable which leaves stuff like \(y=f(x)\) alone when converting html to LaTeX
Write a ruby method which does the following things:
Take the string and split it into an array with a regex (string.split(regex))
loop through the created array and if content matches regex convert the parts to LaTeX which do not include inline math with PandocRuby.html(string).to_latex
concatenate everything back together (array.join)
I would prefer the ruby method solution because I'm hosting my application on Heroku and I don't like to checkin binaries into git.
Note: the pandoc binary is implemented this way http://www.petekeen.net/introduction-to-heroku-buildpacks)
So my question is: what should the regex look like to split the string by \(math\).
E.g. string can look like this: text \(y=f(x) \iff \log_{10}(b)\) and \(a+b=c\) text
And for the sake of completeness, how should the Haskell script be written to leave \(math\) alone when converting to LaTeX and the ruby method is not a possible solution?
Get the very latest version of pandoc (1.12.2). Then you can do
pandoc -f html+tex_math_dollars+tex_math_single_backslash -t latex

Markup for Sphinx formatting of argparse

I am using Sphinx/reStructuredText to create HTML and PDF docs for a project and am including the output of the argparse help for the command line tools. At the moment I am pasting the output in manually but plan at some point to switch to autogenerated output.
The problem is that while the formatting is fine for named parameters (like -x or --xray) it doesn't work well on positional parameters. It looks like the absence of a leading '-' on the parameter name is confusing it. The output looks like normal text without the neat indentation etc.
So my question is, does there exist markup that would force formatting the positional parameters as if they had leading '-' characters? If not, could someone suggest where in the docs or code I should start looking to put something together myself?
You may use sphinx-argparse extension.
It does not use output of argparse help, but it instead introspects argparse parser itself and provides proper sphinx markup for both positional arguments and options, formatting them as definition list with proper labels.
Sub-commands are also easily handled.
http://sphinx-argparse.readthedocs.org/en/latest/
It will also cover you needs in:
plan at some point to switch to autogenerated output

Resources