Customize the Pandoc writer for the ConTeXt output format? - pandoc

I'm currently trying to customize the standard writer built into Pandoc to produce output in the ConTeXt format from Markdown input. Unfortunately, the documentation to create a custom writer found at the Pandoc website is not giving me too much information apart from how to write a custom HTML writer. So, I would like to ask for help with some fundamental ideas:
What would be the preferrable way to add some (probably) very simple functionality to the ConTeXt writer, e.g.: I would like to rewrite the sequence of characters " - " (in a Markdown document) as another sequence "~-- " (in the resulting ConTeXt document).
If I understood correctly, I'm supposed to base my custom writer on the standard (built-in) writers... But where can I find these? There doesn't seem to be anything in /usr/share/pandoc/(I'm working on Linux).
The website mentions the "classic style" and the "new style". Apart from one obviously being newer, what style am I supposed to use?
I know that these questions may sound rather simple, but there doesn't seem to be a lot of information available beyond the usual basic stuff. Any help would be much appreciated.

Pandoc has another feature that is similar to custom writers, called "Lua filters". Filters are quite likely a simpler and better choice in this case: They allow to modify the internal document representation. E.g.:
function Inlines (inlines)
for i=#inlines, 3, -1 do -- iterate backwards through the list of inlines
if inlines[i-2].t == 'Space' and inline[i-1] == pandoc.Str '-' and
inlines[i].t == 'Space' then
-- Replace elements with raw ConTeXt
inlines[i-2] = pandoc.RawInline('context', '~--')
inlines:remove(i)
inlines:remove(i-1)
end
end
return inlines
end
The above would be used b writing it to a file, and then passing that file to pandoc via --lua-filters FILENAME.lua
The documentation for Lua filters is also less sparse and hopefully more approachable than the docs for custom writers.

Related

How to create a custom ConTeXt writer for Pandoc?

Following up on another question I would like to add a few modifications to the ConTeXt writer that comes with Pandoc. It was suggested that I use lua-filters instead of trying to customize the ConTeXt writer for the problem in the aforementioned question and that worked fine. However, I suspect, filters won't be of much help with the following (but do correct me, if I'm wrong)...
In particular, I would like to modify the way Pandoc's built-in ConTeXt writer handles some of the markup. E.g., converting from something like *italic* in Markdown results in {\em italic} in ConTeXt, by using a command like pandoc -t context <markdownfile>.md -o <contextfile>.tex. This isn't exactly wrong, but should ideally be written as \emph{italic}.
How would I approach the problem of creating a custom ConTeXt writer for Pandoc, without having to build the whole thing from scratch? My question from my previous post still applies here: If I understood correctly, I'm supposed to base my custom writer on the standard (built-in) writers... But where can I find these? There doesn't seem to be anything in /usr/share/pandoc/(I'm working on Linux).
Any help would be much appreciated.

How does file convertors work in general like word to pdf, XML to json, word to txt etc

I've used many types of file convertor like word to pdf, XML to json, word to txt etc.
How do they work in backend? Is there some specific guidelines each of them follow? Are there some similarity in the way they are implemented.
I tried searching it but most of the articles take me to the web app that can convert the doc, but none of them gives clarity on how it's done.
All of them work by parsing the first document into a data structure. Then generate a document in the other format from that data structure using recursion.
Parsing itself is a giant topic that people take courses on in computer science. But long story short, it proceeds by breaking the document into tokens, and then fitting the tokens into a parse tree using one of a standard set of methods. They have all sorts of fancy names like Recursive Descent and LALR(1). That's where most of the theory you'd want to learn is.
For example if you're writing a JSON to XML converter, you'd first need to parse that JSON. A JSON Parser shows how you could write that, from scratch, using recursive descent. Once written you just need to write a recursive function that takes each data type and does something appropriate with it to generate text in the format that you want.
Incidentally you can also write a "document converter" that converts from a document format to the same document format. Why would someone want to do that? The two most common use cases are to prettify or minify code. Despite the fact that only one format is being dealt with, the principles of how you do it are exactly the same.

Rename latexmath macro in asciidoc

Is there a way I can rename or alias latexmath in an AsciiDoc document?
In an ideal world, I'd like to set up an AsciiDoc such that $...$ is interpreted as LaTeX math, and
$$...$$ is interpreted as a block equation. In general, I'm just trying to reduce the number of characters involved in defining a math block since
where $c$ is the speed of light and $m_0$ is the rest mass
is significantly more readable (to someone who's used LaTeX for years) than
where latexmath:[$c$] is the speed of light and latexmath:[$m_0$] is the rest mass
The use case I have is that I'm writing technical documentation for upload to a GitLab repository. I'd like to be able to exploit GitLab's ability to automatically render AsciiDoc format files. However, these documents are math heavy, so I find the large numbers of latexmath:[...] blocks hard to read while editing.
latexmath is the name of the macro that handles parsing the LaTeX markup. If you don't specify latexmath, asciidoctor doesn't know to pass control to an alternate parser.
You could achieve what you're after by writing a pre-processing step that identifies contiguous blocks of LaTeX markup and wraps that markup in latexmath:[...]. The updated markup can then be processed by Asciidoctor like normal (assuming that your LaTeX markup identification is accurate). How you go about implementing that is up to you.
Another way, assuming you have some Ruby skills, would be to modify the extension that implements the latexmath macro such that it was called, say, L. Then your markup would be the more concise:
where L:[$c$] is the speed of light and L:[$m_0$] is the rest mass

Rainmeter: How to concatenate strings

I am getting data from a broken RSS feed that gives me wrong link. I wanted to fix this link so I made this code:
<link.*>(.*)&.*tid(.*)</link>
and the link could be like:
www.somedomain.com/?value=50&burrrdurrrr;tid=120
But the real working link is in this form:
www.somedomain.com/?value=50&tid=120
The thing that I'm asking is if my measure thing looks like this:
[FeedURL]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[Feed]
StringIndex=2 ;now I only get www.somedomain.com/?value=50
Substitute=#SubstituteFeed#
How am I supposed to concatenate the strings together to complete the url?
I'm guessing rather than &burrrdurrrr;, the link has &, which is how you have to write & in an HTML or XML file.
If that's the case, you just need to set the DecodeCharacterReference option, as described in this handy-looking tutorial. Another option mentioned there is Substitute, which would be able to strip it out even if it really was &burrrdurrrr;.
None of this is a particularly sensible way of dealing with HTML or XML - a much better approach would be a plugin which actually parsed the document structure and let you reference nodes using XPath or CSS rules - but you work with what you've got, I guess. (I've never heard of this "Rainmeter" before, despite its claim to be "the best known and most popular desktop customization program for Windows"; maybe because nobody else calls their program that, instead almost universally using the word "widget"?)

Internalizating content heavy pages into messagefiles seem cumbersome in play2?

Internalization in Play2 can be done with Message.get("home.title") and language files. What about when you internalizate a page full of textual content and not just one specific header or link?
For example doing Messagefile for a long page representing e.g. product info:
_First header_
Some paragraphs of text
...
_Tenth header_
Tenth paragraph and more text*
Messagefile
a)
product.info = "<many paragraphs of text including headers>"
or splitting one page into html elements
b)
product.info.h1 = "<first header>"
product.info.p1 = "<first para>"
product.info.p2 = "<2nd para>"
For me both solutions doesn't sound right. In first having a vast value for a single key seems bad convention and in latter separating a single page into dozens of keys doesn't sound good either.
Big websites often follow the convention www.site.com/en-us/product/1 of having the language in the URL. So the question is, how do i do in this way and is doing in this way a better way at all? I could easily end up not just translating to dozen languages but doing also dozen times layout changes.
I could use global codesnippets using Messagefile for elements that have a little text and doesn't change often e.g. navigation /view/global/header/somenavbar.scala.html but then i end up only having a complex folder structure.
Another way, a best practise, in Play 2 for internalization than messagefile?
Take a look to the Joscha Feth's solution in play_authenticate Java sample.
There are templates for emails in 3 languages for email confirmation, password reseting etc.
Template for each 'type' of email && each language is kept in single file ie:
_password_reset_en.scala.html
_password_reset_de.scala.html
_password_reset_pl.scala.html
_verify_email_en... etc
And for each 'type' there is an 'parent' template, which contains a condition (common Scala's match check the Tags section of template doc) which returns rendered view depending on detected language:
password_reset.scala.html
Finally, yes, at the beginning I also thought that some kind of madness, but believe me, that technique can be useful. There's field for further improvements I think. Maybe it would be better to move the language conditioning to the controller, hm I think that depends on many factors and it will be great if you'll find a time to investigate this topic.

Resources