Pandoc: complete pass through from tex to html? - pandoc

I'm using pandoc to convert TeX files into HTML files (to be used with JeKyll).
I want to insert some raw block directly into the TeX file in a way that it survives, without any alteration, the conversion from TeX to HTML.
For instance, I might want to add something like this:
{% highlight python %}
def func(ok):
return ok
{% endhighlight %}
I can do this from md to HTML by using {=html}, but what about the TeX->md part?

This requires the use of a filter, as #mb21 pointed out alreday.
You'll probably want the input document to remain valid LaTeX, so a good method would be to use a specially marked verbatim environment, like so:
\begin{verbatim}
%%%HTML
<aside>Embedding raw HTML can be helpful</aside>
\end{verbatim}
Pandoc will read this as a normal code block, but we can use a filter to convert it into a raw HTML block:
function CodeBlock(cb)
local rawHtml = cb.text:match('^%s*%%%%%%HTML\n(.*)')
if rawHtml then
return pandoc.RawBlock('html', rawHtml)
end
end
Save the above into a file and use it as the argument of pandoc's --lua-filter option.

Related

Customising Pandoc writer element output

Is it possible to customise element outputs for a pandoc writer?
Given reStructuredText input
.. topic:: Topic Title
Content in the topic
Using the HTML writer, Pandoc will generate
<div class="topic">
<p><strong>Topic Title</strong></p>
<p>Content in the topic</p>
</div>
Is there a supported way to change the html output? Say, <strong> to <mark>. Or adding another class the parent <div>.
edit: I've assumed the formatting is the responsibility of the writer, but it's also possible it's decided when the AST is created.
This is what pandoc filters are for. Possibly the easiest way is to use Lua filters, as those are built into pandoc and don't require additional software to be installed.
The basic idea is that you'd match on an AST element created from the input, and produce raw output for your target format. So if all Strong elements were to be output as <mark> in HTML, you'd write
function Strong (element)
-- the result will be the element's contents, which will no longer be 'strong'
local result = element.content
-- wrap contents in `<mark>` element
result:insert(1, pandoc.RawInline('html', '<mark>'))
result:insert(pandoc.RawInline('html', '</mark>'))
return result
end
You'd usually want to inspect pandoc's internal representation by running pandoc --to=native YOUR_FILE.rst. This makes it easier to write a filter.
There is a similar question on the pandoc-discuss mailing list; it deals with LaTeX output, but is also about handling of custom rst elements. You might find it instructional.
Nota bene: the above can be shortened by using a feature of pandoc that outputs spans and divs with a class of a known HTML element as that element:
function Strong (element)
return pandoc.Span(element.content, {class = 'mark'})
end
But I think it's easier to look at the general case first.

Jekyll / Liquid include files dynamically

currently I'm working on a new site with Jekyll and have some kind of problem there. I have a layout page, where I can define the background image with variables from each page.
Layout:
class="background background-{{ page.header_bg }}"
Page:
---
header_bg: storm
---
But now I want to include some file dynamically, depending on the variable value. Well, I can do it with some if or case statements, but actually I want to do something like
{% include page.header_bg %}
But this does not work, because Jekyll is looking for a file, that is called "page.header_bg" and not the value.
Can some one help me please?
According to the docs, you need to put the variable name inside additional {{ }}.
Quote from the link:
ProTip™: Use variables as file name
The name of the file you wish to embed can be literal (as in the
example above), or you can use a variable, using liquid-like variable
syntax as in {% include {{my_variable}} %}.
I know this question is 6 years old but here's a clear answer for posterity. You need to assign page.header_bg to a variable first and then use the variable in the include statement.
{% assign headerBg = page.header_bg %}
{% include headerBg %}
You will get an error Liquid Exception: Invalid syntax for include tag if you attempt to use page.header_bg directly in the include statement.

Modifying parsed Atom feed in Ruby

I need to modify the contents of an Atom feed parsed using the standard RSS library. I've found documentation for parsing and generating feeds, but nothing about the correct way to modify existing structure (maybe it's deemed obvious?).
Specifically, I'm trying to add a content element to each entry, with a type of 'html' and wrapped in a CDATA section.
This is what I have so far:
feed = RSS::Parser.parse open(some_uri), true
feed.items.each do |item|
item.content = RSS::Atom::Feed::Entry::Content.new
item.content.type = 'html'
item.content.content = '<html>my content that i have</html>'
end
Is that the preferred way, and if so, how do I add the CDATA tag?

Spread Liquid `include` over mutliple lines

I have an "included" template with several parameters. The contents of the parameters get a bit muddled if I cram them all into a single line, so I would prefer something like this:
{% include product_details
weight= "5.8lbs (2.6 kg)"
width= "22" (56cm)"
length= "49" (125cm)"
thickness= "1¼" (3cm)"
case= "MT51413"
%}
However, this gives me the following error when generating the site:
error: Tag '{%' was not properly terminated with regexp: /\%}/.
Is there any way to spread a Liquid include over several lines?
Sorry you can't, the liquid parser expects the whole statement in one line, this is true for all statements not only for include, try to split an if statement and you receive the same error

how to write code blocks using maruku

how can i write code blocks in maruku
for ruby,javascript
currently i am using technique. but my first line moving to left.
hash["test"] = "test"
hash.merge!("test" => "test")
h=HashWithIndifferentAccess.new
h.update(:test => "test")
{:lang=ruby html_use_syntax=true}
I'm not sure I fully understand the question, but Maruku is just a Markdown interpreter.
To produce a code block in Markdown, simply indent every line of the block by at least 4 spaces or 1 tab. For example, given this input:
This is a normal paragraph:
This is a code block.
Markdown will generate:
<p>This is a normal paragraph:</p>
<pre><code>This is a code block.
</code></pre>
I add this answer because I ended up here searching for a solution to code blocks with Maruku using Jekyll. For anyone else in the same boat, use the Liquid tags for code blocks instead of the Markdown syntax:
{% highlight java %}
System.out.println("Hello, Maruku.");
{% endhighlight %}
Also see this question/answer: Highlight with Jekyll and pygments doesn't work

Resources