Formatting a reveal.js presentation with pandoc, how do I set .fragment on some list items - pandoc

I'm using Pandoc 2.17 to produce a reveal.js presentation. I'd like to produce output like:
<section>
<h2> Slide title>
<ul>
<li> This appears initially</li>
<li class="fragment">And this appears later.
</li<>
</ul>
</section>
Without the fragment I can get that easily with
# Slide Title
- This appears initially.
- This appears later
I can use fenced divs
::: {.fragment}
- This appears later
:::
Which will introduce an extra div (and break my unordered list into two lists) and apply .fragment to the div.
I can also insert the li element manually.
It looks like if I use the commonmark_x input format rather than pandoc flavored markdown I can do something like
- This appears later. {.fragment}
But reading this github issue suggests there is some standard syntax for incremental slides:
pandoc now includes a uniform syntax for transitions, which gets output as \pause in beamer and using fragment divs in revealjs. Of course, you can also just use a in the markdown source, but this won't be portable if you decide to switch to beamer.
I can't find that syntax though.
I'd also like to learn if there's something like the commonmark attributes extension I can use in pandoc flavored markdown.

So, I have found the syntax for incremental slides:
## Slide Title
- This appears initially.
::: incremental
- And this appears later.
:::
That introduces a secont bulleted list (the first bullet and the second are in different <ul> elements) and introduces paragraph elements around the text of the second bullet.
It also introduces an extra div.

Related

Pandoc 2.x renders images' alternative texts in an inaccessible fashion

Since I upgraded from Pandoc v1.19 to 2.9, decorative images are not exported as expected anymore.
First of all, when generating HTML from ![](test.jpg), in v1.19 a <p class="figure"> structure was wrapped around the image, but now it's only a <p>:
<p>
<img src="test.jpg">
</p>
This makes it harder to style in line with other images that have an alternative text.
But what's really a problem here: there's no alt="" attribute produced anymore! This means that e.g. screen readers will not recognise this as a decorative image anymore.
So let's see what happens to an image with an actual alternative text, e.g. when generating HTML from ![Hello](test.jpg):
<div class="figure">
<img src="test.jpg" alt="">
<p class="caption">Hello</p>
</div>
Here we get a class="figure" in the surrounding element, but now it's a <div> instead of a <p> (I don't bother too much about this, but again, it makes it harder to style everything the same).
What again is a big problem though is the fact that the alt attribute is now set empty: this prevents screen readers from perceiving them at all, which is horribly wrong! I guess that Pandoc concludes that having alternative text and caption would be redundant, which is correct, and that the caption below would be the right thing to show - which it is not.
The right structure would look something like this:
<div class="figure">
<img src="test.jpg" alt="Hello"><!-- Leave the alternative text on the image -->
<p class="caption" aria-hidden="true">Hello</p><!-- Hide the redundant visual alternative text from screen readers -->
</div>
Any reason why this behaviour would make sense? Can it be changed somehow? Otherwise I will have to fiddle around with some post-processing JavaScript...
The ![](test.jpg) example is no longer treated as a figure, because pandoc now requires that
the image is the only element in a paragraph, and
it has a caption.
Wrapping of figures with <div> happens when exporting to HTML4. Using the latest pandoc 2.9.2.1 and running pandoc -t html5 on the input ![Hello](test.jpg)
<figure>
<img src="test.jpg" alt="" /><figcaption>Hello</figcaption>
</figure>
The rationale for emitting an empty alt attribute is that screen readers would read the caption twice: first the alt, then the figcaption. Your suggestion seems much better, please open an issue.
If you can't wait for a new release, then use a Lua filter to create figures the way you like:
function Para (p)
if #p.content == 1 and p.content[1].t == "Image" then
local image = p.content[1]
local figure_content = pandoc.List{}
figure_content:insert(image)
figure_content:insert(
pandoc.RawInline('html', '\n<p class=caption aria-hidden="true">'))
figure_content:extend(image.caption)
figure_content:insert(pandoc.RawInline('html', '</p>'))
local attr = pandoc.Attr("", {"figure"})
return pandoc.Div({pandoc.Plain(figure_content)}, attr)
end
end

How do I do strikethrough (line-through) in asciidoc?

How do I render a strikethrough (or line-through) in an adoc file?
Let's presume I want to write "That technology is -c-r-a-p- not perfect."
That technology is [line-through]#crap# not perfect.
As per Ascii Doc manual, [line-through] is deprecated. You can test here.
Comment from Dan Allen
It's important to understand that line-through is just a CSS role. Therefore, it needs support from the stylesheet in order to appear as though it is working.
If I run the following through Asciidoctor (or Asciidoctor.js):
[.line-through]#strike#
I get:
<span class="line-through">strike</span>
The default stylesheet has a rule for this:
.line-through{text-decoration:line-through}
You would need to do the same.
It is possible to customize the HTML that is generated using custom templates (Asciidoctor.js supports Jade templates). In that case, you'd override the template for inline_quoted, check for the line-through role and produce either an <s> or, preferably, a <del> instead of the span.
If you're only targeting the HTML backend, you can insert HTML code verbatim via a passthrough context. This can be done inline by wrapping the parts in +++:
That technology is +++<del>+++crap+++</del>+++ not perfect.
This won't help you for PDF, DocBook XML, or other output formats, though.
If the output is intended for HTML you can pass HTML.
The <s> HTML element renders text with a strikethrough, or a line
through it. Use the element to represent things that are no longer
relevant or no longer accurate.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/s
To render as:
Example text.
use:
1. Pass inline:
Example +++<s>text</s>+++.
2. Pass-through macro:
Example pass:[<s>text</s>].
3. Pass block:
++++
Example <s>text</s>.
++++

Regex encapsulate full line and surround it

I can find examples of surrounding a line but not surrounding and replacing, and I'm a bit new to Regex.
I'm trying to ease up my markdown, so that I do not need to add in html just to get it to center images.
With pandoc, I apparently need to surround and image with DIV tags to get it to be centered, right justified, or what ever.
Instead of typing that every time, I'd like to just preprocess my markdown with a ruby script and have ruby add in the DIV's for me.
So I can type:
center![](image.jpg)
and then run a ruby script that will change it to
<div class="center">
![](image.jpg)
</div>
I want the regex to find "center!" and get rid of the word "center" and surround the rest with DIV tags.
How would I accomplish this?
A little example using gsub:
s = "a\ncenter![](image.jpg)\nb\n"
puts s.gsub(/^center(.*)$/, "<div class=\"center\">\n\\1\n</div>")
Result is:
a
<div class="center">
![](image.jpg)
</div>
b
Should get you started. The (.*) captures the content after center, and \\1 adds it back into the replacement. In this example I assumed that the item was on a line by itself - ^ indicates the start of a line and $ indicates the end of a line. If that isn't the case, you'll need to determine what makes what your regex unique so that it doesn't replace any random usage of "center" in your text.

Applying CSS and roles for text blocks instead of inline spans in Sphinx

There is a previous question that explains how to add a color span to some reStructuredText.
To recap the procedure:
First, you have the role.
.. role:: red
An example of using :red:`interpreted text`
It translates into as follows.
<p>An example of using <span class="red">interpreted text</span></p>
Now, you have the red class, you can use CSS for changing colors.
.red {
color:red;
}
How do you do this if you want text that spans multiple lines? For example:
.. role:: red
:red:`paragraph 1
paragraph 2
paragraph 3`
Where paragraph 1, 2, & 3 would all be "red". If I try to do this I get the warning message:
WARNING: Inline interpreted text or phrase reference start-string without end-string.
It doesn't create the span and inserts ":red:" into the text. It just doesn't interpret this as a string (as the warning suggests).
Basically, can this be done in reStructuredText, and if it can, how?
I'm using Sphinx 1.1.3.
There are a number of ways to do this, but one of them is to use the class directive:
.. class:: red
This is a paragraph.
This is another paragraph.
Most docutils HTML writers will put that into html output as a class html attribute, which you can then style with CSS.
In Sphinx, in particular, however, you may need to use rst-class instead of class in at least some cases. See: https://www.sphinx-doc.org/en/2.0/usage/restructuredtext/basics.html
Also, many block-level elements in RestructuredText take a :class: parameter which does pretty much the same thing.

Convert HTML to plain text and maintain structure/formatting, with ruby

I'd like to convert html to plain text. I don't want to just strip the tags though, I'd like to intelligently retain as much formatting as possible. Inserting line breaks for <br> tags, detecting paragraphs and formatting them as such, etc.
The input is pretty simple, usually well-formatted html (not entire documents, just a bunch of content, usually with no anchors or images).
I could put together a couple regexs that get me 80% there but figured there might be some existing solutions with more intelligence.
First, don't try to use regex for this. The odds are really good you'll come up with a fragile/brittle solution that will break with changes in the HTML or will be very hard to manage and maintain.
You can get part of the way there very quickly using Nokogiri to parse the HTML and extract the text:
require 'nokogiri'
html = '
<html>
<body>
<p>This is
some text.</p>
<p>This is some more text.</p>
<pre>
This is
preformatted
text.
</pre>
</body>
</html>
'
doc = Nokogiri::HTML(html)
puts doc.text
>> This is
>> some text.
>> This is some more text.
>>
>> This is
>> preformatted
>> text.
The reason this works is Nokogiri is returning the text nodes, which are basically the whitespace surrounding the tags, along with the text contained in the tags. If you do a pre-flight cleanup of the HTML using tidy you can sometimes get a lot nicer output.
The problem is when you compare the output of a parser, or any means of looking at the HTML, with what a browser displays. The browser is concerned with presenting the HTML in as pleasing way as possible, ignoring the fact that the HTML can be horribly malformed and broken. The parser is not designed to do that.
You can massage the HTML before extracting the content to remove extraneous line-breaks, like "\n", and "\r" followed by replacing <br> tags with line-breaks. There are many questions here on SO explaining how to replace tags with something else. I think the Nokogiri site also has that as one of the tutorials.
If you really want to do it right, you'll need to figure out what you want to do for <li> tags inside <ul> and <ol> tags, along with tables.
An alternate attack method would be to capture the output of one of the text browsers like lynx. Several years ago I needed to do text processing for keywords on websites that didn't use Meta-Keyword tags, and found one of the text-browsers that let me grab the rendered output that way. I don't have the source available so I can't check to see which one it was.

Resources