Example for use of `citeproc` in Haskell code needed? - pandoc

I try to upgrade a program converting blog entries using pandoc from using pandoc-citeproc to the new citeproc. I have not found a simple example for usage of citeproc and have difficulties to use it.
Specifically, I do not see how to construct the list of references and list of citations.
For the list of references, I assume I should process the bib file as it was done with, e.g.
parseBibTex :: String -> IO [Entry.T]
what is the corresponding function?
I cannot see how to extract the list of citations and how to produce the formatted file.
Perhaps I misunderstood that citeproc was a replacement for pandoc-citeproc. I think it would be extremely useful to see a simple complete example how a text converted to pandoc format and a reference bib file would be processed to obtain a formated text file. I think I could work from such an example...
Thank you!

The functions you want for processing a Pandoc document with citeproc are not in citeproc itself but in Pandoc's Citeproc module:
https://hackage.haskell.org/package/pandoc-2.16.2/docs/Text-Pandoc-Citeproc.html
These functions handle all of the details for you.

Related

Writing thanks and keywords from YAML header in Markdown file to docx document through Pandoc conversion

After reading the online Pandoc manual and browsing pages such as knitr-pandoc-article-template-supporting-keywords and Keywords in Pandoc 2, I haven't figured out yet how to write the values of the thanks and keywords YAML fields from the header of a Markdown file to a docx document through Pandoc conversion. My working version of Pandoc is 2.18.
I have thought that a Lua filter might be the way to proceed, but my knowledge of both Lua and the Pandoc framework at the programmatic level is quite limited.
Any help in this regard would be greatly appreciated.
Although my actual setup is more complex, the following Markdown lines with a YAML header should do for an MWE:
---
title: The Title
author: The Author
thanks: |
The author wishes to thank certain support.
abstract: |
This is the abstract.
keywords: [one, two, three, four]
---
# A Heading
Text body.
The answer to this depends a little on how you'd want the thanks to be viewed. E.g., if you'd like it to be presented as a footnote to the author, you'd use a Lua filter like this:
function Meta (meta)
meta.author = meta.author .. {pandoc.Note(meta.thanks)}
return meta
end
The approach can be adapted to match different requirements.

PanDoc - How to insert variables in header in docx with pandoc_title_block extension?

I would like to generate a DOCX with a variables inside header based on the texts inserted in the md file as variables, such as the title of the document, the version and the date of publication.
Through the yaml_metadata_block extension and the creation of custom fields in the reference.docx file I was able but I would like not to use form fields.
I understand (but I could be wrong) that with the extension pandoc_title_block this can be done but I don't understand how it works and I don't find examples on the net that I can study.
is what I said correct?
if so, could a simple example be shared that you can study and understand?
thank you

Reformat Markdown files to a specific code style

I'm working on a book which had a couple of people writing and editing the text. Everything is Markdown. Unfortunately, there is a mix of different styles and lines widths. Technically this isn't a problem but it's not nice in terms of aesthetics.
What is the best way to reformat those files in e.g. GitHub markdown style? Is there a shell script for this job?
You might want to look at Pandoc; it understands several flavors of Markdown.
pandoc -f markdown -t gfm foobar.md
Having written a markup converter years ago in Perl, I would not want to approach such a task without a decent lexical analyzer, which is a bit beyond shell scripting.
I wrote a tool called tidy-markdown that will reformat any Markdown (including GFM) according to this styleguide.
$ tidy-markdown < ./ugly-markdown.md > ./clean-markdown.md
It handles conversion of inline HTML to Markdown, normalization of syntactic elements like code blocks (converting them to fenced), lists, block-quotes, front-matter, headers, and will even attempt to standardize code-block language identifiers.

Pandoc citations without appending the references bibliography

Main Question:
Is there a way to flag Pandoc to turn off appending the bibliography but still have it insert the correct inline citations?
I am writing a Markdown / Knitr document that has a main file (article.Rmd) and several "child" files that are included in the main file using Knitr's "child=" chunk option.
The child files are basically sections of the main article file, just separated for easier editing and management. Throughout these child section files, I use the citations in the Markdown text (e.g. "#author_title_1999") to cite various papers. The main file and each child file has a YAML header that provides the BibTex file location, e.g.:
---
bibliography: mybibfile.bib
...
(Including this YAML entry more than once is not a problem; see the readme on metadata-blocks.)
When I compile the entire document using Knitr, a big Markdown document is created. I then use Pandoc with the --filter pandoc-citeproc option to manage the citations. Pandoc inserts nice citations and appends a list of the cited papers as references/bibliography. Cool.
As I write and edit the individual child sections, I use the same citation compiling which produces the correct inline citations, but unfortunately also appends the references at the end, even though it's just a section of the larger document. I would like to compile these small child sections with inline citations, but without the bibliography at the end.
I think this is possible with the suppress-bibliography metadata field first introduced in pandoc-citeproc 0.7 (released in May 2015). From the current pandoc-citeproc man page:
pandoc-citeproc will look for the following metadata fields in the input:
...
suppress-bibliography : If this has a true value, the bibliography will be left off. Otherwise a bibliography will be inserted into each Div element with id refs. If there is no such Div, one will be created at the end of the document.
(as a workaround, you can also quite easily create a custom CSL style that doesn't produce a bibliography, by deleting the cs:bibliography child element of the style. See http://docs.citationstyles.org/en/stable/specification.html#child-elements-of-cs-style.)

Markdown to plain text in Ruby?

I'm currently using BlueCloth to process Markdown in Ruby and show it as HTML, but in one location I need it as plain text (without some of the Markdown). Is there a way to achieve that?
Is there a markdown-to-plain-text method? Is there an html-to-plain-text method that I could feel the result of BlueCloth?
RedCarpet gem has a Redcarpet::Render::StripDown renderer which "turns Markdown into plaintext".
Copy and modify it to suit your needs.
Or use it like this:
Redcarpet::Markdown.new(Redcarpet::Render::StripDown).render(markdown)
Converting HTML to plain text with Ruby is not a problem, but of course you'll lose all markup. If you only want to get rid of some of the Markdown syntax, it probably won't yield the result you're looking for.
The bottom line is that unrendered Markdown is intended to be used as plain text, therefore converting it to plain text doesn't really make sense. All Ruby implementations that I have seen follow the same interface, which does not offer a way to strip syntax (only including to_html, and text, which returns the original Markdown text).
It's not ruby, but one of the formats Pandoc now writes is 'plain'. Here's some arbitrary markdown:
# My Great Work
## First Section
Here we discuss my difficulties with [Markdown](http://wikipedia.org/Markdown)
## Second Section
We begin with a quote:
> We hold these truths to be self-evident ...
then some code:
#! /usr/bin/bash
That's *all*.
(Not sure how to turn off the syntax highlighting!) Here's the associated 'plain':
My Great Work
=============
First Section
-------------
Here we discuss my difficulties with Markdown
Second Section
--------------
We begin with a quote:
We hold these truths to be self-evident ...
then some code:
#! /usr/bin/bash
That's all.
You can get an idea what it does with the different elements it parses out of documents from the definition of plainify in pandoc/blob/master/src/Text/Pandoc/Writers/Markdown.hs in the Github repository; there is also a tutorial that shows how easy it is to modify the behavior.

Resources