pandoc generate InvalidUrlException when the link contains full path file - pandoc

Pandoc is a great tool to transform files from one format to another. Among all the diverse functions it provides, one interesting function is to generate self-contained portable HTML file. This function is very useful when you want to share your HTML files with your colleagues. However, when the link contains full path file. For example, the html file original.html contains the following HTML items:
<a href="file:///media/distribution/file_num.png" target="_blank" />
<img src="file:///media/distribution/file_num.png" /></a>
When I use pandoc original.html --self-contained -o transformed.html to generate a portable HTML file, the following error message is given:
pandoc: Could not fetch file:///media/distribution/file_num.png
InvalidUrlException"file:///media/distribution/file_num.png" "Invalid scheme".
Any ideas? Thanks.
EDIT:
I also tried to use pypandoc,
output=pypandoc.convert_file('data.html','html',outputfile="ddd.html",extra_args=['--self-contained'])
but the same error happens:
'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr)
RuntimeError: Pandoc died with exitcode "61" during conversion:

Related

How to include a latex document into a markdown using yaml metadata `include-after-body`

I want include a latex document into a markdown .md file using the YAML metadata. I have two files in a directory:
markdown.md
---
title: test fuer pdf eingeschlossen
author: AUF
keywords: test
abstract: Versuch
publish: True
include-after-body: testlatex.tex
---
A simple exampe for a test.
and testlatex.tex
some text without sense
\begin{itemize}
\tightlist
\item
firstly, a fundamental human need is;
\item
secondly, a cost-effective technical mean,
\end{itemize}
I can include the testlatex.tex file into the body of the markdown.md on the commandline with
pandoc --pdf-engine=lualatex --toc -o test.pdf markdown.md --include-after-body=testlatex.tex
but the equivalent value put into the YAML metadata seems not to have any effect (author and title are however used). I thought that the value included on the command line or in the YAML metadata would be equivalent. I checked in the pandoc latex template and see there an include for the include-after but I also wonder where the filename is converted to its content.
If I put
include-after: testlatex.tex
in the YAML metadata the name of the file is printed in the output, but the file content is not used!
Thank you for your help!

pandoc does not produce bibliography when biblio file is in YAML-metadata only

I assume that inserting a reference to a BibTex bibliography in a YAML-metadata is sufficient for the references to be produced. This is like pandoc does not print references when .bib file is in YAML, which was perhaps misunderstood and which has no accepted answer yet.
I have the example input file:
---
title: Ontologies of what?
author: auf
date: 2010-07-29
keywords: homepage
abstract: |
What are the objects ontologists talk about.
Inconsistencies become visible if one models real objects (cats) and children playthings.
bibliography: "BibTexExample.bib"
---
An example post. With a reference to [#Frank2010a] and more.
## References
I invoke the conversion to latex with :
pandoc -f markdown -t pdf postWithReference.markdown -s --verbose -o postWR.pdf -w latex
The pdf is produced, but it contains no references and the text is rendered as With a reference to [#Frank2010a] and more. demonstrating that the reference file was not used. The title and author is inserted in the pdf, thus the YAML-metadata is read. If I add the reference file on the command line, the output is correctly produce with the reference list.
What am I doing wrong? I want to avoid specifying the bibliography file (as duplication, DRY) on the command line. Is there a general switch to demand bibliography processing and leaving the selection of the bibliography file to the document YAML-metada?
In the more recent version requires --citeproc instead of --filter=pandoc-citeproc
Theo bibliography is inserted by the pandoc-citeproc filter. It will be run automatically when biblioraphy is set via the command lines, but has to be run manually in cases such as yours. Addind --filter=pandoc-citeproc will make it work as expected.

How to split Pandoc template files into sub files

I would like to be able to split Pandoc template files into sub files using the \input{path-to-file} in the template file. When I use the \input command, I get the following error message when running pandoc -o output.pdf --template=default.latex --latex-engine=lualatex:
! LaTeX Error: Missing \begin{document}.
How do I properly split pandoc template files?
I think this is not possible to achieve with Pandoc. I choose to switch to Panzer, a project which combines Pandoc with styles.

use pandoc to embed images into a docx file that are in a HTML

Is it possible to embed images into a docx file that are embedded in a HTML file?
I am trying and it's not working for me, and perhaps I am not adding some extra parameter when I am running pandoc.
pandoc -f html -t docx -o testdoc.docx image.html
Thank you very much!
I managed to solve this by executing the following command:
pandoc -s file_name.html -o file_name.docx;
There are actually 2 important points that you need to consider:
The quality of the output file is pretty much related to how pandoc interpret your HTML file, so that if the source was pretty complex then you wouldn't really expect a pretty good quality output, for instance the <hr/> tag is not recognized by pandoc, while the <p> tag is.
The path of the image is not an HTTP path but instead it is a full desk path, meaning:
This is NO good:
<img src="http://www.example.com/images/img.jpg" />
And This is what pandoc can really read:
<img src="/var/www/example.com/images/img.jpg" />
HTH

dblatex ignore --texstyle or -s command

I want to write an asciidoc document and convert it into a pdf document. However, I want to use a format style different than the default ones. To do so I convert the txt file to docbook using asciidoc and then try to convert the resulting docbook xml to a pdf file using dblatex.
The idea is to set a particular tex style for dblatex to obtain the desired pdf result. I've copied the existing docbook.sty style as it is recommended here to do a small style modification. The only change done in the ./docbook file is \setlength{\textwidth}{18cm} to \setlength{\textwidth}{12cm}. However, when I run the command
dblatex --texstyle=./docbook.sty test.txt
Or the command
dblatex -s ./docbook.sty test.txt
Both produce the same result in the style change: none. I mean, no matter which modification I do to ./docbook.sty file, these modifications are not applied to the output. I obtain always the same result, a pdf with the default formatting. Do you guys have any idea where is the problem?
Thanks in advance.
I would recommend:
Copy the Dblatex docbook.sty to a new filename in your working directory which is "obviously yours" (e.g., mydbstyle.sty).
Continue to supply a full or relative path argument to the --texstyle option (e.g., /path/to/mydbstyle.sty or ./mydbstyle.sty). Failing to do so requires that mydbstyle.sty be in a directory enumerated by the TEXINPUTS environment variable (which you likely have not explicitly set).
Within mydbstyle.sty, use the following directives to initialize your style:
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{mydbstyle}[2013/02/15 DocBook Style]
\RequirePackageWithOptions{docbook}
% ...
% your LaTeX commands here
Pass a DocBook 4.5 XML file as an argument to Dblatex (in your example you are passing test.txt which makes me uncertain whether you're passing an AsciiDoc source file).
dblatex --texstyle=./mydbstyle.sty mybook.xml

Resources