I am trying to convert HTML to asciidoc using pandoc but pandoc converts <br> tags into +\n instead of \n like the following.I also tried asciidoc-escaped_line_breaks but nothing changed.
Terminal Command:
`pandoc +RTS -K100000000 -RTS --wrap=preserve -f html -t asciidoc-escaped_line_breaks "input.html" -o "output.asciidoc"`
input.html
s
<br>
s
output.asciidoc
s +
s
Expected Output:
s
s
Version:pandoc 1.19.2.4
The escaped_line_breaks extension is currently only implemented for markdown, not for AsciiDoc.
You could use a pandoc lua filter like the following, to strip all LineBreak elements from the document:
function LineBreak()
return {}
end
Save this to e.g. strip-linebreaks.lua. Note that you have a really old pandoc version, you need a newer one to use lua filters. Then:
pandoc -f html --lua-filter strip-linebreaks.lua -t asciidoc
While other markdown implementations have a switch to escape HTML, I couldn't find one for Pandoc.
I want Pandoc to convert HELLO <blink>WORLD</blink> to <p>HELLO <blink>WORLD</blink></p>.
Kramdown and Maruku don't seem to support this, how about Pandoc?
You can disable the extension raw_html by using this command to compile:
pandoc -f markdown-raw_html -t html
Although the output does not exactly matches your expected output because it will also transform > to >.
This is my problem. A ".csv" file where delimiter is tab and content is a "Little-endian UTF-16 Unicode text". If i try to open it with gui in libreoffice there are my successfully
if i'm trying via shell with
unoconv -f ods -e FilterOptions="9,34,UNICODE,1" [FILE]
the result is a file with no separation. What's wrong?
And what's is the best shell command in order to convert this ods to file to a well generated csv (Unicode UTF8, comma separated, ecc.)?
this is my definitive solution
iconv -f UTF-16 -t UTF-8 /original/folder/file.csv > /tmp/file.csv
unoconv -f ods -i FilterOptions="9,34,UNICODE,1" /tmp/file.csv
unoconv -f csv -o /original/folder/file.csv -i FilterOptions="9,34,UNICODE,1" /tmp/file.ods
Bullet point 18 of http://pandoc.org/demos.html#examples shows how to change the syntax highlighter used by giving an argument to --highlight-style. For example:
pandoc code.text -s --highlight-style pygments -o example18a.html
pandoc code.text -s --highlight-style kate -o example18b.html
pandoc code.text -s --highlight-style monochrome -o example18c.html
pandoc code.text -s --highlight-style espresso -o example18d.html
pandoc code.text -s --highlight-style haddock -o example18e.html
pandoc code.text -s --highlight-style tango -o example18f.html
pandoc code.text -s --highlight-style zenburn -o example18g.html
I am wondering if these are the only color schemes available. If not, how can I load a different syntax highlighter? Can I define my own?
Since pandoc 2.0.5, you can also use --print-highlight-style to output a theme file and edit it.
To me, the best way to use this option is to
Pick a pleasant available style
Output its theme file
Edit the theme file
Use it!
1. Available Styles
Pick your style, among the one already existing:
2. Output its theme file
Once you decided which style was the closest to your needs, you can output its theme file, using (for instance for pygments, the default style):
pandoc --print-highlight-style pygments
so that you can store this style in a file, using, e.g.,
pandoc --print-highlight-style pygments > my_style.theme
With some shells, especially on Windows, using redirected output can lead to encoding problems. If that happens, use this instead:
pandoc -o my_style.theme --print-highlight-style pygments
3. Edit the file
Using the Skylighting JSON Themes guide, edit the file according to your need / taste.
4. Use the file
In the right folder, just use
pandoc my_file.md --highlight-style my_style.theme -o doc.html
If your pandoc --version indicates a release of 1.15.1 (from Oct 15, 2015) or newer, then you can check if the --bash-completion parameter works for you to get a full list of available built-in highlighting styles.
Run
pandoc --bash-completion
If it works, you'll see a lot of output. And it will be useful well beyond the original question above...
If --bash-completion works, then put this line towards the end of your ${HOME}/.bashrc file (on Mac OS X or Linux -- doesn't work on Windows yet):
eval "$(pandoc --bash-completion)"
Once you open a new terminal, you can use the pandoc command with "tab completion":
pandoc --h[tab]
will yield
--help --highlight-style --html-q-tags
pandoc --hi[tab]
will yield
pandoc --highlight-style
Answer to original question:
Now punch the [tab] key one more time, and you'll see
espresso haddock kate monochrome pygments tango zenburn
It's the list of all available syntax highlighters. To shorten the precedure, you could also type
pandoc --hi[tab][tab]
to get the same result.
Usefulness of Pandoc's tab completion beyond original question:
Pandoc's bash tab completion also works for all other commandline switches:
pandoc -h[tab]
yields this -- a list of all possible command line parameters:
Display all 108 possibilities? (y or n)
--ascii --indented-code-classes --template
--asciimathml --jsmath --title-prefix
--atx-headers --katex --to
--base-header-level --katex-stylesheet --toc
--bash-completion --latex-engine --toc-depth
--biblatex --latex-engine-opt --trace
--bibliography --latexmathml --track-changes
--chapters --listings --variable
--citation-abbreviations --mathjax --verbose
--columns --mathml --version
--csl --metadata --webtex
--css --mimetex --wrap
--data-dir --natbib --write
--default-image-extension --no-highlight -A
--dpi --no-tex-ligatures -B
--dump-args --no-wrap -D
--email-obfuscation --normalize -F
--epub-chapter-level --number-offset -H
--epub-cover-image --number-sections -M
--epub-embed-font --old-dashes -N
--epub-metadata --output -R
--epub-stylesheet --parse-raw -S
--extract-media --preserve-tabs -T
--file-scope --print-default-data-file -V
--filter --print-default-template -c
--from --read -f
--gladtex --reference-docx -h
--help --reference-links -i
--highlight-style --reference-odt -m
--html-q-tags --section-divs -o
--id-prefix --self-contained -p
--ignore-args --slide-level -r
--include-after-body --smart -s
--include-before-body --standalone -t
--include-in-header --tab-stop -v
--incremental --table-of-contents -w
One interesting use case for Pandoc's tab completion is this:
pandoc --print-default-d[tab][tab]
gives the output list of completion for pandoc --print-default-data-file. This list gives you a uniq insight into what data files your instance of Pandoc will load when it is doing its work. For example you could investigate a detail of Pandoc's default ODT (OpenDocument Text file) output styling like this:
pandoc --print-default-data-file odt/content.xml \
| tr " " "\n" \
| tr "<" "\n" \
| grep --color "style"
The Pandoc README says:
--highlight-style=STYLE|FILE
Specifies the coloring style to be used in highlighted source code.
Options are pygments (the default), kate, monochrome,
breezeDark, espresso, zenburn, haddock, and tango.
For more information on syntax highlighting in pandoc, see
Syntax highlighting, below. See also
--list-highlight-styles.
Instead of a STYLE name, a JSON file with extension
.theme may be supplied. This will be parsed as a KDE
syntax highlighting theme and (if valid) used as the
highlighting style. To see a sample theme that can be
modified, pandoc --print-default-data-file default.theme.
The library skylighting (in older versions highlighting-kate) is used for the highlighting. If you don't like any of the provided color schemes, you can either:
Specify a .theme file as mentioned above,
when exporting to HTML, <span> tags are generated that you can style with your custom CSS, or
when exporting to LaTeX/PDF, you need to use a custom Pandoc LaTeX template and replace the $highlighting-macros$ part with your custom color definitions, as described in this issue.
If you are using Pandoc version 1.18 (released in October 2016) or later, a new answer is possible:
pandoc --list-highlight-languages
and
pandoc --list-highlight-styles
will give you all the info you were asking for.
Other new informational command line parameters added to v1.18 are:
pandoc --list-input-formats
pandoc --list-output-formats
pandoc --list-extensions
How do I convert RTF (say from stdin) to Markdown with a command line tool under UNIX/OSX.
I am looking for something like pandoc. However pandoc itself does not allow RTF as an input format. :-( So, I'd be happy either with a similar tool to pandoc or a pointer to an external RTF reader for pandoc.
On Mac OSX I can use the pre-installed textutil command for the RTF-to-HTML conversion, then convert via pandoc to markdown. So a command line which takes RTF from stdin and writes markdown to stdout looks like this:
textutil -stdin -convert html -stdout | pandoc --from=html --to=markdown
Using Ted and pandoc together, you should be able to do this:
Ted --saveTo text.rtf text.html
pandoc --from=html --to=markdown --out=text.md < text.html
Pandoc now supports RTF as an input format, so you can use:
cat file.rtf | pandoc --from=rtf --to=markdown