I'm trying to convert certain pages of a docx to pdf using pandoc but I can't find any sources hinting at where should I start. After taking a look into the pandoc documentation I still couldn't figure it out, so I just assumed that pandoc doesn't support this.
This might just act as a confirmation for future readers, Does pandoc support page range converting?
Pandoc has no concept of pages.
Putting text on pages happens during rendering with Word and LaTeX, but pandoc does not render the text before converting. Therefore it cannot know on which page a specific letter will be placed.
Related
I am using Typora Markdown text editor wherein highlighting text is done with ==[...]== operation. This is also the case for many other Markdown editors such as Obsidian, Quilt, iA Writer, etc.
What is a way for pandoc to convert the == highlighting when converting to a pdf file ?
Sample.md
==Testing==
Then performing
pandoc Sample.md -o test.pdf
produces a pdf with "==Testing=="
The short answer is: there isn't one, highlighting syntax is currently not supported by pandoc. For more details, refer to the related discussion on the pandoc mailing list.
The long answer is that you could write a Lua filter or even a custom Markdown parser to add support for various features, but that's non-trivial in this case.
I'm playing around with pandoc to see if it is able to convert all aspect of word doc to .md reliably. Looks like it handles lot of stuff pretty well such as table of contents, images, etc.. However, I am looking to see if it can also understand a diagram in word doc that has been made using combining multiple shapes of word. for, e.g. diagram like below in your word doc:
when I do "pandoc --extract-media=. my.docx -o my.md" to convert to .md, mark down doc does not have any thing related to word shapes. Looks like it does not understand it. Is there any way to make pandoc smart enough to undestand word shapes ?
No, pandoc cannot handle these. There are two issues for this on the pandoc issue tracker, #4735, and #2792.
In LaTeX one could for example have a nice inline equation like $x^2=4$, which in docx format I would be glad to have as italic text.
Is there a way to tell Pandoc to use one of these solutions depending on the output format?
When searching for a possible solution, I realized pandoc has filters and templates. I would not really understand, which direction to follow.
But I would really like to arrive with a more general solution, that would also work for analogous tasks like, for example, smaller spaces between a number and units: In LaTeX straightforward $\;$, but including this in my Markdown document would not give me a satisfactory result in DOCX or ODT output.
This is what I found from the pandoc manual
For docx output, styles will be defined in the output file as inheriting from normal
text, if the styles are not yet in your reference.docx. If they are already defined,
pandoc will not alter the definition.
and please read the --reference-doc=FILE part of the maunal
--reference-doc=FILE
Use the specified file as a style reference in producing a docx or ODT file.
...
how to use the reference-doc in pandoc???
create a empty docx file and rename it (eg. refer.docx)
define the styles you want to display
add "--reference-doc=(refer.docx path)" into your pandoc command line .
Is there anyway to render MathML in the rst file with Sphinx?
I enabled mathjax extension in conf.py. It works very well with latex using like
However, if I replace it with math ML, it does render it but instead display all the xml code. For example,
produces
In Sphinx math is rendered by a mathjax extension. On https://www.mathjax.org/ mathjax claims they support MathML.
It's difficult to answer without knowing your desired output, but from the doc
The "math" directive inserts blocks with mathematical content (display
formulas, equations) into the document. The input format is LaTeX math
syntax with support for Unicode symbols
So, no way to use MMl with ..math::
If you output in HTML, you can try raw or literal directives instead to let the block as is. The block will be displayed correctly in a MML browser compliant. If you include mathjax libraries with a correct configuration it will also process.
I currently have a script that produces a large amount of 3.5-inch-square SVG images. What I need is to be able to put these SVGs in a layout which can be easily and accurately printed.
I have tried using an HTML template, but HTML/CSS does not have sufficiently robust printing support.
What document layout language is most appropriate for handling SVG images, and how could this be implemented in a scripting language?
I use Ruby to generate my SVGs, and although preferable, it is not required that Ruby also be the language used to generate the print layout.
I'd suggest compiling all SVGs to a larger SVG, placing everything where you want it, and convert that to PDF using one of multiple options:
Using Inkscape on the command line, like
inkscape -f in.svg -A out.pdf
Using Batik
java -jar batik-rasterizer.jar -m application/pdf -d out.pdf in.svg
Using librsvg, like
rsvg-convert -f pdf -o out.pdf in.svg
(Probably the most lightweight option)
You might also be able to use the rsvg2 Ruby gem with a Cairo PDF surface. Documentation seems scarce or scattered, though.
If you have a budget, Prince is what you want. Since you've already tried to use CSS+HTML to get it working, you may have a working solution almost ready. Just generate the HTML, SVG, and CSS (use the CSS3 paged media extensions for best control), then pass it off to Prince to generate the PDF. I've used this for several projects, and it works great.
There are free options that work like Prince, notably wkhtmltopdf, but they might not respect your paging options as much as Prince.
Otherwise, you might be able to hack something together using Cairo, by creating a page-sized SVG image and laying it out, adding links to the multiple external SVG files.
Either of these options will end up generating a PDF, which is the only way to ensure that it will print the same no matter which browser or OS is being used.