Pandoc HTML to DOCX using template - pandoc

I am trying to convert HTML to DOCX file using Pandoc, I have created a reference docx file for template to apply styles for my output DOCX file.
While converting HTML elements to DOCX , hr tag is generated as double line in DOCX output, since by default DOCX applies double lines for Horizontal rule/line.
I need to apply styles for the HR rule in DOCX template so that my output document will have the HR styles applied.
any help is greatly appreciated.

Related

Trying to create rich text link to source citation for inline citation with pandoc citeproc

Trying to create an output that would allow the URL part of a citation to appear as clickable rich text link to source file.
So far I have tried to create a custom CSL file that would output as (TITLE, DATE) ([SOURCE LINK](URL)) so that when I run it through pandoc it would turn into a rich text link when converted to DOCX, HTML, PDF, etc.
However, when I run the following command it populates it as escaped markdown that would not result as a rich link but just the raw text.
pandoc input.md -o output.md --citeproc --csl chicago-custom.csl --bibliography references.bib
I also tried to output as html but it creates the link for the URL, but not the "SOURCE LINK" part and looks like this: /[SOURCE LINK]/(URL) for markdown or URL with html output
Is there a different approach than custom CSL to do this?

Is there a way to produce documentation only in HTML format with doxygen?

I know that doxygen creates LaTeX and HTML documenation, is there a way to create only HTML formated doc?

How to include images from the epub template with pandoc?

I've altered the epub template to display more information. It works fine, except when I specify images that refer to a local file. e.g. <img src = "my_file.png">. The code is there in the epub, but the image file isn't.
Pandoc does not parse the template as HTML, so it misses the <img> element when collecting media elements for inclusion in the EPUB. A quick and simple work-around is to list the missing images in some unused metadata field. E.g.,
---
missing-images: |
![](my_file.png)
---
Store the above in a file and pass it to pandoc via --metadata-file. This makes pandoc aware of the file, forcing its inclusion.
One could automate it by letting pandoc parse the template and extract the image information, e.g. with a pandoc Lua filter, but that's likely to be more trouble than it's worth.

Is there any way to convert ckeditor html content to ms word document?

I have a Laravel project where I need to create a doc/docx document based on user input in Ckeditor. I have previously worked with PHPword where I can convert simple text input to a docx document. But the problem with ckeditor is it gives you html with inline css (which i need) and PHPWORD can not convert this to a docx.
I also tried to convert the html to word by xml but no luck. I know there is a paid tool called phpdocx but I am looking for a free solution.
Just a note, I can actually convert the html to pdf. But again, there is no solution from pdf to doc.
So, any help in converting the html to word or pdf to word?
thanks

convert pdf to html using abcpdf

i am looking for a method to convert a pdf document into corresponding html document using abcpdf. kindly let me know if it is feasible. FYI, My pdf document has rich text along with images.
You can. Try this. Hopefully it'll work.
var doc = new WebSupergoo.ABCpdf10.Doc();
doc.Read('your Pdf byte array');
doc.Save('your HTML file path with .html extension');
doc.Clear();
doc.Dispose();
For documentation please have a look at the note section
http://www.websupergoo.com/helppdfnet/source/5-abcpdf/doc/1-methods/save.htm
To export as XPS, PostScript, DOCX or HTML you need to specify a file path with an appropriate extension - ".xps", ".ps", ".docx", ".htm", ".html" or ".swf". If the file extension is unrecognized then the default PDF format will be used.
You can definitely convert HTML to PDF, but I am not sure the inverse is possible to do with abcpdf.
Perhaps you can give a try to iText (iTextsharp)

Resources