Sphinx with rst2odt.py - python-sphinx

I really like a lot of the features of Sphinx, but I also require the final output to be docx. I've had great luck with .rst -> .odt -> .docx using rst2odt.py (docutils) and LibreOffice. I noticed that Sphinx can generate "Docutils XML". In my mind, I should be able to use that and then go to .odt through the same mechanism as rst2odt.py does. However, I'm not sure how I would go about that. I realize that there is a package sphinxcontrib-docxbuilder, but it uses python-docx internally which has rather limited table support based on my experiments. I am using rst specifically because of the ability to do column/row spanning in a very clean way.
The alternative that I'm currently considering is to use something like jinja2 to do all the stuff I would need Sphinx for and stick stick with rst2odt.py.

Related

How to add listings or images to the table of contents (TOC)

I have a couple of examples (all with titles) and I'd like to create an index/list out of them automatically.
An example can be seen in the chunked AsciiDoc User Guide table of contents (or beneath):
The asciidoc source of the AsciiDoc User Guide does not show anything specific to me for Asciidoc itself, I could find the following hint to Docbook:
DocBook toolchains will normally automatically number examples and generate a 'List of Examples' backmatter section.
I'm looking for the (asciidoctor?) standard html5 rendering, but I'm open for different suggestions.
Adding the :doctype: book attribute alone does not do it. So I merely hit dead ends not knowing if it is possible at all. Also I'm new to Asciidoc so I might just miss some pointers, too.
The Python Asciidoc repo includes the a2x tool, which is a wrapper around a DocBook toolchain. It is DocBook that is producing these entries in the table of contents. Neither Python Asciidoc, nor asciidoctor, can do this out of the box.
You would need to curate the lists manually, or create a macro that does the curation for you. This thread might prove helpful: https://github.com/asciidoctor/asciidoctor-extensions-lab/issues/111

Conversion between knitr and sweave

This might have been asked before, but until now I couldn't find a really helpful answer for me.
I am using R Studio with knitr and a colleague of mine who I need to cooperate with uses the sweave format. Is there a good way to convert a script back and forth between these two?
I have already found "Sweave2knitr" and hoped this would have an .rmd as output with all chunks changed (<<>> to {} etc.) but this is not the case. My main problem is that I would also need the option to convert from .rmd back to .rnw so that my colleague can also re-edit my work-over.
Thanks a lot!
To process the code chunks and convert the .Rnw file to .tex, you use the knit() function in the knitr package rather than Sweave().
R -e 'library(knitr);knit("my_file.Rnw")'
Sweave2knitr() is for converting old Sweave-based .Rnw files to the knitr syntax.
In Program defaults change :
Weave Rnw files using Sweave or knitr
The Rnw format is really LaTeX with some modifications, whereas the Rmd format is Markdown with some modifications. There are two main flavours of Rnw, the one used by Sweave being the original, and the one used by knitr being a modification of it, but they are very similar.
It's not hard to change Sweave flavoured Rnw to knitr flavoured Rnw (that's what Sweave2knitr does), but changing either one to Rmd would require extensive changes, and probably isn't feasible: certainly I'd expect a lot of manual work after the change.
So for your joint work with a co-author, I would recommend that you settle on a single format, and just use that. I would choose Rmd for this: it's much easier for your co-author to learn Markdown than for you to learn LaTeX. (If you already know LaTeX, that might push the choice the other way.)

How can one create a polyglot PDF?

I like reading the PoC||GTFO issues and one thing I found remarkable when I first discovered it, was the "polyglot" nature of their PDF files.
Let met explain: when you consider for example their 8th issue, you may unzip files from it; execute the encryption they are talking about by running it as a script and even better(worse?) with their 9th issue you can even play it as a music file!
I'm currently in the process of writing small scripts every week and writing each time a little one page PDF in LaTeX to explain the said scripts. So I would really enjoy being able to create the same kind of PDF files. Sadly they explained (partly) in their first issue how to include zip files, but they did so through three small sketches of cmd lines without actual explanations.
So my question is basically :
how can one create such a polyglot PDF file containing stuff like a zip as well as being a shell script which may be run using arguments just like normal scripts?
I'm asking here about the process of creation, not just an explanation of how this is possible. The ideal way for me would that there are already some scripts or programs allowing to create easily such PDF files.
I've tried to search the net for the keywords "polyglot files" and others of the kind and wasn't able to find any useful matches. Maybe this process has another name?
I've already read the presentation by Julia Wolf which explains how things works, but I sadly haven't had time to apply the knowledge there to real world, because I'm sadly not used to play with file headers and the way a PDF is constructed.
EDIT:
Okay, I've read more and found the 7th edition of PoC||GTFO to be really informative concerning this subject. I may end up being able to create my own scripts to do such polyglot PDF files if I have some more time to consider it.
I played around with polyglots myself after attending Ange's talks and also talking to him in person. You really need to understand the file formats to be able to nest them into each other.
However, long story short, here are some links I found extremely useful for creating polyglots:
Some older Google Code Trunk
PoC of the polyglot stuff
Especially the second link (to github) will help you creating polyglots, but also understanding how they are working and how they are implemented. Since it is mostly Python stuff and very well / clean written, it is very useful and easy to follow.
I feel dissecting some file formats would be a good place to start. You can find many file format specifications for different file types through Google, but they can be a tough read and will likely take you some time to translate into whatever language you are using.
PDF: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
ELF: https://www.cs.cmu.edu/afs/cs/academic/class/15213-s00/doc/elf.pdf
ZIP: http://kat.sdf.org/zip_file_format.txt
The language(s) you select will need a way to read and write raw bytes (not just ascii alphanumeric), so perhaps C would be good for more direct access to memory. Some Python tricks could help with open sourcing the scripts easily.
To dissect the files, you may want to build a tool kinda like https://github.com/kvesel/zipbrk/ to take them apart, then put them all back together in a polyglot format. For example, zip does not require the section headers to be at the start (or even contiguous for that matter), and PDF magic number can appear in multiple places within the file as well. I also believe I recall a polyglot tool being included in one of the PoC||GTFO publishings (maybe issue 8 or 2??) as a polyglot in the pdf file.
Don't forget the hackers bible! :)
https://nostarch.com/gtfo

How to get list of figures in Asciidoc

I am using asciid for an article. In the end of my document I want to have a list of figures. How to I create a list of figures? Did not find something useful in the documentation for me.
Nope there isn't one at the time of answer. I checked the docs (which you indicated you did as well) and I also grepped the codebase. There is good news though! You should be able to do this with an extension.
Extensions can be written in any JVM language if you're using asciidoctorj, or in Ruby if you're using the core asciidoctor (I'm not sure about JavaScript for asciidoctorjs). You'll need to create two extensions probably: a TreeProcessor extension to go through the whole AST looking for images and pulling them out into a storage structure. Then you'll also need to create either an inline or block macro to actually place it within the page.
I strongly recommend examining the API for the nodes and functions you'll want to make use of. There are some other examples of processors that may also be helpful to examine.

What simple syntax can be used for rich text?

I want in an application with a simple text input, enriched with some marks to include formatting or semantic labeling. I want the syntax as easy as possible and I want to include self-defined labels.
Example:
[bold]Stackoverflow[/bold] is a [tag]good[/tag] resource for programmers.
Tables would be needed too.
HTML/XML and LaTeX are mighty enough to allow this, but too complicated. Wiki-Syntax seems simple, but uses another symbol for each markup, has unclear quoting and every Wiki seems to have another syntax. For tables and similar stuff Wiki becomes very complicated.
Exists a language/syntax, that matches my needs or can be slightly changed to do so? Or do I have to invent something myself? In that case, do you have suggestions?
Definitely do NOT invent your own. There are plenty of simple markup languages already, and users HATE learning new ones. Trust me on this!
I would suggest using one of the following:
Textile
Markdown
BBCode
Make your decision based on your userbase, as well as what tools and parsers are available in your chosen language. For my site, we went with Textile, but I've found that BBCode tends to be the language that most people already know. However, this will vary with different user demographics.
StackOverflow, along with several other sites, uses Markdown. I think it will give you the best balance between features and simplicity.
Let me add ReStructuredText to the list.
An additional benefit of using it is given by the availability of ReStructuredText to Anything service that makes extremely easy to create HTML or PDF versions of the document.
As already pointed out there are a lot of lightweight markup languages (many are listed here: wikipedia article), there should be no need of creating your own.

Resources