I'm trying to convert my existing asciidoc documentation into pdf. Asciidoctor-pdf seems quite easy and I'm able to convert single files into pdf.
asciidoctor-pdf -a pdf-theme='./theme/styles.yml' -a pdf-fontsdir='GEM_FONTS_DIR, theme/fonts/' 01-intro.adoc
But my docs are spread across many files. I want do create a single pdf from all those files. Does anyone know how to do this?
Secondly I don't want the generated pdf to be located next du the adoc file. I want to specify a target path.
I'd appreciate every hint. Thanks and best regards. Sebastian
(Dec 26, 2021)
The easiest and most convenient way is to use the VSCode editor with the AsciiDoc extension installed. This extension is developed by the same team that develops the AsciiDoctor text processor. This is a GUI-based approach to solve all your problems so I'm pretty sure u're gonna love it.
(Step 1) After the extension is installed, use the keyboard shortcut Cmd + , to go to the settings and then enter asciidoc.use_asciidoctorpdf in the search bar and tick the check box (see the demonstration below)
(Step 2) To create a single pdf file from multiple .adoc files, just simply put all of them in a single .adoc file with include::directory-to-the-adoc-file.adoc[] (see the illustration below)
(Step 3) Press F1, then type in as pdf and hit Enter to export this single .adoc file as a single PDF file, this will allow u to specify the target export directory for the PDF. Please be patient and wait for a few seconds for the export to complete, the editor will immediately inform u as soon as the export is complete (see the image at the bottom)
Have you considered to work with includes?
Just add to your document "01-intro.adoc" an any position this line:
include::02-next-file.adoc[]
When you build the 01-intro.adoc with your regular command, the contents of 02-next-file.adoc will be put to the position of the include line. Using this method we create a file with many includes and just build that file. We're very happy with that.
Related
I'm downloading some newspapers as pdf (for posterity). One title is a pain, it includes URI links in the pdf itself, if you accidentally click these it opens a browser tab to a page that 500s. It's not so bad on a desktop computer, but a pain in the butt if someone is reading it with a tablet. Each issues has approximately 200 of these links.
For a different title, it was as simple as using QPDF, like so:
qpdf --qdf --object-streams=disable file temp-file
This puts the temp version into postscript mode or something, and I was able to nuke the links with something like this:
s/obj\n<<\n( \/A <<\n \/S \/URI.+?)>>\nendobj/"obj\n<<\n" . " " x length($1). ">>\nendobj"/sge
This still works. However, a 15 meg original pdf is now becoming a 108meg "fixed" pdf. I can accept some bloat, but 720% is a bit absurd (I think it was more like 10% on the other title). Whenever I google for how to do this, I get results for Acrobat Reader and how you can click around in 20 menus to do such... does no one that uses Adobe products ever want to automate this stuff? There are between 180 and 300 links in a typical issue, spread across 45-150 pages (Sunday editions).
Are there any tools that can do this? Are there any clever arguments to qpdf that will make this more reasonable?
PS Yes I know it's hacky as hell to just overwrite the URIs with spaces, but I've never managed to figure out how to remove the objects entirely since their references also have to be removed.
You can do this with the community edition of cpdf: https://community.coherentpdf.com/
To remove all links in a PDF (well, to replace them with an empty link):
cpdf -replace-dict-entry /URI cpdfmanual.pdf -replace-dict-entry-value '""' -o out.pdf
This does not remove the annotations - it just makes sure that clicking on them won't go anywhere. It leaves the annotation in place, but with an empty link. You could replace with a working URL too, of course:
cpdf -replace-dict-entry /URI cpdfmanual.pdf -replace-dict-entry-value '"https://www.google.com/"' -o out.pdf
(You can also use -replace-dict-entry-search to replace only certain URLs - see the manual.)
Or, if you just want rid of all the annotations (link and non-link):
cpdf -remove-annotations in.pdf -o out.pdf
You can use HexaPDF (you need to have Ruby installed and then use gem install hexapdf to install HexaPDF) and the following small script to remove the links:
require 'hexapdf'
HexaPDF::Document.open(ARGV[0]) do |doc|
doc.pages.each do |page|
page.each_annotation.select {|annot| annot[:Subtype] == :Link}.each do |annot|
page[:Annots].delete(annot)
end
end
doc.write(ARGV[0] + '_processed.pdf', optimize: true)
end
Then batch execute the script for all the files you want the links removed.
Note that this will remove all links.
Just to round off the options I would suggest the best is potentially a PDF dedicated command line tool such as cpdf answer by johnwhitington or a dedicated library like iText.
There are several alternative methods touted for batch text editing your using qpdf
"temp version into postscript mode or something,"
That is a converted pdf into plain old decompressed text/pdf hybrid qdf so you can run sed or similar string editor. Here the primary difference is the upper out.pdf file shows as an editable QDF-1.0 version after editing so needs conversion to a conventional PDF as seen in the lower part where the stream is binary thus recompressed.
1) qpdf
At end of a bloating edit exercise the idea is to reverse back to application/pdf using
fix-qdf file-temp.pdf>out.pdf
to tidy up redirects and then
qpdf --compress-streams=y out.pdf outfixed.pdf
back to fixed.pdf
Other cross platform means are using
2) pdftk
$ pdftk infile.pdf output outfile.pdf uncompress
edit with vim or whatever sed scripting method then
$ pdftk outfile.pdf output fixedfile.pdf compress
3) mutool
mutool clean -d [options] input.pdf [output.pdf] [pages]
-d Decompress streams. This will make the output file larger, but provides easy access for reading and editing the contents with a text editor.
-i Toggle decompression of image streams. Use in conjunction with -d to leave images compressed.
-f Toggle decompression of font streams. Use in conjunction with -d to leave fonts compressed.
-a ASCII Hex encode binary streams. Use in conjunction with -d and -i or -f to ensure that although the images and/or fonts are compressed, the resulting file can still be viewed and edited with a text editor.
Whichever options you use, need to be reversed when recompressing
NOTE
Using text editors will potentially corrupt binary fonts and binary images, thus they need monitoring for any corruption in an editor that changes encoding or line feeds. This pdftk sample shows the image stream has been decompressed well into simple text but beware any change of End Of Line by editor would break up that stream
Additionally when making text edits that are not simple byte wise "find and replace", the xref table can be corrupted too much to be reindexed by recompression, try to overwrite with same number of characters when using a text edit method.
SIDE NOTE
EVEN if you remove actions and external hyperlinks actions but the text is present the reader will still provide that exploitable action. Same as here https://google.com but html will highlight usually in blue underline.
Hence ensure security is on
This question is motivated by the answer given in this question
Using the animate package without adobe
I want to create latex beamer presentations without relying on adobe, as it is a pain.
I followed the instructions given in the post's answer, and when compiling the given example code, the output were 4 .svg files, and I have no idea on what to do with them.
Something tells me they should be embedded into an html file that produce a slide-presentation, but I'm a complete noob in html and I've not been able to find an answer on how to achieve this.
No additional wrapper for the individual .svg files is necessary. Simply open the first .svg file in your browser and use the little arrows at the top right for navigation. They automatically link to the next slide.
I need to maintain some slides in both latex beamer and in powerpoint. (This is to make slides available for instructors elsewhere, too, 90% of which do not know how to use latex and are unwilling to learn it. and I am a latex guy on linux.)
I have tried the route via Libreoffice (and opendocument), but this did not come out well. right now, the best method that I have found is to author pdf in beamer, then run it through a nuance OCR program to get MS Word...and not even go all the way to Powerpoint (which is where I really need to be).
If I only had a markup language that produced nice Powerpoint, I could probably code a perl translator from markdown to this intermediate markup language. (going from markdown to latex beamer is relatively easy.)
I don't think this exists, but hope springs eternal. after all, it is almost 2014 now. does anyone know of a solution?
One solution is to use odpdown: It converts markdown to the OpenOffice Presenter format, which can be imported into PowerPoint.
It is not yet complete, i.e. table support is missing and possibly not running on certain Windows setups, but nevertheless it could be a start. Possibly, you have Linux running, where it seems to work.
Steve Rindsberg's answer in the comments works on PP 2007 works! Let me repeat it here:
I suspect that PowerPoint is the likeliest solution. ;-) But what sort
of slides are you creating? If they're simple heading and bullet point
slides, all you need to produce is a simple text file. Any text that
starts in the left column will be the heading of a new slide. Indent
one tab and it becomes a first-level bullet point under the current
heading; indent two tabs, it becomes a second level bullet point and
so on. Simply use File | Open on the text file to pull it into PPT.
Steve: Is this all that PP converts? Or is there a reference of other "sneaky" markup that PP knows about?
(pandoc: unfortunately, the conversion from libreoffice to powerpoint is pretty poor when I tried it last. I also tried to save and understand the powerpoint xml format, but that was REAL bad.)
The easiest way to handle this is to work with:
RStudio (and R if not already installed)
RMarkdown
Pandoc 2.0.5 (minimum)
Install those 3 (or 4) items, then read: https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html
The installation time is worth the time saved copy-pasting everything from scratch.
I also am a Linux guy and I also use LateX engines to create nice documents. Based on my experience, here's what you should do :
Stop writing directly in LaTeX and start using org-mode to write documents instead (I spent years writing in LaTeX and now it's over (except when I use modernv package))
Org supports latex math formulas and .org files are easily exported in .tex files
Org can also be easily exported in markdown
Once you have your markdown, there are several tools that will allow you to create a PowerPoint. Two of them are pandoc and md2pptx
For example if I have a business card design done in InDesign and now I need to provide print ready PDF for printers containing multiple copies of the business card. How would you do that? Are there any specific tools?
InDesign doesn't do imposition (placing of pages on one output page in a particular order).
You have to buy/find a tool, a plugin. Like croptima dot com.
Or on this page, there's some interesting stuff:
http://www.adobe.com/cfusion/exchange/index.cfm?l=6&s=5&o=desc&exc=19&cat=223&event=producthome
Alternatively do it by hand, or use a pdf imposition tool.
Succes!
Do an export to PDF ( with any marks you need ). Get the file path. Open a text file and type in :
file
/myFile.pdf
/myFile.pdf
/myFile.pdf
/myFile.pdf
/myFile.pdf
/myFile.pdf
/myFile.pdf
…
Once that done. Go to Indesign, set a box that will host the pdf and run a data merge. You will get your imposition quite freely ;)
Loic
My bad, you need to specify that you are placing images files with a trailing arobase :
#pdfs
"/myFile.pdf"
"/myFile.pdf"
"/myFile.pdf"
…
And specify the absolute path to the file.
How many cards do you need to layout ? If few, you could just flow the indd file into another document and duplicate boxes.
I didn't test but maybe you could draw a grid and point for the indesign file. Best scenario, if grid is selected, the file is flowed in every frame.
Loic
Does anyone knows how to extract the characters image from a font(ttf) file?
TTF is a vector format, so there are no characters shapes, really. Load the font, select it into a device context (a memory one), render a character, grab a bitmap.
Relevant APIs: AddFontResource, CreateFont, CreateDC, CreateBitmap, SelectObject, TextOut (or DrawText).
You can use GetGlyphOutline with GGO_BEZIER to get the shape of a single character.
For the sake of completeness I'd like to add a GUI and Python way to this pretty old thread.
If the goal is to extract images (as e.g. png) from a .ttf file I found two pretty straight forward ways which both involve the open-source program fontforge (Link to their website):
GUI Way (Suitable for extracting a handful of characters): Open the .ttf file in fontforge click on the character you want to export. Then: file -> export -> format:png
CLI / Python Way (Suitable for automation): FontForge has a cli api for python 2.7 which allows to automate the extraction of the images. Refer to this superuser thread for a complete script.
Link 1: https://fontforge.org/en-US/
Link 2: https://superuser.com/questions/1337567/how-do-i-convert-a-ttf-into-individual-png-character-images