Can I output multiple pdfs for different page ranges (or using some sort of delimiter) in Rstudio?
Here's a trick I'm using in case you can't find an easy way (on ubuntu, after installing pdftk):
Aside from the rmd file, I create an R script which edits the pdf generated by the rmd file and splits it into smaller pdfs.
example:
# 1 KNIT THE RMD FILE AND GENERATE A SINGLE PDF WITH ALL THE PAGES
rmarkdown::render('~/my_rmd_file.Rmd')
# 2 CUT THE FIRST 5 PAGES OF THE PDF
# 2.1 make up a name for the smaller pdf:
name_for_the_top5pages_pdf <- "my_rmd_file_top5.pdf"
# 2.2 compose the command that edits the pdf:
cmd_extract_first_5_pages <- paste0("pdftk my_rmd_file cat 1-5 output ",name_for_the_top5pages_pdf)
# 2.3 run the command
system(cmd_extract_first_5_pages)
It will keep the original pdf and create another one with the top 5 pages.
Related
I'm using the following command to convert 5 files (1 Markdown, 4 HTML) to PDF using Pandoc:
pandoc --toc --latex-engine=xelatex ${SOURCE_DIR}/* -o ${DST_DIR}/${DST}.pdf
It successfully does so, but in whatever order it wants. Is there any way to specify what order these files should be added to the singular PDF file?
It seems to do it alphabetically by file name, so that's a workaround.
I am differentiating two CSV files and generating its output in one html file by using below query:
vimdiff Sheet1.csv Sheet2.csv -c TOhtml -c 'w! CsvResult.html' -c 'qa!'
In above cmd two csv files (sheet1 and sheet2) in my Windows desktop, and CsvResult.html is a file which will show the output of both CSV's in html format.
The HTML file(CsvResult.html) which is generating is not visible properly because of different colors,
How to change generated HTML file FG/BG/Text color ? I tried using cmds for text change but it is not applying with file generated.
I recently discovered that concatenating text to the end of a PDF file does not change properties of the PDF file. This may be a very silly question, but if a program were concatenated to the PDF file, could it somehow be executed?
For example, opening this PDF file would create a text file in the home directory with the words "hello world" in it.
*pdf contents*...
trailer^M
<</Size 219/Root 186 0 R/Info 177 0 R/ID[<5990BFFB4DF3DB26CE6A92829BB5C41B> <B35E036CA0E7BA4CBF39B3D74DCE4CAF>]/Prev 4494028 >>^M
startxref^M
4663747^M
%%EOF^M
#!/bin/bash
echo "hello world" > ~/hello.txt
Would this work with a different file format? Does the embedded code need to be a binary executable?
As (fortunately), that's not part of the standard, you can't do that.
Unfortunately, the standard supports "launch actions", to execute arbitrary code with user confirmation. Those are now disabled by default and don't allow to execute embedded bulbs, but if enabled you could use that to execute arbitrary code that finds and executes the code embedded on the pdf.
The standard also supports javascript that excecutes sandboxed, but it a reader specific bug that allows may escaping the sandbox.
On OSX, how can I extract a sequence of pages from a PDF into a single file using the command line? I see that there are commands to split a PDF into separate pages... but I'd like to specify a page range (e.g. pages 24-31) and output a single PDF file.
With the pdfbox command line utilities of the pdfbox-app jar file:
http://pdfbox.apache.org/download.cgi
http://pdfbox.apache.org/2.0/commandline.html
java -jar pdfbox-app-x.y.z.jar PDFSplit -startPage 24 -endPage 31 yourfile.pdf
A result file named yourfile-1.pdf will be created.
I'm using Docsplit to split pdf into pages using
Docsplit.extract_pages("my.pdf").
But I want to limit the pages to 4. I tried
Docsplit.extract_pages("my.pdf", :pages => 1..4)
which is not working..
Can anyone suggest me what to do
install pdftk in you machine if not already done and set your path accordingly
remove the ESCAPEs from the lib/docscript/page_extractor.rb:18 file like so:
pdftk #{ESCAPE[pdf]} burst output #{ESCAPE[page_path]} 2>&1"
change to :
pdftk #{pdf} burst output #{page_path} 2>&1"
by default, the gem ignores the page range you give and it will create one pdf file per page. If you're happy with this, then the output pages are created in the same folder as your input file.
However, the easiest solution IMHO would be to just use to pdftk binary directly, it's quite straightforward: to extract pages 1-4, you could use this snippet :
in_file = 'IN.pdf'
range = 1..4
range_s = range.to_s.gsub('..', '-')
cmd = "pdftk.exe #{in_file} cat #{range_s} output pages#{range_s}.pdf"
res = `cmd`.chomp
This works, provided that the pdftk executable is in your PATH