How to convert PDF version to 1.7 ExtensionLevel 8 using Ghostscript - ghostscript

I am able to convert the PDF version from 1.5 to 1.7 with the below Ghostscript code but how to convert "PDF version 1.7 ExtensionLevel 8"?
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dFastWebView=true -dCompatibilityLevel=1.7 -sOutputFile=output.pdf input.pdf

What exactly are you expecting to happen here ?
Your command line doesn't actually do anything except write a different version number into the PDF header. The same would be true of writing an extension version, all it does is change the 'version', it doesn't affect the content of the PDF file.
Ghostscript's pdfwrite device doesn't even use the features of PDF 1.5 (with a couple of minor exceptions), so what do you expect to gain even by producing a PDF 1.7 file ?
Lying about the minimum required version (which is what you are doing when you change the version like this) simply means that older PDF consumers might fail to open the file (or give a warning) because they believe it will use features which they don't support. Since the PDF file doesn't use those features, you are actually making the file less portable by doing this.
FWIW Ghostscript's pdfwrite device can now produce PDF 2.0 files.
If you absolutely are insistent about doing this you can 'probably' add an Extensions dictionary to the document Catalog using pdfmarks, but I'm not 100% confident.

Related

Converting .tif to pdf/A 1.4 or 1.5

Using a shell command I have been able to convert .tif files to pdf files. Unfortunately the outputed files are in pdf/A ver 1.3 and I need it to pdf/A v1.4 or 1.5.
This is my command (output in v1.3):
Convert test.tif test.pdf
I think the command is using imagemagick tool (which use ghostscript) to do the conversion
So I tried this ( still 1.3 but the pdf/A is not valid):
convert test.tif pdfa:test.pdf
Then I tried to convert the pdf 1.3 to 1.5 using ghostscript
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dNOPAUSE -dQUIET -dBATCH -sOutputFile=new-test1.5.pdf test.pdf
This work just fine.Do you think is possible to convert .tif file directly to .pdf 1.4 or 1.5 ?
I tried to check the gosthscript files but I was not able to do any modification leading to my excpected result.
Thank you for your help
RFlow
You have not stated which version of PDF/A you require. If you need PDF/A-1a or PDF/A-1b then you cannot produce a PDF greater than PDF 1.3 as that is specifically forbidden by the specification.
Why do you need PDF 1.4 or 1.5 ? The PDF version is simply a statement of the minimum feature set used by this PDF file, PDF is forward-compatible, a PDF 1.4 consumer can also by definition read PDF 1.3 files. Since you started with TIFF I cannot see any way that your PDF file needs to be of a higher level than PDF 1.3
Your Ghostscript command line will produce a PDF file which states that it requires a consumer of at least level 1.5 (which is a lie ;-) but the resulting PDF will not be PDF/A compliant.
So you really need to explain why you need a PDF/A file, and why it has to be PDF version 1.4 or greater.

Count pages in PCL with GhostPDL 9.x

I used to Count pages in a pcl file using the GhostPDL (pcl6.exe) in Version 8.71.
pcl6.exe -dNOPAUSE -sDEVICE=nullpage -g10x10 -C file.prn
which produced a
%%PageCount: 10
for instance. Now I updated to 9.10 Version and found that my usual code did not work anymore. The command line Switch "-C" is gone.
I searched the documentation but it brought me nowwhere close to a solution. So it does not seems a very good approach to convert PCL to PDF and then count the PDF pages with GhostScript.
Any suggestions?
Why is converting to a PDF and then using pdfinfo not a good solution ?

Wkhtmltopdf version, first page and TOC

Some questions for this very nifty tool, unfortunately lacking many usage examples.
Manual speaks of a possible “Reduced Functionality” for wkhtmltopdf. I have version wkhtmltox-0.11.0_rc1-installer.exe, by running wkhtmltopdf --version what should I read to understand whether my version is the reduced one or not?
Currently I like wkhtmltopdf for webpages I want to read later and/or store. To mirror webpages I use httrack, then I generate the PDF with wkhtmltopdf *.html offline.pdf. How can I set/specify the first PDF page from the *.html list? Currently they seem to be converted in alphabetical order.
If I run wkhtmltopdf toc http://qt-project.org/doc/qt-4.8/qstring.html qstring.pdf I simply get a leading blank page, no TOC. What’s wrong?
Thanks for helping
EDIT:
#Nenotlep:
Your TOC trick works perfectly.
As for the first page, I don’t need an actual cover.
What I need is a way to download/convert a given page www.site.com/foo.html and all the linked pages (A.html, B.html ...) up to a certain depth level. Then I want a single PDF starting with foo.html and containing also the pages A.html, B.html ... (with relative links).
I don’t think there is an option to download and insert the linked pages in the final PDF (please, correct me if I am wrong). So I use httrack.com to download and wkhtmltopdf to convert. Given the alphabetical behaviour of wkhtmltopdf, the best now seems to rename the target page, downloaded with httrack, something like !foo.html.
Please, let me know of possible alternatives.
For part 3 of the question which is blank TOC, the latest stable version 0.12.5 also does not generate it. The pre-release version 0.12.6-dev has fixed this problem in Mac.
I think all available precompiled wkhtmltopdf's are compiled with the patched QT, they are not reduced. The reduced functionality means that it was compiled without a special patched version of QT. I use the windows version and it isn't reduced.
I think the cover command line argument would work for you. I can't test at the moment, but try a command like wkhtmltopdf cover derpy.html toc --xsl-style-sheet default.xsl rarity.html twilight.html spike.html equestriadaily.pdf
At least in Linux, I think the asterix *.html simply explodes into all the html files before the command is performed, so if you select one html file for the cover and then do *.html in the same folder you will get the file twice. Getting around this issue might need some command line sorcery or a batch file or some other trickery.
This is a bug in wkhtmltopdf. The workaround is to manually set a tocfile. You can get the default tocfile with wkhtmltopdf.exe --dump-default-toc-xsl. Then you can save the output as a file and use it like wkhtmltopdf.exe toc --xsl-style-sheet default.xsl www.stackoverflow.com so.pdf.

GhostScript on CentOS 5.3 - Unable to process JPXDecode data

I'm trying to get our server to convert PDFs to image files. It's a CentOS 5.3 system and the latest version of ghostscript that can be (8.70), has been installed.
When I try to convert a PDF I get the following error repeated for each page, and the result is a load of blank images.
**** ERROR: Unable to process JPXDecode data. Page will be missing data.
So, I found an answer on here that seemed to answer that question:
iText PDF; howto convert jpeg2000 to jpg using Java
Following that I downloaded iText 5.3.4 and jai_imageio-1.1.jar and compiled the supplied script on my local machine. When I run the final conversion command on my PDF I get:
java.lang.NullPointerException
at com.itextpdf.text.pdf.parser.PdfImageObject.decodeImageBytes(PdfImageObject.java:296)
at com.itextpdf.text.pdf.parser.PdfImageObject.<init>(PdfImageObject.java:199)
at com.itextpdf.text.pdf.parser.PdfImageObject.<init>(PdfImageObject.java:158)
at PDFConverter.hasJpeg2000(PDFConverter.java:36)
at PDFConverter.main(PDFConverter.java:15)
Doesn't contain any JPEG2000 images: Nothing to be done...
I'm not sure whether that's definitely saying that the PDF doesn't have any JPEG2000 images, or whether I've done something wrong when I compiled the script. Perhaps I've got the wrong version of iText since no links were provided in the answer to that other question.
So now I either need help to convert my PDFs to remove any JPEG2000 images, or I need help to get our server running ghostscript properly.

additional settings for wkhtmltopdf?

I am converting some docs to pdf using wkhtmltopdf (currently using perl and the command line versions). Is it possible to change the "PDF Producer", "PDF Version" and "Fast Web View" fields? The current defaults are "wkhtmltopdf", "1.4 (Acrobat 5.x)", and "No", respectively. I didn't see anything in the wiki page
Pass the following with the command line to see supported features: " --extended-help"
Not sure if those specific params are supported or not.
I patched wkhtmltopdf to support an additional flag recently, and it would be quite easy to add parameters to change those. I don't believe they are supported currently, though.
PDF Producer: Nope. Most apps want folks to know that particular app generated the PDF.
PDF Version: Nope, but trivial. The version number at the beginning of the file is just a courtesy really. What exactly are you after with this? Chances are you don't really need it. The PDF generated isn't going to acquire any features automagically just because the PDF claims to be this version or that. It's only really used so a viewer opening a newer PDF can say something like "I don't support this version, some stuff might not work". Because everything will work regardless (unless someone happens to have a VERY old version of Acrobat/Reader), I don't see the issue.
Fast Web View: Nope, and decidedly non-trivial. "Fast Web View" means everything needed to display the first page of the PDF is sorted to the front of the file, and there are various "hints" on where an app downloading the PDF can find this or that. It's not just a flag, not by a long stretch.
Zero for three. Sorry.

Resources