dompdf 0.6 renders much larger pdfs - dompdf

I upgraged from dompdf 0.5.2 to 0.6.0 beta3, the same html, rendered to pdf is now 3-4 times the filesize they were previously.
I have played with all the config vars, and nothing seems to make them smaller.
NOTE: I have NO external fonts, only specified font is helvetica. The config file does list the "default_font" as "serif", but changing that to helvetica has no effect on filesize.
Is there another reason this update would make so much larger pdfs?
What is a better tool to convert html to pdf?

Related

Effect on existing images after changing JPEG quality (GD toolkit) in Drupal 7

On our Drupal 7 website with thousands of images, JPEG quality in GD toolkit was set at 100%.
This caused well-optimized images to be 150-200% larger if 'image styles' were used instead of 'original images'. But we need to use styles to keep images ratio consistent. CSS 'object-fit' is not an option for cross-browser reasons.
What will happen to the quality of existing images if we reduce quality to 60%?
Update: As far as I tested it should not have any effect on the existing images
(New) image style image (thumbnail) is generated if old one is not found. There is no point of re-generating thumbnails every time they are displayed since it would required too much server/resources.
If you want some thumbnail regenerated it should be enough to just delete thumbnail file. Next time it's called it will be generated again.
There is also a module for that: https://www.drupal.org/project/imagestyleflush
Check on this thread to see other options:
https://drupal.stackexchange.com/questions/12864/rebuild-images-from-image-style

Create small high quality PDF embedding optimized PNG?

I'm trying to create a small PDF file, embedding one optimized PNG image displayed as a header and footer on a 3 page PDF (same image must appear 6x in the PDF)
My optimized PNG image is only 2.3KB. It looks very sharp.
Failed with libreoffice
When I insert just one instance of the 2.3KB PNG image into a Libreoffice Writer doc containing only text, then export as PDF I can see that the image gets re-compressed to JPG and the resulting PDF file grows by about 40KB after adding the image. It also loses quality, the PNG also gets JPG fuzzy edges.
If I right click the image and select compression, there is no way to disable recompressing the image (it's already optimized better than libreoffice could do it) I've tried setting a compression level of 0,1,9 etc. Choosing JPG, no resize, lossless, etc but there was no improvement.
Failed with wkhtmltopdf
I also tried making a test page and used wkhtml2pdf but it did the same thing. Adding the low quality flag made no difference.
PDF Spec suggests PNG is supported?
From skimming the PDF spec, it looks like PNG images are supported.
Even plain text PDF files are surprisingly large
The disappointing thing is also when I take a 7KB HTML file which is basically just <html><body><p>foo...</p><p>bar...</p> (only about 15 paragraphs) with no CSS. The resulting 2 page PDF file is 30KB. Why should a 7kb (almost plain text) file become 30kb as a PDF?
Suggestions?
Can someone please suggest how to make a small PDF file in Linux?
I need to include 7KB of text and repeat one PNG image 6 times.
Manually or programatically. I'll take whatever I can get at this point.
PDF Spec suggests PNG is supported?
PNG isn't supported per se; PDF allows embedding JPEG images as-is, but not PNG images. PDF does borrow a set of features of the PNG format, however.
rinohtype (full disclosure: I'm the author) tries to embed as much as possible from PNG images as-is into the PDF. This does involve some bit-juggling to separate the alpha channel from the color data for example, but no reencoding of the image is performed. It does not (yet) support interlaced PNGs.
rinohtype should be able to do what you want to achieve. But please note that it currently is in a beta stage, so you might encounter some bugs.
Even plain text PDF files are surprisingly large
To keep the PDF size as small as possible, make sure not to embed/subset any of the fonts. Use only the fonts from the base 14 PDF fonts which are provided by PDF readers.
What you want is certainly achievable. Regarding the image quality, I would recommend making your image twice the size that you want it to actually display at in the PDF to keep it looking sharp.
As to the size, I've just modified a test in my PDF writer module (WIP..) to include a 7.2K png, 200px x 70px, in a PDF twice and the PDF came out at 6.8K 8). There's not much text included, but more text will only add what it's worth + a small percentage.
You can see the module and original test here.. https://github.com/DoccaPDF/docca-pdf-writer/blob/master/src/tests/writer.js#L40
That test adds ~112K of images to the PDF and results in a 103K PDF.
Of course not all images are created equal so you milage may vary..
*the images are only actually added to the PDF once, but are displayed multiple time.

Base64 img not showing after Winnovative PDFConverter HTML to PDF

I’ve been using the PDFConverter for years with no issues. And there are still no issues converting a large HTML form to PDF, except certain images aren’t showing.
I programmatically fill an HTML img element with a base64 string, like so:
imgSignature.Src = "data:image/jpg;base64," + Convert.ToBase64String(SignatureImage);
where SignatureImage is a byte[] array.
I've observed that if the byte[] array size is more than around 7K (not sure exactly the threshold), the image will not render to PDF (at least it’s not visible anyway). Anything under that displays fine. Note: the image displays in HTML just fine. It's when converting to PDF that it disappears if the byte array is too large
I've tried adjust the size of the img, the container it's in, everything I can think of.
Currently still going through Winnovative support docs but no luck so far.
Thanks for any advice.
Just in my case, I changed all images formats from jpeg to png, and it works for me. It's worth mentioning that my images are between 6kb to 8 kb (the images are Bar Codes)

Any Tricks to Use in wkhtmltopdf and pdftk to Reduce File Size?

I'm using wkhtmltopdf on OS X, and while it has been generally working as intended, the size of the files it generates is larger than I had hoped for. My goal is to essentially save a screenshot of the text content webpage as a pdf, and I don't really care about the images, hyperlinks, and other features on the page. I've been using the tool in conjunction with pdftk to save the first page of a website as a pdf, and below is an example of my code for the desired webpage (http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702):
/usr/local/bin/wkhtmltopdf http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 --zoom 0.65 /Users/dwm8/Desktop/test.pdf
/usr/local/bin/pdftk /Users/dwm8/Desktop/test.pdf cat 1 output /Users/dwm8/Desktop/test2.pdf dont_ask
The size of the final file test2.pdf is 487 KB, which is larger than I would prefer. Are there any tricks I can use in wkhtmltopdf or pdftk to reduce the file size? Thanks for the help!
Well, if you don't care about hyperlinks or images, the obvious thing to do is suppress them using --disable-external-links and --no-images. If you are really only interested in the text, which is black and white, you may as well only generate a greyscale PDF too:
/usr/local/bin/wkhtmltopdf --disable-external-links --no-images --zoom 0.65 --grayscale http://espn.go.com/mens-college-basketball/boxscore?gameId=400589702 result.pdf
which gets the file size down from 500kB to 70kB on my system - a fairly useful 86% space saving!
You could pass in --lowquality true as this is used to shrink the generated pdfs size.
More information on options can be found here http://wkhtmltopdf.org/usage/wkhtmltopdf.txt

Change quality of PDF rendered using PDFsharp

Can we reduce the quality of PDF rendered using PDFsharp?
I have to render files containing thousands of pages. Due to this, the resulting files sometimes occupy more than 100 MB memory space which is inconvenient.
A PDF is rendered from an Excel sheet. There are no images or charts. Only the number of rows in the Excel is very high. Can anyone please help?
Thanks in advance
With the current version of PDFsharp it is up to you to reduce the resolution of images or the JPEG quality to optimize the file size.
.NET provides functions that can accomplish that.

Resources