Can CGPDFDataFormatJPEG2000 be used for something other than a JPEG2000 image? - cocoa

Using the Quartz 2D PDF routines, can the CGPDFDataFormat format of a CGPDFStreamRef PDF stream be equal to CGPDFDataFormatJPEG2000 in any case other than for an XObject image with a filter of /JPXDecode?
In other words, is the CGPDFDataFormatJPEG2000 format ever used for anything other than JPEG2000 image streams? The reasonable answer would be no, but there can always be a difference between common usage and what's theoretically possible.

JPXDecode filter expects a JPEG2000 image file to be stored in the image XObject, not just compressed raw data. I can say 100% it is always used for image XObjects. But theoretically nothing stops you to wrap your raw content stream data as a JPEG2000 image and then use the JPXDecode filter with a regular content stream. It is just not practical.

Related

Converting PDF to images of original size

I have a PDF file which is made of photographs of a book connected in a single PDF file. I'm trying to convert it back to single images in PNG format, every tool I tried asks me to set DPI which alters the size of resulting images, is there a way to get images of the exact same pixel size the original images were?
Most PDFs of books contain a single image per page and depending on the scanner these images can basically be in three different formats: JPEG, JPEG2000 or TIFF. JPEG2000 is rarely used, so your PDF probably contains JPEG and/or TIFF images.
The good thing about JPEG (and JPEG2000) images is that they can be embedded as-is into a PDF! So you can extract the images as they are stored in the PDF. With TIFF this is also sometimes possible (but I don't think always...).
As mentioned by Tim Roberts you should try using pdfimages or hexapdf images to view and extract the images stored in the PDF. This will give you the best result.

Convert image to Blob

I want to upload image data to a php script on the server. I have a URL for an image source (PNG, the image might be located on a different server). I load this into a Javascript image, draw this into a canvas and use the canvas.toBlob() method (or a polyfill as it is not mainly supported yet) to generate a blob holding the image data. This works fine, but I recognized that the resulting blob size is much bigger than the original image data.
In contrast if I use a HTML File input and let the user select an image on the client the resulting blob has equal size to the original image. Can I get image data from a canvas that is equal to the original image size?
I guess the reason is that I loose the PNG (or any image compression) when using the canvas.toBlob() polyfill:
value: function (callback, type, quality) {
var binStr = atob(this.toDataURL(type, quality).split(',')[1]),
len = binStr.length,
arr = new Uint8Array(len);
for (var i=0; i<len; i++ ) {
arr[i] = binStr.charCodeAt(i);
}
callback(new Blob([arr], {type: type || 'image/png'}));
}
I am confused by so many conversion steps via image, canvas, blob - so maybe there is an alternative to get the image data from a given URL and finally append it to FormData to send it to the server?
The method toDataURL when using the png format only uses a limited set of the possible formats available for PNG files. It is the 8bit per channel RGBA (32 bits) compressed format. There are no options to use any of the other formats available so you are forced to include redundant data when you save as a PNG. PNG also has a 24bit and 8 bit format. PNG also has several compression options available though I am unsure which is used but each browser.
In most cases it is best to send the original image. If you need to modify the image and do not use the alpha channel (no transparency) but still want the quality to be high send it as a jpeg with quality set to 1 (max).
You may also consider the use of a custom encoder for PNG that gives you access to more of the PNG encoding options, or even try one of the many other formats available, or make up your own format, though you will be hard pushed to improve on jpeg and webp.
You could also consider compressing the data on the server when you store it, even jpeg and webp have a little room for more compression. For transport you should not worry as most data these days is compressed as it leaves the page and most definitely compressed by the time it leaves the clients ISP

Decode JPEG image stripped from inside a PDFs file

I have code that decompresses jpgs into bit maps which works fine for JPEG files, however when I feed the code a JPEG I have stripped directly from a PDFs XObject I get errors.
Adobe reader displays the image fine so I don't believe it's corrupted. I have read through JPEG and PDFs documentation and don't find any obvious problems.
My question is this, is there anything different in the "JPEG" embedded inside a PDFs stream and a normal JPEG? And if so what is it?
Note: I can manually open the PDFs, copy the image, paste into paint, and save...when I do this everything works....my problem is I need this automated.
When my code parses the PDFs, strips out the image stream, dumps the binary to a file, and then I try and open this file, it does not work. What am I missing?
My errors seem to be occurring in the Huffman decoding process, the cdt and Huffman tables appear to be read in fine.
Pardon my using the answer section but I overflowed the comment section:
My questions:
1. What code is failing to decode the JPEG? You say you "have code" but where did that come from? Why do you think that it is reliable?
What is the file format of the JPEG stream? JFIF, ADOBE, EXIF, none specified?
Could there be something in the file format that your decoder cannot handle? Does your encoder check for different types of APPn markers?
What is the JPEG format? What type of SOS marker?
Does this encoder source handle all the normally formats? Baseline, Extended, Sequential, progressive? If you have progressive JPEG and and encoder that only does baseline, you are going to have a problem.
How many components does the JPEG stream have?
Some Adobe files have 4 components and decoders may only be able to handle 1 or 3.

Render images progressively in a MFC based application

Browser can render progressive images progressively.
And the images can only be progressively decoded if they were progressively encoded.
e.g., GIF or PNG images saved with the "interlaced" option, or JPEG images saved with the "progressive" option.
I want to render the progressive images in my MFC based application just like the browser does.
Windows Imaging Component provide IWICProgressiveLevelControl interface to decode image progressively.
But I can't find out any example to show how to stream and display image progressively at the same time using IWICProgressiveLevelControl.
Any advice would be appreciated. Thanks.
There's a good sample here:
https://code.msdn.microsoft.com/Windows-Imaging-Component-3af3cd49
Once you've used IWICProgressiveLevelControl::SetCurrentLevel to select the scan, the decoder will behave normally but only use the scans up to and including the one you selected. So any call to CopyPixels or any IWICBitmapSource components in your chain will receive the fully decoded image at the selected scan level.
The trick, as demonstrated in the sample, is that you can't use IWICProgressiveLevelControl::GetLevelCount and select the max level immediately if you don't know the complete file is available. As the documentation for the sample states,
IWICProgressiveLevelControl allows you to control which progressive level of detail to use on the frame decode. It also allows you to query the total number of progressive levels in the file; however it is not recommended to use this method on JPEG images because the total count is not known until the entire image has been downloaded, defeating the purpose of progressive decode. Instead, this sample demonstrates the recommended practice of iteratively requesting increasing levels of detail until WIC returns WINCODEC_ERR_INVALIDPROGRESSIVELEVEL.

TIFF image file format

I am working on TIFF images for image compression.
I want to know how is the actual raw image data i.e. R,G,B components organised/stored in the TIFF file.
Is it stored as G0B0R0G1B1R1... (1 byte each for each color component, all components intereleaved)
or is it some other way viz. planar format or something else?
Thank you.
-AD.
TIFF specifies:
How attributes are associated with a page
How multiple pages (and their attributes) are packed into a single file
Page attributes include properties such as:
Dimensions
Encoding scheme
In other words, a TIFF file may contain data that's encoding using any of many different encoding schemes.
The TIFF file can store various image types:
Bilevel (B/W)
Grayscale
Palette-color
RGB full-color
The storing of actual image data is done differently for each image type.
The specification is not the scariest I have seen, but it is definitely not trivial!
The TIFF specification can be found here: http://partners.adobe.com/public/developer/tiff/index.html
I have been doing the same, with tiff files looking at multi resoution tiffs.
Adobe have TIFF 6 documentation on their website.
You should be able to use P/Invoke on LibTiff with c# or vb.net.
Their are many types of compression, some of them proprietary.
Looking at the doc supplied by tomassao, I see that uncompressed RGB is just one of the possible tiff encodings.
It looks like the data is not interleaved. In fact, you can specify more than 3 samples per pixels (but RGB is 3), and you can specify different numbers of bits per sample (but 8,8,8 is common).
I assume you already know about how the headers work. The document covers it if you don't.

Resources