How can an Interlaced .png file's size be smaller than the original file? - image

Ok, so I tried to use the imagemagick command:
"convert picA.png -interlace line picB.png"
to make an interlace version of my .png images. Most of the time, I got the resulting image is larger than the original one, which is kinda normal. However, on certain image, the resulting image size is smaller.
So I just wonder why does that happen? I really don't want my new image to lose any quality because of the command.
Also, is there any compatibility problem with interlaced .png image?
EDIT: I guess my problem is that the original image was not compressed as best as it could be.

The following only applies to the cases where the pixel size is >= 8 bits. I didn't investigate for other cases but I expect similar outcomes.
A content-identical interlaced PNG image file will almost always be greater because of the additional data for filter type descriptions required to handle the passes scanlines. This is what I explained in details in this web page based on the PNG RFC RFC2083.
In short, this is because the sum of the below number of bytes for interlaced filter types description per interlacing pass is almost always greater than the image height (which is the number of filter types for non-interlaced images):
nb_pass1_lines = CEIL(height/8)
nb_pass2_lines = (width>4?CEIL(height/8):0)
nb_pass3_lines = CEIL((height-4)/8)
nb_pass4_lines = (width>2?CEIL(height/4):0)
nb_pass5_lines = CEIL((height-2)/4)
nb_pass6_lines = (width>1?CEIL(height/2):0)
nb_pass7_lines = FLOOR(height/2)
Though, theoretically, it can be that the data entropy/complexity accidentally gets lowered enough by the Adam7 interlacing so that, with the help of filtering, the usually additional space needed for filter types with interlacing may be compensated through the deflate compression used for the PNG format. This would be a particular case to be proven as the entropy/complexity is more likely to increase with interlacing because the image data is made less consistent through the interlacing deconstruction.
I used the word "accidentally" because reducing the data entropy/complexity is not the purpose of the Adam7 interlacing. Its purpose is to allow the progressive loading and display of the image through a passes mechanism. While, reducing the entropy/complexity is the purpose of the filtering for PNG.
I used the word "usually" because, as shown in the explanation web page, for example, a 1 pixel image will be described through the same length of uncompressed data whether interlaced or not. So, in this case, no additional space should be needed.
When it comes to the PNG file size, a lower size for interlaced can be due to:
Different non-pixel encoding related content embedded in the file such as palette (in the case of color type =! 3) and non-critical chunks such as chromaticities, gamma, number of significant bits, default background color, histogram, transparency, physical pixel dimensions, time, text, compressed text. Note that some of those non-pixel encoding related content can lead to different display of the image depending on the software used and the situation.
Different pixel encoding related content (which can change the image quality) such as bit depth, color type (and thus the use of palette or not with color type = 3), image size,... .
Different compression related content such as better filtering choices, accidental lower data entropy/complexity due to interlacing as explained above (theoretical particular case), higher compression level (as you mentioned)
If I had to check whether 2 PNG image files are equivalent pixel wise, I would use the following command in a bash prompt:
diff <( convert non-interlaced.png rgba:- ) <( convert interlaced.png rgba:- )
It should return no difference.
For the compatibility question, if the PNG encoder and PNG decoder implement the mandatory aspects of the PNG RFC, I see no reason for the interlacing to lead to a compatibility issue.
Edit 2018 11 13:
Some experiments based on auto evolved distributed genetic algorithms with niche mechanism (hosted on https://en.oga.jod.li ) are explained here:
https://jod.li/2018/11/13/can-an-interlaced-png-image-be-smaller-than-the-equivalent-non-interlaced-image/
Those experiments show that it is possible for equivalent PNG images to have a smaller size interlaced than non-interlaced. The best images for this are tall, they have a one pixel width and have pixel content that appear random. Though, the shape is not the only important aspect for the interlaced image to be smaller than the non-interlaced image as random cases with the same shape lead to different size differences.
So, yes, some PNG images can be identical pixel wise and for non-pixel related content but have a smaller size interlaced than non-interlaced.

So I just wonder why does that happen?
From section Interlacing and pass extraction of the PNG spec.
Scanlines that do not completely fill an integral number of bytes are padded as defined in 7.2: Scanlines.
NOTE If the reference image contains fewer than five columns or fewer than five rows, some passes will be empty.
I would assume the behavior your experiencing is the result of the Adam7 method requiring additional padding.

Related

Please clarify the gif image format's intended behavior

If I have a gif89a which has multiple image blocks that are identical (and small, say 40x40 or 1600 pixels in size), should these continue to increase the final size of the gif file (assuming a sane encoder)?
I'm trying to understand how the LZW compression works. According to the W3C spec, I thought the entire data stream itself (consisting of multiple image blocks) should be be compressed, and thus repeating the same image frame multiple times would incur very little overhead (just the size of the symbol for the the repeated image block). This does not seem to be the case, and I've tested with several encoders (Gimp, Photoshop).
Is this to be expected with all encoders, or are these two just doing it poorly?
With gimp, my test gif was 23k in size when it had 240 identical image blocks, and 58k in size with 500 image blocks, which seems less impressive than my intuition is telling me (my intuition's pretty dumb, so I won't be shocked if/when someone tells me it's incredibly wrong).
[edit]
I need to expand on what it is I'm getting at, I think, to receive a proper answer. I am wanting to handcraft a gif image (and possibly write an encoder if I'm up to it) that will take advantage of some quirks to compress it better than would happen otherwise.
I would like to include multiple sub-images in the gif that are used repeatedly in a tiling fashion. If the image is large (in this case, 1700x2200), gif can't compress the tiles well because it doesn't see them as tiles, it rasters from the top left to the bottom right, and at most a 30 pixel horizontal slice of any given tile will be given a symbol and compressed, and not the 30x35 tile itself.
The tiles themselves are just the alphabet and some punctuation in this case, from a scan of a magazine. Of course in the original scan, each "a" is slightly different than every other, which doesn't help for compression, and there's plenty of noise in the scan too, and that can't help.
As each tile will be repeated somewhere in the image anywhere from dozens to hundreds of times, and each is 30 or 40 times as large as any given slice of a tile, it looks like there are some gains to be had (supposing the gif file format can be bent towards my goals).
I've hand-created another gif in gimp, that uses 25 sub-images repeatedly (about 700 times, but I lost count). It is 90k in size unzipped, but zipping it drops it back down to 11k. This is true even though each sub-image has a different top/left coordinate (but that's only what, 4 bytes up in the header of the sub-image).
In comparison, a visually identical image with a single frame is 75k. This image gains nothing from being zipped.
There are other problems I've yet to figure out with the file (it's gif89a, and treats this as an animation even though I've set each frame to be 0ms in length, so you can't see it all immediately). I can't even begin to think how you might construct an encoder to do this... it would have to select the best-looking (or at least one of the better-looking) versions of any glyph, and then figure out the best x,y to overlay it even though it doesn't always line up very well.
It's primary use (I believe) would be for magazines scanned in as cbr/cbz ebooks.
I'm also going to embed my hand-crafted gif, it's easier to see what I'm getting at than to read my writing as I stumble over the explanation:
LZW (and GIF) compression is one-dimensional. An image is treated as a stream of symbols where any area-to-area (blocks in your terminology) symmetry is not used. An animated GIF image is just a series of images that are compressed independently and can be applied to the "main" image with various merging options. Animated GIF was more like a hack than a standard and it wasn't well thought out for efficiency in image size.
There is a good explanation for why you see smaller files after ZIP'ing your GIF with repeated blocks. ZIP files utilize several techniques which include a "repeated block" type of compression which could do well with small (<32K) blocks (or small distances separating) identical LZW data.
GIF-generating software can't overcome the basic limitation of how GIF images are compressed without writing a new standard. A slightly better approach is used by PNG which uses simple 2-dimensional filters to take advantage of horizontal and vertical symmetries and then compresses the result with FLATE compression. It sounds like what you're looking for is a more fractal or video approach which can have the concept of a set of compressed primitives that can be repeated at different positions in the final image. GIF and PNG cannot accomplish this.
GIF compression is stream-based. That means to maximize compression, you need to maximize the repeatability of the stream. Rather than square tiles, I'd use narrow strips to minimize the amount of data that passes before it starts repeating then keep the repeats within the same stream.
The LZW code size is capped at 12 bits, which means the compression table fills up relatively quickly. A typical encoder will output a clear code when this happens so that the compression can start over, giving good adaptability to fresh content. If you do your own custom encoder you can skip the clear code and keep reusing the existing table for higher compression results.
The GIF spec does not specify the behavior when a delay time of 0 is given, so you're at the mercy of the decoder implementation. For consistent results you should use a delay of 1 and accept that the entire image won't show up immediately.

Compression method effectiveness varies hugely

I decided I'd attempt an image compression (From pixel RGBs) idea I had for a bit of extra credit in class. I've finished, but I find the levels of compression I get varies HUGELY from image to image. With this image, I'm getting a file size 1.25x the size of the coresponding PNG. With this image however, I'm getting a file size 22.5x the size of the PNG.
My compression works by first assigning each color in the image with an int(starting from 0), then using that int rather than the actual color in the file. The file is formatted as:
0*qj8c1*50i2p2*pg93*9zlds4*2rk5*4ok4r6*8mv1w7*2r25l8*3m89o9*9yp7c10*111*2clz112*g1j13*2w34z14*auq15*3zhg616*mmhc17*5lgsi18*25lw919*7ip84+0!0!0!0!0!0!0!0!0!0!0!0!0!1!1!1!#2!2!2!2!2!2!2!2!2!2!2!2!2!3!3!3!#4!4!4!4!4!4!4!4!4!4!4!4!4!5!5!5!#6!6!6!6!6!6!6!6!6!6!6!6!6!0!0!0!#3!3!3!3!3!3!3!3!3!3!3!3!3!2!2!2!#1!1!1!1!1!1!1!1!1!1!1!1!1!4!4!4!#7!7!7!7!7!7!7!7!7!7!7!7!7!6!6!6!#5!5!5!5!5!5!5!5!5!5!5!5!5!8!8!8!#9!9!9!9!9!9!9!9!9!9!9!9!9!10!10!10!#8!8!8!8!8!8!8!8!8!8!8!8!8!11!11!11!#12!12!12!12!12!12!12!12!12!12!12!12!12!7!7!7!#13!13!13!13!13!13!13!13!13!13!13!13!13!14!14!14!#15!15!15!15!15!15!15!15!15!15!15!15!15!16!16!16!#17!17!17!17!17!17!17!17!17!17!17!17!17!13!13!13!#18!18!18!18!18!18!18!18!18!18!18!18!18!15!15!15!#10!10!10!10!10!10!10!10!10!10!10!10!10!19!19!19!#
Where the first bit (With the *s and base36 numbers) is the dictionary defining the colors, and the second part seperated by !s is the actual image.
Why am I seeing the level of compression vary so hugely image from image? Is there a flaw in my compression algorithm?
Edit: The fact that the actual level of compression is poor compared to JPEG or PNG isn't an issue, I wasn't expecting to rival any major formats.
Thanks

Match two images in different formats

I'm working on a software project in which I have to compare a set of 'input' images against another 'source' set of images and find out if there is a match between any of them. The source images cannot be edited/modified in any way; the input images can be scaled/cropped in order to find a match. The images can be in BMP,JPEG,GIF,PNG,TIFF of any dimensions.
A constraint: I'm not allowed to use any external libraries. ImageMagick is an exception and can be used.
I intend to use Java/Python. The software is purely command-line based.
I was reading on SO and some common image comparing algorithms. I'm planning to take 2 approaches.
1. I could use Histograms/buckets to find out the RGB values of the 2 images being compared.
2. Use SIFT/SURF to fin keypoint descriptors and find the euclidean distance between them and output the result based on the resultant distance.
The 2 images in comparison can be in different formats. An intuitive thought is that before analysis/comparison, the 2 images must be converted to a common format.I reasoned that the image should be converted to the one with lesser quality e.g. if the 2 input images are BMP and JPEG, convert the BMP to JPEG. This can be thought of as a pre-processing step.
My question:
Is image conversion to a common format required? Can 2 images of different formats be compared? IF they have to be converted before comparison, is my assumption of comparing from higher quality(BMP) to lower(JPEG) correct? It'd also be helpful if someone can suggest some algorithms for image conversion.
EDIT
A match is said to be found if the pattern image is found in the source image.
Say for example the source image consists of a football field with one player. If the pattern image contains the player EXACTLY as he is in the source image, then its a match.
No, conversion to a common format on disk is not required, and likely not helpful. If you extract feature descriptors from an image (SIFT/SURF, for example), it matters much less how the original images were stored on disk. The feature descriptors should be invariant to small compression artifacts.
A bit more...
Suppose you have a BMP that is an image of object X in your source dataset.
Then, in your input/query dataset, you have another image of object X, but it has been saved as a JPEG.
You have no idea how what noise was introduced in the encoding process that produced either of these images. There is lighting differences, atmospheric effects, lens effects, sensor noise, tone-mapping, gammut-mapping. Some of these vary from image to image, others vary from camera to camera. All this is done before the image even gets saved to storage in the camera. Yes, there are also JPEG compression artifacts, but to assume the BMP is "higher" quality and then degrade it through JPEG compression will not help. Perhaps the BMP has even gone through JPEG compression before being saved as a BMP.

Matlab imwrite function changes the pixel values

I tried to change some pixel values of a Grayscale image and save it using imwrite in matlab.
no problem with saving.
the problem is when I read it back, some pixel values have been changed. not exactly the same values I assigned to pixels before saving it.
I'm trying to hash images so 1unit difference will effect the hash numbers.
As mentioned by mmgp, JPG can be lossy. That means that some of the information in your image will be lost in favor of storage efficiency.
The rationale behind JPG is somewhat like that behind MP3 -- changes in hues etc. that the human eye is not particularly well-adapted to distinguish will be simplified or removed altogether, thus decreasing the amount of information in the image. The information in a JPG represents a similar-looking, but in fact very different image. This is probably what you're experiencing.
In Matlab, have a look at the output of help imwrite. You can give a parameter to the jpg write called 'Quality', which is a number between 0 and 100, 100 meaning (near-)lossless compression.
Although the JPEG standard does allow for (near-)lossless compression, it is not often used in practice (at least, in my field). More popular lossless image formats are PNG, JPEG2000 and TIFF. Read more about it here.
All of these are also available in Matlab's imwrite function.

which format of png should I use?

Which format of PNG should I use PNG 8 or PNG 24? Which one is better for a website. I am confused about these. What is main different between PNG 8 and PNG 2?
Png-24. Png-24 has alpha transparency (where Png-8 only has on/off transparency).
Png-8 is indexed. Png-24 is loss-less.
Png-24 is better in almost every way.
http://www.elated.com/articles/understanding-image-formats/
PNG has several modes which can be used. It may contain:
Greyscale
Indexed colour, usually meant by PNG-8
Greayscale with alpha
Truecolour (RGB)
Truecolour with alpha (RGBA), usually meant by PNG-24
Indexed colour is different from the others that it is a palette of maximum 256 colours, from which indexes are used to denote the colour of specific pixels. It can contain transparency via an auxillary chunk. So every pixel is denoted by a byte-wide value or even less if palette isn't that big. If you use truecolour, there will be more data per pixel, depending on whether you use an alpha-channel.
So in a large image indexed colour will save you a lot of data per each pixel. However, if you use more than 256 colours, some colour data will be lost, which is also more probable in a large image. I would advise to save your image in both formats and see if the loss is worth the gain in smaller file size. Though if you are designing your image for normal web site, not for mobile phones, you should better use PNG-24 anyway, since no one will notice the difference in the size.
I would say that it depends on the image you want to store as PNG*, but in case you've doubts, PNG-24 is better: "true color" (8bits per channel), so that the image must not be dithered and don't "loose" "exact" color match, and optionally PNG-32 if with the alpha channel (transparency) too. PNG-8 images are limited (256 colors chosen from a 24bit palette) and allow only for a mask, you can use it e.g. if you convert from a GIF image; if you convert from other "true color" formats, you "loose" "exact" color match as said (the program try to reduce the real number of used colors into only 256 colors, and other tricks to give resemblance with the original). Some style of icons do not need "true colors", they are "described" well by 256 "fixed" colors or less, and so PNG-8 is ok; as said, if you have GIF images as source, go for PNG-8... if you convert from JPG, go for PNG-24; if you create you image by yourself directly "in" PNG, you know if you can "crunch" the result in PNG-8 or not, but if you are not able to evaluate or use tools to evaluate, PNG-24/32 is ok in any case.
I've written article about it: PNG that works, which outlines all major variants of PNG and their trade-offs/compatibility.
In short, use PNG8 (paletted) whenever you can, as it has much smaller file size. You can have full alpha transparency in PNG8 if you use good tools (Photoshop is not good for PNG).

Resources