Similar images compression - image

I have bunch of similar images. Those images contains different noise, but edges and histograms are very similar. I need to compress this images loselessly.
Is there any algorithm, that can use image similarity for more efficient compression ?
I have tried to use improved compression via prediction (changed MED predictor from LOCO), but my gain was only about 0,4%

What do you mean exactly by "similar"? Having similar histograms isn't going to help much. Do the images look the same?
You could simply try subtracting the previous image from this image, pixel by pixel and color by color, and see if the difference image is more compressible.
The next step would be to make the series of images a video, and use video compression, which can exploit more complex correlations between successive images.

If you have heard about Set Redundancy Compression (SRC) that would make your task very easy. It provides loss less and lossy image image compression techniques for set of similar images. Min-Max differential technique might be the one you seek.

Try doing "composite -compose difference image1 image2 diff" on sequential images (or arbitrarily order the images in some way if you don't already have an order). The 'diff' image might be very small, and you can recover image2 by doing composite -compose difference diff image1 (or some variation).

Are the images in gray scale or are colored. If they are gray scale ones, I have an application that I developed 7 years ago. I can try it for you.
It is based on a technique that is called Set Redundancy Compression.
If they are colored, you can try Lagarith (a video codec)

Related

Is there an algorithm to compress multiple images that are not much different apart?

I have a set of 100+ images that only differ a few pixels. They also don't have many different colors. Is there a way to compress them together by taking advantage of that?
There's an algorithm: calculate the difference of each image to some reference image. Is there a ready made application? Probably not. One approach could be to combine the images to a very big image (by interleaving or placing them next to each other) and using png.
If this is for archiving purpose and you don't need a random access to them, you can concatenate them to zip / tar (with zero compression) and compress the whole thing. .bz2 algorithm (Burrows-Wheeler Transform) is able to search for similarities over much larger window than deflate of png. It's an order of ten megabytes vs. 10 kilobytes. If the images are large enough, the window size will limit the compression in both algorithms -- that would have to be combat by interleaving or delta compressing between each image.
The delta compression is utilized regularly in some video compressing applications and e.g. screen capture and virtual desktop applications where lossless compression is required.
String them together as a movie and use video compression. That is exactly what video compression does.
Nothing ready-made. If you want lossless compression, you could store just a reference image and the delta images (difference of images at the pixel leve), each one encoded with PNG. But you must write yourself the delta transformation.
You could also take advantage of the delta compression mode of the MNG format, an extension to PNG for animation (each of your image would be an animation frame). But the format is not widely supported.
You could also take the same approach (one image = one video frame) and use any standard video format (MPEG) but this would be lossy.

How to scale JPEG image down so that text is clear as possible?

I have some JPEG images that I need scale down to about 80% of original size. Original image dimension are about 700px × 1000px. Images contain some computer generated text and possibly some graphics (similar to what you would find in corporate word documents).
How to scale image so that the text is as legible as possible? Currently we are scaling the imaeg down using bicubic interpolation, but that makes the text blurry and foggy.
Two options:
Use a different resampling algorithm. Lanczos gives you a much less blurrier result.
You ight use an advances JPEG library that resamples the 8x8 blocks to 6x6 pixels.
If you are not set on exactly 80% you can try getting and building djpeg from http://www.ijg.org/ as it can decompress your jpeg to 6/8ths (75%) or 7/8ths (87.5%) size and the text quality will still be pretty good:
Original
7/8
3/4
(SO decided to scale the images when showing them inline)
There may be a scaling algorithm out there that works similarly, but this is an easy off the shelf solution.
There is always a loss involved in scaling down, but it again depends of your trade offs.
Blurring and artifact generation is normal for jpeg images, so its recommended that you generate images is the correct size the first time.
Lanczos is a fine solution, but you have your trade offs
If its just the text and you are concerned about it, you could try dilation filter over the resampled image. This would correct some blurriness but may also affects the graphics. If you can live with it, its good. Alternatively if you can identify the areas of text, you can apply dilation just over those areas.

How to detect subjective image quality

For an image-upload tool I want to detect the (subjective) quality of an image automatically, resulting in a rating of the quality.
I have the following idea to realize this heuristically:
Obviously incorporate the resolution into the rating.
Compress it to JPG (75%), decompress it and compare jpg-size vs. decompressed size to gain a ratio. The blurrier the image is, the higher the ratio.
Obviously my approach would use up a lot of cycles and memory if large images are rated, although this would do in my scenario (fat server, not many uploads), and I could always build in a "short circuit" around the more expensive steps if the image exceeds a certain resolution.
Is there something else I can try, or is there a way to do this more efficiently?
Assesing the image (the same goes for sound or video) quality is not an easy task, and there are numerous publications tackling the problem.
Much depends on the nature of the image - different set of criteria is appropriate for artificially created images (i.e. diagrams) or natural images (i.e. photographs). There are subtle effects that have to be taken into consideration - like color masking, luminance masking, contrast perception. For some images a given compression ratio is perfectly adequate, while for other it will result in significant loss of quality.
Here is a free-access publication giving a brief introduction to the subject of image quality evaluation.
The method you mentioned - compressing the image and comparing the result with the original is far from perfect. What will be the metric that you plan to use? MSE? MSE per block? For sure it is not too difficult to implement, but the results will be difficult to interpret (consider images with high-frequency components and without them).
And if you want to delve more into the are of image quality assessment there is also a lot of research done by the machine learning community.
You could try looking in the EXIF tags of the image (using something like exiftool), what you get will vary a lot. On my SLR, for example, you even get which of the focus points were active when the image was taken. There may also be something about compression quality.
The other thing to check is the image histogram - watch out for images biased to the left, which suggests under-exposure or lots of saturated pixels.
For image blur you could look at the high frequency components of the Fourier transform, this is probably accessing parameters relating to the JPG compression anyway.
This is a bit of a tricky area because most "rules" you might be able to implement could arguably be broken for artistic effect.
I'd like to shoot down the "obviously incorporate resolution" idea. Resolution tells you nothing. I can scale an image by a factor of 2 , quadrupling the number of pixels. This adds no information whatsoever, nor does it improve quality.
I am not sure about the "compress to JPG" idea. JPG is a photo-oriented algorithm. Not all images are photos. Besides, a blue sky compresses quite well. Uniformly grey even better. Do you think exact cloud types determine the image quality?
Sharpness is a bad idea, for similar reasons. Depth of Field is not trivially related to image quality. Items photographed against a black background will have a lot of pixels with quite low intensity, intentionally. Again, this does not signal underexposure, so the histogram isn't a good quality indicator by itself either.
But what if the photos are "commercial?" Does the value of the existing technology work if the photos are of every-day objects and purposefully non-artistic?
If I hire hundreds of people to take pictures of park benches I want to quickly know which pictures are of better quality (in-focus, well-lit) and which aren't. I don't want pictures of kittens, people, sunsets, etc.
Or what if the pictures are supposed to be of items for a catalog? No models, just garments. Would image-quality processing help there?
I'm also really interested working out how blurry a photograph is.
What about this:
measure the byte size of the image when compressed as JPEG
downscale the image to 1/4th
upscale it 4x, using some kind of basic interpolation
compress that version using JPEG
compare the sizes of the two compressed images.
If the size did not go down a lot (past some percentage threshold), then downscaling and upscaling did not lose much information, therefore the original image is the same as something that has been zoomed.

Ruthlessly compressing large images for the web

I have a very large background image (about 940x940 pixels) and I'm wondering if anyone has tips for compressing a file this large further than Photoshop can handle? The best compression without serious loss of quality from Photoshop is PNG 8 (250 KB); does anyone know of a way to compress an image down further than this (maybe compress a PNG after it's been saved)?
I don't normally deal with optimizing images this large, so I was hoping someone would have some pointers.
It will first depend on what kind of image you are trying to compress. The two basic categories are:
Picture
Illustration
For pictures (such as photographs), a lossy compression format like JPEG will be best, as it will remove details that aren't easily noticed by human visual perception. This will allow very high compression rates for the quality. The downside is that excessive compression will result in very noticeable compression artifacts.
For illustrations that contain large areas of the same color, using a lossless compression format like PNG or GIF will be the best approach. Although not technically correct, you can think of PNG and GIF will compress repetitions the same color very well, similar to run-length encoding (RLE).
Now, as you've mentioned PNG specifically, I'll go into that discussion from my experience of using PNGs.
First, compressing a PNG further is not a viable option, as it's not possible to compress data that has already been compressed. This is true with any data compression; removing the entropy from the source data (basically, repeating patterns which can be represented in more compact ways) leads to the decrease in the amount of space needed to store the information. PNG already employs methods to efficiently compress images in a lossless fashion.
That said, there is at least one possible way to drop the size of a PNG further: by reducing the number of colors stored in the image. By using "indexed colors" (basically embedding a custom palette in the image itself), you may be able to reduce the size of the file. However, if the image has many colors to begin with (such as having color gradients or a photographic image) then you may not be able to reduce the number of colors used in a image without perceptible loss of quality.
Basically it will come down to some trial-and-error to see if the changes to the image will cause any change in image quailty and file size.
The comment by Paul Fisher reminded me that I also probably wouldn't recommend using GIF either. Paul points out that PNG compresses static line art better than GIF for nearly every situation.
I'd also point out that GIF only supports 8-bit images, so if an image has more than 256 colors, you'll have to reduce the colors used.
Also, Kent Fredric's comment about reducing the color depth has, in some situtations, caused a increase in file size. Although this is speculation, it may be possible that dithering is causing the image to become less compressible (as dithering introduces pixels with different color to simulate a certain other color, kind of like mixing pigment of different color paint to end up with another color) by introducing more entropy into the image.
Have a look at http://www.irfanview.com/, is an oldy but a goody.
Have found this is able to do multipass png compression pretty well, and does batch processing way faster than PS.
There is also PNGOUT available here http://advsys.net/ken/utils.htm, which is apparently very good.
Heres a point the other posters may not have noticed that I found out experimentally:
On some installations, the default behaviour is to save a full copy of the images colour profile along with the image.
That is, the device calibration map, usually SRGB or something similar, that tells using agents how to best map the colour to real world-colours instead of device independant ones.
This image profile is however quite large, and can make some of the files you would expect to be very small to be very large, for instance, a 1px by 1px image consuming a massive 25kb. Even a pure BMP format ( uncompressed ) can represent 1 pixel in less.
This profile is generally not needed for the web, so, when saving your photoshop images, make sure to export them without this profile, and you'll notice a marked size improvement.
You can strip this data using another tool such as gimp, but it can be a little time consuming if there are many files.
pngcrush can further compress PNG files without any data loss, it applies different combinations of the encoding and compression options to see which one works best.
If the image is photographic in nature, JPEG will compress it far better than PNG8 for the same loss in quality.
Smush.It claims to go "beyond the limitations of Photoshop". And it's free and web-based.
It depends a lot on the type of image. If it has a lot of solid colors and patterns, then PNG or GIF are probably your best bet. But if it's a photo-realistic image then JPG will be better - and you can crank down the quality of JPG to the point where you get the compression / quality tradeoff you're looking for (Photoshop is very good at showing you a preview of the final image as you adjust the quality).
The "compress a PNG after it's been saved" part looks like a deep misunderstanding to me. You cannot magically compress beyond a certain point without information loss.
First point to consider is whether the resolution has to be this big. Reducing the resolution by 10% in both directions reduces the file size by 19%.
Next, try several different compression algorithms with different grades of compression versus information/quality loss. If the image is sketchy, you might get away with quite rigorous JPEG compression.
I would tile it, Unless you are absolutely sure that you audience has bandwidth.
next is jpeg2k.
To get more out of a JPEG file you can use the 'Modified Quality Setting' of the "Save as Web" dialog.
Create a mask/selection that contains white where you want to keep the most detail, eq around Text. You can use Quick-Mask to draw the mask with a brush. It helps to Feather the selection, this results in a nice white to black transition in the next step.
save this mask/selection as a channel and give the channel a name
Use File->Save as Web
Select JPEG as file format
Next to the Quality box there is a small button with a circle on it. Click that. Select the saved channel in step 2 and play with the quality setting for the white and black part of the channel content.
http://www.jpegmini.com is a new service that creates standard jpgs with an impressively small filesize. I've had good success with it.
For best quality single images, I highly recommend RIOT. You can see the original image, aside from the changed one.
The tool is free and really worth trying out.
JPEG2000 gives compression ratios on photographic quality images that are significantly higher than JPEG (or PNG). Also, JPEG2000 has both "lossy" and "lossless" compression options that can be tuned quite nicely to your individual needs.
I've always had great luck with jpeg. Make sure to configure photoshop to not automatically save thumbnails in jpegs. In my experience I get the greatest bang/buck ratio by using 3 pass progressive compression, though baseline optimized works pretty well. Choose very low quality levels (e.g. 2 or 3) and experiment until you've found a good trade off.
PNG images are already compressed internally, in a manner that doesn't benefit from more compression much (and may actually expand if you try to compress it).
You can:
Reduce the resolution from 940x940 to something smaller like 470x470.
Reduce the color depth
Compress using a lossy compression tool like JPEG
edit: Of course 250KB is large for a web background. You might also want to rethink the graphic design that requires this.
Caesium is the best tool i have ever seen.

Image compression for webcomics

As probably many people around here I read a few webcomics. Drowtales is my favorite, but that's besides the point.
For a long time a thought has been nagging me at the back of my head: webcomics are drawn pictures. They are not photographs. There should be a lot of redundancy (less colors, more flat colored areas, etc.) and thus they should be easily compressible at quite high rates while still maintaining lossless quality. Still it seems that the best tool to compress them is the same old lossy JPEG.
How so? Are there not better things invented? I'm not an expert in data compression, so my own meager attempts at finding some better algorithm have been fruitless. Best I could find was Pngcrush, but it still is way behind JPEG in terms of compression.
I would like to hear an expert opinion on this. Is this idea of mine foolish and doomed to failure? Or is there perhaps some way that people have found or that I could look into?
This, of course, comes from the selfish desire to decrease load times. :)
Added: Some people seem to miss the point, so I'll clarify:
Webcomic images should have a lot of redundancy in them so they should be easily compressible. Is it not possible to somehow compress them so that they would be both lossless AND smaller than JPEG? Or at the very least compress them better than JPEG while still retaining the quality.
Since they would be for web the specialized compressor should still probably emit PNG or JPEG - just compressed with a modified algorithm for better results.
No question, it's a balancing act between appearance and performance. Barring a custom compression algorithm specifically for comics, I think the best you can do is experiment with JPEG compression levels until you get one that's a reasonable size, but still looks good for the particular comic.
From lbrandy.com
The problem with comics is that a lot of graduated colouring is used. A common technique when colouring a comic on computer using Photoshop, for example, is to start by blocking out areas in solid colour as you mentioned. However, these solid areas are then refined using various techniques, from hand touching using airbrush tools to overlaying graduated fills, dodging and burning tools, etc.
The result is an image which is more like a natural image - which is what comic artists are striving for of course - and thus it compresses better with a lossy algorithm such as that used by JPEG.
A completely different approach would be to render the comic images using a vector format like SVG. That would capture the essence of the drawing (fill here, arc here, line here, etc) without having to try to raster-compress the resulting images.
Your assumptions aren't borne out by my data. My favorite webcomic is already distributed as PNG. Converting a 167K PNG file to JPEG using the default compression quality yields a 199K JPEG file. Break-even is somewhere between -quality 60 and -quality 65, which is quite a low quality for a JPEG. So, Questionable Content is already compressed lossless and smaller than JPEG.
A few things I've picked up on doing images for web use -
Use jpegtran -optimise on JPEGs - it recompresses them losslessly and can shave a good few percent off poorly compressed images.
I run PNG files through pngnq (make them 8 bit) and then optipng -i0 (recompress and remove any interlacing). I know you said you don't like lossy, but pngnq does an amazingly good job of converting images to a palette - best thing to do is try it yourself and see if the output is good enough.
Under certain circumstances, JPEG images will be larger than PNG images.
For example, in cases where there is a very simple image, PNG may end up compressing the image better and giving better image quality.
Here's an example with some Java code:
public static void main(String[] args)
{
BufferedImage img = new BufferedImage(
256,
256,
BufferedImage.TYPE_INT_RGB
);
Graphics g = img.getGraphics();
g.setColor(Color.white);
g.fillRect(0, 0, 256, 256);
g.setColor(Color.black);
g.drawLine(0, 0, 255, 255);
g.drawLine(255, 0, 0, 255);
try
{
ImageIO.write(img, "jpg", new File("output.jpg"));
ImageIO.write(img, "png", new File("output.png"));
}
catch (IOException e) {} // Don't usually ignore exceptions!
g.dispose();
}
The above code produces an image with the dimensions of 256 x 256 pixels, and draws two intersecting diagonal lines in the form on an "X".
The 256 x 256 image was used to keep the image size to an multiple of 8, as JPEG compression performs a 2D DCT transform on 8 x 8 pixel sections of the image. By keeping the image size and location of the line to align within the 8 x 8 pixel section, it will reduce the amount of compression artifacts and improve the quality of the image.
(Choosing 256 x 256 was empirical -- I at first used 100 x 100 and noticed that the JPEG image was horrible, so I tried 64 x 64 and it looked better, so I made it larger to simulate a more realistic image size.)
After drawing the image, the program generate a JPEG file and a PNG file. (The Java ImageIO library uses the default compression ratio of 0.75f for the compression quality of the JPEG.)
Results:
output.png : 1,308 bytes
output.jpg : 3,049 bytes
Taking a look at the image itself, the JPEG has a little bit of artifacting, but it wasn't very noticeable until I zoomed in with an image editor. Of course, the PNG image is lossless, so it was an exact representation of the original.
To conclude, whether an image is smaller with PNG or JPEG is really up to the source -- there are cases where JPEG can be larger than a PNG and yet the PNG can be better quality. Of course, in practice, generally PNG will be larger than JPEG for a given image.
You may want to cut down on how many colours you are encoding in your image. Try saving your comic with only 256 colours and watch the size decrease a lot. Depending on your specific drawing style, that me be enough.
I've drawn a number of large hand-illustrated circuit diagrams which I scan in as grayscale for use in computerized documents; LZW-compressed TIFF always wins hand over JPEG, both in viewable quality and file size, I think because TIFF can take advantages of RLE encoding for whitespace. I'm not sure whether PNG can do this too, or whether RLE can be extended for multicolor images & not just black/white.
edit: I just tried one of my grayscale hand drawings; TIFF can beat PNG by about 2:1 (43K vs. 83K using ImageMagick convert to go from original TIFF -> PNG -> TIFF again to double-check that ImageMagick is producing both file formats and ensure that my original program didn't do a bad job producing the TIFF) but only because TIFF uses 8bits/pixel (grayscale) and PNG uses 24bits/pixel (RGB).
edit 2: never mind, I just was able to use pngcrush -c 0 to ensure the image is grayscale. PNGcrush got the RGB version down to 67K and the grayscale down to 34K. Nice!
edit 3: Just a point of procedure: It seems to me that it would make a heck of a lot more sense to pick a number of different images of this type to choose as standard benchmarks, and just try different techniques across the benchmark set, rather than just a bunch of stack-overfloids pontificating. This seems like a problem that needs a well-tested empirical solution.
No matter how great a lossless compression is, a loss compression will always be better, because it just has fewer limitations.
Imagine that one day they invent some lossless compression better than jpeg for comics, obviously the next day someone will modify it to compress more, even, and probably, if it means that some info is lost.
Between anti-aliasing and gradients, there are probably more colors in the image than you think.
Drawn vs. not drawn, web comics vs. any other type of image... that's not relevant. The specifics of how web comics are drawn or the colors are laid out or whatever is something you're perceiving as different. But you can bet that decades of graphics research and development have that fully taken into account, and the people that do graphics optimizations for a living have pushed the envelope.
If there was a better compression algorithm than JPEG, GIF, PNG, etc. then don't you think it would be in wide-spread use? If you're looking for fairly recent breakthroughs then I think you're probably wasting your time, as 1) you'd have to expend quite a bit of effort to make your front-end compression compatible with whatever viewer people use (like browsers) and 2) if it had significant gains from current formats then it would become wide-spread fairly quickly.
If I'm getting down-voted I must not have explained myself very well.
Thinking that web comics are in some special domain because they're hand-drawn or have lots of color repetition is a bit silly. Finding large blocks of the same color is one of the absolute basics of image compression.
Get yourself a good graphics program, and using your specific image, see which of its export formats yields the smallest image size while retaining the quality you desire. It is going to be different for different images.
For pen-and-inkish images, the compression scheme in GIF can work wonders.
JPEG compression is ill-suited to that kind of image.
As someone who has done a lot of colouring work for cartoons, as well as photo-manipulation work I can safely say that there is often a lot going on inside the average web-comic when compared to a normal photo.
Assuming that the image is done in Photoshop or Painter (usually from a Tablet) there are often a number of filters or layers at work in the average web-comic. Shading, reflection, opacity, background images and far more come into the equation and with many of these being straight from filters or layer overlays there are often many colours in place.
A lot of the time you have to think of your audience. It is really worth optimising your images if you get 20 visitors a day? I'd probably argue that it is completely down to the size and content of your web-comic. If you can get away with PNG then I'd stick with it. More often than not in web-comics there is little going on to warrant using JPG.
I use OPTIPNG to get the best filter (with a sane level) and then I run ADVDEF -4 -z
http://advancemame.sourceforge.net/comp-readme.html (Not Advpng because Advpng removes the filters) to optimize the deflate.
Also you can try pngout http://www.advsys.net/ken/utils.htm
Has a plugin for Irfanview.
It uses the same deflate implementation of Kzip, which is usually even better than 7-zip but much slower.
EDIT:
okcancel20031003.gif What's your favorite "programmer" cartoon? 256 colors 147KB
PNG (Paint) 126KB
PNG (Irfanview) 120 KB
PNG (Irfanview) +
Optipng -o5 120KB (525 bytes smaller) 9s
Optipng + ADVDEF 114 KB 9s+0.9s
PngOut 114 KB 6s
BMP 273 KB
BMP +
7z (LZMA -fb 273) 107 KB
RAR (Best) 116 KB
BMF -S 90 KB 0.3s
Paq8o10t -4 79 KB 35s
I think the missing piece of information here is Image Compression is Tied to the Format. It's certainly possible that someone could come up with a compression algorithm that was/is well suited for the kind of images that web cartoonists create. However, once you took the new uber-comic-image format and emitted a PNG, JPG or GIF, the color information would be subject to the rules of the PNG, JPG or GIF compression mechanism and you'd lose all the benefit of your new image format.
Here's another way to think about it.
Save a photo as a low quality JPEG
Note the file size
Take that low quality jpeg and save-as a 24/32 bit PNG
Note the larger file size
The same thing would happen to this mythical uber-comic-image format.
The alternative would be getting the major browser vendors to support uber-comic-image nativity. I'll leave the reasons behind that not working as an exercise to the viewer.

Resources