tiff lzw compression 10 times larger than the original jpeg compression - image

i converted a few jpeg compressed files into tiff lzw compression but to my surprise their sizes were larger than the original jpeg ones. I checked the format of the converted files using imagemagick's identify tool and found that they were using LZW compression.
What could be the reason behind this? Any one having a idea ?
i also tried it using faststone image converter and still with the same result.

That is entirely expected and normal. JPEG is a lossy conversion that allows significant compression by accepting some degradation in the image. TIFF LZW compression is lossless, and so by definition cannot change the image at all for the purposes of compression, greatly limiting how far it can get. Furthermore, LZW is not a particularly good compression scheme, even for lossless. There are better ones. The most common is PNG, but there are better ones depending on the type of image, such as JBIG for bilevel or images or JPEG 2000 lossless for natural images.

Related

JPG vs compressed JPG vs WEBP - why WEBP isn't the smallest one?

I have this image (photo taken by me on SGS 9 plus): Uncompressed JPG image. Its size is 4032 x 3024 and its weight is around 3MB. I compressed it with TinyJPG Compressor and its weight was 1.3MB. For PNG images I used Online-Convert and I saw webp images much more smaller even than compressed with TinyPNG. I expected something similar, especially that I read an article JPG to WebP – Comparing Compression Sizes where WEBP is much smaller that compressed JPG.
But when I convert my JPG to WEBP format in various online image convertion tools, I see 1.5-2MB size, so file is bigger than my compressed JPG. Am I missing something? WEBP should not be much smaller than compressed JPG? Thank you in advance for every answer.
These are lossy codecs, so their file size mostly depends on quality setting used. Comparing just file sizes from various tools doesn't say anything without ensuring images have the same quality (otherwise they're incomparable).
There are a couple of possibilities:
JPEG may compress better than WebP. WebP has problems with blurring out of the details, low-resolution color, and using less than full 8 bits of the color space. In the higher end of quality range, a well-optimized JPEG can be similar or better than WebP.
However, most of file size differences in modern lossy codecs are due to difference in quality. The typical difference between JPEG and WebP at the same quality is 15%-25%, but file sizes produced by each codec can easily differ by 10× between low-quality and high-quality image. So most of the time when you see a huge difference in file sizes, it's probably because different tools have chosen different quality settings (and/or recompression has lost fine details in the image, which also greatly affects file sizes). Even visual difference too small for human eye to notice can cause noticeable difference in file size.
My experience is that lossy WebP is superior below quality 70 (in libjpeg terms) and JPEG is often better than WebP at quality 90 and above. In between these qualities it doesn't seem to matter much.
I believe WebP qualities are inflated about 7 points, i.e., to match JPEG quality 85 one needs to use WebP quality 92 (when using the cwebp tool). I didn't measure this well, this is based on rather ad hoc experiments and some butteraugli runs.
Lossy WebP has difficulties compressing complex textures such as leafs of trees densely, whereas JPEGs difficulties are with thin lines against flat borders, like a telephone line hanging against the sky or computer graphics.

Image format vs image compression algorithm vs codec

I'm still confused with concepts of image format, image compression algorithm or method and codec and relationships between them.
In my understanding format is something image is saved in, so it could contain information about what compression algorithm or method (are these two synonyms?) to use. Or does a specific format always use the same algorithm? Also, these algorithms can then use multiple codecs but I don't see a difference between job done by a compression algorithm and a codec.
Am I right in my assumptions? Can you elaborate definition and relationships of these concepts?
An image format is the specification of how the image data is stored on disk.
Since storage sizes for images can be quite large, images are often stored using a compression algorithm which can reduce the storage space needed to store a representation of the image.
A codec is an encoder/decoder pair. So a codec is a compression algorithm, and the reverse de-compression algorithm too.
One place to start learning more is the documentation for the NetPBM format and library. This is one of the simplest image formats because it does not use compression internally.
The following are examples of formats - PNG, GIF, TIFF, JPEG, BMP, TGA, PCX.
The following are examples of compression algorithms - LZW (Lempel Ziv Welch), RLE (Run Length Encoding), DEFLATE.
For the most part, each format generally uses the same compression, e.g. PNG format uses DEFLATE compression , whereas TGA and PCX format always use RLE. Some formats however can accommodate different types of compression, e.g. TIFF format can accommodate LZW, JPEG, Packbits, CCITT compression types.
A codec is more than a compression algorithm, it understands all aspects of the format... where to find the height and width, palettes, padding, compression, byte-ordering, transparency, meta-data and so on.

LZW or JBIG is better lossless compression algorithm for images?

Which lossless compression algorithm [between LZW or JBIG] is better for compressing data sets consisting of images (colored and monochrome) ?
I have implemented both and tested on smaller data sets [each containing 100 images] and have found inconclusive results.
Please Note:: I cannot use Lossy compressions like Jpeg because the data after decompression has to be identical to that of the source. Neither I can other lossless algorithms like PNG as they are not supported by the firmware which is responsible for the decompression.
Neither LZW or JBIG are optimal, although JBIG (JBIG2) should give you better results.
LZW is not designed for images (e.g., it does not exploit 2D correlation), and JBIG. JBIG (perhaps you mean JBIG2?) does exploit the 2D correlation, although it is designed for monochrome images such as fax pages.
Of course, results will depend on your particular dataset, so if results are inconclusive the best thing you can do is to test on more images (and perhaps differenciate between color and grayscale images).
If your firmware supports it, I would also test JPEG-LS (https://jpeg.org/jpegls/), which in my experience gives good overall lossless compression performance.
JPEG-LS or JPEG 2000 would give better results. You can think about WebP or JPEG XR as well.
Note: If you want to render compressed image to browser then you may need to take the browser support into account. e.g. JPEG 2000 supported by safari, WebP supported by chrome and android browsers, JPEG-XR supported by IE11 & Edge likewise.

Lossless JPEG file writing

I have a question with regard to JPEG file writing. Suppose I have a PNG file example.png, and I want to change the file format to JPEG without any information loss. For the now being, I have two solutions:
Solution 1: perform the file formatting transformation with MATLAB
I = imread('example.png');
imwrite(I,'example.jpg','Mode','lossless');
II = imread('example.jpg');
differ = I-II;
max(differ(:))
This solution can produce lossless JPEG files. However, the problem with this solution is that
some information in the original image such as the DPI resolution may be lost. Moreover, the
produced output image cannot be viewed by popular image viewers such as IrfanView and Windows
Paint.
Solution 2: use IrfanView software.
Use the "Save as" function of IrfanView program, we can change the file format very easily. However, although I have set 'best quality 100' option when saving the JPEG file, the output image also show some information loss. The difference between these two images are not zero for all the pixels.
I am therefore wondering what I should do in order to solve the problem. Any ideas will be appreciated.
This problem has no solution (yet, as of 2018).
You can't avoid use of lossy compression if you want the JPEG file to be usable in majority of image viewers.
The commonly supported version of JPEG is based on DCT compression, which - by definition - performs transformation and rounding that causes some precision loss.
The alternative, lossless JPEG compression method, JPEG-LS, is rarely supported.
There's also JPEG-XT extension that's a combination of lossy image + layer for reconstruction of the lossless original. It fails gracefully in JPEG image viewers, but it's even newer, and I don't know if it's implemented anywhere yet.
If you really need lossless, use PNG. With JPEG the best you can get is minimally lossy JPEG in RGB colorspace at quality=100 (which isn't literally 100%).

What is the difference between "JPG" / "JPEG" / "PNG" / "BMP" / "GIF" / "TIFF" Image?

I have seen many types of image extensions but have never understood the real differences between them. Are there any links out there that clearly explain their differences?
Are there standards to consider when choosing a particular type of image to use in an application? What do we use for web applications?
Yes. They are different file formats (and their file extensions).
Wikipedia entries for each of the formats will give you quite a bit of information:
JPEG (or JPG, for the file extension; Joint Photographic Experts Group)
PNG (Portable Network Graphics)
BMP (Bitmap)
GIF (Graphics Interchange Format)
TIFF (or TIF, for the file extension; Tagged Image File Format)
Image formats can be separated into three broad categories:
lossy compression,
lossless compression,
uncompressed,
Uncompressed formats take up the most amount of data, but they are exact representations of the image. Bitmap formats such as BMP generally are uncompressed, although there also are compressed BMP files as well.
Lossy compression formats are generally suited for photographs. It is not suited for illustrations, drawings and text, as compression artifacts from compressing the image will standout. Lossy compression, as its name implies, does not encode all the information of the file, so when it is recovered into an image, it will not be an exact representation of the original. However, it is able to compress images very effectively compared to lossless formats, as it discards certain information. A prime example of a lossy compression format is JPEG.
Lossless compression formats are suited for illustrations, drawings, text and other material that would not look good when compressed with lossy compression. As the name implies, lossless compression will encode all the information from the original, so when the image is decompressed, it will be an exact representation of the original. As there is no loss of information in lossless compression, it is not able to achieve as high a compression as lossy compression, in most cases. Examples of lossless image compression is PNG and GIF. (GIF only allows 8-bit images.)
TIFF and BMP are both "wrapper" formats, as the data inside can depend upon the compression technique that is used. It can contain both compressed and uncompressed images.
When to use a certain image compression format really depends on what is being compressed.
Related question: Ruthlessly compressing large images for the web
You should be aware of a few key factors...
First, there are two types of compression: Lossless and Lossy.
Lossless means that the image is made smaller, but at no detriment to the quality. Lossy means the image is made (even) smaller, but at a detriment to the quality. If you saved an image in a Lossy format over and over, the image quality would get progressively worse and worse.
There are also different colour depths (palettes): Indexed color and Direct color.
With Indexed it means that the image can only store a limited number of colours (usually 256) that are chosen by the image author, with Direct it means that you can store many thousands of colours that have not been chosen by the author.
BMP - Lossless / Indexed and Direct
This is an old format. It is Lossless (no image data is lost on save) but there's also little to no compression at all, meaning saving as BMP results in VERY large file sizes. It can have palettes of both Indexed and Direct, but that's a small consolation. The file sizes are so unnecessarily large that nobody ever really uses this format.
Good for: Nothing really. There isn't anything BMP excels at, or isn't done better by other formats.
GIF - Lossless / Indexed only
GIF uses lossless compression, meaning that you can save the image over and over and never lose any data. The file sizes are much smaller than BMP, because good compression is actually used, but it can only store an Indexed palette. This means that there can only be a maximum of 256 different colours in the file. That sounds like quite a small amount, and it is.
GIF images can also be animated and have transparency.
Good for: Logos, line drawings, and other simple images that need to be small. Only really used for websites.
JPEG - Lossy / Direct
JPEGs images were designed to make detailed photographic images as small as possible by removing information that the human eye won't notice. As a result it's a Lossy format, and saving the same file over and over will result in more data being lost over time. It has a palette of thousands of colours and so is great for photographs, but the lossy compression means it's bad for logos and line drawings: Not only will they look fuzzy, but such images will also have a larger file-size compared to GIFs!
Good for: Photographs. Also, gradients.
PNG-8 - Lossless / Indexed
PNG is a newer format, and PNG-8 (the indexed version of PNG) is really a good replacement for GIFs. Sadly, however, it has a few drawbacks: Firstly it cannot support animation like GIF can (well it can, but only Firefox seems to support it, unlike GIF animation which is supported by every browser). Secondly it has some support issues with older browsers like IE6. Thirdly, important software like Photoshop have very poor implementation of the format. (Damn you, Adobe!) PNG-8 can only store 256 colours, like GIFs.
Good for: The main thing that PNG-8 does better than GIFs is having support for Alpha Transparency.
Important Note: Photoshop does not support Alpha Transparency for PNG-8 files. (Damn you, Photoshop!) There are ways to convert Photoshop PNG-24 to PNG-8 files while retaining their transparency, though. One method is PNGQuant, another is to save your files with Fireworks.
PNG-24 - Lossless / Direct
PNG-24 is a great format that combines Lossless encoding with Direct color (thousands of colours, just like JPEG). It's very much like BMP in that regard, except that PNG actually compresses images, so it results in much smaller files. Unfortunately PNG-24 files will still be much bigger than JPEGs, GIFs and PNG-8s, so you still need to consider if you really want to use one.
Even though PNG-24s allow thousands of colours while having compression, they are not intended to replace JPEG images. A photograph saved as a PNG-24 will likely be at least 5 times larger than a equivalent JPEG image, which very little improvement in visible quality. (Of course, this may be a desirable outcome if you're not concerned about filesize, and want to get the best quality image you can.)
Just like PNG-8, PNG-24 supports alpha-transparency, too.
Generally these are either:
Lossless compression
Lossless compression algorithms reduce file size without losing image quality, though they are not compressed into as small a file as a lossy compression file. When image quality is valued above file size, lossless algorithms are typically chosen.
Lossy compression
Lossy compression algorithms take advantage of the inherent limitations of the human eye and discard invisible information. Most lossy compression algorithms allow for variable quality levels (compression) and as these levels are increased, file size is reduced. At the highest compression levels, image deterioration becomes noticeable as "compression artifacting". The images below demonstrate the noticeable artifacting of lossy compression algorithms; select the thumbnail image to view the full size version.
Each format is different as described below:
JPEG
JPEG (Joint Photographic Experts Group) files are (in most cases) a lossy format; the DOS filename extension is JPG (other OS might use JPEG). Nearly every digital camera can save images in the JPEG format, which supports 8 bits per color (red, green, blue) for a 24-bit total, producing relatively small files. When not too great, the compression does not noticeably detract from the image's quality, but JPEG files suffer generational degradation when repeatedly edited and saved. Photographic images may be better stored in a lossless non-JPEG format if they will be re-edited, or if small "artifacts" (blemishes caused by the JPEG's compression algorithm) are unacceptable. The JPEG format also is used as the image compression algorithm in many Adobe PDF files.
TIFF
The TIFF (Tagged Image File Format) is a flexible format that normally saves 8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, using either the TIFF or the TIF filenames. The TIFF's flexibility is both blessing and curse, because no single reader reads every type of TIFF file. TIFFs are lossy and lossless; some offer relatively good lossless compression for bi-level (black&white) images. Some digital cameras can save in TIFF format, using the LZW compression algorithm for lossless storage. The TIFF image format is not widely supported by web browsers. TIFF remains widely accepted as a photograph file standard in the printing business. The TIFF can handle device-specific colour spaces, such as the CMYK defined by a particular set of printing press inks.
PNG
The PNG (Portable Network Graphics) file format was created as the free, open-source successor to the GIF. The PNG file format supports truecolor (16 million colours) while the GIF supports only 256 colours. The PNG file excels when the image has large, uniformly coloured areas. The lossless PNG format is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of photographic images, because JPG files are smaller than PNG files. Many older browsers currently do not support the PNG file format, however, with Internet Explorer 7, all contemporary web browsers fully support the PNG format. The Adam7-interlacing allows an early preview, even when only a small percentage of the image data has been transmitted.
GIF
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is still widely used to provide image animation effects. It also uses a lossless compression that is more effective when large areas have a single color, and ineffective for detailed images or dithered images.
BMP
The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large; the advantage is their simplicity, wide acceptance, and use in Windows programs.
Use for Web Pages / Web Applications
The following is a brief summary for these image formats when using them with a web page / application.
PNG is great for IE6 and up (will require a small CSS patch to get transparency working well). Great for illustrations and photos.
JPG is great for photos online
GIF is good for illustrations when you do not wish to move to PNG
BMP shouldn't be used online within web pages - wastes bandwidth
Source: Image File Formats
Since others have covered the differences, I'll hit the uses.
TIFF is usually used by scanners. It makes huge files and is not really used in applications.
BMP is uncompressed and also makes huge files. It is also not really used in applications.
GIF used to be all over the web but has fallen out of favor since it only supports a limited number of colors and is patented.
JPG/JPEG is mainly used for anything that is photo quality, though not for text. The lossy compression used tends to mar sharp lines.
PNG isn't as small as JPEG but is lossless so it's good for images with sharp lines. It's in common use on the web now.
Personally, I usually use PNG everywhere I can. It's a good compromise between JPG and GIF.
JPG > Joint Photographic Experts Group
1 JPG images support 16 million colors and are best suited for photographs and complex graphics
2 JPGs do not support transparency.
PNG > Portable Network Graphics
1 It's used as an alternative to the GIF file format when the GIF technology was copyrighted and required permission to use.
2 PNGs allow for 5 to 25 percent greater compression than GIFs, and with a wider range of colors. PNGs use two-dimensional interlacing, which makes them load twice as fast as GIF images.”
3 Image that has a lot of colors or requires advanced variable transparency, PNG is the preferred file type.
GIF > Graphics Interchange Format
1 Reduces the number of colors in an image to 256.
2 GIFs also support transparency.
3 GIFs have the unique ability to display a sequence of images, similar to videos, called an animated GIF.
4 If the image has few colors and does not require any advanced alpha transparency effect, GIF is the way to go.
SVG > Scalable Vector Graphics
1 SVGs are a web standard based on XML that describe both static images and animations in two dimensions.
2 SVG allows you to create very high-quality graphics and animations that do not lose detail as their size increases/decreases.
These names refers to different ways to encode pixel image data (JPG and JPEG are the same thing, and TIFF may just enclose a jpeg with some additional metadata).
These image formats may use different compression algorithms, different color representations, different capability in carrying additional data other than the image itself, and so on.
For web applications, I'd say jpeg or gif is good enough. Jpeg is used more often due to its higher compression ratio, and gif is typically used for light weight animation where a flash (or something similar) is an over kill, or places where transparent background is desired. PNG can be used too, but I don't have much experience with that. BMP and TIFF probably are not good candidates for web applications.
What coobird and Gerald said.
Additionally, JPEG is the file format name. JPG is commonly used abbreviated file extension for this format, as you needed to have a 3-letter file extension for earlier Windows systems. Likewise with TIFF and TIF.
Web browsers at the moment only display JPEG, PNG and GIF files - so those are the ones that can be shown on web pages.
PNG supports alphachannel transparency.
TIFF can have extended options I.e.
Geo referencing for GIS applications.
I recommend only ever using JPEG for photographs, never for images like clip art, logos, text, diagrams, line art.
Favor PNG.
The named ones are all raster graphics, but beside that don't forget the more and more important vectorgraphics.
There are compressed and uncompressed types (in a more or less way), but they're all lossless. Most important are:
SVG / SVGZ
EPS
EMF / (WMF)
The file extension tells you how the image is saved. Some of those formats just save the bits as they are, some compress the image in different ways, including lossless and lossy methods. The Web can tell you, although I know some of the patient responders will outline them here.
The web favors gif, jpg, and png, mostly. JPEG is the same (or very close) to jpg.
For the specified difference and usage between the varies of image formats have a good discussion above already.
However, I want to add something for the overall process of capturing a picture and storing them.
The capturing process
Or you can say the construct process(as we can draw or make pictures with computers now). If you take a photograph with a camera, you are already using lots of sensors(CCD or CMOS) and algorithms(Bayer Pattern Filter, Sub-sampling and quantization, etc.) Also there are stuff like Pixel Format and Color Space. After you got the basic pixel information, there must be a way for storing them.
The basic image file structure
For storing the pixels info a file, we need a convention and related algorithms. For saving space, there are compression, but basically problem is encoding the pixels to bytes and decoding the bytes to pixels for display.
A typical image file may be consisted by several parts, basically two:meta data or file header and pixel data section. The meta data tells about the image itself, maybe height and width, file format, etc. And the pixel data section is the real section who deals with the real picture.
Storing and Displaying
As we said earlier, files are stored in hard disk and are in bytes/bits. So image files have no priority but also bytes stream actually. For displaying, maybe we should get something to know how monitor works. Typical PC monitors use RGB model for displaying.
Hope this helps:-)

Resources