What are the state-of-art algorithms when it comes to compressing digital images (say for instance color photos, maybe 800x480 pixels)?
Some of the formats that are frequently discussed as possible JPEG successors are:
JPEG XR (aka HD Photo, Windows Media Photo). According to a study the Graphics and Media Lab at Moscow State University (MSU) image quality is comparable to JPEG2000 and significantly better than JPEG, compression efficiency is comparable to JPEG-2000
WebP is already tested in the wild on Google properties mainly, where the format is served to Chrome users exclusively (if you connect with a different browser, you get png or jpg images instead). It's very web-oriented
HEVC-MSP. In a study of Mozilla Corporation (oct 2013) HEVC-MSP performed best in most tests, and in the tests that it was not best, it came in second to the original JPEG format (but the study only looked at the image compression efficiency and not at other metrics and data that matters: feature sets, performance during run-rime, licensing...)
Jpeg 2000. The most computational intensive to encode/decode. Compared with the regular JPEG format, it offers advantages such as support for higher bit depths, more advanced compression and a lossless compression option. It is the standard comparison term for the others but it is a bit "slow in acceptance".
Anyway JPEG encoders haven't really reached their full compression potential after 20+ years. Even within the constraints of strong compatibility requirements, there are projects (e.g. Mozilla mozjpeg Project or Google Guetzli) that can produce smaller JPG files without sacrificing quality.
It would depend on what you need to do with the encoded images of course. For webpages and small sizes, lossy compression systems may be suitable, but for satellite images, medical images etc. lossless compression may be required.
None of the formats mentioned above satisfy both situations. Not all of the above formats support every pixel format either, so they cannot be compared like for like.
I've been doing my own research into lossless compression for high bit depth images, and what I've found so far is that a Huffman coder with suitable reversible pre-compression filtering beats jpeg2000 and-jpeg xr in terms of file size by 56% on average (i.e., makes files less than half the size) on cinematic real world footage and faster. I can also beats FFV1 in the limited tests I've conducted, producing files under half the size even after FFV1 has truncated the source image pixel depths from 16 bits to 10 bits. Really most surprising.
For lossless compression ratios FLIF is ranked number one for me, but encoding times are astronomical. I've never made a file smaller than a FLIF file when compared. So good things come to those who wait. FLIF uses machine learning to achieve its compression ratios. Applying a lossy pre-compression filter to images before FLIF compression (something the encoder enables), creates visually lossless images that competes with the best lossy encoders, but with the advantage that re-encoding the output files repeatedly won't further reduce quality (as the encoder is lossless).
One thing that is obvious to me - nothing is really state of the art currently. Most formats are using old technology, designed in a time when memory and processing power was a premium. As far as lossless compression goes, FLIF is a big jump forward, but its an area of research that is wide open. Most research seems to be into lossy compression systems.
Related
I have this image (photo taken by me on SGS 9 plus): Uncompressed JPG image. Its size is 4032 x 3024 and its weight is around 3MB. I compressed it with TinyJPG Compressor and its weight was 1.3MB. For PNG images I used Online-Convert and I saw webp images much more smaller even than compressed with TinyPNG. I expected something similar, especially that I read an article JPG to WebP – Comparing Compression Sizes where WEBP is much smaller that compressed JPG.
But when I convert my JPG to WEBP format in various online image convertion tools, I see 1.5-2MB size, so file is bigger than my compressed JPG. Am I missing something? WEBP should not be much smaller than compressed JPG? Thank you in advance for every answer.
These are lossy codecs, so their file size mostly depends on quality setting used. Comparing just file sizes from various tools doesn't say anything without ensuring images have the same quality (otherwise they're incomparable).
There are a couple of possibilities:
JPEG may compress better than WebP. WebP has problems with blurring out of the details, low-resolution color, and using less than full 8 bits of the color space. In the higher end of quality range, a well-optimized JPEG can be similar or better than WebP.
However, most of file size differences in modern lossy codecs are due to difference in quality. The typical difference between JPEG and WebP at the same quality is 15%-25%, but file sizes produced by each codec can easily differ by 10× between low-quality and high-quality image. So most of the time when you see a huge difference in file sizes, it's probably because different tools have chosen different quality settings (and/or recompression has lost fine details in the image, which also greatly affects file sizes). Even visual difference too small for human eye to notice can cause noticeable difference in file size.
My experience is that lossy WebP is superior below quality 70 (in libjpeg terms) and JPEG is often better than WebP at quality 90 and above. In between these qualities it doesn't seem to matter much.
I believe WebP qualities are inflated about 7 points, i.e., to match JPEG quality 85 one needs to use WebP quality 92 (when using the cwebp tool). I didn't measure this well, this is based on rather ad hoc experiments and some butteraugli runs.
Lossy WebP has difficulties compressing complex textures such as leafs of trees densely, whereas JPEGs difficulties are with thin lines against flat borders, like a telephone line hanging against the sky or computer graphics.
Which lossless compression algorithm [between LZW or JBIG] is better for compressing data sets consisting of images (colored and monochrome) ?
I have implemented both and tested on smaller data sets [each containing 100 images] and have found inconclusive results.
Please Note:: I cannot use Lossy compressions like Jpeg because the data after decompression has to be identical to that of the source. Neither I can other lossless algorithms like PNG as they are not supported by the firmware which is responsible for the decompression.
Neither LZW or JBIG are optimal, although JBIG (JBIG2) should give you better results.
LZW is not designed for images (e.g., it does not exploit 2D correlation), and JBIG. JBIG (perhaps you mean JBIG2?) does exploit the 2D correlation, although it is designed for monochrome images such as fax pages.
Of course, results will depend on your particular dataset, so if results are inconclusive the best thing you can do is to test on more images (and perhaps differenciate between color and grayscale images).
If your firmware supports it, I would also test JPEG-LS (https://jpeg.org/jpegls/), which in my experience gives good overall lossless compression performance.
JPEG-LS or JPEG 2000 would give better results. You can think about WebP or JPEG XR as well.
Note: If you want to render compressed image to browser then you may need to take the browser support into account. e.g. JPEG 2000 supported by safari, WebP supported by chrome and android browsers, JPEG-XR supported by IE11 & Edge likewise.
I'm looking for a way to compress dicom files and send them to a remote server (nodejs in my case)
I tried bz2 compression and it seems to work very well on large dicom files (tested it with 10 Mb file which gave me a 5Mb compressed file).
When it comes to small size (like a 250Kb file ), I get a size reduced by a very few kb (5 to 10 kb in most cases) which it won't be worth it
can someone please explain me why bz2 is working very well with large dicom files and is there a better way to compress dicom files that I can use in order to send them via internet.
thanks in advance.
If you want to compress a DICOM dataset with images, recommendation is to use one of the compression types supported by DICOM Standard. This includes lossy and lossless JPEG, JPEG 2000, JPEG-LS, and RLE to name a few. Standard also supports encoding extended grayscale (12-16 bit grayscale) using the standard based compression techniques.
The Transfer Syntax element (0002, 0010) will indicate whether image in the DICOM dataset is already compressed or not. As for example, recompressing an already compressed image is going to appear having less compression ratio compared to the original. So the best way to measure is to compare with the original uncompressed image. If your original image is already compressed, you can calculate the uncompressed image size using (Rows x Columns x Bits Allocated / 8 x Sample Per Pixel x Number of Frames). Also, the compression ratio will vary based on the image type (color vs grayscale) and compression technique used. Typically, you will get much better compression when dealing with true color image vs a grayscale image such as an X-RAY.
As for using HTTP for uploading the file, you can also use DICOM standard defined service such as DICOMWeb (STOW-RS) REST service.
I work for LEAD Technologies and if you want to test various compressions on your DICOM files, we have a demo exe (Transfer Syntax) that ships with our free 60 day evaluation SDK that you can use for testing. Also, there is a demo for testing the DICOMWeb REST services. You can download the evaluation copy from our web site.
There is not one solution that perfectly fits all...
BZ2 is based on the principle that "colours" (or gray values, but I will use "colours" in this explanation) which frequently occur in the image are encoded with less bits than colours which are rare. Thus, as a rule of thumb: the bigger the image the better the compression ratio.
JPEG is a different approach which decomposes the image into tiles and optimizes the encoding for each tile. Thus the compression ratio is less dependent on the image size than it is for BZ2. JPEG comes in different flavors (lossy, lossless, JPEG 2000 which can create different serializations of the compressed data for different purposes, e.g. progressive refinement).
Less popular compression algorithms which are valid in DICOM but not widely supported by DICOM products are:
RLE (Run Length Encoding) - the pixel data is described by pairs of colour and number of pixels, so it compresses very well when you have large homogenous areas in the image. In all other cases it is rather increasing the size of the "compressed" image
JPEG-LS - I do not know how it is working internally, but it provides a lossless algorithm and a lossy algorithm in which you can control the loss of information (maximum difference of a pixel value after compression to the original pixel value). It is said to achieve better ratios than traditional JPEG, but as it is not widely supported, I have not used it in practice yet.
If you do not want to select the compression algorithm depending on the image type, JPEG-Lossless is probably a good compromise for you. In typical medical images, it achieves an average compression ratio of roughly 1:, a bit more with JPEG-2000.
I'd like to batch convert my entire photo library from NEF/RAW format to a more suitable format for storage. By that I mean I would like to keep the higher bitrate data, potentially for future 'developing', but I want a smaller file footprint. I realize I could just zip them into an archive but I would prefer they were still in a browsable format.
I'm currently considering going with JPEG XR (i.e. HD Photo) since it supports HDR bitrates (giving me some good room to change exposures in the future) and decent enough lossy and lossless compression (though I'm not sure HDR will work with lossy). I'm also aware of the WebP format but while it's compression and quality are phenomenal it will not work for storing my HDR data intact. I realize the demosaicing data is gone if I don't use NEF/RAW but it's a compromise I'm willing to make as long as I can keep that higher bitrate data. I've considered TIFF as well but ruled it out due to it only supporting lossless compression.
Can anyone recommend an alternative to these formats or perhaps comment on their own experience with the JXR format, specifically using the MS JXRlib?
I have a good understanding of pros and cons of different image formats for web use.
However, I'm trying to decide what format to use for a desktop application.
I have a potentially large number of high-resolution images (with no transparency) to deploy. I'm mainly weighing JPG vs. PNG, but am open to other formats.
My understanding:
JPG is more compressed, which means smaller file size, but probably lower image quality. Because they are more compressed, they take more time to decompress.
PNG files are larger, but maintain image quality. Because they are less compressed, they decompress faster.
Both occupy the same amount of RAM once loaded and decompressed.
Seems that PNG is a better option, given that HD space (i.e. application size) is not an issue, because it will decompress and appear on-screen faster, and maintain higher image quality.
Are my assumptions generally correct? Are there any nuances I'm overlooking? Any other image file formats worth considering?
Your assumptions are roughly correct.
Because [JPG] are more compressed, they take more time to decompress.
Not exactly, a JPG supports distinct levels of compression, the time to decompress depends on the algorithm itself, which is slighly more complex than PNG. However, decompression speed is rarely an issue. And, in any case, that depends wildly on the decoder implementation.
Seems that PNG is a better option, given that HD space (i.e. application size) is not an issue is not an issue, because it will decompress and appear on-screen faster, and maintain higher image quality.
May be. PNG is definitely better if your program is going to read-modify-write the images; JPG is not advisable in this scenario -unless you use lossless JPG. If, instead, the images are read only, the difference is less important. Notice that for high resolution photographic images, the compression ratio can be quite different; and, even if you are not worried about HD space, bigger files can be slower to read because of I/O performance.
I would go with Jpeg.
The size is small compared to other formats and you could compress it in high quality mode so it would be very hard to notice any Jpeg artifacts. Regarding decompression, since most of the decompression procedure is math and CPU runs much faster than memory you will be amazed to hear that in many cases decompressing a Jpeg is faster than reading a PNG from the disk and displaying it.