JPEG estimating file size

JPEG estimating file size - image

I'm writing an application that needs to compress and scale an image before sending it as an mms which has a maximum file size of 600kb. I'm looking for an algorithm that will help estimate the maximum surface area of a jpeg with a file size < 600kb that accounts for a reduction in quality by some percent between. Thanks.

The compression ratio gotten for a specific quality level is pretty consistent. For example, you may find that at a quality level of 80 with color subsampling, you get 14:1 compression. You can do some tests with your JPEG encoder at various quality levels to see the compression ratio achieved and then work backwards to find the largest image size you can compress for a given quality level.

Related

Image file size to area ratio

I'm writing a tool to detect images on our website that should be flagged for manual intervention to reduce file size. If a "large" image is 100K that might be fine, but if a "small" image is 100K, someone forgot to flatten it or compress it.
I'm looking at the "file density" of an image as the ratio filesize/(height x width). Is there a term for this? Is there some guidance about what a reasonable range for this density should be, so that I can flag images? Or am I thinking about this wrong?

Yes, if the file size is given in bits, then that fraction is known as the bitrate in bits per pixel (bpp) - as sascha points out. For example, an uncompressed image is usually 24 bit (8bit/channel * 3 channels (r,g,b)). Anything at this or higher bitrates is (most often) not compressed.
In general, lossless compression can be achieved at bitrates of about 12bpp (a 2:1 compression ratio). Normally you can aim at much lower bitrates (e.g., 1 bit per pixel, 24:1 compression ratio) and expect decent quality, but it'll depend on the images you're dealing with.

Uncertainty in L,a,b space of compressed JPEG images

My team wish to calculate the contrast between two photographs taken in a wet environment.
We will calculate contrast using the formula
Contrast = SQRT((ΔL)^2 + (Δa)^2 + (Δb)^2)
where ΔL is the difference in luminosity, Δa is the difference in (redness-greeness) and Δb is (yellowness-blueness), which are the dimensions of Lab space.
Our (so far successful) approach has been to convert each pixel from RGB to Lab space, and taking the mean values of the relevant sections of the image as our A and B variables.
However the environment limits us to using a (waterproof) GoPro camera which compresses images to JPEG format, rather than saving as TIFF, so we are not using a true-colour image.
We now need to quantify the uncertainty in the contrast - for which we need to know the uncertainty in A and B and by extension the uncertainties (or mean/typical uncertainty) in each a and b value for each RGB pixel. We can calculate this only if we know the typical/maximum uncertainty produced when converting from true-colour to JPEG.
Therefore we need to know the maximum possible difference in each of the RGB channels when saving in JPEG format.
EG. if true colour RGB pixel (5, 7, 9) became (2, 9, 13) after compression the uncertainty in each channel would be (+/- 3, +/- 2, +/- 4).
We believe that the camera compresses colour in the aspect ratio 4:2:0 - is there a way to test this?
However our main question is; is there any way of knowing the maximum possible error in each channel, or calculating the uncertainty from the compressed RGB result?
Note: We know it is impossible to convert back from JPEG to TIFF as JPEG compression is lossy. We merely need to quantify the extent of this loss on colour.

In short, it is not possible to absolutely quantify the maximum possible difference in digital counts in a JPEG image.
You highlight one of these points well already. When image data is encoded using the JPEG standard, it is first converted to the YCbCr color space.
Once in this color space, the chroma channels (Cb and Cr) are downsampled, because the human visual system is less sensitive to artifacts in chroma information than it is lightness information.
The error introduced here is content-dependent; an area of very rapidly varying chroma and hue will have considerably more content loss than an area of constant hue/chroma.
Even knowing the 4:2:0 compression, which describes the amount and geometry of downsampling (more information here), the content still dictates the error introduced at this step.
Another problem is the quantization performed in JPEG compression.
The resulting information is encoded using a Discrete Cosine Transform. In the transformed space, the results are again quantized depending on the desired quality. This quantization is set at the time of file generation, which is performed in-camera. Again, even if you knew the exact DCT quantization being performed by the camera, the actual effect on RGB digital counts is ultimately content-dependent.
Yet another difficulty is noise created by DCT block artifacts, which (again) is content dependent.
These scene dependencies make the algorithm very good for visual image compression, but very difficult to characterize absolutely.
However, there is some light at the end of the tunnel. JPEG compression will cause significantly more error in areas of rapidly changing image content. Areas of constant color and texture will have significantly less compression error and artifacts. Depending on your application you may be able to leverage this to your benefit.

What are the steps in which loss takes place in jpeg compression?

JPEG is a lossy image compression which can give a high compression ratio.
As far as I know, information loss takes place in JPEG during quantization.
Are there any other steps in JPEG compression where the loss takes place or can take place?
If it takes place, then where?

There are 3 aspects of JPEG compression which affect the quality and accuracy of images:
1) Loss of precision takes place during the quantization stage. Accuracy of the colors is lost in order to reduce the amount of data generated.
2) Errors are introduced during the conversion to/from the RGB/YCC color spaces.
3) Errors are introduced during the transformation to/from the frequency domain. The Discrete Cosine Transform converts pixels into the frequency domain. This conversion incurs errors in both directions.

Another place where loss can take place in JPEG compression is the chroma subsampling stage.
My understanding is that most JPEG-compressed images use 4:2:0 color subsampling: after converting each pixel from RGB to YCbCr, the Cb values for a 2x2 block of pixels are averaged to a single value, and the Cr values for that 2x2 block of pixels are also averaged to a single value.
The JPEG standard also supports 4:4:4 (no downsampling).

How do I prevent ImageMagick from doubling the image size during rotation?

I have an optimally compressed png that I'm rotating by 1 degree using ImageMagick -
convert -rotate 1 crab.png crab-rotated.png
The size goes from 74 KB to 167 KB. How do I minimize that increase?
Orginal:
Rotated:

The increase in file size is probably due to less efficient compression. You won't be able to do anything about that unless you decrease the compression level (-quality option) or use a more efficient but lossy compression method (e.g. JPEG).
Here's the reason why I think this happens (I hope somebody can correct me if I am wrong). By rotating the image, you are introducing spatial frequencies that were not present in the original image. If these frequencies are not suppressed during compression, than the file size will inevitably increase. However, suppressing these frequencies may degrade the quality of your image. It's a delicate balance.
The amount of increase (or decrease) in filesize after rotation depends on the frequencies already present in the image, i.e. it is image-specific.

How to estimate the size of JPEG image which will be scaled down

For example, I have an 1024*768 JPEG image. I want to estimate the size of the image which will be scaled down to 800*600 or 640*480. Is there any algorithm to calculate the size without generating the scaled image?
I took a look in the resize dialog in Photoshop. The size they show is basically (width pixels * height pixels * bit/pixel) which shows a huge gap between the actual file size.
I have mobile image browser application which allow user to send image through email with options to scale down the image. We provide check boxes for the user to choose down-scale resolution with the estimate size. For large image (> 10MB), we have 3 down scale size to choose from. If we generate a cached image for each option, it may hurt the memory. We are trying to find the best solution which avoid memory consumption.

I have successfully estimated the scaled size based on the DQT - the quality factor.
I conducted some experiments and find out if we use the same quality factor as in the original JPEG image, the scaled image will have size roughly equal to (scale factor * scale factor) proportion of the original image size. The quality factor can be estimate based on the DQT defined in the every JPEG image. Algorithm has be defined to estimate the quality factor based on the standard quantization table shown in Annex K in JPEG spec.
Although other factors like color subsampling, different compression algorithm and the image itself will contribute to error, the estimation is pretty accurate.
P.S. By examining JPEGSnoop and it source code, it helps me a lot :-)
Cheers!

Like everyone else said, the best algorithm to determine what sort of JPEG compression you'll get is the JPEG compression algorithm.
However, you could also calculate the Shannon entropy of your image, in order to try and understand how much information is actually present. This might give you some clues as to the theoretical limits of your compression, but is probably not the best solution for your problem.
This concept will help you measure the differences in information between an all white image and that of a crowd, which is related to it's compressibility.
-Brian J. Stinar-

Why estimate what you can measure?
In essence, it's impossible to provide any meaningful estimate due to the fact that different types of images (in terms of their content) will compress very differently using the JPEG algorithm. (A 1024x768 pure white image will be vastly smaller than a photograph of a crowd scene for example.)
As such, if you're after an accurate figure it would make sense to simply carry out the re-size.
Alternatively, you could just provide an range such as "40KB to 90KB", based on an "average" set of images.

I think what you want is something weird and difficult to do. Based on JPG compression level some images are heavier that others in terms of heavier (size).

My hunch for JPEG images: Given two images at same resolution, compressed at the same quality ratio - the image taking smaller memory will compress more (in general) when its resolution is reduced.
Why? From experience: many times when working with a set of images, I have seen that if a thumbnail is occupying significantly more memory than most others, reducing its resolution has almost no change in the size (memory). On other hand, reducing resolution of one of the average size thumbnails reduces the size significantly. (all parameters like original/final resolution and JPEG quality being the same in the two cases).
Roughly speaking - higher the entropy, less will be the impact on size of image by changing resolution (at the same JPEG quality).
If you can verify this with experiments, maybe you can use this as a quick method to estimate the size. If my language is confusing, I can explain with some mathematical notation/psuedo formula.

An 800*600 image file should be roughly (800*600)/(1024*768) times as large as the 1024*768 image file it was scaled down from. But this is really a rough estimate, because the compressibility of original and scaled versions of the image might be different.

Before I attempt to answer your question, I'd like to join the ranks of people that think it's simpler to measure rather than estimate. But it's still an interesting question, so here's my answer:
Look at the block DCT coefficients of the input JPEG image. Perhaps you can find some sort of relationship between the number of higher frequency components and the file size after shrinking the image.
My hunch: all other things (e.g. quantization tables) being equal, the more higher frequency components you have in your original image, the bigger the difference in file size between the original and shrinked image will be.
I think that by shrinking the image, you will reduce some of the higher frequency components during interpolation, increasing the possibility that they will be quantized to zero during the lossy quantization step.
If you go down this path, you're in luck: I've been playing with JPEG block DCT coefficients and put some code up to extract them.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio