I am wondering what the data structure is behind storing images with HDR data. I understand how regular images (rgba) and cubemaps are stored. I doubt its as simple as storing multiple images at different exposures inside the same file.
You've probably moved on long ago, but I thought it worth posting references for anyone else who happened upon this question.
Here is an old reference for the Radiance .pic (now .hdr) file format. The useful info starts at the bottom of page 29.
http://radsite.lbl.gov/radiance/refer/filefmts.pdf
excerpt:
The basic idea is to store a 1-byte mantissa for each of three
primaries, and a common 1-byte exponent. The accuracy of these values
will be on the order of 1% (+/-1 in 200) over a dynamic range from
10^-38 to 10^38.
And here is a more recent reference for JPEG HDR format: http://www.anyhere.com/gward/papers/cic05.pdf
It's generally a matter of increasing the range of values (in an HSV sense) representable, so you can use e.g. RGB[A] where each element is a 16-bit int, 32-bit int, float, double etc. instead of a JPEG-type-quality 8-bit int. There's a trade-off between increasing the range represented, retaining fine gradations within that range, and whether some particular intensity levels are given priority via some non-linearity in the mapping (e.g. storing a log of the value).
The raw file from the camera normally stores the 12-14bit values from the Bayer mask - so effectively a greeyscale. These are sometimes compressed losslessly (in Canon or Nikon) or as 16bit values (Olympus). The header also contains the white balance and gain calibrations for the red,green,blue masked pixels so you can generate a color image.
Once you have a color image you can store it however you want, normally 16bit RGB is the easiest.
Here is some information on the Radiance file format, used for HDR images. It uses 32-bit floating-point numbers.
First, I am not sure if there is a public format for storing multiple images at different exposures inside cause the usage is rare. Those multiple images are used as one sort of HDR sources, but they are not HDR, they are just normal LDR (L for low) or SDR (S for standard?) images encoded like JPEG from digital cameras.
It is more common to store resulting in HDR format and the point is just like everyone mentioned, in floating point.
There are some HDR formats:
OpenEXR
TIF
Radiance
...
You can get more info from wiki
Related
If there is a two dimensional array of booleans, would there be an efficient way to represent this as an image and save this using minimal space? So just one bit for every pixel, having it either be black or white?
so you could put a bitmap header on a buffer and display it this way, you wont save any memory, but you will be able to view it... if you are looking to save space there are lots of lossless encoding techniques... huffman coding and lzw are some methods... some of those methods get grouped into formats like zip, bzip, gzip, deflate, etc
My guess is no, because in an image, the color attribute of each pixel is represented by at least 8 bits. So you in effect are using an 8-bit byte to store a value which can be represented by one solitary bit (0 or 1).
In addition, there are other attributes that describe each pixel in an image, including alpha channel, opacity, and so on.
So, in short, although it may be visually pleasing to use images to store binary data, it would in fact be using way more storage space.
Most programming languages have native support for biinary data, and these provide much more efficient storage for them.
Ok, so I tried to use the imagemagick command:
"convert picA.png -interlace line picB.png"
to make an interlace version of my .png images. Most of the time, I got the resulting image is larger than the original one, which is kinda normal. However, on certain image, the resulting image size is smaller.
So I just wonder why does that happen? I really don't want my new image to lose any quality because of the command.
Also, is there any compatibility problem with interlaced .png image?
EDIT: I guess my problem is that the original image was not compressed as best as it could be.
The following only applies to the cases where the pixel size is >= 8 bits. I didn't investigate for other cases but I expect similar outcomes.
A content-identical interlaced PNG image file will almost always be greater because of the additional data for filter type descriptions required to handle the passes scanlines. This is what I explained in details in this web page based on the PNG RFC RFC2083.
In short, this is because the sum of the below number of bytes for interlaced filter types description per interlacing pass is almost always greater than the image height (which is the number of filter types for non-interlaced images):
nb_pass1_lines = CEIL(height/8)
nb_pass2_lines = (width>4?CEIL(height/8):0)
nb_pass3_lines = CEIL((height-4)/8)
nb_pass4_lines = (width>2?CEIL(height/4):0)
nb_pass5_lines = CEIL((height-2)/4)
nb_pass6_lines = (width>1?CEIL(height/2):0)
nb_pass7_lines = FLOOR(height/2)
Though, theoretically, it can be that the data entropy/complexity accidentally gets lowered enough by the Adam7 interlacing so that, with the help of filtering, the usually additional space needed for filter types with interlacing may be compensated through the deflate compression used for the PNG format. This would be a particular case to be proven as the entropy/complexity is more likely to increase with interlacing because the image data is made less consistent through the interlacing deconstruction.
I used the word "accidentally" because reducing the data entropy/complexity is not the purpose of the Adam7 interlacing. Its purpose is to allow the progressive loading and display of the image through a passes mechanism. While, reducing the entropy/complexity is the purpose of the filtering for PNG.
I used the word "usually" because, as shown in the explanation web page, for example, a 1 pixel image will be described through the same length of uncompressed data whether interlaced or not. So, in this case, no additional space should be needed.
When it comes to the PNG file size, a lower size for interlaced can be due to:
Different non-pixel encoding related content embedded in the file such as palette (in the case of color type =! 3) and non-critical chunks such as chromaticities, gamma, number of significant bits, default background color, histogram, transparency, physical pixel dimensions, time, text, compressed text. Note that some of those non-pixel encoding related content can lead to different display of the image depending on the software used and the situation.
Different pixel encoding related content (which can change the image quality) such as bit depth, color type (and thus the use of palette or not with color type = 3), image size,... .
Different compression related content such as better filtering choices, accidental lower data entropy/complexity due to interlacing as explained above (theoretical particular case), higher compression level (as you mentioned)
If I had to check whether 2 PNG image files are equivalent pixel wise, I would use the following command in a bash prompt:
diff <( convert non-interlaced.png rgba:- ) <( convert interlaced.png rgba:- )
It should return no difference.
For the compatibility question, if the PNG encoder and PNG decoder implement the mandatory aspects of the PNG RFC, I see no reason for the interlacing to lead to a compatibility issue.
Edit 2018 11 13:
Some experiments based on auto evolved distributed genetic algorithms with niche mechanism (hosted on https://en.oga.jod.li ) are explained here:
https://jod.li/2018/11/13/can-an-interlaced-png-image-be-smaller-than-the-equivalent-non-interlaced-image/
Those experiments show that it is possible for equivalent PNG images to have a smaller size interlaced than non-interlaced. The best images for this are tall, they have a one pixel width and have pixel content that appear random. Though, the shape is not the only important aspect for the interlaced image to be smaller than the non-interlaced image as random cases with the same shape lead to different size differences.
So, yes, some PNG images can be identical pixel wise and for non-pixel related content but have a smaller size interlaced than non-interlaced.
So I just wonder why does that happen?
From section Interlacing and pass extraction of the PNG spec.
Scanlines that do not completely fill an integral number of bytes are padded as defined in 7.2: Scanlines.
NOTE If the reference image contains fewer than five columns or fewer than five rows, some passes will be empty.
I would assume the behavior your experiencing is the result of the Adam7 method requiring additional padding.
I decided I'd attempt an image compression (From pixel RGBs) idea I had for a bit of extra credit in class. I've finished, but I find the levels of compression I get varies HUGELY from image to image. With this image, I'm getting a file size 1.25x the size of the coresponding PNG. With this image however, I'm getting a file size 22.5x the size of the PNG.
My compression works by first assigning each color in the image with an int(starting from 0), then using that int rather than the actual color in the file. The file is formatted as:
0*qj8c1*50i2p2*pg93*9zlds4*2rk5*4ok4r6*8mv1w7*2r25l8*3m89o9*9yp7c10*111*2clz112*g1j13*2w34z14*auq15*3zhg616*mmhc17*5lgsi18*25lw919*7ip84+0!0!0!0!0!0!0!0!0!0!0!0!0!1!1!1!#2!2!2!2!2!2!2!2!2!2!2!2!2!3!3!3!#4!4!4!4!4!4!4!4!4!4!4!4!4!5!5!5!#6!6!6!6!6!6!6!6!6!6!6!6!6!0!0!0!#3!3!3!3!3!3!3!3!3!3!3!3!3!2!2!2!#1!1!1!1!1!1!1!1!1!1!1!1!1!4!4!4!#7!7!7!7!7!7!7!7!7!7!7!7!7!6!6!6!#5!5!5!5!5!5!5!5!5!5!5!5!5!8!8!8!#9!9!9!9!9!9!9!9!9!9!9!9!9!10!10!10!#8!8!8!8!8!8!8!8!8!8!8!8!8!11!11!11!#12!12!12!12!12!12!12!12!12!12!12!12!12!7!7!7!#13!13!13!13!13!13!13!13!13!13!13!13!13!14!14!14!#15!15!15!15!15!15!15!15!15!15!15!15!15!16!16!16!#17!17!17!17!17!17!17!17!17!17!17!17!17!13!13!13!#18!18!18!18!18!18!18!18!18!18!18!18!18!15!15!15!#10!10!10!10!10!10!10!10!10!10!10!10!10!19!19!19!#
Where the first bit (With the *s and base36 numbers) is the dictionary defining the colors, and the second part seperated by !s is the actual image.
Why am I seeing the level of compression vary so hugely image from image? Is there a flaw in my compression algorithm?
Edit: The fact that the actual level of compression is poor compared to JPEG or PNG isn't an issue, I wasn't expecting to rival any major formats.
Thanks
I'm working on a software project in which I have to compare a set of 'input' images against another 'source' set of images and find out if there is a match between any of them. The source images cannot be edited/modified in any way; the input images can be scaled/cropped in order to find a match. The images can be in BMP,JPEG,GIF,PNG,TIFF of any dimensions.
A constraint: I'm not allowed to use any external libraries. ImageMagick is an exception and can be used.
I intend to use Java/Python. The software is purely command-line based.
I was reading on SO and some common image comparing algorithms. I'm planning to take 2 approaches.
1. I could use Histograms/buckets to find out the RGB values of the 2 images being compared.
2. Use SIFT/SURF to fin keypoint descriptors and find the euclidean distance between them and output the result based on the resultant distance.
The 2 images in comparison can be in different formats. An intuitive thought is that before analysis/comparison, the 2 images must be converted to a common format.I reasoned that the image should be converted to the one with lesser quality e.g. if the 2 input images are BMP and JPEG, convert the BMP to JPEG. This can be thought of as a pre-processing step.
My question:
Is image conversion to a common format required? Can 2 images of different formats be compared? IF they have to be converted before comparison, is my assumption of comparing from higher quality(BMP) to lower(JPEG) correct? It'd also be helpful if someone can suggest some algorithms for image conversion.
EDIT
A match is said to be found if the pattern image is found in the source image.
Say for example the source image consists of a football field with one player. If the pattern image contains the player EXACTLY as he is in the source image, then its a match.
No, conversion to a common format on disk is not required, and likely not helpful. If you extract feature descriptors from an image (SIFT/SURF, for example), it matters much less how the original images were stored on disk. The feature descriptors should be invariant to small compression artifacts.
A bit more...
Suppose you have a BMP that is an image of object X in your source dataset.
Then, in your input/query dataset, you have another image of object X, but it has been saved as a JPEG.
You have no idea how what noise was introduced in the encoding process that produced either of these images. There is lighting differences, atmospheric effects, lens effects, sensor noise, tone-mapping, gammut-mapping. Some of these vary from image to image, others vary from camera to camera. All this is done before the image even gets saved to storage in the camera. Yes, there are also JPEG compression artifacts, but to assume the BMP is "higher" quality and then degrade it through JPEG compression will not help. Perhaps the BMP has even gone through JPEG compression before being saved as a BMP.
When using the function: D3DXSaveTextureToFile and passing in D3DXIFF_BMP to create a bmp I've noticed that the values seem to be estimated rather than given specifically.
Correct me if I'm wrong but a floating point texture can store any float in any given texel which would put it outside the range of a BMP which is stuck between rgb(255,255,255,255), so what it seems that the function is doing is simply taking the upper most and lowermost value of the texture and normalizing it between that range.
So my question is: Is it possible to grab the values exactly as they are in memory? including when the colours are outside of the spectruc of the computer monitor?
Don't use BMP. Use a format that supports the data type you want. For DX textures, it seems the D3DXIFF_PFM format is what you need. It's described like so:
Portable float map format. A raw
floating point image format, without
any compression. The file header
specifies image width, height,
monochrome or color, and machine word
order. Pixel data is stored as 32-bit
floating point values, with 3 values
per pixel for color, and one value per
pixel for monochrome.
Note that images will be large, though. A 256x256 texture in this format should weigh in at around 768 KB.
Updates: You should be able to use Image Magick's display command to view images in this format. Also HDRView supports the PFM format. A third choice might be fv.