Determine the number of bytes necessary to store an uncompressed RGB color image of size 640 ×
480 pixels using 8, 10, 12, and 14 bits per color channel?
I know how to calculate the size of image by using Size = (rows * columns * bpp) but i cannot understand what bit per color channel means in this question
Bits per color channel, is the number of bits that are used for storing a color component of a single pixel.
RGB color space has 3 channels: Red, Green and Blue.
The "bits per color channel" (bpc) is the number of bits that are used for storing each component (e.g 8 bits for red, 8 bits for green, 8 bits for blue).
The dynamic range of 8 bits is [0, 255] (255 = 2^8-1).
8 bpc applies 24 bits per pixel (bpp).
The number of bits per pixel defines the Color Depth of the image.
24 bpp can represent 2^24 = 16,777,216 different colors.
More bits applies larger range: 12 bits range is [0, 4095] (4095 = 2^12-1), and much larger colors variety can be coded in each pixel.
12 bpc applies 36 bpp, and can represent 2^36 = 68,719,476,736 different colors.
For more information refer to BIT DEPTH TUTORIAL
Remark: The bits per channel is not directly related to memory storage (e.g. it's common to store 12 bit in 2 bytes [16 bits] in memory).
As you probably know, an image is built as a matrix of pixels.
Following figure illustrates the structure of an RGB image:
Following figure illustrates a pixel with 8 bits per color channel:
Following figure illustrates a pixel with 10 bits per color channel:
Following figure illustrates a pixel with 12 bits per color channel:
There subject is much wider than that, but I think that's enough...
Related
This is all of the information I was provided in the practice question. I am trying to figure out how to calculate it when prompted to do so on an exam...
How to determine the number of bytes necessary to store an uncompressed grayscale image of size 8000 × 3400 pixels?
I am also curious how the calculation changes if the image is a compressed binary image.
"I am trying to figure out how to calculate it when prompted to do so on an exam."
There are 8 bits to make 1 byte, so once you know how many bits-per-pixel (bpp) you have, this is a very simple calculation.
For 8 bits per pixel greyscale, just multiply the width by the height.
8000 * 3400 = 27200000 bytes.
For 1 bit per pixel black&white, multiply the width by the height and then divide by 8.
(8000 * 3400) / 8 = 3400000 bytes.
It's critical that the image is uncompressed, and that there's no padding at the end of each raster line. Otherwise the count will be off.
The first thing to work out is how many pixels you have. That is easy, it is just the width of the image multiplied by the height:
N = w * h
So, in your case:
N = 8000 * 3400 = 27200000 pixels
Next, in general you need to work out how many samples (S) you have at each of those 27200000 pixel locations in the image. That depends on the type of the image:
if the image is greyscale, you will have a single grey value at each location, so S=1
if the image is greyscale and has transparency as well, you will have a grey value plus a transparency (alpha) value at each location, so S=2
if the image is colour, you will have three samples for each pixel - one Red sample, one Green sample and one Blue sample, so S=3
if the image is colour and has transparency as well, you will get the 3 RGB values plus a transparency (alpha) value for each pixel, so S=4
there are others, but let's not get too complicated
The final piece of the jigsaw is how big each sample is, or how much storage it takes, i.e. the bytes per sample (B).
8-bit data takes 1 byte per sample, so B=1
16-bit data takes 2 bytes per sample, so B=2
32-bit floating point or integer data take 4 bytes per sample, so B=4
there are others, but let's not get too complicated
So, the actual answer for an uncompressed greyscale image is:
storage required = w * h * S * B
and in your specific case:
storage required = 8000 * 3400 * 1 * 1 = 27200000 bytes
If the image were compressed, the only thing you should hope and expect is that it takes less storage. The actual amount required will depend on:
how repetitive/predictable the image is - the more predictable the image is, in general, the better it will compress
how many colours the image contains - fewer colours generally means better compression
which image file format you require (PNG, JPEG, TIFF, GIF)
which compression algorithm you use (RLE, LZW, DCT)
how long you are prepared to wait for compression and decompression - the longer you can wait, the better you can compress in general
what losses/inaccuracies you are prepared to tolerate to save space - if you are prepared to accept a lower quality version of your image, you can get a smaller file
I am processing a 12-bit image which is unfortunately store as a 16-bit tiff. However, I do not know which 4 of the 16 bits are useless. So I tried three methods: mask each pixel with 0xFFF0, 0x0FFF, or 0x0FF0. It appears to me the resulting image of these three methods look just the same, but their md5 values are different. Why does this happen? Are there any differences if I use any of these three images for other purposes later?
Computer monitors can only display 256 distinct brightness levels. A 12-bit image consequently has its lower 4 bits ignored. So you see no difference when you zero out those bits or not.
When a 12-bit image is stored in a 16-bit integer, the upper 4 bits are usually left at zero, so there is no difference when you zero them or not. [Sometimes the pixel value is scaled to occupy the full 16 bit range, but this is not usually the case.]
So, don’t mask out any bits is my recommendation. Zeroing our the lower 4 bits just reduced the precision of the values in the image, making it equivalent to an 8-bit image. Masking the upper 4 bits is pointless because they already are zero.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I know how to find the maximum possible capacity of an image given the resolution, image type, and bits per pixel for hiding. How do I find the estimated message size?
Say the image is 100 x 200 pixels and is a 8-bit paletted image. And we are hiding 4-Bit LSB. What would the estimated message size be?
The estimated message length is the total length of 1s and 0s you will embed. This is composed of the header (optional) and message stream.
Size of message (message stream)
This depends on the size of the message and how you hide it. Generally, you want to ask what form your message takes when you convert it to 1s and 0s (message stream). The numbers 0-255 and ASCII characters can be represented by 8 bits each. The most straightforward examples include:
plain text: number of characters x 8
binary (black and white) image: height x width
grayscale image: height x width x 8
colour image: height x width x 24
You can also decide to compress your message before embedding with, for example, Huffman coding. This should convert your message to fewer bits than the above examples, but you will also need to include your decoding table in your message so that the recipient can decode the message. Overall, Huffman compression results to a shorter length, including the table, for even just a few hundred characters.
Metadata (header)
Speaking of metadata, in many cases, embedding just your message stream is not enough. For a message stream which is shorter than the maximum message capacity, you have 3 options on what to do with the remaining space:
fill it with random 1s and 0s (effectively introduce most distortions to the cover image),
do nothing, or
stretch the message so that it takes up as much of the maximum message capacity as possible, e.g. matrix encoding in the F5 steganographic system.
If you decide to do nothing, you may need to tell the recipient how many pixels he has to read to extract the whole message so not to carry on reading gibberish. You can either say the total number of bits of the message stream, or how many pixels to read until all the information is extracted. For the former option, the tendency is to devote 32 bits for the message length, but that can be quite the overkill. You can either set a more practical limit, or adopt a more dynamic approach.
A practical limit would be 24 bits, if you assume you will never use a bigger cover than a 1920x1200 grayscale image with 4-bit LSB embedding (1920x1200x4 = 9216000 < 2^24 maximum storage capacity).
A more dynamic approach would be to estimate the minimum number of bits to represent the message length, e.g. 8 bits for up to a message length of 256, 9 bits for up to 512, etc. Then encode this number to a 5-bit value, followed by the message length. For example, if the message length is 3546 bits, using 32 bits to encode the length, it becomes 00000000000000000000110111011010. But with the dynamic approach, it is 01100 10111011010, where 01100 is binary for 12, which says to read the 12 following bits to obtain the message length.
If your program handles both text and images as the secret message, you'll also need to tell the recipient the type of the secret. If you are ever only going to use the four above types, I would encode that using two bits: plain text = 00, binary image = 01, grayscale image = 10, colour image = 11.
If the secret is an image, you'll also need to provide the height and width lengths. 16x2 bits is the general tendency here. But similarly to the message length above, you can use something more practical. For example, you can expect no image to have more than 2048 pixel length for either width or height, so 11x2 bits will be enough to encode this information.
If you embed your secret in more than the last LSB, your message length may not be divisible by that number. For example, a message length of 301, when you embed in the 4-bit LSB. In this case, you need to pad with message with 3 more junk 1s or 0s so that it becomes divisible by 4. Now, 304 is your reported message stream, but after you extract it, you can discard the last 3 bits. It is logical to assume you will never embed in more than 7-bit LSB, so devoting 3 bits to padding should be more than enough.
Depending on what you choose to include in the metadata, you can stitch all of these together and call them header.
Examples
Let's do a couple of examples to see this in action. We let the header format to be in the order of message length, secret type, padding, height, width (last two only if necessary).
Embed the string 'Hello World' using 4-bit LSB
The message stream is 11 x 8 = 88 bits.
88 mod 4 = 0, so the padding is 000 (3 bits).
The message length is 88 = 00111 1011000 (12 bits).
Secret type is text, so 00 (2 bits).
Estimated message length: Header + message stream = (12 + 2 + 3) + 88 = 105 bits.
Embed a grayscale image of size 151 x 256 using 3-bit LSB
The message stream is 151 x 256 x 8 = 309248 bits.
309248 mod 3 = 2, so the padding is 3-2 = 1 = 001 (3 bits).
The message length is 309249 = 10011 1001011100000000001 (24 bits).
Secret type is grayscale image, so 10 (2 bits).
Secret is image, so adding the width and height using 2 16-bit numbers (32 bits).
Estimated message length: Header + message stream = (24 + 2 + 3 + 32) + 309249 = 309310 bits.
Does anybody know the difference between GL_UNSIGNED_SHORT_4_4_4_4 and GL_UNSIGNED_SHORT_5_6_5 data types in OpenGL ES?
They are both 16 bit textures.
When using 32 bit texture you have 8 bit for each of the color components plus 8 bit for alpha, maximum quality and alpha control, so 8888.
With 16 bit there's always a compromise, if you only need color and not alpha then use the 565. Why 565 ? Because 16 bits can't divide evenly by 3 and our eyes are better at seing in the green spectrum, so better quality by giving the leftover bit to green.
If you need alpha but your images don't use gradients in alpha use 5551, 5 bits for each color 1 bit for alpha.
If you image has some gradient alpha then you can can use 4444, 4 bits for each color and 4 bits for alpha.
4444 has the worst color quality but it retains some alpha to mix, I use this for my font textures, for example, lighter than 32 bit and since the fonts are monocromatic they fit well in 4 bits.
I am not an OpenGL expert but :
GL_UNSIGNED_SHORT_4_4_4_4 stands for GL_UNSIGNED_SHORT_R_G_B_A where each RGBA values can have a value of 4 bit each (well that is 2^4)
GL_UNSIGNED_SHORT_5_6_5 stands for GL_UNSIGNED_SHORT_R_G_B. You can see that their is no Alpha value available here so that is a major difference. RGB values can also have greater values since they are 5 6 and 5 bits respectively.
Well, when using GL_UNSIGNED_SHORT_4_4_4_4 as type in a pixel specification command (glTexImage2D or glReadPixels), the data is assumed to be laid out in system memory as one 16bit value per-pixel, with individual components each taking up 4 consecutive bits. It can only be used with a format of GL_RGBA.
Whereas GL_UNSIGNED_SHORT_5_6_5 also assumes individual pixels as 16bit values, but with the red and blue components taking up 5 bits each and the green component having 6 bits (there is no alpha channel). It can only be used with a format of GL_RGB.
At the moment I am working on an on screen display project with black, white and transparent pixels. (This is an open source project: http://code.google.com/p/super-osd; that shows the 256x192 pixel set/clear OSD in development but I'm migrating to a white/black/clear OSD.)
Since each pixel is black, white or transparent I can use a simple 2 bit/4 state encoding where I store the black/white selection and the transparent selection. So I would have a truth table like this (x = don't care):
B/W T
x 0 pixel is transparent
0 1 pixel is black
1 1 pixel is white
However as can be clearly seen this wastes one bit when the pixel is transparent. I'm designing for a memory constrained microcontroller, so whenever I can save memory it is good.
So I'm trying to think of a way to pack these 3 states into some larger unit (say, a byte.) I am open to using lookup tables to decode and encode the data, so a complex algorithm can be used, but it cannot depend on the states of the pixels before or after the current unit/byte (this rules out any proper data compression algorithm) and the size must be consistent; that is, a scene with all transparent pixels must be the same as a scene with random noise. I was imagining something on the level of densely packed decimal which packs 3 x 4-bit (0-9) BCD numbers in only 10 bits with something like 24 states remaining out of the 1024, which is great. So does anyone have any ideas?
Any suggestions? Thanks!
In a byte (256 possible values) you can store 5 of your three-bit values. One way to look at it: three to the fifth power is 243, slightly less than 256. The fact that it's slightly less also shows that you're not wasting much of a fraction of a bit (hardly any, either).
For encoding five of your 3-bit "digits" into a byte, think of taking a number in base 3 made from your five "digits" in succession -- the resulting value is guaranteed to be less than 243 and therefore directly storable in a byte. Similarly, for decoding, do the base-3 conversion of a byte's value.