(Quick version: jump to paragraph next to the last one - the one beginning with "But")
I was happy in my ignorance believing that PVRTC images were 4 or 2 bits per channel. That sounded plausible. It would give 4+4+4+4 (16 bit) or 2+2+2+2 (8 bit) textures, that would have 2^16 (65536) and 2^8 (256) color depth respectively. But reading through some documents about PVRTC, I suddenly realized that it said 4 bpp (and 2 bpp), i.e. 4 bits per pixel. Confusion and madness entered my world.
What?! 4 bits? Per pixel? But that's just 1 bit per channel! (And don't even get me started on the 2 bit one, that one was far too weird for my brain to grasp at the moment.) Some moments into this agonizing reality, I came to understand this wasn't so real after all. Apparently, when saying 4 bpp, it's referring to the compression, and not the color depth. Phew, I wasn't not mad, after all.
But then I started to wonder: what color depth do these images have then, after decompression? I tried to look this information up, but apparently it's not considered important to mention (or I'm just bad at finding info).
The fact that PVRTC compressed images don't seem give any visible artifacts in OpenGLES with the pixel format RGBA4444 would suggest they're 16 bit (using 32 bit png images with the pixel format RGBA4444 in OpenGLES on the iPhone gives very visible artifacts).
According to the paper http://web.onetel.net.uk/~simonnihal/assorted3d/fenney03texcomp.pdf the final output of the decompressor is 8 bits per channel.
Related
I have trouble understanding these two class styles. The docs say that they align the window on a byte boundary, but I don't understand what that means.
I have tried to use them and yes, the position of the window upon creation is different, but what do they do and why would I use them is unclear to me.
What do they do and why would I use them?
With modern display technology and GPUs, they (probably) do very little in terms of performance.
In older times, though, a (potentially slow) CPU would need to write blocks of RAM directly to display memory. In such cases, where a display and/or bitmap has a "colour depth" of less than one byte – like monochrome (1 bit-per-pixel) and low colour (say, 4 bpp) – windows and their clients could be aligned such that each row was not 'aligned' to an actual byte boundary; thus, block-copy operations (like BitBlt) would be very slow, because the first few pixels in each row would have to be set by manually setting some of the bits in the display memory according to some of the bits in the first bytes of the source (RAM). These slow operations would also be propagated along each row.
Forcing the display (be it the client area or the entire window) to have its x-origin (those flags/styles only affect the x-position) aligned to a true byte boundary allows much faster copying, because there would then be a direct correspondence between bytes in the source (RAM) and bytes in the target (display); thus, simple block-copying of a row of bytes can be performed (with something akin to memcpy), without the need for any manipulation of individual bits from different bytes.
As a vague analogy, consider the difference (in speed and simplicity) between: (a) copying one array of n bytes to another of the same size; and (b) replacing each byte in the second array with the combination of the lower 4 bits of one source element with the higher 4 bits of the following source element.
From Why did Windows 95 keep window coordinates at multiples of 8? by Raymond Chen:
The screen itself is a giant bitmap, and this means that copying data to the screen goes much faster if x-coordinate of the destination resides on a full byte boundary. And the most common x-coordinate is the left edge of a window’s contents (known as its client area).
Applications can request that Windows position their windows so that their client area began at these advantageous coordinates by setting the CS_BYTEALIGNCLIENT style in their window class. And pretty much all applications did this because of the performance benefit it produced.
So what happened after Windows 95 that made this optimization go away?
Oh, the optimization is still there. You can still set the CS_BYTEALIGNCLIENT style today, and the system will honor it.
The thing that changed wasn’t Windows. The thing that changed was your video card.
In the Windows 95 era, predominant graphics cards were the VGA (Video Graphics Array) and EGA (Enhanced Graphics Adapter). Older graphics cards were also supported, such as the CGA (Color Graphics Adapter) and the monochrome HGC (Hercules Graphics Card).
All of these graphics cards had something in common: They used a pixel format where multiple pixels were represented within a single byte,¹ and therefore provided an environment where byte alignment causes certain x-coordinates to become ineligible positions.
Once you upgraded your graphics card and set the color resolution to “256 colors” or higher, every pixel occupies at least a full byte,² so the requirement that the x-coordinate be byte-aligned is vacuously satisfied. Every coordinate is eligible.
Nowadays, all graphics cards use 32-bit color formats, and the requirement that the coordinate be aligned to a byte offset is satisfied by all x-coordinates.³ The multiples of 8 are no longer special.
There should be a formula to come up with my answer but I can't seem to find a definitive one. Is it 300 dots per inch, meaning 300 pixels per inch, therefore making my file size be what (if I have a letter-size image scanned, i.e., 8" x 11")? What's the method to find this out? I suppose the file format matters too....and I want one that has the least loss of fidelity (how about *.JPG?)
PS:
I found a pretty neat calculator at http://jan.ucc.nau.edu/lrm22/pixels2bytes/calculator.htm that seems to help me figure out my number (and help me understand also that bit color depth matters too). It says there that, "DPI affects the size and quality of the printed image, but not the size of the file or how it looks on screen." Hmmm...that's a bit counter-intuitive to me. And I am still curious as to how to calculate my file size. Imagine I want 24 bit color. What's the formula?
Typically, the most common RGB format seems to be 24-bit RGB (8-bits for each channel). However, historically, RGB has been represented in many other formats, including 3-bit RGB (1-bit per channel), 6-bit RGB (2-bits per channel), 9-bit RGB (3-bits per channel), etc.
When an N-bit RGB file has a value of N that is not a multiple of 8, how are these bitmaps typically represented in memory? For example, if we have 6-bit RGB, it means each pixel is 6 bits, and thus each pixel is not directly addressable by a modern computer without using bitwise operations.
So, is it common practice to simply covert N-bit RGB files into bitmaps where each pixel is of an addressable size (e.g. convert 6-bit to 8-bit)? Or is it more common to simply use bitwise operations to manipulate bitmaps where the pixel size is not addressable?
And what about disk storage? How is, say, a 6-bit RGB file stored on disk, when the last byte of the bitmap may not even contain a full pixel?
Images are often heavy and bandwidth is critical when transferring them. So a 6 bit per channel image is a reasonable choice if some loss in chrominance is acceptable (usually unnoticeable with textures and photos)
How is, say, a 6-bit RGB file stored on disk, when the last byte of
the bitmap may not even contain a full pixel?
If the smallest unit of storage is a Byte then yes you need to add some padding to be 8 bit aligned. This is fine because the space saving compared to an 8 bit per channel image can be considerable.
A power of 2 value that is divisible by 6 is very large. Better numbers are 5 bits for the red and blue channels and 6 bits for the green channel and the total is 16 bits per pixel. R5G6B5 is a very common pixel format.
Apologies for the archeological dig, but I just couldn't resist, as there's no good answer imho.
In the old days memory was the most expensive part of a computer. These days memory is dirt-cheap, so the most sensible way to handle N-bit channel images in modern hardware is to blow up every channel to the number of bits that fits your API or hardware (usually 8 bits).
For files, you could do the same, as disk space is also cheap these days. (and you can use one of the many compressed file formats out there.)
That said, the way it worked in the old days when these formats were common, is this:
The total number of bits per pixel was not a multiple of 8, but the number of pixels per scan line always was a multiple of 8. In this case you can store your scan lines "a bit at a time" and not waste any memory space when storing it in bytes.
So if your pixels were 9 bits per pixel, and a scan line was 320 pixels, you would have 320/8 = 40 bytes containing bit #0 of each pixel, followed by 40 bytes containing all bit #1's etc. up to and including bit #8. Hence all pixel info for your scan line would be exactly 360 bytes.
The video chips would have a different hardware wiring to the memory, so rendering such scan lines was fast. In fact, this is the easiest way to implement a variable amount of bits/pixel hardware support: by pulling bits from N adresses at once.
Note that this method does not change the amount of 'bitshifting' required to find the bits for pixel number X in a scanline, based on the total number of bits you use. You just read less addresses ahead at once.
Does anybody know the difference between GL_UNSIGNED_SHORT_4_4_4_4 and GL_UNSIGNED_SHORT_5_6_5 data types in OpenGL ES?
They are both 16 bit textures.
When using 32 bit texture you have 8 bit for each of the color components plus 8 bit for alpha, maximum quality and alpha control, so 8888.
With 16 bit there's always a compromise, if you only need color and not alpha then use the 565. Why 565 ? Because 16 bits can't divide evenly by 3 and our eyes are better at seing in the green spectrum, so better quality by giving the leftover bit to green.
If you need alpha but your images don't use gradients in alpha use 5551, 5 bits for each color 1 bit for alpha.
If you image has some gradient alpha then you can can use 4444, 4 bits for each color and 4 bits for alpha.
4444 has the worst color quality but it retains some alpha to mix, I use this for my font textures, for example, lighter than 32 bit and since the fonts are monocromatic they fit well in 4 bits.
I am not an OpenGL expert but :
GL_UNSIGNED_SHORT_4_4_4_4 stands for GL_UNSIGNED_SHORT_R_G_B_A where each RGBA values can have a value of 4 bit each (well that is 2^4)
GL_UNSIGNED_SHORT_5_6_5 stands for GL_UNSIGNED_SHORT_R_G_B. You can see that their is no Alpha value available here so that is a major difference. RGB values can also have greater values since they are 5 6 and 5 bits respectively.
Well, when using GL_UNSIGNED_SHORT_4_4_4_4 as type in a pixel specification command (glTexImage2D or glReadPixels), the data is assumed to be laid out in system memory as one 16bit value per-pixel, with individual components each taking up 4 consecutive bits. It can only be used with a format of GL_RGBA.
Whereas GL_UNSIGNED_SHORT_5_6_5 also assumes individual pixels as 16bit values, but with the red and blue components taking up 5 bits each and the green component having 6 bits (there is no alpha channel). It can only be used with a format of GL_RGB.
At the moment I am working on an on screen display project with black, white and transparent pixels. (This is an open source project: http://code.google.com/p/super-osd; that shows the 256x192 pixel set/clear OSD in development but I'm migrating to a white/black/clear OSD.)
Since each pixel is black, white or transparent I can use a simple 2 bit/4 state encoding where I store the black/white selection and the transparent selection. So I would have a truth table like this (x = don't care):
B/W T
x 0 pixel is transparent
0 1 pixel is black
1 1 pixel is white
However as can be clearly seen this wastes one bit when the pixel is transparent. I'm designing for a memory constrained microcontroller, so whenever I can save memory it is good.
So I'm trying to think of a way to pack these 3 states into some larger unit (say, a byte.) I am open to using lookup tables to decode and encode the data, so a complex algorithm can be used, but it cannot depend on the states of the pixels before or after the current unit/byte (this rules out any proper data compression algorithm) and the size must be consistent; that is, a scene with all transparent pixels must be the same as a scene with random noise. I was imagining something on the level of densely packed decimal which packs 3 x 4-bit (0-9) BCD numbers in only 10 bits with something like 24 states remaining out of the 1024, which is great. So does anyone have any ideas?
Any suggestions? Thanks!
In a byte (256 possible values) you can store 5 of your three-bit values. One way to look at it: three to the fifth power is 243, slightly less than 256. The fact that it's slightly less also shows that you're not wasting much of a fraction of a bit (hardly any, either).
For encoding five of your 3-bit "digits" into a byte, think of taking a number in base 3 made from your five "digits" in succession -- the resulting value is guaranteed to be less than 243 and therefore directly storable in a byte. Similarly, for decoding, do the base-3 conversion of a byte's value.