Debayering bayer encoded Raw Images - image

I have an image which I need to write a debayer for, but I can't figure out how the data is packed.
The information I have about the image:
original bpp: 64;
PNG bpp: 8;
columns: 242;
rows: 3944;
data size: 7635584 bytes.
PNG https://drive.google.com/file/d/1fr8Tg3OvhavsgYTwjJnUG3vz-kZcRpi9/view?usp=sharing
SRC data: https://drive.google.com/file/d/1O_3tfeln76faqgewAknYKJKCbDq8UjEz/view?usp=sharing
I was told that it should be BGGR, but it doesn't look like any ordinary Bayer BGGR image to me. Also I got the image with a txt file which contains this text:
Camera resolution: 1280x944
Camera type: LVDS
Could the image be compressed somehow?
I'm completely lost here, I would appreciate any help.
Bayer pattern of the image in 8bpp representation

Looks like there are 4 images, and the pixels are stored in some kind of "packed 12" format.
Please note that "reverse engineering" the format is challenging, and the solution probably has few mistakes.
The 4 images are stored in steps of 4 rows:
aaaaaaaaaaaaa
bbbbbbbbbbbbb
ccccccccccccc
ddddddddddddd
aaaaaaaaaaaaa
bbbbbbbbbbbbb
ccccccccccccc
ddddddddddddd
...
aaa... marks the first image.
bbb... marks the second image.
ccc... marks the third image.
ddd... marks the fourth image.
There are about 168 rows at the top that we have to ignore.
Getting 1280 pixels out of 1936 bytes in each row:
Each row has 16 bytes we have to ignore.
Out of 1936 bytes, only 1920 bytes are relevant (assume we have to remove 8 bytes from each side).
The 1920 bytes represents 1280 pixels.
Every 2 pixels are stored in 3 bytes (every pixel is 12 bits).
The two 12 bits elements in 3 bytes are packed as follows:
8 MSB bits 8 MSB bits 4 LSB and 4 LSB bits
######## ######## #### ####
It's hard to tell how the LSB bits are divided between the two pixels (the LSB it mainly "noise").
After unpacking the pixels, and extracting one image out of the 4, the format looks like GRBG Bayer pattern (by changing the size of the margins we may get BGGR).
MATLAB code sample for extracting one image:
f = fopen('test.img', 'r'); % Open file (as binary file) for reading
T = fread(f, [1936, 168], 'uint8')'; % Read the first 168
I = fread(f, [1936, 944*4], 'uint8')'; % Read 944*4 rows
fclose(f);
% Convert from packed 12 to uint16 (also skip rows in steps of 4, and ignore 8 bytes from each side):
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
A = uint16(I(1:4:end, 8+1:3:end-8)); % MSB of even pixels (convert to uint16)
B = uint16(I(1:4:end, 8+2:3:end-8)); % MSB of odd pixels (convert to uint16)
C = uint16(I(1:4:end, 8+3:3:end-8)); % 4 bits are LSB of even pixels and 4 bits are LSB of odd pixels
I1 = A*16 + bitshift(C, -4); % Add the 4 LSB bits to the even pixels (may be a wrong)
I2 = B*16 + bitand(C, 15); % Add the other 4 LSB bits to the even pixels (may be a wrong)
I = zeros(size(I1, 1), size(I1, 2)*2, 'uint16'); % Allocate 1280x944 uint16 elements.
I(:, 1:2:end) = I1; % Copy even pixels
I(:, 2:2:end) = I2; % Copy odd pixels
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
J = demosaic(I*16, 'grbg'); % Apply demosaic (multiply by 16, because MATLAB assume 12 bits are in the upper bits).
figure;imshow(lin2rgb(J));impixelinfo % Show the output image (lin2rgb applies gamma correction).
Result (converted to 8 bit):

Related

Xlib RGB hex strings: have I understood "scaling in bits"?

The xlib docs present two string formats for specifying RGB colours:
RGB Device String Specification
An RGB Device specification is
identified by the prefix “rgb:” and conforms to the following syntax:
rgb:<red>/<green>/<blue>
<red>, <green>, <blue> := h | hh | hhh | hhhh
h := single hexadecimal digits (case insignificant)
Note that h indicates the value scaled in 4 bits, hh the value scaled in 8 bits,
hhh the value scaled in 12 bits, and hhhh the value scaled in 16 bits,
respectively.
Typical examples are the strings “rgb:ea/75/52” and “rgb:ccc/320/320”,
but mixed numbers of hexadecimal digit strings (“rgb:ff/a5/0” and
“rgb:ccc/32/0”) are also allowed.
For backward compatibility, an older syntax for RGB Device is
supported, but its continued use is not encouraged. The syntax is an
initial sharp sign character followed by a numeric specification, in
one of the following formats:
#RGB (4 bits each)
#RRGGBB (8 bits each)
#RRRGGGBBB (12 bits each)
#RRRRGGGGBBBB (16 bits each)
The R, G, and B represent single hexadecimal digits. When fewer than 16 bits each are specified, they
represent the most significant bits of the value (unlike the “rgb:”
syntax, in which values are scaled). For example, the string “#3a7” is
the same as “#3000a0007000”.
https://www.x.org/releases/X11R7.7/doc/libX11/libX11/libX11.html#RGB_Device_String_Specification
So there are strings like either rgb:c7f1/c7f1/c7f1 or #c7f1c7f1c7f1
The part that confuses me is:
When fewer than 16 bits each are specified, they represent the most significant bits of the value (unlike the “rgb:” syntax, in which values are scaled)
Let's say we have 4-bit colour strings, i.e. one hexadecimal digit per component.
C is 12 in decimal, out of 16.
If we want to scale it up to 8-bits, I guess you have 12 / 16 * 256 = 192
192 in hexadecimal is C0
So this seems to give exactly the same result as "the most significant bits of the value" i.e. padding with 0 to the right-hand side.
Since the docs make a distinction between the two formats I wondered if I have misunderstood what they mean by "scaling in bits" for the rgb: format?

swap dimensions of 4-D image array

I have a 4-D array of images of shape [32 32 3 1000]. In Matlab, how can I change this so that the image number (1000) is the first index instead of the last, i.e. shape [1000 32 32 3]?
I tried A = permute(A, [1000 32 32 3]);, but it says:
Error using permute ORDER contains an invalid permutation index.

How many bits(maximum number of bits) can be embedded in a pixel?

I have an gray scale image of sixe 512x512. Thus, each pixel contains 8 bits. Can I embed a total of 8 bits into the pixels I wish to embed data in? Is this possible? (I require the image only for embedding data). In case I want to embed data in 10,000 pixels out of the total 512*512 pixels, can I then in total embed 80,000 bits of data or 10kB of data?
A standard grayscale image with 256 levels for each pixel requires 8 bits per pixel. This is because 8 bits are required to encode 256 different levels. If you have an image with dimensions 512 x 512 then the total number of pixels in the entire image is 262,144 pixels. So, the entire image contains 8 bits * 262,144 = 2,097,152 bits worth of information.
If you were to take a subset of these pixels and encode 8 bits of "different" information, note that the resulting image would likely change in appearance. The 8 bits of information at each pixel coordinate previously encoded the pixel intensity (from 0 to 255). If you are replacing this value with some other value then the intensity will be different and the overall image will appear different.
If you want to embed 10KiB of data in a 512x512 image, where the bit depth is 8 bits, I'd recommend just storing 1 bit of data in every second pixel by changing the LSB of each.
Changing just 1 bit of data from every other pixel allows you to store (512*512*1)/2 bits of data, or 16KiB of data. This way you can store all of the data that you need to while only changing the image in a very limited way.
As an example, here's an image with varying amounts of white-noise embedded within it (by embedding n bytes per pixel), you can see how much noise(data) is embedded in the table below:
X | Y | bits used | data(KiB)
0 | 0 | 0 | 0
1 | 0 | 1 | 32
0 | 1 | 2 | 64
1 | 1 | 3 | 96
0 | 2 | 4 | 128
1 | 2 | 5 | 160
0 | 3 | 6 | 192
1 | 3 | 7 | 224
_ | _ | 8 | 256 (image omitted as just white noise)
As can be seen, embedding up to 64KiB of data into a 512x512x8 image is perfectly reasonable expecting little noticeable change in the image by editing the 2 LSB of each pixel, so that a pixel is encoded as:
XXXX XXYY
Where X came from the original image, and Y is 2 bits of the stored data.

Why do you see the size of the image loaded(in libgdx) in memory takes up more of your own photos

I do not photograph with the size of 2048 x 2048 and the size of 437kb. But when it loads libgdx my memory rises sharply(4mb). It seems that in general opengl the file type&compress is not due and only to see the Bitmap in memory.
The problem is that many images are in Android at a time when the number of asset loading program exits without error.
My game is strategic, including wallpaper and buildings as well as lots of character. At the same time according to need them at the scene, there is no possibility of cross loading. (Games like clash of clans).
Now the question is how big I have a lot and I load the image into memory to play on Android phones with low ram applicable.
Its because the OpenGL stores images in uncompressed picture formats. The memory can be commpressed using S3TC_DXT texture compression algоrithms. To enable it you just must specify texture compression algorithm in your glTexImage2D call.
glTexImage2D(GL_TEXTURE2D, 0, GL_COMPRESSED_RGB_S3TC_DXT1_EXT, width, height, 0, externalFormat, GL_BYTE);
You can also check if texture is compressed calling:
int iFlag;
glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_COMPRESSED, &iFlag);
Or load already compressed DDS textures by calling:
void glCompressedTexImage2D(GLenum target, GLint level, GLenum internalFormat,
GLsizei width, GLsizei height,
GLint border, GLsizei imageSize, void *data);
Some more information on S3 Texture Compress (S3TC) is provided below (taken from https://www.opengl.org/wiki/S3_Texture_Compression):
S3TC is a technique for compressing images for use as textures. Standard image compression techniques like JPEG and PNG can achieve greater compression ratios than S3TC. However, S3TC is designed to be implemented in high-performance hardware. JPEG and PNG decompress images all-at-once, while S3TC allows specific sections of the image to be decompressed independently.
S3TC is a block-based format. The image is broken up into 4x4 blocks. For non-power-of-two images that aren't a multiple of 4 in size, the other colors of the 4x4 block are taken to be black. Each 4x4 block is independent of any other, so it can be decompressed independently.
There are 3 forms of S3TC accepted by OpenGL. These forms are named after the old Direct3D names for these formats: DXT1, DXT3 and DXT5.
DXT1 Format
A DXT1-compressed image is an RGB image format. As such, the alpha of any color is assumed to be 1. Each 4x4 block takes up 64-bits of data, so compared to a 24-bit RGB format, it provides 6:1 compression. You can get a DXT1 image by using the GL_COMPRESSED_RGB_S3TC_DXT1_EXT as the internal format of the image.
Each 4x4 block stores color data as follows. There are 2 16-bit color values, color0 followed by color1. Following this is a 32-bit unsigned integer containing values that describe how the two colors are combined to determine the color for a given pixel.
The 2 16-bit color values are stored in little-endian format, so the low byte of the 16-bit color comes first in each case. The color values are stored in RGB order (from high bit to low bit) in 5_6_5 bits.
The 32-bit unsigned integer is also stored in little-endian format. Every 2 bits of the integer represent a pixel; the 2 bits are a code that defines how to combine color0 and color1 to produce the color of that pixel. In order from highest bit to lowest bit (after the little-endian conversion), the pixels are stored in row-major order. Every 8 bits, 4 2-bit codes, is a single row of the image.
Here is a diagram of the setup:
63 55 47 39 31 23 15 7 0
| c0-low | c0-hi | c1-low | c1-hi | codes0 | codes1 | codes2 | codes3 |
-------------------------------------------------------------------------
c0-low​ is the low byte of the 16-bit color 0; similarly, c0-hi​ is the high byte of color 0. To reconstitute color 0, simply do this: ((bytes[1] << 8) + bytes[0])​, where bytes​ is the an array containing the above sequence of bytes. Color 1 would come from ((bytes[3] << 8) + bytes[2])​.
Similarly, the codes​ are the bytes that make up the 32-bit integer bitcodes. They have to be rebuilt in reverse order, so codes3​ is the left-most byte.
Once rebuilt into their proper order, you can get the individual 2-bit values like this. The The pixel values are in cXY​ form, where Y goes bottom to top as is standard for OpenGL:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
| c00 | c10 | c20 | c30 | c01 | c11 | c21 | c31 | c02 | c12 | c22 | c32 | c03 | c13 | c23 | c33
| codes3 | codes2 | codes1 | codes0
------------------------------------------------------------------------------------------------
The interpretation of the 2-bit values depends on how color0 and color1 compare to each other. If the integer value of color0 is greater than color1, then the 2-bit values mean something different than if color0 is less than or equal to color1. The meaning of the 2-bit values is as follows:
code | color0 > color1 | color0 <= color1
----------------------------------------------------
0 | color0 | color0
1 | color1 | color1
2 | (2*color0 + color1) / 3 | (color0 + color1) / 2
3 | (color0 + 2*color1) / 3 | Black
The arithmetic operations are done per-component, not on the integer value of the colors. And the value "Black" is simply R=G=B=0.
DXT1 with 1-bit Alpha
There is a form of DXT1 available that provides a simple on/off alpha value. This format therefore uses an RGBA base format. To get this format, use the GL_COMPRESSED_RGBA_S3TC_DXT1_EXT internal format.
The format of the data is identical to the above case, which is why this is still DXT1 compression. The interpretation differs slightly. You always get an alpha value of 1 unless the pixel uses the code for Black in the above table. In that case, you get an alpha value of 0.
Note that this means that the RGB colors will also be 0 on any pixel with a 0 alpha. This also means that bilinear filtering between neighboring texels will result in colors combined with black. If you are using premultipled alpha blending, this is what you want. If you aren't, then it almost certainly is not what you want.
When using OpenGL to compress a texture, the GL implementation will assume any pixel with an alpha value < 0.5 should have an alpha of 0. This is another reason to manually compress images.
DXT3 Format
The DXT3 format is an RGBA format. Each 4x4 block takes up 128 bits of data. Thus, compared to a 32-bit RGBA texture, it offers 4:1 compression. You can get this with the GL_COMPRESSED_RGBA_S3TC_DXT3_EXT internal format.
Each block of 128 bits is broken into 2 64-bit chunks. The second chunk contains the color information, compressed almost as in the DXT1 case; the difference being that color0 is always assumed to be less than color1 in terms of determining how to use the codes to extract the color value. The first chunk contains the alpha information.
The alpha 64-bit chunk is stored as a little-endian 64-bit unsigned integer. The alpha values are stored as 4-bit-per-pixel alpha values. The alpha values are stored in row-major order, from the highest bit of the 64-bit unsigned integer.
DXT5 Format
The DXT5 format is an alternate RGBA format. As in the DXT3 case, each 4x4 block takes up 128 bits. So it provides the same 4:1 compression as in the DXT3 case. You can get this with the GL_COMPRESSED_RGBA_S3TC_DXT5_EXT format.
Just as for the DXT3 format, there are two 64-bit chunks of data per block: an RGB chunk compressed as for DXT1 (with the same caveat as for DXT3), and an alpha chunk. Again the second chunk is the color chunk; the first is the alpha.
Where DXT3 and DXT5 differ is how the alpha chunk is compressed. DXT5 compresses the alpha using a compression scheme similar to DXT1.
The alpha data is stored as 2 8-bit alpha values, alpha0 and alpha1, followed by a 48-bit unsigned integer that describes how to combine these two reference alpha values to achieve the final alpha value. The 48-bit integer is also stored in little-endian order.
The 48-bit unsigned integer contains 3-bit codes that describe how to compute the final alpha value. These codes are stored in the identical order as the codes in DXT1; they simply are 3 bits in size rather than 2.
Just as in the DXT1 case, the codes have different meanings depending on how alpha0 and alpha1 compare to one another. Here is the table of codes and computations:
code | alpha0 > alpha1 | alpha0 <= alpha1
---------------------------------------------
0 |alpha0 | alpha0
1 |alpha1 | alpha1
2 |(6*alpha0 + 1*alpha1)/7 | (4*alpha0 + 1*alpha1)/5
3 |(5*alpha0 + 2*alpha1)/7 | (3*alpha0 + 2*alpha1)/5
4 |(4*alpha0 + 3*alpha1)/7 | (2*alpha0 + 3*alpha1)/5
5 |(3*alpha0 + 4*alpha1)/7 | (1*alpha0 + 4*alpha1)/5
6 |(2*alpha0 + 5*alpha1)/7 | 0.0
7 |(1*alpha0 + 6*alpha1)/7 | 1.0
More info at S3 Texture Compression
There are also more modern texture compression algorithms
For OpenGL 4.2 OpenGL-ES 3.0 Ericsson Texture Compression (ETC):
GL_COMPRESSED_RGB8_ETC2
GL_COMPRESSED_SRGB8_ETC2
GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2
GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2
GL_COMPRESSED_RGBA8_ETC2_EAC
GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC
GL_COMPRESSED_R11_EAC
GL_COMPRESSED_SIGNED_R11_EAC
GL_COMPRESSED_RG11_EAC
GL_COMPRESSED_SIGNED_RG11_EAC
And Adaptive Scalable Texture Compression.

Direct Table & Lookup Table

How to measure memory size of an image in direct coding 24-bit RGB color model & in 24-bit 256-entry loop-up table representation. For example: Given an image of resolution 800*600. how much spaces are required to save the image using direct coding and look-up table.
For a regular 24-bit RGB representation most probably you just have to multiply the number of pixel on number of bytes per pixel. 24 bits = 3 bytes, so the size is 800 * 600 * 3 bytes = 1440000 bytes ≈ 1.37 MiB. In some cases you may have rows of an image aligned on some boundary in memory, usually 4 or 8 or 32 bytes. But since 800 is divisible by 32, this will not change anything, still 1.37 MiB.
Now, for a look-up table, you have 1 byte per pixel, since you have only to address one entry in the table. This yields 800 * 600 * 1 = 480000 bytes ≈ 0.46 MiB. Plus the table itself: 256 colors, 24 bits (3 bytes) each - 256 * 3 = 768 bytes. Negligible comparing to the size of the image.

Resources