Monochrome Bitmap (1 bbp) Padding and extra 0xF0 byte - algorithm

I am working with a Monochrome Bitmap image, 1 bit per pixel.
When I examine the file with an hexadecimal editor, I notice that each row ends up with the following hexadecimal sequence: f0 00 00 00.
Having studied the problem a little bit, I concluded that the three last bytes 00 00 00 correspond to the row padding.
Question 1:
I would like to know if the following algorithm to determine the number of padding bytes in case of a 1 bbp BMP image is correct:
if(((n_width % 32) == 0) || ((n_width % 32) > 24))
{
n_nbPaddingBytes = 0;
}
else if((n_width % 32) <= 8)
{
n_nbPaddingBytes = 3;
}
else if((n_width % 32) <= 16)
{
n_nbPaddingBytes = 2;
}
else
{
n_nbPaddingBytes = 1;
}
n_width is the width in pixels of the BMP image.
For example, if n_width = 100 px then n_nbPaddingBytes = 3.
Question 2:
Apart from the padding (00 00 00), I have this F0 byte preceding the three bytes padding on every row. It results in a black vertical line of 4 pixels on the right side of the image.
Note 1: I am manipulating the image prior to printing it on a Zebra printer (I am flipping the image vertically and reverting the colors: basically a black pixel becomes a white one and vice versa).
Note 2: When I open the original BMP image with Paint, it has no such black vertical line on its right side.
Is there any reason why this byte 0xF0 is present at the end of each row?
Thank you for helping.
Best regards.

The bits representing the bitmap pixels are packed in rows. The size of each row is rounded up to a multiple of 4 bytes (a 32-bit DWORD) by padding.
RowSize = [(BitsPerPixel * ImageWidth + 31) / 32] * 4 (division is integer)
(BMP file format)
Monochrome image with width = 100 has line size 16 bytes (128 bits), so 3.5 bytes serve for padding (second nibble of F0 and 00 00 00). F represents right 4 columns of image (white for usual 0/1 palette).

Related

Turn off sw_scale conversion to planar YUV 32 byte alignment requirements

I am experiencing artifacts on the right edge of scaled and converted images when converting into planar YUV pixel formats with sw_scale. I am reasonably sure (although I can not find it anywhere in the documentation) that this is because sw_scale is using an optimization for 32 byte aligned lines, in the destination. However I would like to turn this off because I am using sw_scale for image composition, so even though the destination lines may be 32 byte aligned, the output image may not be.
Example.
Full output frame is 1280x720 yuv422p10le. (this is 32 byte aligned)
However into the top left corner I am scaling an image with an outwidth of 1280 / 3 = 426.
426 in this format is not 32 byte aligned, but I believe sw_scale sees that the output linesize is 32 byte aligned and overwrites the width of 426 putting garbage in the next 22 bytes of data thinking this is simply padding when in my case this is displayable area.
This is why I need to actually disable this optimization or somehow trick sw_scale into believing it does not apply while keeping intact the way the program works, which is otherwise fine.
I have tried adding extra padding to the destination lines so they are no longer 32 byte aligned,
this did not help as far as I can tell.
Edit with code Example. Rendering omitted for ease of use.
Also here is a similar issue, unfortunately as I stated there fix will not work for my use case. https://github.com/obsproject/obs-studio/pull/2836
Use the commented line of code to swap between a output width which is and isnt 32 byte aligned.
#include "libswscale/swscale.h"
#include "libavutil/imgutils.h"
#include "libavutil/pixelutils.h"
#include "libavutil/pixfmt.h"
#include "libavutil/pixdesc.h"
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
/// Set up a 1280x720 window, and an item with 1/3 width and height of the window.
int window_width, window_height, item_width, item_height;
window_width = 1280;
window_height = 720;
item_width = (window_width / 3);
item_height = (window_height / 3);
int item_out_width = item_width;
/// This line sets the item width to be 32 byte aligned uncomment to see uncorrupted results
/// Note %16 because outformat is 2 bytes per component
//item_out_width -= (item_width % 16);
enum AVPixelFormat outformat = AV_PIX_FMT_YUV422P10LE;
enum AVPixelFormat informat = AV_PIX_FMT_UYVY422;
int window_lines[4] = {0};
av_image_fill_linesizes(window_lines, outformat, window_width);
uint8_t *window_planes[4] = {0};
window_planes[0] = calloc(1, window_lines[0] * window_height);
window_planes[1] = calloc(1, window_lines[1] * window_height);
window_planes[2] = calloc(1, window_lines[2] * window_height); /// Fill the window with all 0s, this is green in yuv.
int item_lines[4] = {0};
av_image_fill_linesizes(item_lines, informat, item_width);
uint8_t *item_planes[4] = {0};
item_planes[0] = malloc(item_lines[0] * item_height);
memset(item_planes[0], 100, item_lines[0] * item_height);
struct SwsContext *ctx;
ctx = sws_getContext(item_width, item_height, informat,
item_out_width, item_height, outformat, SWS_FAST_BILINEAR, NULL, NULL, NULL);
/// Check a block in the normal region
printf("Pre scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
(int)((uint16_t*)window_planes[2])[0]);
/// Check a block in the corrupted region (should be all zeros) These values should be out of the converted region
int corrupt_offset_y = (item_out_width + 3) * 2; ///(item_width + 3) * 2 bytes per component Y PLANE
int corrupt_offset_uv = (item_out_width + 3); ///(item_width + 3) * (2 bytes per component rshift 1 for horiz scaling) U and V PLANES
printf("Pre scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
(int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));
sws_scale(ctx, (const uint8_t**)item_planes, item_lines, 0, item_height,window_planes, window_lines);
/// Preform same tests after scaling
printf("Post scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
(int)((uint16_t*)window_planes[2])[0]);
printf("Post scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
(int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));
return 0;
}
Example Output:
//No alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 512 36865 36865
//With alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 0 0 0
I believe sw_scale sees that the output linesize is 32 byte aligned and overwrites the width of 426 putting garbage in the next 22 bytes of data thinking this is simply padding when in my case this is displayable area.
That's actually correct, swscale indeed does that, good analysis. There's two ways to get rid of this:
disable all SIMD code using av_set_cpu_flags_mask(0).
write the re-scaled 426xN image in a temporary buffer and then manually copy the pixels into the unpadded destination plane.
The reason ffmpeg/swscale overwrite the destination is for performance. If you don't care about runtime and want the simplest code, use the first solution. If you do want performance and don't mind slightly more complicated code, use the second solution.

Matlab: Reorder 5 bytes to 4 times 10 bits RAW10

I would like to Import a RAW10 file into Matlab. The infos are directly attachted to the jpeg file provided by the raspberry pi camera.
4 Pixels are saved as 5 bytes.
The first four bytes contain the bit 9-2 of a pixel.
The last byte contains the missing LSB.
sizeRAW = 6404096;
sizeHeader =32768;
I = ones(1944,2592);
fin=fopen('0.jpeg','r');
off1 = dir('0.jpeg');
offset = off1.bytes - sizeRAW + sizeHeader;
fseek(fin, offset,'bof');
pixel = ones(1944,2592);
I=fread(fin,1944,'ubit10','l');
for col=1:2592
I(:,col)=fread(fin,1944,'ubit8','l');
col = col+4;
end
fclose(fin);
This is as far as I came yet, but it's not right.

Algorithm to Generate All Possible Black and White Pixel Images in 640 x 360 Dimensions?

I have very minimal programming experience.
I would like to write a program that will generate and save as a gif image every possible image that can be created using only black and white pixels in 640 by 360 px dimensions.
In other words, each pixel can be either black or white. 640 x 360 = 230,400 pixels. So I believe total of 460,800 images are possible to be generated (230,400 x 2 for black/white).
I would like a program to do this automatically.
Please help!
First to answer your questions. Yes there will be writings on "some" pictures. Actually ever text written by human which fits in 640x360 pixels will show up. Also every other text (text not yet written or text that never will be written). Also you will see pictures of every human which is, was or will be alive. See Infinite Monkey Theorem for further information.
The code to create your wanted gif is fairly easy. I used Java for this. Note that you need an extra class: AnimatedGifEncoder. The Code is not memory-bound because the AanimatedGifEncoder will write each image to disk as soon it is computed. But make sure that you have enough disk space available.
import java.awt.Color;
import java.awt.image.BufferedImage;
public class BigPicture {
private final int width;
private final int height;
private final int WHITE = Color.WHITE.getRGB();
private final int BLACK = Color.BLACK.getRGB();
public BigPicture(int width, int height) {
this.width = width;
this.height = height;
}
public void process(String outFile) {
AnimatedGifEncoder gif = new AnimatedGifEncoder();
gif.setSize(width, height);
gif.setTransparent(null); // no transparency
gif.setRepeat(-1); // play only once
gif.setDelay(0); // 0 ms delay between images,
// 'cause ain't nobody got time for that!
gif.start(outFile);
BufferedImage bufferedImage = new BufferedImage(width, height, BufferedImage.TYPE_BYTE_BINARY);
// set the image to all white
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
bufferedImage.setRGB(x, y, WHITE);
}
}
// add white image
gif.addFrame(bufferedImage);
// add all other combinations
while (increase(bufferedImage)) {
gif.addFrame(bufferedImage);
}
gif.finish();
}
/**
* #param bufferedImage
* the image to increase
* #return false if last pixel set to black => image is complete black
*/
private boolean increase(BufferedImage bufferedImage) {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (bufferedImage.getRGB(x, y) == WHITE) {
bufferedImage.setRGB(x, y, BLACK);
return true;
}
bufferedImage.setRGB(x, y, WHITE);
}
}
return false;
}
public static void main(String[] args) {
new BigPicture(640, 360).process("C:\\temp\\bigpicture.gif");
System.out.println("finished.");
}
}
Please be aware that this will take some time. So don't bother waiting and enjoy your life instead! ;)
EDIT: Since my solution is a bit unclear i will explain the algorithm.
I have defined a method called increase. This method takes the BufferedImage and changes the bit pattern of the image so that the next bit pattern appears. The method is just a bit addition. The method will return false if the image encounters the last bit pattern (all pixels are set to black).
As long as it is possible to increase the bit pattern (i.e. increase() returns true) we will save the image as new frame and increase the image again.
How the increase() method works: The method runs over the image first in x-direction then in y-direction. I assume that white pixels are 0 and black pixels are 1. So, we want to take the bit pattern of the image and add 1. We inspect the first pixel: if it is white (0) we can add 1 without an overflow so we turn the pixel to black (0 + 1 = 1 => black pixel). After that we return from the method because we want to increase only one position. It returns true because an increase was possible. If we encounter a black pixel we have an overflow (1 + 1 = 2 or in binary 10). So we have to set the current pixel to white and add the 1 to the next pixel. This will continue until we find the first white pixel.
example:
first we create a print method: this method prints the image as binary number. Attention the number is reversed and the most significant bit is the bit on the right side.
public void print(BufferedImage bufferedImage) {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (bufferedImage.getRGB(x, y) == WHITE) {
System.out.print(0); // white pixel
} else {
System.out.print(1); // black pixel
}
}
}
System.out.println();
}
now we modify our main-while loop:
print(bufferedImage); // this one prints the empty image
while (increase(bufferedImage)) {
print(bufferedImage);
}
and now set some short example to test:
new BigPicture(1, 5).process("C:\\temp\\bigpicture.gif");
and finally the output:
00000 // 0 this is the first print before the loop -> "white image"
10000 // 1 the first white pixel is set to black
01000 // 2 the first overflow, so the second pixel is set to black "2"
11000 // 3
00100 // 4
10100 // 5
01100
11100
00010 // 8
10010
01010
11010
00110
10110
01110
11110
00001 // 16
10001
01001
11001
00101
10101
01101
11101
00011
10011
01011
11011
00111
10111
01111
11111 // 31 == 2^5 - 1
finished.
In other words, each pixel can be either black or white. 640 x 360 =
230,400 pixels. So I believe total of 460,800 images are possible to
be generated (230,400 x 2 for black/white).
There is a little flaw in your belief. You are right about the number of pixels: 230,400. Unfortunately, this means there are not 2 * 230,400, but 2 ^ 230,400 possible pictures, which is a number with more than 60,000 digits (longer than the allowed answer size, I am afraid). For comparison a particular number with 45 digits signifies the diameter of the observable universe in centimeters (roughly the width of a pinkie).
In order to understand why your computation of the number of pictures is wrong consider this example: if your pictures contained only three pixels, you could have 8 different pictures (2 ^ 3), rather than 6 (2 * 3). Here are all of them: BBB, BBW, BWB, BWW, WBB, WBW, WWB, WWW. Adding another pixel doubles the size of possible pictures because you can have it white for all the 3-pixel cases, or black for all the 3-pixel cases. Doubling 1 (which is the amount of pictures you can have with 0 pixels) 230,400 times gives you 2 ^ 230,400.
It's great that there is a bounty for the question, but it is rather distracting and counter-productive if it was just as an April's Fool joke.
I'm going to go ahead and pinch some code from a related question, just for fun.
from itertools import product
for matrix in product([0, 1], repeat=(math,pow(2,230400)):
# render and save your .gif
As all the comments have already stated, good luck!
On a more serious note, if you didn't want to be absolutely sure that you had all permutations, you could generate a random 640x360 matrix and store it as an image.
Perform this action say 100k times, and you'll have at least an interesting set of pictures to look at, but it's unfeasible to get every possible permutation.
You could then delete all identical files to reduce the set to just the unique images.

Indexing pixels in a monochrome FreeType glyph buffer

I want to translate a monochrome FreeType glyph to an RGBA unsigned byte OpenGL texture. The colour of the texture at pixel (x, y) would be (255, 255, alpha), where
alpha = glyph->bitmap.buffer[pixelIndex(x, y)] * 255
I load my glyph using
FT_Load_Char(face, glyphChar, FT_LOAD_RENDER | FT_LOAD_MONOCHROME | FT_LOAD_TARGET_MONO)
The target texture has dimensions of glyph->bitmap.width * glyph->bitmap.rows. I've been able to index a greyscale glyph (loaded using just FT_Load_Char(face, glyphChar, FT_LOAD_RENDER)) with
glyph->bitmap.buffer[(glyph->bitmap.width * y) + x]
This does not appear work on a monochrome buffer though and the characters in my final texture are scrambled.
What is the correct way to get the value of pixel (x, y) in a monochrome glyph buffer?
Based on this thread I started on Gamedev.net, I've come up with the following function to get the filled/empty state of the pixel at (x, y):
bool glyphBit(const FT_GlyphSlot &glyph, const int x, const int y)
{
int pitch = abs(glyph->bitmap.pitch);
unsigned char *row = &glyph->bitmap.buffer[pitch * y];
char cValue = row[x >> 3];
return (cValue & (128 >> (x & 7))) != 0;
}
I have a similiar question some time ago. So I would to try help you.
The target texture has dimensions of glyph->bitmap.width * glyph->bitmap.rows
This is very specific dimension for OpenGl. Would be better if you round this to power of two.
In common way you make cycle where you get every glyph. Then cycle for every row from 0 to glyph->bitmap.rows. Then cycle for every byte (unsigned char) in row from 0 to glyph->pitch. Where you get byte by handling glyph->bitmap.buffer[pitch * row + i] (i is index of inner cycle and row is index of outer). For example:
if(s[i] == ' ') left += 20; else
for (int row = 0; row < g->bitmap.rows; ++row) {
if(kerning)
for(int b = 0; b < pitch; b++){
if(data[left + 64*(strSize*(row + 64 - g->bitmap_top)) + b] + g->bitmap.buffer[pitch * row + b] < UCHAR_MAX)
data[left + 64*(strSize*(row + 64 - g->bitmap_top)) + b] += g->bitmap.buffer[pitch * row + b];
else
data[left + 64*(strSize*(row + 64 - g->bitmap_top)) + b] = UCHAR_MAX;
} else
std::memcpy(data + left + 64*(strSize*(row + 64 - g->bitmap_top)) , g->bitmap.buffer + pitch * row, pitch);
}
left += g->advance.x >> 6;
This code is relevant to an 8-bit bitmap (standart FT_Load_Char(face, glyphChar, FT_LOAD_RENDER)).
Now I tried to use the monochrome flag and it caused me trouble. So my answer is not a solution to your problem. If you just want to display the letter then you should see my question.
The following Python function unpacks a FT_LOAD_TARGET_MONO glyph bitmap into a more convenient representation where each byte in the buffer maps to one pixel.
I've got some more info on monochrome font rendering with Python and FreeType plus additional example code on my blog: http://dbader.org/blog/monochrome-font-rendering-with-freetype-and-python
def unpack_mono_bitmap(bitmap):
"""
Unpack a freetype FT_LOAD_TARGET_MONO glyph bitmap into a bytearray where each
pixel is represented by a single byte.
"""
# Allocate a bytearray of sufficient size to hold the glyph bitmap.
data = bytearray(bitmap.rows * bitmap.width)
# Iterate over every byte in the glyph bitmap. Note that we're not
# iterating over every pixel in the resulting unpacked bitmap --
# we're iterating over the packed bytes in the input bitmap.
for y in range(bitmap.rows):
for byte_index in range(bitmap.pitch):
# Read the byte that contains the packed pixel data.
byte_value = bitmap.buffer[y * bitmap.pitch + byte_index]
# We've processed this many bits (=pixels) so far. This determines
# where we'll read the next batch of pixels from.
num_bits_done = byte_index * 8
# Pre-compute where to write the pixels that we're going
# to unpack from the current byte in the glyph bitmap.
rowstart = y * bitmap.width + byte_index * 8
# Iterate over every bit (=pixel) that's still a part of the
# output bitmap. Sometimes we're only unpacking a fraction of a byte
# because glyphs may not always fit on a byte boundary. So we make sure
# to stop if we unpack past the current row of pixels.
for bit_index in range(min(8, bitmap.width - num_bits_done)):
# Unpack the next pixel from the current glyph byte.
bit = byte_value & (1 << (7 - bit_index))
# Write the pixel to the output bytearray. We ensure that `off`
# pixels have a value of 0 and `on` pixels have a value of 1.
data[rowstart + bit_index] = 1 if bit else 0
return data

How "bytesPerRow" is calculated from an NSBitmapImageRep

I would like to understand how "bytesPerRow" is calculated when building up an NSBitmapImageRep (in my case from mapping an array of floats to a grayscale bitmap).
Clarifying this detail will help me to understand how memory is being mapped from an array of floats to a byte array (0-255, unsigned char; neither of these arrays are shown in the code below).
The Apple documentation says that this number is calculated "from the width of the image, the number of bits per sample, and, if the data is in a meshed configuration, the number of samples per pixel."
I had trouble following this "calculation" so I setup a simple loop to find the results empirically. The following code runs just fine:
int Ny = 1; // Ny is arbitrary, note that BytesPerPlane is calculated as we would expect = Ny*BytesPerRow;
for (int Nx = 0; Nx<320; Nx+=64) {
// greyscale image representation:
NSBitmapImageRep *dataBitMapRep = [[NSBitmapImageRep alloc]
initWithBitmapDataPlanes: nil // allocate the pixel buffer for us
pixelsWide: Nx
pixelsHigh: Ny
bitsPerSample: 8
samplesPerPixel: 1
hasAlpha: NO
isPlanar: NO
colorSpaceName: NSCalibratedWhiteColorSpace // 0 = black, 1 = white
bytesPerRow: 0 // 0 means "you figure it out"
bitsPerPixel: 8]; // bitsPerSample must agree with samplesPerPixel
long rowBytes = [dataBitMapRep bytesPerRow];
printf("Nx = %d; bytes per row = %lu \n",Nx, rowBytes);
}
and produces the result:
Nx = 0; bytes per row = 0
Nx = 64; bytes per row = 64
Nx = 128; bytes per row = 128
Nx = 192; bytes per row = 192
Nx = 256; bytes per row = 256
So we see that the bytes/row jumps in 64 byte increments, even when Nx incrementally increases by 1 all the way to 320 (I didn't show all of those Nx values). Note also that Nx = 320 (max) is arbitrary for this discussion.
So from the perspective of allocating and mapping memory for a byte array, how are the "bytes per row" calculated from first principles? Is the result above so the data from a single scan-line can be aligned on a "word" length boundary (64 bit on my MacBook Pro)?
Thanks for any insights, having trouble picturing how this works.
Passing 0 for bytesPerRow: means more than you said in your comment. From the documentation:
If you pass in a rowBytes value of 0, the bitmap data allocated may be padded to fall on long word or larger boundaries for performance. … Passing in a non-zero value allows you to specify exact row advances.
So you're seeing it increase by 64 bytes at a time because that's how AppKit decided to round it up.
The minimum requirement for bytes per row is much simpler. It's bytes per pixel times pixels per row. That's all.
For a bitmap image rep backed by floats, you'd pass sizeof(float) * 8 for bitsPerSample, and bytes-per-pixel would be sizeof(float) * samplesPerPixel. Bytes-per-row follows from that; you multiply bytes-per-pixel by the width in pixels.
Likewise, if it's backed by unsigned bytes, you'd pass sizeof(unsigned char) * 8 for bitsPerSample, and bytes-per-pixel would be sizeof(unsigned char) * samplesPerPixel.

Resources