C code to covert 16bit (RBG565) to grayscale image - for-loop

I am trying to covert 16bit (RBG565) to grayscale image. I tried various combination of formula suggested in the internet all works good for 24 bit RGB888 format. when i try with 16bit (RBG565) the image has blue, red pixels, unable to create the exact grayscale image. please help.
Formula 1 works better than Formula 2:
Formula 1:
unsigned char gray = (red * 77+( (green )* 150)/2 + blue * 29+128) / 256;
Formula 2:
unsigned char gray = red * 0.212 + green * 0.715 + blue * 0.072;

Since the RGB components tend to be a scale from no-colour to full-colour for each primary colour (primary in the general sense, meaning colours combined to form others), I would probably convert to RGB up front, then use the normal grey-scaling algorithm (any of the ones you state that "work good for 24 bit RGB888").
That first bit could probably be done with (upper case is the eight-bit value, lower case the 5/6-bit value):
R = r * 8 ; 256/32, 8-bit from 5-bit
B = b * 4 ; 256/64, 8-bit from 6-bit
G = g * 8 ; 256/32, 8-bit from 5-bit
Just ensure you get the order right when extracting the bits, because you're converting RBG to RGB.
For example here's a complete C program (since you didn't provide a language tag) containing a function which will do that, except that it uses slightly different scaling which relies on percentages rather than fixed values that need 0..255 inputs:
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
uint16_t greyScale(uint16_t rbg565) {
// Get the values (32-bit to hold larger intermediate values).
uint32_t r = (rbg565 >> 11) & 31U; // XXXXX___________
uint32_t g = rbg565 & 31U; // ___________XXXXX
uint32_t b = (rbg565 >> 5) & 63U; // _____XXXXXX_____
// Scale each of them up to the range 0..500.
g = g * 500 / 31;
b = b * 500 / 63;
r = r * 500 / 31;
// Use example RGB weights of 299/587/114 (summing to 1,000).
// Range then 0..500,000 so divide by 500 to get grey 0..1,000.
uint32_t grey = (r * 299 + g * 587 + b * 114);
grey /= 500;
return (uint16_t)grey;
}
int main(int argc, char *argv[])
{
for (int i = 1; i < argc; i++) {
uint16_t rbg = atoi(argv[i]);
printf("%5d -> %d\n", rbg, greyScale(rbg));
}
return 0;
}
If you run it with some sample values, you'll see the greyscale values come out (comments are added by me):
pax:~> ./prog 0 65535 31 $((63*32)) $((31*2048)) $((16*2048+32*32+16))
0 -> 0 # No colour components.
65535 -> 1000 # Max RGB (white).
31 -> 587 # Max green, nothing else.
2016 -> 114 # Max blue, nothing else.
63488 -> 299 # Max red, nothing else.
33808 -> 514 # A little more than half of everything.
If you want to use weightings different from the ones I provided (29.9%, 58.7%, and 11.4%), just change this line to match, making sure they still sum to a thousand:
uint32_t grey = (r * 299 + g * 587 + b * 114);

Related

Compare 16 bpp to 32 bpp bitmap conversions

I got a 16 bpp bitmap that I converted to 32 bpp via code below:
void Rgb555ToRgb8(const UChar* bitmapData, UInt32 width, UInt32 height, UChar* buf)
{
UInt32 dst_bytes_per_row = width * 4;
UInt32 src_bytes_per_row = ((width * 16 + 31) / 32) * 4;
UInt16 red_mask = 0x7C00;
UInt16 green_mask = 0x3E0;
UInt16 blue_mask = 0x1F;
for (UInt32 row = 0; row < height; ++row)
{
UInt32 dstCol = 0, srcCol = 0;
do
{
UInt16 rgb = *(UInt16*)(bitmapData + row * src_bytes_per_row + srcCol);
UChar red_value = (rgb & red_mask) >> 10;
UChar green_value = (rgb & green_mask) >> 5;
UChar blue_value = (rgb & blue_mask);
buf[row*dst_bytes_per_row + dstCol] = blue_value << 3;
buf[row*dst_bytes_per_row + dstCol + 1] = green_value << 3;
buf[row*dst_bytes_per_row + dstCol + 2] = red_value << 3;
buf[row*dst_bytes_per_row + dstCol + 3] = rgb >> 15;
srcCol += 2;
dstCol += 4;
} while (srcCol < src_bytes_per_row);
}
}
Here is conversion result: [2]: https://i.stack.imgur.com/1ajO7.png
I also tried to convert this image via GdiPlus:
Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(w,h,PixelFormat32bppRGB);
Resultant image is .
Notice that the 2 results don't look exactly the same (e.g., the background in GdiPlus result is white). How can I modify my code to match GdiPlus result?
There are two issues that need to be addressed:
Unused bits
When moving from 5 bits of information to 8 bits of information you gain an additional 3 bits. As implemented, the code doesn't make use of that additional range, and is biased towards darker color components. This is an illustration of what blue_value << 3 actually does:
5 bits per channel 8 bits per channel
bbbbb -> bbbbb000
To address this, the least significant 3 bits need to grow as the channel value gets higher. A simple (yet somewhat inaccurate) would be to just copy the most significant 3 bits down to the least significant 3 bits, i.e.
buf[row*dst_bytes_per_row + dstCol] = (blue_value << 3) | (blue_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 1] = (green_value << 3) | (green_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 2] = (red_value << 3) | (red_value >> 2);
The exact mapping would be a bit more involved, something like
blue_value = static_cast<UChar>((blue_value * 255.0) / 31.0 + 0.5);
That converts from 5 bits to the respective 8 bit value that's nearest to the ideal value, including the 4 values that were 1/255th off in the bit-shifting solution above.
If you opt for the latter, you can build a lookup table that stores the mapped values. This table is only 32 entries of one byte each, so it fits into a single cache-line.
Alpha channel
Assuming that the MSB of your source image is indeed interpreted as an alpha value, you're going to have move that into the destination as well. Since the source is only 1 bit of information, the raw transformation is trivial:
buf[row*dst_bytes_per_row + dstCol + 3] = rgb & (1 << 15) ? 255 : 0;
That may or may not be all that's needed. Windows assumes premultiplied alpha, i.e. the stored values of the color channels must be premultiplied by the alpha value (see BLENDFUNCTION for reference).
If the alpha value is 255, the color channel values are already correct. If the alpha value is 0, all color channels need to be multiplied by zero (or simply set to 0). The translation doesn't produce any other alpha values.

Assembly language using signed int multiplication math to perform shifts

This is a bit of a turn around.
Usually one is attempting to use shifts to perform multiplication and not the other way around.
On the Hitachi/Motorola 6309 there is no shift by n bits. There is only shift by 1 bit.
However there is a 16 bit x 16 bit signed multiply (provides a 32 bit signed result).
(EDIT) Using this is no problem for a 16 bit shift (left) however I'm trying to use 2 x 16x16 signed mults to do a 32 bit shift. The high order word of the result for the low order word shift is the problem. (Does that make sence?)
Some pseudo code might help:
result.highword = low word of (val.highword * shiftmulttable[shift])
temp = val.lowword * shiftmulttable[shift]
result.lowword = temp.lowword
result.highword = or (result.highword, temp.highword)
(with some magic on temp.highword to consider signed values)
I have been exercising my logic in an attempt to use this instruction to perform the shifts but so far I have failed.
I can easily achieve any positive value shifts by 0 to 14 but when it comes to shifting by 15 bits (mult by 0x8000) or shifting any negative values certain combinations of values require either:
complementing the result by 1
complementing the result by 2
adding 1 to the result
doing nothing to the result
And I just can't see any pattern to these values.
Any ideas appreciated!
Best I can tell from the problem description, implementing the 32-bit shift would work as desired by using an unsigned 16x16->32 bit multiply. This can easily be synthesized from a signed 16x16->32 multiply instruction by exploiting the two's complement integer representation. If the two factors are a and b, adding b to the high-order 16 bits of the signed product when a is negative, and adding a to the high-order 16 bits of the signed product when b is negative will give us the unsigned multiplication result.
The following C code implements this approach and tests it exhaustively:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
/* signed 16x16->32 bit multiply. Hardware instruction */
int32_t mul16_wide (int16_t a, int16_t b)
{
return (int32_t)a * (int32_t)b;
}
/* unsigned 16x16->32 bit multiply (synthetic) */
int32_t umul16_wide (int16_t a, int16_t b)
{
int32_t p = mul16_wide (a, b); // signed 16x16->32 bit multiply
if (a < 0) p = p + (b << 16); // add 'b' to upper 16 bits of product
if (b < 0) p = p + (a << 16); // add 'a' to upper 16 bits of product
return p;
}
/* unsigned 16x16->32 bit multiply (reference) */
uint32_t umul16_wide_ref (uint16_t a, uint16_t b)
{
return (uint32_t)a * (uint32_t)b;
}
/* test synthetic unsigned multiply exhaustively */
int main (void)
{
int16_t a, b;
int32_t res, ref;
uint64_t count = 0;
a = -32768;
do {
b = -32768;
do {
res = umul16_wide (a, b);
ref = umul16_wide_ref (a, b);
count++;
if (res != ref) {
printf ("!!!! a=%d b=%d res=%d ref=%d\n", a, b, res, ref);
return EXIT_FAILURE;
}
if (b == 32767) break;
b = b + 1;
} while (1);
if (a == 32767) break;
a = a + 1;
} while (1);
printf ("test cases passed: %llx\n", count);
return EXIT_SUCCESS;
}
I am not familiar with the Hitachi/Motorola 6309 architecture. I assume it uses a special 32-bit register to hold the result of a wide multiply, from which high and low half can be extracted into 16-bit general-purpose registers, and the conditional corrections can then be applied to the register holding the upper 16 bits.
Are you using fixed-point multiplicative inverses to use the high half result for a right shift?
If you're just left-shifting, multiply by 0x8000 should work. The low half of an NxN => 2N-bit multiply is the same whether inputs are treated as signed or unsigned. Or do you need a 32-bit shift result from your 16-bit input?
Is the multiply instruction actually faster than a few 1-bit shifts for small shift counts? (I wouldn't be surprised if compile-time-constant counts of 2 or 3 would be faster with just a chain of 2 or 3 add same,same or left-shift instructions.)
Anyway, for a compile-time-constant shift count of 15, maybe just multiply by 1<<14 and then do the last count with a 1-bit shift (add same,same).
Or if your ISA has rotates, rotate right by 1 and mask away the low bits, skipping the multiply. Or zero a register, right-shift the low bit into the carry flag, then rotate-through-carry into the top of the zeroed register.
(The latter might be useful on an ISA that doesn't have large immediates and couldn't "mask away all the low bits" in one instruction. Or an ISA that only has RCR not ROR. I don't know 6309 at all)
If you're using a runtime count to look up a multiplier from a table, maybe branch for that case, or adjust your LUT so every entry needs an extra 1-bit shift, so you can do mul(lut[count]) and an unconditional extra shift.
(Only works if you don't need to support a shift-count of zero.)
Not that there would be many interested people who would want to see the 6309 code, but here it is:
Compliant with OS9 C ABI.
Pointer to result and arguments pushed on stack right to left.
U,PC,val(4bytes),shift(2bytes),*result(2bytes)
0 2 4 8 10
:
* 10,s pointer to long result
* 4,s 4 byte value
* 8,s 2 byte shift
* x = pointer to result
pshs u
ldx 10,s * load pointer to result
ldd 8,s * load shift
* if shift amount is greater than 31 then
* just return zero. OS9 C standard.
cmpd #32
blt _10x
ldq #0
stq 4,s
bra _13x
* if shift amount is greater than 16 than
* move bottom word of value into top word
* and clear bottom word
_10x
cmpb #16
blt _1x
ldu 6,s
stu 4,s
clr 6,s
clr 7,s
_1x
* setup pointer u and offset e into mult table _2x
leau _2x,pc
andb #15
* if there is no shift value just return value
beq _13x
aslb * need to double shift to use as word table offset
stb 8,s * save double shft
tfr b,e
* shift top word q = val.word.high * multtab[shft]
ldd 4,s
muld e,u
stw ,x * result.word.high = low word of mult
* shift bottom word q = val.word.low * multtab[shft]
lde 8,s * reload double shft
ldd 6,s
muld e,u
stw 2,x * result.word.low = low word of mult
* The high word or mult needs to be corrected for sign
* if val is negative then muld will return negated results
* and need to un negate it
lde 8,s * reload double shift
tst 4,s * test top byte of val for negative
bge _11x
addd e,u * add the multtab[shft] again to top word
_11x
* if multtab[shft] is negative (shft is 15 or shft<<1 is 30)
* also need to un negate result
cmpe #30
bne _12x
addd 6,s * add val.word.low to top word
_12x
* combine top and bottom and save bottom half of result
ord ,x
std ,x
bra _14x
* this is only reached if the result is in value (let result = value)
_13x
ldq 4,s * load value
stq ,x * result = value
_14x
puls u,pc
_2x fdb $01,$02,$04,$08,$10,$20,$40,$80,$0100,$0200,$0400,$0800
fdb $1000,$2000,$4000,$8000

Why does golang RGBA.RGBA() method use | and <<?

In the golang color package, there is a method to get r,g,b,a values from an RGBA object:
func (c RGBA) RGBA() (r, g, b, a uint32) {
r = uint32(c.R)
r |= r << 8
g = uint32(c.G)
g |= g << 8
b = uint32(c.B)
b |= b << 8
a = uint32(c.A)
a |= a << 8
return
}
If I were to implement this simple function, I would just write this
func (c RGBA) RGBA() (r, g, b, a uint32) {
r = uint32(c.R)
g = uint32(c.G)
b = uint32(c.B)
a = uint32(c.A)
return
}
What's the reason r |= r << 8 is used?
From the the excellent "The Go image package" blogpost:
[...] the channels have a 16-bit effective range: 100% red is represented by
RGBA returning an r of 65535, not 255, so that converting from CMYK or
YCbCr is not as lossy. Third, the type returned is uint32, even though
the maximum value is 65535, to guarantee that multiplying two values
together won't overflow.
and
Note that the R field of an RGBA is an 8-bit alpha-premultiplied color in the range [0, 255]. RGBA satisfies the Color interface by multiplying that value by 0x101 to generate a 16-bit alpha-premultiplied color in the range [0, 65535]
So if we look at the bit representation of a color with the value c.R = 10101010 then this operation
r = uint32(c.R)
r |= r << 8
effectively copies the first byte to the second byte.
00000000000000000000000010101010 (r)
| 00000000000000001010101000000000 (r << 8)
--------------------------------------
00000000000000001010101010101010 (r |= r << 8)
This is equivalent to a multiplication with the factor 0x101 and distributes all 256 possible values evenly across the range [0, 65535].
The color.RGBA type implements the RGBA method to satisfy the color.Color interface:
type Color interface {
// RGBA returns the alpha-premultiplied red, green, blue and alpha values
// for the color. Each value ranges within [0, 0xffff], but is represented
// by a uint32 so that multiplying by a blend factor up to 0xffff will not
// overflow.
//
// An alpha-premultiplied color component c has been scaled by alpha (a),
// so has valid values 0 <= c <= a.
RGBA() (r, g, b, a uint32)
}
Now the RGBA type represents the colour channels with the uint8 type, giving a range of [0, 0xff]. Simply converting these values to uint32 would not extend the range up to [0, 0xffff].
An appropriate conversion would be something like:
r = uint32((float64(c.R) / 0xff) * 0xffff)
However, they want to avoid the floating point arithmetic. Luckily 0xffff / 0xff is 0x0101, so we can simplify the expression (ignoring the type conversions for now):
r = c.R * 0x0101
= c.R * 0x0100 + c.R
= (c.R << 8) + c.R # multiply by power of 2 is equivalent to shift
= (c.R << 8) | c.R # equivalent, since bottom 8 bits of first operand are 0
And that's essentially what the code in the standard library is doing.
Converting a value in the range 0 to 255 (an 8-bit RGB component) to a value in the range 0 to 65535 (a 16-bit RGB component) would be done by multiplying the 8-bit value by 65535/255; 65535/255 is exactly 257, which is hex 101, so multiplying a one-byte by 65535/255 can be done by shifting that byte value left 8 bits and ORing it with the original value.
(There's nothing Go-specific about this; similar tricks are done elsewhere, in other languages, when converting 8-bit RGB/RGBA components to 16-bit RGB/RGBA components.)
To convert from 8- to 16-bits per RGB component, copy the byte into the high byte of the 16-bit value. e.g., 0x03 becomes 0x0303, 0xFE becomes 0xFEFE, so that the 8-bit values 0 through 255 (0xFF) produce 16-bit values 0 to 65,535 (0xFFFF) with an even distribution of values.

trouble to understand how to make the matrix match the AR Marker

My goal is to know how to create my own marker and use it
I'm having trouble to understand how to make the matrix matches the AR Marker PNG.
Id really love someone to either explain how this and the PNG are working together,
Actually Im a bit embarrassed as on further reading it is not hamming code,
but based on hamming code Still possibly someone familiar with hamming code might be able to help this is
(the whole tutorial link is at the bottom of the post)
The main difference with the hamming code is that the first bit (parity of bits 3 and 5) is inverted. So, ID 0 (which in hamming code is 00000) becomes 10000 in our code. The idea is to prevent a completely black rectangle from being a valid marker ID, with the goal of reducing the likelihood of false positives with objects of the environment.
As there are four possible orientations of the marker picture, we have to find the correct marker position. Remember, we introduced three parity bits for each two bits of information. With their help we can find the hamming distance for each possible marker orientation. The correct marker position will have zero hamming distance error, while the other rotations won't.
Here is a code snippet that rotates the bit matrix four times and finds the correct marker orientation:
//check all possible rotations
cv::Mat rotations[4];
int distances[4];
rotations[0] = bitMatrix;
distances[0] = hammDistMarker(rotations[0]);
std::pair<int,int> minDist(distances[0],0);
for (int i=1; i<4; i++)
{
//get the hamming distance to the nearest possible word
rotations[i] = rotate(rotations[i-1]);
distances[i] = hammDistMarker(rotations[i]);
if (distances[i] < minDist.first)
{
minDist.first = distances[i];
minDist.second = i;
}
}
This code finds the orientation of the bit matrix in such a way that it gives minimal error for the hamming distance metric. This error should be zero for correct marker ID; if it's not, it means that we encountered a wrong marker pattern (corrupted image or false-positive marker detection).
**this is the code that I think is relating to the Marker png shown
can anyone help me to understand the matrix so I can use it.
ALL diagrams, thoughts and explanations happily accepted for a non maths person to get an understanding of this quite complex problem ;P !
![the working AR marker when view from iPad][4]
//
// Marker.cpp
// Example_MarkerBasedAR
//
// Created by Ievgen Khvedchenia on 3/13/12.
// Copyright (c) 2012 Ievgen Khvedchenia. All rights reserved.
//
#include "Marker.hpp"
#include "DebugHelpers.hpp"
Marker::Marker()
: id(-1)
{
}
bool operator<(const Marker &M1,const Marker&M2)
{
return M1.id<M2.id;
}
cv::Mat Marker::rotate(cv::Mat in)
{
cv::Mat out;
in.copyTo(out);
for (int i=0;i<in.rows;i++)
{
for (int j=0;j<in.cols;j++)
{
out.at<uchar>(i,j)=in.at<uchar>(in.cols-j-1,i);
}
}
return out;
}
int Marker::hammDistMarker(cv::Mat bits)
{
int ids[4][5]=
{
{1,0,0,0,0},
{1,0,1,1,1},
{0,1,0,0,1},
{0,1,1,1,0}
};
int dist=0;
for (int y=0;y<5;y++)
{
int minSum=1e5; //hamming distance to each possible word
for (int p=0;p<4;p++)
{
int sum=0;
//now, count
for (int x=0;x<5;x++)
{
sum += bits.at<uchar>(y,x) == ids[p][x] ? 0 : 1;
}
if (minSum>sum)
minSum=sum;
}
//do the and
dist += minSum;
}
return dist;
}
int Marker::mat2id(const cv::Mat &bits)
{
int val=0;
for (int y=0;y<5;y++)
{
val<<=1;
if ( bits.at<uchar>(y,1)) val|=1;
val<<=1;
if ( bits.at<uchar>(y,3)) val|=1;
}
return val;
}
int Marker::getMarkerId(cv::Mat &markerImage,int &nRotations)
{
assert(markerImage.rows == markerImage.cols);
assert(markerImage.type() == CV_8UC1);
cv::Mat grey = markerImage;
// Threshold image
cv::threshold(grey, grey, 125, 255, cv::THRESH_BINARY | cv::THRESH_OTSU);
#ifdef SHOW_DEBUG_IMAGES
cv::showAndSave("Binary marker", grey);
#endif
//Markers are divided in 7x7 regions, of which the inner 5x5 belongs to marker info
//the external border should be entirely black
int cellSize = markerImage.rows / 7;
for (int y=0;y<7;y++)
{
int inc=6;
if (y==0 || y==6) inc=1; //for first and last row, check the whole border
for (int x=0;x<7;x+=inc)
{
int cellX = x * cellSize;
int cellY = y * cellSize;
cv::Mat cell = grey(cv::Rect(cellX,cellY,cellSize,cellSize));
int nZ = cv::countNonZero(cell);
if (nZ > (cellSize*cellSize) / 2)
{
return -1;//can not be a marker because the border element is not black!
}
}
}
cv::Mat bitMatrix = cv::Mat::zeros(5,5,CV_8UC1);
//get information(for each inner square, determine if it is black or white)
for (int y=0;y<5;y++)
{
for (int x=0;x<5;x++)
{
int cellX = (x+1)*cellSize;
int cellY = (y+1)*cellSize;
cv::Mat cell = grey(cv::Rect(cellX,cellY,cellSize,cellSize));
int nZ = cv::countNonZero(cell);
if (nZ> (cellSize*cellSize) /2)
bitMatrix.at<uchar>(y,x) = 1;
}
}
//check all possible rotations
cv::Mat rotations[4];
int distances[4];
rotations[0] = bitMatrix;
distances[0] = hammDistMarker(rotations[0]);
std::pair<int,int> minDist(distances[0],0);
for (int i=1; i<4; i++)
{
//get the hamming distance to the nearest possible word
rotations[i] = rotate(rotations[i-1]);
distances[i] = hammDistMarker(rotations[i]);
if (distances[i] < minDist.first)
{
minDist.first = distances[i];
minDist.second = i;
}
}
nRotations = minDist.second;
if (minDist.first == 0)
{
return mat2id(rotations[minDist.second]);
}
return -1;
}
void Marker::drawContour(cv::Mat& image, cv::Scalar color) const
{
float thickness = 2;
cv::line(image, points[0], points[1], color, thickness, CV_AA);
cv::line(image, points[1], points[2], color, thickness, CV_AA);
cv::line(image, points[2], points[3], color, thickness, CV_AA);
cv::line(image, points[3], points[0], color, thickness, CV_AA);
}
AR tutorial I'm working from
https://www.packtpub.com/books/content/marker-based-augmented-reality-iphone-or-ipad
I don't have an exhaustive answer, but I think I can explain enough to help your confusion since I am familiar with Hamming distance as well as matrix rotations and other transformations.
From what I can tell:
The matrix you have isn't the bitmap for that marker. It is an array of rotations.
In the algorithm as implemented in the article, the hamming distances are computed with 4 rotations of a marker. So the reason you have 4 rows in the matrix is it is 4 rotations.
I could be wrong, or have oversimplified, maybe this answer will trigger discussion and someone better will see it. I'll look closer at the algorithm and article to see if I can understand it. I made an A in Matrix Theory, but that was 18 years ago and I've frankly forgotten how to transform matrices.
I just found out a clue, but not very sure. Here is my thought:
I found the explanation of "Hamming code" via Wiki http://en.wikipedia.org/wiki/Hamming_code, and how to encode.(have no privilege to insert picture, so please visit the link above)
Here is the code from the book《Mastering OpenCV ...》:
int Marker::mat2id(const cv::Mat &bits)
{
int val=0;
for (int y=0;y<5;y++)
{
val<<=1;
if ( bits.at<uchar>(y,1)) val|=1;
val<<=1;
if ( bits.at<uchar>(y,3)) val|=1;
}
return val;
}
I think only bits 1 and 3 are data, so I took a look at the matrix:
int ids[4][5]=
{
{1,0,0,0,0},// 0,0 -> 0
{1,0,1,1,1},// 0,1 -> 1
{0,1,0,0,1},// 1,0 -> 2
{0,1,1,1,0} // 1,1 -> 3
}; ^ ^
| |
So these 4 cols should be the hamming-code of [0][1][2][3](which 2 bits can encode)
The following is my explanation( maybe incorrect~):
//(1):'d' represents data, 'p' represents parity bits. Among [d1~d4],only [d1][d2] are useful
//(2):then place [d1][d2] into [p1][p2][p4],[p1~p3] is calculated following the circle graph(?) in the Wiki webpage.
//(3):next, write the matrix from the high bits
//(4):finally, according to what the book explains:
The main difference with the hamming code is that the first bit (parity of bits
3 and 5) is inverted. So, ID 0 (which in hamming code is 00000) becomes 10000
in our code. The idea is to prevent a completely black rectangle from being a valid
marker ID, with the goal of reducing the likelihood of false positives with objects of
the environment.
//I invert the [p4] , and I got the matrix above.
origin Num d1 d2 d3 d4 p1 p2 p4 p1 d1 p2 d2 p4 p4 d2 p2 d1 p1 p4 d2 p2 d1 p1
00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
01 1 0 0 0 1 1 0 => 1 1 1 0 0 => 0 0 1 1 1 => 1 0 1 1 1
10 0 1 0 0 1 0 1 => 1 0 0 1 1 => 1 1 0 0 1 => 0 1 0 0 1
11 1 1 0 0 0 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0
[ (1) ] [ (2) ] [ (3) ] [ (4) ]
I do not sure whether this is right, but I got the same result.
If it is true, then we can make our own markers with this matrix.
#Natalie and #mrjoltcola . Hope you can see this~
(this is my first time reply on this forum ,if there are something improper, I`d be glad to receiving advices :) ^_^ )

How does one convert 16-bit RGB565 to 24-bit RGB888?

I’ve got my hands on a 16-bit rgb565 image (specifically, an Android framebuffer dump), and I would like to convert it to 24-bit rgb888 for viewing on a normal monitor.
The question is, how does one convert a 5- or 6-bit channel to 8 bits? The obvious answer is to shift it. I started out by writing this:
puts("P6 320 480 255");
uint16_t buf;
while (read(0, &buf, sizeof buf)) {
unsigned char red = (buf & 0xf800) >> 11;
unsigned char green = (buf & 0x07e0) >> 5;
unsigned char blue = buf & 0x001f;
putchar(red << 3);
putchar(green << 2);
putchar(blue << 3);
}
However, this doesn’t have one property I would like, which is for 0xffff to map to 0xffffff, instead of 0xf8fcf8. I need to expand the value in some way, but I’m not sure how that should work.
The Android SDK comes with a tool called ddms (Dalvik Debug Monitor) that takes screen captures. As far as I can tell from reading the code, it implements the same logic; yet its screenshots are coming out different, and white is mapping to white.
Here’s the raw framebuffer, the smart conversion by ddms, and the dumb conversion by the above algorithm. Note that the latter is slightly darker and greener.
(By the way, this conversion is implemented in ffmpeg, but it’s just performing the dumb conversion listed above, leaving the LSBs at all zero.)
I guess I have two questions:
What’s the most sensible way to convert rgb565 to rgb888?
How is DDMS converting its screenshots?
You want to map each of these from a 5/6 bit space to an 8 bit space.
5 bits = 32 values
6 bits = 64 values
8 bits = 256 values
The code you're using is taking the naive approach that x5 * 256/32 = x8 where 256/32 = 8 and multiplying by 8 is left shift 3 but, as you say, this doesn't necessarily fill the new number space "correctly". 5 to 8 for max value is 31 to 255 and therein lies your clue to the solution.
x8 = 255/31 * x5
x8 = 255/63 * x6
where x5, x6 and x8 are 5, 6 and 8 bit values respectively.
Now there is a question about the best way to implement this. It does involve division and with integer division you will lose any remainder result (round down basically) so the best solution is probably to do floating point arithmetic and then round half up back to an integer.
This can be sped up considerably by simply using this formula to generate a lookup table for each of the 5 and 6 bit conversions.
My few cents:
If you care about precise mapping, yet fast algorithm you can consider this:
R8 = ( R5 * 527 + 23 ) >> 6;
G8 = ( G6 * 259 + 33 ) >> 6;
B8 = ( B5 * 527 + 23 ) >> 6;
It uses only: MUL, ADD and SHR -> so it is pretty fast!
From the other side it is compatible in 100% to floating point mapping with proper rounding:
// R8 = (int) floor( R5 * 255.0 / 31.0 + 0.5);
// G8 = (int) floor( G6 * 255.0 / 63.0 + 0.5);
// B8 = (int) floor( R5 * 255.0 / 31.0 + 0.5);
Some extra cents:
If you are interested in 888 to 565 conversion, this works very well too:
R5 = ( R8 * 249 + 1014 ) >> 11;
G6 = ( G8 * 253 + 505 ) >> 10;
B5 = ( B8 * 249 + 1014 ) >> 11;
Constants were found using brute force search with somę early rejections to speed thing up a bit.
You could shift and then or with the most significant bits; i.e.
Red 10101 becomes 10101000 | 101 => 10101101
12345 12345--- 123 12345123
This has the property you seek, but it's not the most linear mapping of values from one space to the other. It's fast, though. :)
Cletus' answer is more complete and probably better. :)
iOS vImage Conversion
The iOS Accelerate Framework documents the following algorithm for the vImageConvert_RGB565toARGB8888 function:
Pixel8 alpha = alpha
Pixel8 red = (5bitRedChannel * 255 + 15) / 31
Pixel8 green = (6bitGreenChannel * 255 + 31) / 63
Pixel8 blue = (5bitBlueChannel * 255 + 15) / 31
For a one-off conversion this will be fast enough, but if you want to process many frames you want to use something like the iOS vImage conversion or implement this yourself using NEON intrinsics.
From ARMs Community Forum Tutorial
First, we will look at converting RGB565 to RGB888. We assume there are eight 16-bit pixels in register q0, and we would like to separate reds, greens and blues into 8-bit elements across three registers d2 to d4.
vshr.u8 q1, q0, #3 # shift red elements right by three bits,
# discarding the green bits at the bottom of
# the red 8-bit elements.
vshrn.i16 d2, q1, #5 # shift red elements right and narrow,
# discarding the blue and green bits.
vshrn.i16 d3, q0, #5 # shift green elements right and narrow,
# discarding the blue bits and some red bits
# due to narrowing.
vshl.i8 d3, d3, #2 # shift green elements left, discarding the
# remaining red bits, and placing green bits
# in the correct place.
vshl.i16 q0, q0, #3 # shift blue elements left to most-significant
# bits of 8-bit color channel.
vmovn.i16 d4, q0 # remove remaining red and green bits by
# narrowing to 8 bits.
The effects of each instruction are described in the comments above, but in summary, the operation performed on each channel is:
Remove color data for adjacent channels using shifts to push the bits off either end of the element.
Use a second shift to position the color data in the most-significant bits of each element, and narrow to reduce element size from 16 to eight bits.
Note the use of element sizes in this sequence to address 8 and 16 bit elements, in order to achieve some of the masking operations.
A small problem
You may notice that, if you use the code above to convert to RGB888 format, your whites aren't quite white. This is because, for each channel, the lowest two or three bits are zero, rather than one; a white represented in RGB565 as (0x1F, 0x3F, 0x1F) becomes (0xF8, 0xFC, 0xF8) in RGB888. This can be fixed using shift with insert to place some of the most-significant bits into the lower bits.
For an Android specific example I found a YUV-to-RGB conversion written in intrinsics.
Try this:
red5 = (buf & 0xF800) >> 11;
red8 = (red5 << 3) | (red5 >> 2);
This will map all zeros into all zeros, all 1's into all 1's, and everything in between into everything in between. You can make it more efficient by shifting the bits into place in one step:
redmask = (buf & 0xF800);
rgb888 = (redmask << 8) | ((redmask<<3)&0x070000) | /* green, blue */
Do likewise for green and blue (for 6 bits, shift left 2 and right 4 respectively in the top method).
The general solution is to treat the numbers as binary fractions - thus, the 6 bit number 63/63 is the same as the 8 bit number 255/255. You can calculate this using floating point math initially, then compute a lookup table, as other posters suggest. This also has the advantage of being more intuitive than bit-bashing solutions. :)
There is an error jleedev !!!
unsigned char green = (buf & 0x07c0) >> 5;
unsigned char blue = buf & 0x003f;
the good code
unsigned char green = (buf & 0x07e0) >> 5;
unsigned char blue = buf & 0x001f;
Cheers,
Andy
I used the following and got good results. Turned out my Logitek cam was 16bit RGB555 and using the following to convert to 24bit RGB888 allowed me to save as a jpeg using the smaller animals ijg: Thanks for the hint found here on stackoverflow.
// Convert a 16 bit inbuf array to a 24 bit outbuf array
BOOL JpegFile::ByteConvert(BYTE* inbuf, BYTE* outbuf, UINT width, UINT height)
{ UINT row_cnt, pix_cnt;
ULONG off1 = 0, off2 = 0;
BYTE tbi1, tbi2, R5, G5, B5, R8, G8, B8;
if (inbuf==NULL)
return FALSE;
for (row_cnt = 0; row_cnt <= height; row_cnt++)
{ off1 = row_cnt * width * 2;
off2 = row_cnt * width * 3;
for(pix_cnt=0; pix_cnt < width; pix_cnt++)
{ tbi1 = inbuf[off1 + (pix_cnt * 2)];
tbi2 = inbuf[off1 + (pix_cnt * 2) + 1];
B5 = tbi1 & 0x1F;
G5 = (((tbi1 & 0xE0) >> 5) | ((tbi2 & 0x03) << 3)) & 0x1F;
R5 = (tbi2 >> 2) & 0x1F;
R8 = ( R5 * 527 + 23 ) >> 6;
G8 = ( G5 * 527 + 23 ) >> 6;
B8 = ( B5 * 527 + 23 ) >> 6;
outbuf[off2 + (pix_cnt * 3)] = R8;
outbuf[off2 + (pix_cnt * 3) + 1] = G8;
outbuf[off2 + (pix_cnt * 3) + 2] = B8;
}
}
return TRUE;
}
Here's the code:
namespace convert565888
{
inline uvec4_t const _c0{ { { 527u, 259u, 527u, 1u } } };
inline uvec4_t const _c1{ { { 23u, 33u, 23u, 0u } } };
} // end ns
uvec4_v const __vectorcall rgb565_to_888(uvec4_v const rgba) {
return(uvec4_v(_mm_srli_epi32(_mm_add_epi32(_mm_mullo_epi32(rgba.v,
uvec4_v(convert565888::_c0).v), uvec4_v(convert565888::_c1).v), 6)));
}
and for rgb 888 to 565 conversion:
namespace convert888565
{
inline uvec4_t const _c0{ { { 249u, 509u, 249u, 1u } } };
inline uvec4_t const _c1{ { { 1014u, 253u, 1014u, 0u } } };
} // end ns
uvec4_v const __vectorcall rgb888_to_565(uvec4_v const rgba) {
return(uvec4_v(_mm_srli_epi32(_mm_add_epi32(_mm_mullo_epi32(rgba.v,
uvec4_v(convert888565::_c0).v), uvec4_v(convert888565::_c1).v), 11)));
}
for the explanation of where all these numbers come from, specifically how I calculated the optimal multiplier and bias for green:
Desmos graph -
https://www.desmos.com/calculator/3grykboay1
The graph isn't the greatest but it shows the actual value vs. error -- play around with the interactive sliders to see how different values affect the output. This graph also applies to calculating the red and blue values aswell. Typically green is shifted by 10bits, red and blue 11bits.
In order for this to work with intrinsic _mm_srli_epi32 / _mm_srl_epi32 requires all components to be shifted by the same amount. So everything is shifted by 11 bits (rgb888_to_565) in this version, however, the green component is scaled to compensate for this change. Fortunately, it scales perfectly!
I had this difficulty too, and the most faithful way I found was to replace the 16-bit value with the original 24-bit value. Now the ILI9341 screen color is visually compatible with Notebook screen. I thought of just using the 24-bit color table, but then the display routines would have to be converted to 565, and that would make the program even slower.
If the color palette is fixed as in my case, it might be the most viable option. I tried to make use of the 3 MSB adding with the 3 LSB, but it wasn't very good.
The colors I used on the ILI9341 display I got from this website (Note: I choose the 24-bit color 888 and get the 16-bit color 565, on this website there's no way to do otherwise):
http://www.barth-dev.de/online/rgb565-color-picker/
For example, I read the pixel color of the ILI9341 display and save it to a USB Disk, in a file, in BMP format. As the display operates with 16-bit or 18-bit, I have no way to retrieve 24-bit information directly from the GRAM memory.
#define BLACK_565 0x0000
#define BLUE_565 0x001F
#define RED_565 0xF800
#define GREEN_565 0x07E0
#define CYAN_565 0x07FF
#define MAGENTA_565 0xF81F
#define YELLOW_565 0xFFE0
#define WHITE_565 0xFFFF
#define LIGHTGREY_565 0xC618
#define ORANGE_565 0xFD20
#define GREY_565 0x8410
#define DARKGREY_565 0x2104
#define DARKBLUE_565 0x0010
#define DARKGREEN_565 0x03E0
#define DARKCYAN_565 0x03EF
#define DARKYELLOW_565 0x8C40
#define BLUESKY_565 0x047F
#define BROWN_565 0xC408
#define BLACK_888 0x000000
#define BLUE_888 0x0000FF
#define RED_888 0xFF0000
#define GREEN_888 0x04FF00
#define CYAN_888 0x00FFFB
#define MAGENTA_888 0xFF00FA
#define YELLOW_888 0xFBFF00
#define WHITE_888 0xFFFFFF
#define LIGHTGREY_888 0xC6C3C6
#define ORANGE_888 0xFFA500
#define GREY_888 0x808080
#define DARKGREY_888 0x202020
#define DARKBLUE_888 0x000080
#define DARKGREEN_888 0x007D00
#define DARKCYAN_888 0x007D7B
#define DARKYELLOW_888 0x898A00
#define BLUESKY_888 0x008CFF
#define BROWN_888 0xC08240
I did the test (using an STM32F407 uC) with an IF statement, but it can also be done with Select Case, or another form of comparison.
uint16_t buff1; // pixel color value read from GRAM
uint8_t buff2[3];
uint32_t color_buff; // to save to USB disk
if (buff1 == BLUE_565) color_buff = BLUE_888;
else if (buff1 == RED_565) color_buff = RED_888;
else if (buff1 == GREEN_565) color_buff = GREEN_888;
else if (buff1 == CYAN_565) color_buff = CYAN_888;
else if (buff1 == MAGENTA_565) color_buff = MAGENTA_888;
else if (buff1 == YELLOW_565) color_buff = YELLOW_888;
else if (buff1 == WHITE_565) color_buff = WHITE_888;
else if (buff1 == LIGHTGREY_565) color_buff = LIGHTGREY_888;
else if (buff1 == ORANGE_565) color_buff = ORANGE_888;
else if (buff1 == GREY_565) color_buff = GREY_888;
else if (buff1 == DARKGREY_565) color_buff = DARKGREY_888;
else if (buff1 == DARKBLUE_565) color_buff = DARKBLUE_888;
else if (buff1 == DARKCYAN_565) color_buff = DARKCYAN_888;
else if (buff1 == DARKYELLOW_565) color_buff = DARKYELLOW_888;
else if (buff1 == BLUESKY_565) color_buff = BLUESKY_888;
else if (buff1 == BROWN_565) color_buff = BROWN_888;
else color_buff = BLACK;
RGB separation for saving to 8-bit variables:
buff2[0] = color_buff; // Blue
buff2[1] = color_buff >> 8; // Green
buff2[2] = color_buff >> 16; // Red

Resources