Why image values are different in Matlab and OpenCV?

Why image values are different in Matlab and OpenCV? - image

I have an original image:
I then read it, create a PSF, and blur it in Matlab:
lenawords1=imread('lenawords.bmp');
%create PSF
sigma=6;
PSFgauss=fspecial('gaussian', 8*sigma+1, sigma);
%blur it
lenablur1=imfilter(lenawords1, PSFgauss, 'conv');
lenablurgray1=mat2gray(lenablur1);
PSFgaussgray = mat2gray(PSFgauss);
and I saved the blurred image:
imwrite(lenablurgray1, 'lenablur.bmp');
When I display some values in it, I get
disp(lenablurgray1(91:93, 71:75))
0.5556 0.5778 0.6000 0.6222 0.6444
0.6000 0.6444 0.6667 0.6889 0.6889
0.6444 0.6889 0.7111 0.7333 0.7333
I then open that blurred image in OpenCV and display its values at the same indices:
Mat img = imread("lenablur.bmp");
for (int r = 91; r < 94; r++) {
for (int c = 71; c < 76; c++) {
cout << img.at<double>(r, c) << " ";
}
cout << endl;
}
cout << endl;
The result I get doesn't match the values above:
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Why is this?
EDIT: img.at<unsigned int>(r, c) gives
1903260029 1533437542 ...
2004318088 ...
....
If I save the blurred image as a png file:
imwrite(lenablurgray1, 'lenablur.png');
Then when I read it in OpenCV:
Mat img = imread("lenablur.png");
img.convertTo(img, CV_64F);
then img.at<double>(r, c) gives
17 11 11 11 6
17 11 11 11 6
17 11 11 11 11
which still doesn't match the values from Matlab
EDIT2: I now see the values are wrong for the kernel. In Matlab, I get
imwrite(PSFgaussgray, 'PSFgauss.bmp');
disp(PSFgaussgray(7:9, 7:9)*256)
.0316 .0513 .0812
.0513 ...
...
whereas in OpenCV:
Mat kernel = imread("PSFgauss.bmp");
cvtColor(kernel, kernel, cv::COLOR_BGR2GRAY);
kernel.convertTo(kernel, CV_64F);
for (int r = 6; r < 9 ; r++) {
for (int c = 6; c < 9; c++) {
cout << kernel.at<double>(r, c) << " ";
}
cout << endl;
}
cout << endl;
The result I get doesn't match the values above:
0 0 0
0 0 0
0 0 0

To understand the discrepancy you see you need to know how MATLAB saves images to a BMP or PNG file, and how OpenCV reads it.
MATLAB assumes, if the image is of type double as is this case, that its intensity range is [0,1]. That is, pixel values below 0 and above 1 are not expected. Such images are multiplied by 255 and converted to 8-bit integers (which have a range of [0,255]) when saved to a file.
Thus, if
>> disp(lenablurgray1(91:93, 71:75))
0.5556 0.5778 0.6000 0.6222 0.6444
0.6000 0.6444 0.6667 0.6889 0.6889
0.6444 0.6889 0.7111 0.7333 0.7333
what is saved is
>> uint8( lenablurgray1(91:93, 71:75) * 255 )
142 147 153 159 164
153 164 170 176 176
164 176 181 187 187
Next, OpenCV will read this file as RGB (or rather BGR, OpenCV's awkward color order) and as 8-bit unsigned integer (CV_8U). To display these data, either extract one of the color channels, or convert to gray value using
cvtColor(img, img, cv::COLOR_BGR2GRAY);
Then, read the 8-bit unsigned values with
img.at<uchar>(r, c)
If you read them with img.at<double>(), groups of 8 consecutive pixels will be regarded as a single pixel value (a double has 8 bytes).
Next, remember that MATLAB's indexing starts at 1, whereas OpenCV's starts at 0. So your loop should look like this:
for (int r = 90; r < 93; r++) { // matches MATLAB's 91:93 indexing
for (int c = 70; c < 75; c++) { // matches MATLAB's 71:75 indexing
cout << (int)img.at<uchar>(r, c) << " ";
}
cout << '\n';
}
cout << '\n';
Finally, in the case of your kernel, note that its values, when multiplied by 255 are still much smaller than unity: .0316 .0513 .0812. These values will be written as 0 to the BMP or PNG file. If you want to save these values, you need to scale the kernel so its maximum value is 1:
PSFgauss = PSFgauss / max(PSFgauss(:));
imwrite(PSFgauss, 'PSFgauss.bmp');
(Note that this kernel is already a grey-value image, you don't need to use mat2gray on it.)

Related

Compare 16 bpp to 32 bpp bitmap conversions

I got a 16 bpp bitmap that I converted to 32 bpp via code below:
void Rgb555ToRgb8(const UChar* bitmapData, UInt32 width, UInt32 height, UChar* buf)
{
UInt32 dst_bytes_per_row = width * 4;
UInt32 src_bytes_per_row = ((width * 16 + 31) / 32) * 4;
UInt16 red_mask = 0x7C00;
UInt16 green_mask = 0x3E0;
UInt16 blue_mask = 0x1F;
for (UInt32 row = 0; row < height; ++row)
{
UInt32 dstCol = 0, srcCol = 0;
do
{
UInt16 rgb = *(UInt16*)(bitmapData + row * src_bytes_per_row + srcCol);
UChar red_value = (rgb & red_mask) >> 10;
UChar green_value = (rgb & green_mask) >> 5;
UChar blue_value = (rgb & blue_mask);
buf[row*dst_bytes_per_row + dstCol] = blue_value << 3;
buf[row*dst_bytes_per_row + dstCol + 1] = green_value << 3;
buf[row*dst_bytes_per_row + dstCol + 2] = red_value << 3;
buf[row*dst_bytes_per_row + dstCol + 3] = rgb >> 15;
srcCol += 2;
dstCol += 4;
} while (srcCol < src_bytes_per_row);
}
}
Here is conversion result: [2]: https://i.stack.imgur.com/1ajO7.png
I also tried to convert this image via GdiPlus:
Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(w,h,PixelFormat32bppRGB);
Resultant image is .
Notice that the 2 results don't look exactly the same (e.g., the background in GdiPlus result is white). How can I modify my code to match GdiPlus result?

There are two issues that need to be addressed:
Unused bits
When moving from 5 bits of information to 8 bits of information you gain an additional 3 bits. As implemented, the code doesn't make use of that additional range, and is biased towards darker color components. This is an illustration of what blue_value << 3 actually does:
5 bits per channel 8 bits per channel
bbbbb -> bbbbb000
To address this, the least significant 3 bits need to grow as the channel value gets higher. A simple (yet somewhat inaccurate) would be to just copy the most significant 3 bits down to the least significant 3 bits, i.e.
buf[row*dst_bytes_per_row + dstCol] = (blue_value << 3) | (blue_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 1] = (green_value << 3) | (green_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 2] = (red_value << 3) | (red_value >> 2);
The exact mapping would be a bit more involved, something like
blue_value = static_cast<UChar>((blue_value * 255.0) / 31.0 + 0.5);
That converts from 5 bits to the respective 8 bit value that's nearest to the ideal value, including the 4 values that were 1/255th off in the bit-shifting solution above.
If you opt for the latter, you can build a lookup table that stores the mapped values. This table is only 32 entries of one byte each, so it fits into a single cache-line.
Alpha channel
Assuming that the MSB of your source image is indeed interpreted as an alpha value, you're going to have move that into the destination as well. Since the source is only 1 bit of information, the raw transformation is trivial:
buf[row*dst_bytes_per_row + dstCol + 3] = rgb & (1 << 15) ? 255 : 0;
That may or may not be all that's needed. Windows assumes premultiplied alpha, i.e. the stored values of the color channels must be premultiplied by the alpha value (see BLENDFUNCTION for reference).
If the alpha value is 255, the color channel values are already correct. If the alpha value is 0, all color channels need to be multiplied by zero (or simply set to 0). The translation doesn't produce any other alpha values.

Decoding a .xwd image in Julia

I am trying to write a decoder for a .xwd image (X Window Dump), since ImageMagick is quite slow.
The only specifications I found are:
http://www.opensource.apple.com/source/X11/X11-0.40.80/xc/include/XWDFile.h?txt
https://formats.kaitai.io/xwd/index.html
From which I managed to read the header:
xwd_data = read(`xwd -id $id`)
function get_header(data)
args = [reinterpret(Int32, reverse(data[4*i-3:4*i]))[1] for i in 1:25]
xwd = XwdHeader(args...)
return xwd
end
struct XwdHeader
header_size::Int32
file_version::Int32
pixmap_format::Int32
pixmap_depth::Int32
pixmap_width::Int32
pixmap_height::Int32
xoffset::Int32
byte_order::Int32
bitmap_unit::Int32
bitmap_bit_order::Int32
bitmap_pad::Int32
bits_per_pixel::Int32
bytes_per_line::Int32
visual_class::Int32
red_mask::Int32
green_mask::Int32
blue_mask::Int32
bits_per_rgb::Int32
colormap_entries::Int32
ncolors::Int32
window_width::Int32
window_height::Int32
window_x::Int32
window_y::Int32
window_bdrwidth::Int32
end
and the colormap, which is stored in blocks of 12 bytes and in little-endian byte order:
function read_colormap_entry(n, data, header)
offset = header.header_size + 1
poff = 12*n
px = Pixel(reinterpret(UInt32, reverse(data[offset+poff:offset+poff+3]))[1],
reinterpret(UInt16, reverse(data[offset+poff+4:offset+poff+5]))[1],
reinterpret(UInt16, reverse(data[offset+poff+6:offset+poff+7]))[1],
reinterpret(UInt16, reverse(data[offset+poff+8:offset+poff+9]))[1],
reinterpret(UInt8, data[offset+poff+10])[1],
reinterpret(UInt8, data[offset+poff+11])[1])
println("Pixel number ", px.entry_number >> 16)
println("R ", px.red >> 8)
println("G ", px.green >> 8)
println("B ", px.blue >> 8)
println("flags ", px.flags)
println("padding ",px.padding)
end
struct Pixel
entry_number::UInt32
red::UInt16
green::UInt16
blue::UInt16
flags::UInt8
padding::UInt8
end
julia> read_colormap_entry(0, data, header)
Pixel number 0
R 0
G 0
B 0
flags 7
padding 0
julia> read_colormap_entry(1, data, header)
Pixel number 1
R 1
G 1
B 1
flags 7
padding 0
julia> read_colormap_entry(2, data, header)
Pixel number 2
R 2
G 2
B 2
flags 7
padding 0
Now I have the actual image data stored in 4 byte blocks per pixel in the "Direct Color" visual class. Does anybody know howto extract the RGB values from this ?
edit:
By playing around with the data I found out how to extract the R and G values
function read_pixel(i, j, data, header::XwdHeader)
w = header.window_width
h = header.window_height
offset = header.header_size + header.colormap_entries * 12 + 1
poff = 4*((i-1)*w + (j-1))
px = reinterpret(UInt32, reverse(data[offset+poff:offset+poff+3]))[1]
println("Px value ", px)
r = (px & xwd.red_mask) >> 16
g = (px & xwd.green_mask) >> 8
b = (px & xwd.blue_mask)
println("r ", r)
println("g ", g)
println("b ", b)
end
which gives the correct R and G values, but the B value should be non zero.
julia> read_pixel(31, 31, data, xwd_header)
Px value 741685248
r 53
g 56
b 0
I basically have no idea what I am doing with the color masks and the
bit-shifts. Can anyone explain this ? Thanks !

Subset sum with maximum equal sums and without using all elements

You are given a set of integers and your task is the following: split them into 2 subsets with an equal sum in such way that these sums are maximal. You are allowed not to use all given integers, that's fine. If it's just impossible, report error somehow.
My approach is rather straightforward: at each step, we pick a single item, mark it as visited, update current sum and pick another item recursively. Finally, try skipping current element.
It works on simpler test cases, but it fails one:
T = 1
N = 25
Elements: 5 27 24 12 12 2 15 25 32 21 37 29 20 9 24 35 26 8 31 5 25 21 28 3 5
One can run it as follows:
1 25 5 27 24 12 12 2 15 25 32 21 37 29 20 9 24 35 26 8 31 5 25 21 28 3 5
I expect sum to be equal 239, but it the algorithm fails to find such solution.
I've ended up with the following code:
#include <iostream>
#include <unordered_set>
using namespace std;
unordered_set<uint64_t> visited;
const int max_N = 50;
int data[max_N];
int p1[max_N];
int p2[max_N];
int out1[max_N];
int out2[max_N];
int n1 = 0;
int n2 = 0;
int o1 = 0;
int o2 = 0;
int N = 0;
void max_sum(int16_t &sum_out, int16_t sum1 = 0, int16_t sum2 = 0, int idx = 0) {
if (idx < 0 || idx > N) return;
if (sum1 == sum2 && sum1 > sum_out) {
sum_out = sum1;
o1 = n1;
o2 = n2;
for(int i = 0; i < n1; ++i) {
out1[i] = p1[i];
}
for (int i = 0; i < n2; ++i) {
out2[i] = p2[i];
}
}
if (idx == N) return;
uint64_t key = (static_cast<uint64_t>(sum1) << 48) | (static_cast<uint64_t>(sum2) << 32) | idx;
if (visited.find(key) != visited.end()) return;
visited.insert(key);
p1[n1] = data[idx];
++n1;
max_sum(sum_out, sum1 + data[idx], sum2, idx + 1);
--n1;
p2[n2] = data[idx];
++n2;
max_sum(sum_out, sum1, sum2 + data[idx], idx + 1);
--n2;
max_sum(sum_out, sum1, sum2, idx + 1);
}
int main() {
int T = 0;
cin >> T;
for (int t = 1; t <= T; ++t) {
int16_t sum_out;
cin >> N;
for(int i = 0; i < N; ++i) {
cin >> data[i];
}
n1 = 0;
n2 = 0;
o1 = 0;
o2 = 0;
max_sum(sum_out);
int res = 0;
int res2 = 0;
for (int i = 0; i < o1; ++i) res += out1[i];
for (int i = 0; i < o2; ++i) res2 += out2[i];
if (res != res2) cerr << "ERROR: " << "res1 = " << res << "; res2 = " << res2 << '\n';
cout << "#" << t << " " << res << '\n';
visited.clear();
}
}
I have the following questions:
Could someone help me to troubleshoot the failing test? Are there any obvious problems?
How could I get rid of unordered_set for marking already visited sums? I prefer to use plain C.
Is there a better approach? Maybe using dynamic programming?

Another approach is consider all the numbers till [1,(2^N-2)].
Consider the position of each bit to position of each element .Iterate all numbers from [1,(2^N-2)] then check for each number .
If bit is set you can count that number in set1 else you can put that number in set2 , then check if sum of both sets are equals or not . Here you will get all possible sets , if you want just one once you find just break.

1) Could someone help me to troubleshoot the failing test? Are there any obvious problems?
The only issue I could see is that you have not set sum_out to 0.
When I tried running the program it seemed to work correctly for your test case.
2) How could I get rid of unordered_set for marking already visited sums? I prefer to use plain C.
See the answer to question 3
3) Is there a better approach? Maybe using dynamic programming?
You are currently keeping track of whether you have seen each choice of value for first subset, value for second subset, amount through array.
If instead you keep track of the difference between the values then the complexity significantly reduces.
In particular, you can use dynamic programming to store an array A[diff] that for each value of the difference either stores -1 (to indicate that the difference is not reachable), or the greatest value of subset1 when the difference between subset1 and subset2 is exactly equal to diff.
You can then iterate over the entries in the input and update the array based on either assigning each element to subset1/subset2/ or not at all. (Note you need to make a new copy of the array when computing this update.)
In this form there is no use of unordered_set because you can simply use a straight C array. There is also no difference between subset1 and subset2 so you can only keep positive differences.
Example Python Code
from collections import defaultdict
data=map(int,"5 27 24 12 12 2 15 25 32 21 37 29 20 9 24 35 26 8 31 5 25 21 28 3 5".split())
A=defaultdict(int) # Map from difference to best value of subset sum 1
A[0] = 0 # We start with a difference of 0
for a in data:
A2 = defaultdict(int)
def add(s1,s2):
if s1>s2:
s1,s2=s2,s1
d = s2-s1
if d in A2:
A2[d] = max( A2[d], s1 )
else:
A2[d] = s1
for diff,sum1 in A.items():
sum2 = sum1 + diff
add(sum1,sum2)
add(sum1+a,sum2)
add(sum1,sum2+a)
A = A2
print A[0]
This prints 239 as the answer.
For simplicity I haven't bothered with the optimization of using a linear array instead of the dictionary.

A very different approach would be to use a constraint or mixed integer solver. Here is a possible formulation.
Let
x(i,g) = 1 if value v(i) belongs to group g
0 otherwise
The optimization model can look like:
max s
s = sum(i, x(i,g)*v(i)) for all g
sum(g, x(i,g)) <= 1 for all i
For two groups we get:
---- 31 VARIABLE s.L = 239.000
---- 31 VARIABLE x.L
g1 g2
i1 1
i2 1
i3 1
i4 1
i5 1
i6 1
i7 1
i8 1
i9 1
i10 1
i11 1
i12 1
i13 1
i14 1
i15 1
i16 1
i17 1
i18 1
i19 1
i20 1
i21 1
i22 1
i23 1
i25 1
We can easily do more groups. E.g. with 9 groups:
---- 31 VARIABLE s.L = 52.000
---- 31 VARIABLE x.L
g1 g2 g3 g4 g5 g6 g7 g8 g9
i2 1
i3 1
i4 1
i5 1
i6 1
i7 1
i8 1
i9 1
i10 1
i11 1
i12 1
i13 1
i14 1
i15 1
i16 1
i17 1
i19 1
i20 1
i21 1
i22 1
i23 1
i24 1
i25 1
If there is no solution, the solver will select zero elements in each group with a sum s=0.

Go << and >> operators

Could someone please explain to me the usage of << and >> in Go? I guess it is similar to some other languages.

The super (possibly over) simplified definition is just that << is used for "times 2" and >> is for "divided by 2" - and the number after it is how many times.
So n << x is "n times 2, x times". And y >> z is "y divided by 2, z times".
For example, 1 << 5 is "1 times 2, 5 times" or 32. And 32 >> 5 is "32 divided by 2, 5 times" or 1.

From the spec at http://golang.org/doc/go_spec.html, it seems that at least with integers, it's a binary shift. for example, binary 0b00001000 >> 1 would be 0b00000100, and 0b00001000 << 1 would be 0b00010000.
Go apparently doesn't accept the 0b notation for binary integers. I was just using it for the example. In decimal, 8 >> 1 is 4, and 8 << 1 is 16. Shifting left by one is the same as multiplication by 2, and shifting right by one is the same as dividing by two, discarding any remainder.

The << and >> operators are Go Arithmetic Operators.
<< left shift integer << unsigned integer
>> right shift integer >> unsigned integer
The shift operators shift the left
operand by the shift count specified
by the right operand. They implement
arithmetic shifts if the left operand
is a signed integer and logical shifts
if it is an unsigned integer. The
shift count must be an unsigned
integer. There is no upper limit on
the shift count. Shifts behave as if
the left operand is shifted n times by
1 for a shift count of n. As a result,
x << 1 is the same as x*2 and x >> 1
is the same as x/2 but truncated
towards negative infinity.

They are basically Arithmetic operators and its the same in other languages here is a basic PHP , C , Go Example
GO
package main
import (
"fmt"
)
func main() {
var t , i uint
t , i = 1 , 1
for i = 1 ; i < 10 ; i++ {
fmt.Printf("%d << %d = %d \n", t , i , t<<i)
}
fmt.Println()
t = 512
for i = 1 ; i < 10 ; i++ {
fmt.Printf("%d >> %d = %d \n", t , i , t>>i)
}
}
GO Demo
C
#include <stdio.h>
int main()
{
int t = 1 ;
int i = 1 ;
for(i = 1; i < 10; i++) {
printf("%d << %d = %d \n", t, i, t << i);
}
printf("\n");
t = 512;
for(i = 1; i < 10; i++) {
printf("%d >> %d = %d \n", t, i, t >> i);
}
return 0;
}
C Demo
PHP
$t = $i = 1;
for($i = 1; $i < 10; $i++) {
printf("%d << %d = %d \n", $t, $i, $t << $i);
}
print PHP_EOL;
$t = 512;
for($i = 1; $i < 10; $i++) {
printf("%d >> %d = %d \n", $t, $i, $t >> $i);
}
PHP Demo
They would all output
1 << 1 = 2
1 << 2 = 4
1 << 3 = 8
1 << 4 = 16
1 << 5 = 32
1 << 6 = 64
1 << 7 = 128
1 << 8 = 256
1 << 9 = 512
512 >> 1 = 256
512 >> 2 = 128
512 >> 3 = 64
512 >> 4 = 32
512 >> 5 = 16
512 >> 6 = 8
512 >> 7 = 4
512 >> 8 = 2
512 >> 9 = 1

n << x = n * 2^x   Example: 3 << 5 = 3 * 2^5 = 96
y >> z = y / 2^z   Example: 512 >> 4 = 512 / 2^4 = 32

<< is left shift. >> is sign-extending right shift when the left operand is a signed integer, and is zero-extending right shift when the left operand is an unsigned integer.
To better understand >> think of
var u uint32 = 0x80000000;
var i int32 = -2;
u >> 1; // Is 0x40000000 similar to >>> in Java
i >> 1; // Is -1 similar to >> in Java
So when applied to an unsigned integer, the bits at the left are filled with zero, whereas when applied to a signed integer, the bits at the left are filled with the leftmost bit (which is 1 when the signed integer is negative as per 2's complement).

Go's << and >> are similar to shifts (that is: division or multiplication by a power of 2) in other languages, but because Go is a safer language than C/C++ it does some extra work when the shift count is a number.
Shift instructions in x86 CPUs consider only 5 bits (6 bits on 64-bit x86 CPUs) of the shift count. In languages like C/C++, the shift operator translates into a single CPU instruction.
The following Go code
x := 10
y := uint(1025) // A big shift count
println(x >> y)
println(x << y)
prints
0
0
while a C/C++ program would print
5
20

In decimal math, when we multiply or divide by 10, we effect the zeros on the end of the number.
In binary, 2 has the same effect. So we are adding a zero to the end, or removing the last digit

<< is the bitwise left shift operator ,which shifts the bits of corresponding integer to the left….the rightmost bit being ‘0’ after the shift .
For example:
In gcc we have 4 bytes integer which means 32 bits .
like binary representation of 3 is
00000000 00000000 00000000 00000011
3<<1 would give
00000000 00000000 00000000 00000110 which is 6.
In general 1<<x would give you 2^x
In gcc
1<<20 would give 2^20 that is 1048576
but in tcc it would give you 0 as result because integer is of 2 bytes in tcc.
in simple terms we can take it like this in golang
So
n << x is "n times 2, x times". And y >> z is "y divided by 2, z times".
n << x = n * 2^x Example: 3<< 5 = 3 * 2^5 = 96
y >> z = y / 2^z Example: 512 >> 4 = 512 / 2^4 = 32

These are Right bitwise and left bitwise operators

How does one convert 16-bit RGB565 to 24-bit RGB888?

I’ve got my hands on a 16-bit rgb565 image (specifically, an Android framebuffer dump), and I would like to convert it to 24-bit rgb888 for viewing on a normal monitor.
The question is, how does one convert a 5- or 6-bit channel to 8 bits? The obvious answer is to shift it. I started out by writing this:
puts("P6 320 480 255");
uint16_t buf;
while (read(0, &buf, sizeof buf)) {
unsigned char red = (buf & 0xf800) >> 11;
unsigned char green = (buf & 0x07e0) >> 5;
unsigned char blue = buf & 0x001f;
putchar(red << 3);
putchar(green << 2);
putchar(blue << 3);
}
However, this doesn’t have one property I would like, which is for 0xffff to map to 0xffffff, instead of 0xf8fcf8. I need to expand the value in some way, but I’m not sure how that should work.
The Android SDK comes with a tool called ddms (Dalvik Debug Monitor) that takes screen captures. As far as I can tell from reading the code, it implements the same logic; yet its screenshots are coming out different, and white is mapping to white.
Here’s the raw framebuffer, the smart conversion by ddms, and the dumb conversion by the above algorithm. Note that the latter is slightly darker and greener.
(By the way, this conversion is implemented in ffmpeg, but it’s just performing the dumb conversion listed above, leaving the LSBs at all zero.)
I guess I have two questions:
What’s the most sensible way to convert rgb565 to rgb888?
How is DDMS converting its screenshots?

You want to map each of these from a 5/6 bit space to an 8 bit space.
5 bits = 32 values
6 bits = 64 values
8 bits = 256 values
The code you're using is taking the naive approach that x5 * 256/32 = x8 where 256/32 = 8 and multiplying by 8 is left shift 3 but, as you say, this doesn't necessarily fill the new number space "correctly". 5 to 8 for max value is 31 to 255 and therein lies your clue to the solution.
x8 = 255/31 * x5
x8 = 255/63 * x6
where x5, x6 and x8 are 5, 6 and 8 bit values respectively.
Now there is a question about the best way to implement this. It does involve division and with integer division you will lose any remainder result (round down basically) so the best solution is probably to do floating point arithmetic and then round half up back to an integer.
This can be sped up considerably by simply using this formula to generate a lookup table for each of the 5 and 6 bit conversions.

My few cents:
If you care about precise mapping, yet fast algorithm you can consider this:
R8 = ( R5 * 527 + 23 ) >> 6;
G8 = ( G6 * 259 + 33 ) >> 6;
B8 = ( B5 * 527 + 23 ) >> 6;
It uses only: MUL, ADD and SHR -> so it is pretty fast!
From the other side it is compatible in 100% to floating point mapping with proper rounding:
// R8 = (int) floor( R5 * 255.0 / 31.0 + 0.5);
// G8 = (int) floor( G6 * 255.0 / 63.0 + 0.5);
// B8 = (int) floor( R5 * 255.0 / 31.0 + 0.5);
Some extra cents:
If you are interested in 888 to 565 conversion, this works very well too:
R5 = ( R8 * 249 + 1014 ) >> 11;
G6 = ( G8 * 253 + 505 ) >> 10;
B5 = ( B8 * 249 + 1014 ) >> 11;
Constants were found using brute force search with somę early rejections to speed thing up a bit.

You could shift and then or with the most significant bits; i.e.
Red 10101 becomes 10101000 | 101 => 10101101
12345 12345--- 123 12345123
This has the property you seek, but it's not the most linear mapping of values from one space to the other. It's fast, though. :)
Cletus' answer is more complete and probably better. :)

iOS vImage Conversion
The iOS Accelerate Framework documents the following algorithm for the vImageConvert_RGB565toARGB8888 function:
Pixel8 alpha = alpha
Pixel8 red = (5bitRedChannel * 255 + 15) / 31
Pixel8 green = (6bitGreenChannel * 255 + 31) / 63
Pixel8 blue = (5bitBlueChannel * 255 + 15) / 31
For a one-off conversion this will be fast enough, but if you want to process many frames you want to use something like the iOS vImage conversion or implement this yourself using NEON intrinsics.
From ARMs Community Forum Tutorial
First, we will look at converting RGB565 to RGB888. We assume there are eight 16-bit pixels in register q0, and we would like to separate reds, greens and blues into 8-bit elements across three registers d2 to d4.
vshr.u8 q1, q0, #3 # shift red elements right by three bits,
# discarding the green bits at the bottom of
# the red 8-bit elements.
vshrn.i16 d2, q1, #5 # shift red elements right and narrow,
# discarding the blue and green bits.
vshrn.i16 d3, q0, #5 # shift green elements right and narrow,
# discarding the blue bits and some red bits
# due to narrowing.
vshl.i8 d3, d3, #2 # shift green elements left, discarding the
# remaining red bits, and placing green bits
# in the correct place.
vshl.i16 q0, q0, #3 # shift blue elements left to most-significant
# bits of 8-bit color channel.
vmovn.i16 d4, q0 # remove remaining red and green bits by
# narrowing to 8 bits.
The effects of each instruction are described in the comments above, but in summary, the operation performed on each channel is:
Remove color data for adjacent channels using shifts to push the bits off either end of the element.
Use a second shift to position the color data in the most-significant bits of each element, and narrow to reduce element size from 16 to eight bits.
Note the use of element sizes in this sequence to address 8 and 16 bit elements, in order to achieve some of the masking operations.
A small problem
You may notice that, if you use the code above to convert to RGB888 format, your whites aren't quite white. This is because, for each channel, the lowest two or three bits are zero, rather than one; a white represented in RGB565 as (0x1F, 0x3F, 0x1F) becomes (0xF8, 0xFC, 0xF8) in RGB888. This can be fixed using shift with insert to place some of the most-significant bits into the lower bits.
For an Android specific example I found a YUV-to-RGB conversion written in intrinsics.

Try this:
red5 = (buf & 0xF800) >> 11;
red8 = (red5 << 3) | (red5 >> 2);
This will map all zeros into all zeros, all 1's into all 1's, and everything in between into everything in between. You can make it more efficient by shifting the bits into place in one step:
redmask = (buf & 0xF800);
rgb888 = (redmask << 8) | ((redmask<<3)&0x070000) | /* green, blue */
Do likewise for green and blue (for 6 bits, shift left 2 and right 4 respectively in the top method).

The general solution is to treat the numbers as binary fractions - thus, the 6 bit number 63/63 is the same as the 8 bit number 255/255. You can calculate this using floating point math initially, then compute a lookup table, as other posters suggest. This also has the advantage of being more intuitive than bit-bashing solutions. :)

There is an error jleedev !!!
unsigned char green = (buf & 0x07c0) >> 5;
unsigned char blue = buf & 0x003f;
the good code
unsigned char green = (buf & 0x07e0) >> 5;
unsigned char blue = buf & 0x001f;
Cheers,
Andy

I used the following and got good results. Turned out my Logitek cam was 16bit RGB555 and using the following to convert to 24bit RGB888 allowed me to save as a jpeg using the smaller animals ijg: Thanks for the hint found here on stackoverflow.
// Convert a 16 bit inbuf array to a 24 bit outbuf array
BOOL JpegFile::ByteConvert(BYTE* inbuf, BYTE* outbuf, UINT width, UINT height)
{ UINT row_cnt, pix_cnt;
ULONG off1 = 0, off2 = 0;
BYTE tbi1, tbi2, R5, G5, B5, R8, G8, B8;
if (inbuf==NULL)
return FALSE;
for (row_cnt = 0; row_cnt <= height; row_cnt++)
{ off1 = row_cnt * width * 2;
off2 = row_cnt * width * 3;
for(pix_cnt=0; pix_cnt < width; pix_cnt++)
{ tbi1 = inbuf[off1 + (pix_cnt * 2)];
tbi2 = inbuf[off1 + (pix_cnt * 2) + 1];
B5 = tbi1 & 0x1F;
G5 = (((tbi1 & 0xE0) >> 5) | ((tbi2 & 0x03) << 3)) & 0x1F;
R5 = (tbi2 >> 2) & 0x1F;
R8 = ( R5 * 527 + 23 ) >> 6;
G8 = ( G5 * 527 + 23 ) >> 6;
B8 = ( B5 * 527 + 23 ) >> 6;
outbuf[off2 + (pix_cnt * 3)] = R8;
outbuf[off2 + (pix_cnt * 3) + 1] = G8;
outbuf[off2 + (pix_cnt * 3) + 2] = B8;
}
}
return TRUE;
}

Here's the code:
namespace convert565888
{
inline uvec4_t const _c0{ { { 527u, 259u, 527u, 1u } } };
inline uvec4_t const _c1{ { { 23u, 33u, 23u, 0u } } };
} // end ns
uvec4_v const __vectorcall rgb565_to_888(uvec4_v const rgba) {
return(uvec4_v(_mm_srli_epi32(_mm_add_epi32(_mm_mullo_epi32(rgba.v,
uvec4_v(convert565888::_c0).v), uvec4_v(convert565888::_c1).v), 6)));
}
and for rgb 888 to 565 conversion:
namespace convert888565
{
inline uvec4_t const _c0{ { { 249u, 509u, 249u, 1u } } };
inline uvec4_t const _c1{ { { 1014u, 253u, 1014u, 0u } } };
} // end ns
uvec4_v const __vectorcall rgb888_to_565(uvec4_v const rgba) {
return(uvec4_v(_mm_srli_epi32(_mm_add_epi32(_mm_mullo_epi32(rgba.v,
uvec4_v(convert888565::_c0).v), uvec4_v(convert888565::_c1).v), 11)));
}
for the explanation of where all these numbers come from, specifically how I calculated the optimal multiplier and bias for green:
Desmos graph -
https://www.desmos.com/calculator/3grykboay1
The graph isn't the greatest but it shows the actual value vs. error -- play around with the interactive sliders to see how different values affect the output. This graph also applies to calculating the red and blue values aswell. Typically green is shifted by 10bits, red and blue 11bits.
In order for this to work with intrinsic _mm_srli_epi32 / _mm_srl_epi32 requires all components to be shifted by the same amount. So everything is shifted by 11 bits (rgb888_to_565) in this version, however, the green component is scaled to compensate for this change. Fortunately, it scales perfectly!

I had this difficulty too, and the most faithful way I found was to replace the 16-bit value with the original 24-bit value. Now the ILI9341 screen color is visually compatible with Notebook screen. I thought of just using the 24-bit color table, but then the display routines would have to be converted to 565, and that would make the program even slower.
If the color palette is fixed as in my case, it might be the most viable option. I tried to make use of the 3 MSB adding with the 3 LSB, but it wasn't very good.
The colors I used on the ILI9341 display I got from this website (Note: I choose the 24-bit color 888 and get the 16-bit color 565, on this website there's no way to do otherwise):
http://www.barth-dev.de/online/rgb565-color-picker/
For example, I read the pixel color of the ILI9341 display and save it to a USB Disk, in a file, in BMP format. As the display operates with 16-bit or 18-bit, I have no way to retrieve 24-bit information directly from the GRAM memory.
#define BLACK_565 0x0000
#define BLUE_565 0x001F
#define RED_565 0xF800
#define GREEN_565 0x07E0
#define CYAN_565 0x07FF
#define MAGENTA_565 0xF81F
#define YELLOW_565 0xFFE0
#define WHITE_565 0xFFFF
#define LIGHTGREY_565 0xC618
#define ORANGE_565 0xFD20
#define GREY_565 0x8410
#define DARKGREY_565 0x2104
#define DARKBLUE_565 0x0010
#define DARKGREEN_565 0x03E0
#define DARKCYAN_565 0x03EF
#define DARKYELLOW_565 0x8C40
#define BLUESKY_565 0x047F
#define BROWN_565 0xC408
#define BLACK_888 0x000000
#define BLUE_888 0x0000FF
#define RED_888 0xFF0000
#define GREEN_888 0x04FF00
#define CYAN_888 0x00FFFB
#define MAGENTA_888 0xFF00FA
#define YELLOW_888 0xFBFF00
#define WHITE_888 0xFFFFFF
#define LIGHTGREY_888 0xC6C3C6
#define ORANGE_888 0xFFA500
#define GREY_888 0x808080
#define DARKGREY_888 0x202020
#define DARKBLUE_888 0x000080
#define DARKGREEN_888 0x007D00
#define DARKCYAN_888 0x007D7B
#define DARKYELLOW_888 0x898A00
#define BLUESKY_888 0x008CFF
#define BROWN_888 0xC08240
I did the test (using an STM32F407 uC) with an IF statement, but it can also be done with Select Case, or another form of comparison.
uint16_t buff1; // pixel color value read from GRAM
uint8_t buff2[3];
uint32_t color_buff; // to save to USB disk
if (buff1 == BLUE_565) color_buff = BLUE_888;
else if (buff1 == RED_565) color_buff = RED_888;
else if (buff1 == GREEN_565) color_buff = GREEN_888;
else if (buff1 == CYAN_565) color_buff = CYAN_888;
else if (buff1 == MAGENTA_565) color_buff = MAGENTA_888;
else if (buff1 == YELLOW_565) color_buff = YELLOW_888;
else if (buff1 == WHITE_565) color_buff = WHITE_888;
else if (buff1 == LIGHTGREY_565) color_buff = LIGHTGREY_888;
else if (buff1 == ORANGE_565) color_buff = ORANGE_888;
else if (buff1 == GREY_565) color_buff = GREY_888;
else if (buff1 == DARKGREY_565) color_buff = DARKGREY_888;
else if (buff1 == DARKBLUE_565) color_buff = DARKBLUE_888;
else if (buff1 == DARKCYAN_565) color_buff = DARKCYAN_888;
else if (buff1 == DARKYELLOW_565) color_buff = DARKYELLOW_888;
else if (buff1 == BLUESKY_565) color_buff = BLUESKY_888;
else if (buff1 == BROWN_565) color_buff = BROWN_888;
else color_buff = BLACK;
RGB separation for saving to 8-bit variables:
buff2[0] = color_buff; // Blue
buff2[1] = color_buff >> 8; // Green
buff2[2] = color_buff >> 16; // Red

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio