Decoding a .xwd image in Julia - image

I am trying to write a decoder for a .xwd image (X Window Dump), since ImageMagick is quite slow.
The only specifications I found are:
http://www.opensource.apple.com/source/X11/X11-0.40.80/xc/include/XWDFile.h?txt
https://formats.kaitai.io/xwd/index.html
From which I managed to read the header:
xwd_data = read(`xwd -id $id`)
function get_header(data)
args = [reinterpret(Int32, reverse(data[4*i-3:4*i]))[1] for i in 1:25]
xwd = XwdHeader(args...)
return xwd
end
struct XwdHeader
header_size::Int32
file_version::Int32
pixmap_format::Int32
pixmap_depth::Int32
pixmap_width::Int32
pixmap_height::Int32
xoffset::Int32
byte_order::Int32
bitmap_unit::Int32
bitmap_bit_order::Int32
bitmap_pad::Int32
bits_per_pixel::Int32
bytes_per_line::Int32
visual_class::Int32
red_mask::Int32
green_mask::Int32
blue_mask::Int32
bits_per_rgb::Int32
colormap_entries::Int32
ncolors::Int32
window_width::Int32
window_height::Int32
window_x::Int32
window_y::Int32
window_bdrwidth::Int32
end
and the colormap, which is stored in blocks of 12 bytes and in little-endian byte order:
function read_colormap_entry(n, data, header)
offset = header.header_size + 1
poff = 12*n
px = Pixel(reinterpret(UInt32, reverse(data[offset+poff:offset+poff+3]))[1],
reinterpret(UInt16, reverse(data[offset+poff+4:offset+poff+5]))[1],
reinterpret(UInt16, reverse(data[offset+poff+6:offset+poff+7]))[1],
reinterpret(UInt16, reverse(data[offset+poff+8:offset+poff+9]))[1],
reinterpret(UInt8, data[offset+poff+10])[1],
reinterpret(UInt8, data[offset+poff+11])[1])
println("Pixel number ", px.entry_number >> 16)
println("R ", px.red >> 8)
println("G ", px.green >> 8)
println("B ", px.blue >> 8)
println("flags ", px.flags)
println("padding ",px.padding)
end
struct Pixel
entry_number::UInt32
red::UInt16
green::UInt16
blue::UInt16
flags::UInt8
padding::UInt8
end
julia> read_colormap_entry(0, data, header)
Pixel number 0
R 0
G 0
B 0
flags 7
padding 0
julia> read_colormap_entry(1, data, header)
Pixel number 1
R 1
G 1
B 1
flags 7
padding 0
julia> read_colormap_entry(2, data, header)
Pixel number 2
R 2
G 2
B 2
flags 7
padding 0
Now I have the actual image data stored in 4 byte blocks per pixel in the "Direct Color" visual class. Does anybody know howto extract the RGB values from this ?
edit:
By playing around with the data I found out how to extract the R and G values
function read_pixel(i, j, data, header::XwdHeader)
w = header.window_width
h = header.window_height
offset = header.header_size + header.colormap_entries * 12 + 1
poff = 4*((i-1)*w + (j-1))
px = reinterpret(UInt32, reverse(data[offset+poff:offset+poff+3]))[1]
println("Px value ", px)
r = (px & xwd.red_mask) >> 16
g = (px & xwd.green_mask) >> 8
b = (px & xwd.blue_mask)
println("r ", r)
println("g ", g)
println("b ", b)
end
which gives the correct R and G values, but the B value should be non zero.
julia> read_pixel(31, 31, data, xwd_header)
Px value 741685248
r 53
g 56
b 0
I basically have no idea what I am doing with the color masks and the
bit-shifts. Can anyone explain this ? Thanks !

Related

Compare 16 bpp to 32 bpp bitmap conversions

I got a 16 bpp bitmap that I converted to 32 bpp via code below:
void Rgb555ToRgb8(const UChar* bitmapData, UInt32 width, UInt32 height, UChar* buf)
{
UInt32 dst_bytes_per_row = width * 4;
UInt32 src_bytes_per_row = ((width * 16 + 31) / 32) * 4;
UInt16 red_mask = 0x7C00;
UInt16 green_mask = 0x3E0;
UInt16 blue_mask = 0x1F;
for (UInt32 row = 0; row < height; ++row)
{
UInt32 dstCol = 0, srcCol = 0;
do
{
UInt16 rgb = *(UInt16*)(bitmapData + row * src_bytes_per_row + srcCol);
UChar red_value = (rgb & red_mask) >> 10;
UChar green_value = (rgb & green_mask) >> 5;
UChar blue_value = (rgb & blue_mask);
buf[row*dst_bytes_per_row + dstCol] = blue_value << 3;
buf[row*dst_bytes_per_row + dstCol + 1] = green_value << 3;
buf[row*dst_bytes_per_row + dstCol + 2] = red_value << 3;
buf[row*dst_bytes_per_row + dstCol + 3] = rgb >> 15;
srcCol += 2;
dstCol += 4;
} while (srcCol < src_bytes_per_row);
}
}
Here is conversion result: [2]: https://i.stack.imgur.com/1ajO7.png
I also tried to convert this image via GdiPlus:
Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(w,h,PixelFormat32bppRGB);
Resultant image is .
Notice that the 2 results don't look exactly the same (e.g., the background in GdiPlus result is white). How can I modify my code to match GdiPlus result?
There are two issues that need to be addressed:
Unused bits
When moving from 5 bits of information to 8 bits of information you gain an additional 3 bits. As implemented, the code doesn't make use of that additional range, and is biased towards darker color components. This is an illustration of what blue_value << 3 actually does:
5 bits per channel 8 bits per channel
bbbbb -> bbbbb000
To address this, the least significant 3 bits need to grow as the channel value gets higher. A simple (yet somewhat inaccurate) would be to just copy the most significant 3 bits down to the least significant 3 bits, i.e.
buf[row*dst_bytes_per_row + dstCol] = (blue_value << 3) | (blue_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 1] = (green_value << 3) | (green_value >> 2);
buf[row*dst_bytes_per_row + dstCol + 2] = (red_value << 3) | (red_value >> 2);
The exact mapping would be a bit more involved, something like
blue_value = static_cast<UChar>((blue_value * 255.0) / 31.0 + 0.5);
That converts from 5 bits to the respective 8 bit value that's nearest to the ideal value, including the 4 values that were 1/255th off in the bit-shifting solution above.
If you opt for the latter, you can build a lookup table that stores the mapped values. This table is only 32 entries of one byte each, so it fits into a single cache-line.
Alpha channel
Assuming that the MSB of your source image is indeed interpreted as an alpha value, you're going to have move that into the destination as well. Since the source is only 1 bit of information, the raw transformation is trivial:
buf[row*dst_bytes_per_row + dstCol + 3] = rgb & (1 << 15) ? 255 : 0;
That may or may not be all that's needed. Windows assumes premultiplied alpha, i.e. the stored values of the color channels must be premultiplied by the alpha value (see BLENDFUNCTION for reference).
If the alpha value is 255, the color channel values are already correct. If the alpha value is 0, all color channels need to be multiplied by zero (or simply set to 0). The translation doesn't produce any other alpha values.

Why image values are different in Matlab and OpenCV?

I have an original image:
I then read it, create a PSF, and blur it in Matlab:
lenawords1=imread('lenawords.bmp');
%create PSF
sigma=6;
PSFgauss=fspecial('gaussian', 8*sigma+1, sigma);
%blur it
lenablur1=imfilter(lenawords1, PSFgauss, 'conv');
lenablurgray1=mat2gray(lenablur1);
PSFgaussgray = mat2gray(PSFgauss);
and I saved the blurred image:
imwrite(lenablurgray1, 'lenablur.bmp');
When I display some values in it, I get
disp(lenablurgray1(91:93, 71:75))
0.5556 0.5778 0.6000 0.6222 0.6444
0.6000 0.6444 0.6667 0.6889 0.6889
0.6444 0.6889 0.7111 0.7333 0.7333
I then open that blurred image in OpenCV and display its values at the same indices:
Mat img = imread("lenablur.bmp");
for (int r = 91; r < 94; r++) {
for (int c = 71; c < 76; c++) {
cout << img.at<double>(r, c) << " ";
}
cout << endl;
}
cout << endl;
The result I get doesn't match the values above:
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Why is this?
EDIT: img.at<unsigned int>(r, c) gives
1903260029 1533437542 ...
2004318088 ...
....
If I save the blurred image as a png file:
imwrite(lenablurgray1, 'lenablur.png');
Then when I read it in OpenCV:
Mat img = imread("lenablur.png");
img.convertTo(img, CV_64F);
then img.at<double>(r, c) gives
17 11 11 11 6
17 11 11 11 6
17 11 11 11 11
which still doesn't match the values from Matlab
EDIT2: I now see the values are wrong for the kernel. In Matlab, I get
imwrite(PSFgaussgray, 'PSFgauss.bmp');
disp(PSFgaussgray(7:9, 7:9)*256)
.0316 .0513 .0812
.0513 ...
...
whereas in OpenCV:
Mat kernel = imread("PSFgauss.bmp");
cvtColor(kernel, kernel, cv::COLOR_BGR2GRAY);
kernel.convertTo(kernel, CV_64F);
for (int r = 6; r < 9 ; r++) {
for (int c = 6; c < 9; c++) {
cout << kernel.at<double>(r, c) << " ";
}
cout << endl;
}
cout << endl;
The result I get doesn't match the values above:
0 0 0
0 0 0
0 0 0
To understand the discrepancy you see you need to know how MATLAB saves images to a BMP or PNG file, and how OpenCV reads it.
MATLAB assumes, if the image is of type double as is this case, that its intensity range is [0,1]. That is, pixel values below 0 and above 1 are not expected. Such images are multiplied by 255 and converted to 8-bit integers (which have a range of [0,255]) when saved to a file.
Thus, if
>> disp(lenablurgray1(91:93, 71:75))
0.5556 0.5778 0.6000 0.6222 0.6444
0.6000 0.6444 0.6667 0.6889 0.6889
0.6444 0.6889 0.7111 0.7333 0.7333
what is saved is
>> uint8( lenablurgray1(91:93, 71:75) * 255 )
142 147 153 159 164
153 164 170 176 176
164 176 181 187 187
Next, OpenCV will read this file as RGB (or rather BGR, OpenCV's awkward color order) and as 8-bit unsigned integer (CV_8U). To display these data, either extract one of the color channels, or convert to gray value using
cvtColor(img, img, cv::COLOR_BGR2GRAY);
Then, read the 8-bit unsigned values with
img.at<uchar>(r, c)
If you read them with img.at<double>(), groups of 8 consecutive pixels will be regarded as a single pixel value (a double has 8 bytes).
Next, remember that MATLAB's indexing starts at 1, whereas OpenCV's starts at 0. So your loop should look like this:
for (int r = 90; r < 93; r++) { // matches MATLAB's 91:93 indexing
for (int c = 70; c < 75; c++) { // matches MATLAB's 71:75 indexing
cout << (int)img.at<uchar>(r, c) << " ";
}
cout << '\n';
}
cout << '\n';
Finally, in the case of your kernel, note that its values, when multiplied by 255 are still much smaller than unity: .0316 .0513 .0812. These values will be written as 0 to the BMP or PNG file. If you want to save these values, you need to scale the kernel so its maximum value is 1:
PSFgauss = PSFgauss / max(PSFgauss(:));
imwrite(PSFgauss, 'PSFgauss.bmp');
(Note that this kernel is already a grey-value image, you don't need to use mat2gray on it.)

How do I get HSV values of an average pixel of an image?

In this code
im = Vips::Image.new_from_file "some.jpg"
r = (im * [1,0,0]).avg
g = (im * [0,1,0]).avg
b = (im * [0,0,1]).avg
p [r,g,b] # => [57.1024, 53.818933333333334, 51.9258]
p Vips::Image.sRGB2HSV [r,g,b]
the last line throws
/ruby-vips-1.0.3/lib/vips/argument.rb:154:in `set_property': invalid argument Array (expect #<Class:0x007fbd7c923600>) (ArgumentError)`
P.S.: temporary took and refactored the ChunkyPNG implementation:
def to_hsv r, g, b
r, g, b = [r, g, b].map{ |component| component.fdiv 255 }
min, max = [r, g, b].minmax
chroma = max - min
[
60.0 * ( chroma.zero? ? 0 : case max
when r ; (g - b) / chroma
when g ; (b - r) / chroma + 2
when b ; (r - g) / chroma + 4
else 0
end % 6 ),
chroma / max,
max,
]
end
Pixel averaging should really be in a linear colorspace. XYZ is an easy one, but scRGB would work well too. Once you have a 1x1 pixel image, convert to HSV and read out the value.
#!/usr/bin/ruby
require 'vips'
im = Vips::Image.new_from_file ARGV[0]
# xyz colourspace is linear, ie. the value is each channel is proportional to
# the number of photons of that frequency
im = im.colourspace "xyz"
# 'shrink' is a fast box filter, so each output pixel is the simple average of
# the corresponding input pixels ... this will shrink the whole image to a
# single pixel
im = im.shrink im.width, im.height
# now convert the one pixel image to hsv and read out the values
im = im.colourspace "hsv"
h, s, v = im.getpoint 0, 0
puts "h = #{h}"
puts "s = #{s}"
puts "v = #{v}"
I wouldn't use HSV myself, LCh is generally much better.
https://en.wikipedia.org/wiki/Lab_color_space#Cylindrical_representation:_CIELCh_or_CIEHLC
For LCh, just change the end to:
im = im.colourspace "lch"
l, c, h = im.getpoint 0, 0
I realised, that it is obviously wrong to calculate average Hue as arithmetic average, so I solved it by adding vectors of length equal to Saturation. But I didn't find how to iterate over pixels in vips so I used a crutch of chunky_png:
require "vips"
require "chunky_png"
def get_average_hsv_by_filename filename
im = Vips::Image.new filename
im.write_to_file "temp.png"
y, x = 0, 0
ChunkyPNG::Canvas.from_file("temp.png").to_rgba_stream.unpack("N*").each do |rgba|
h, s, v = ChunkyPNG::Color.to_hsv(rgba)
a = h * Math::PI / 180
y += Math::sin(a) * s
x += Math::cos(a) * s
end
h = Math::atan2(y, x) / Math::PI * 180
_, s, v = im.colourspace("hsv").bandsplit.map(&:avg)
[h, s, v]
end
For large images I used .resize that seems to inflict only up to ~2% error when resizing down to 10000 square pixels area with default kernel.

zero padding zoom fourier

I'm trying to implement a zero padding zoom using fourier.
I'm using octave and I can't add zeros around my matrix.
The result (after inverse fourier transformation) is very dark.
My goal:
My code:
I=double(imread('montagne.jpeg'));
I = I/255;
%%scaling factor
facteur = 4;
[m,n,r] = size(I);
H=fft2(I);
H = fftshift(H);
%%the new image
B = zeros(facteur*m,facteur*n,3);
%%try to add zeros around my matrix
%% r : rgb channels
for r=1:3
for i=1:m
for j=1:n
B(i+((facteur*m)/4),j+((facteur*n)/4),r) = H(i,j,r);
end
end
end
%% show the image
B= ifftshift(B);
final = ifft2(B);
figure;
imshow(final);
Any suggestions ?
Don't use for-loops to copy matrices. I would try something like:
I = im2double (imread ('IMG_2793.JPG'));
facteur = 4; %%scaling factor
[m, n, r] = size (I);
H = fftshift (fft2 (I));
B = zeros(facteur*m, facteur*n, 3);
ms = round (m * (facteur/2 - 0.5));
ns = round (n * (facteur/2 - 0.5));
B(ms:(m+ms-1), ns:(n+ns-1), :) = H;
final = abs (ifft2 (ifftshift (B)));
figure;
imshow(final * facteur^2);
EDIT:
Btw, there is also the function padarray which does what you want:
octave:1> padarray (magic(3), [1, 1])
ans =
0 0 0 0 0
0 8 1 6 0
0 3 5 7 0
0 4 9 2 0
0 0 0 0 0

Mata: Create a matrix that contains averages of all elements of 3 matrices

Define three row vectors,
A = (1,2,3)
B = (10,20,30,40)
C = (100,200,300,400,500)
I want to construct a new matrix D which will be a have 3x4x5 = 60 elements and contains the averages of these elements as illustrated below:
D =
(1+10+100)/3, (1+10+200)/3,…, (1+10+ 500)/3 \
(1+20+100)/3, (1+20+200)/3,…, (1+20+ 500)/3 \
(1+30+100)/3, (1+30+200)/3,…, (1+30+ 500)/3 \
(1+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \
(2+10+100)/3, (2+10+200)/3,…, (2+10+ 500)/3 \
(2+20+100)/3, (2+20+200)/3,…, (2+20+ 500)/3 \
(2+30+100)/3, (2+30+200)/3,…, (2+30+ 500)/3 \
(2+40+100)/3, (2+40+200)/3,…, (2+40+ 500)/3 \
(3+10+100)/3, (3+10+200)/3,…, (3+10+ 500)/3 \
(3+20+100)/3, (3+20+200)/3,…, (3+20+ 500)/3 \
(3+30+100)/3, (3+30+200)/3,…, (3+30+ 500)/3 \
(3+40+100)/3, (3+40+200)/3,…, (3+40+ 500)/3 \
The way it is set up in this example it will be a 12x5 matrix, but I am fine if it is a 1X60 vector or 60X1 vector.
How to do this efficiently in Mata? I am new to Mata and I had this running in Stata using multiple forval loops (in this case, there would be 3 forval loops). But this becomes very time consuming as I have up to 8 row vectors and about 120 elements in each of them.
I figured that I can use for loops in Mata and it will be much faster, but I believe if I can do this as a matrix manipulation instead of using for loops then it will be even faster. The problem is I am having a hard time visualizing how to write such a program (or if it's even possible) and any help would be highly appreciated.
The clever solution by #AspenChen offers huge speed gains over for loops, as shown with some testing:
clear all
set more off
mata
timer_clear()
//----- change data -----
fa = 250
fb = fa + 1
fc = fa + 2
//----- Method 1 -----
timer_on(1)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
F = J(1, cols(A) * cols(B) * cols(C), .)
col = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
col++
F[1,col] = A[1,i] + B[1,j] + C[1,k]
}
}
}
timer_off(1)
//----- Method 2 (Aspen Chen) -----
timer_on(2)
A = (1::fa) // fa x 1
B = (1::fb)*10 // fb x 1
C = (1::fc)*100 // fc x 1
// tensor sum for A and B
a = J(1,rows(B),1) // 1 x fb with values of 1
b = J(1,rows(A),1) // 1 x fa with values of 1
T = (A#a) + (b#B)' // fa x fb
T = vec(T) // fa*fb x 1
// tensor sum for T and C
c = J(1,rows(T),1) // 1 x fa*fb with values of 1
t = J(1,rows(C),1) // 1 x fc with values of 1
T = (C#c) + (t#T)' // fc x fa*fb
timer_off(2)
timer()
end
Resulting in:
timer report
1. 8.78 / 1 = 8.776
2. .803 / 1 = .803
If the original poster still wants to use for loops due the large number of elements that will be compared, (s)he can use something along the lines of:
<snip>
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)
}
}
}
larger
<snip>
Edit
Further tests with for loops only:
clear all
set more off
mata
timer_clear()
//----- change data -----
fa = 500
fb = fa + 1
fc = fa + 2
//----- Method 1 -----
timer_on(1)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
larger = larger + (A[1,i] + B[1,j] + C[1,k] > 7)
}
}
}
larger
timer_off(1)
//----- Method 2 (ec27) -----
timer_on(2)
A = (1..fa) // 1 x fa
B = (1..fb)*10 // 1 x fb
C = (1..fc)*100 // 1 x fc
larger = 0
for (i=1; i<=cols(A); i++) {
for (j=1; j<=cols(B); j++) {
for (k=1; k<=cols(C); k++) {
placebo = A[1,i] + B[1,j] + C[1,k]
if (placebo > 7) larger = larger + 1
}
}
}
larger
timer_off(2)
timer()
end
For the smaller example, you are basically asking for the summation version of the Kronecker Product. I found this Matlab discussion thread on the subject, in which it is referred to as the Tensor Sum (did not see this phrase used very often).
Here's a quick attempt to replicate the operation in Mata. Not a very neat code, so feel free to edit or correct.
clear
mata
// input data
A=(1,2,3)' // 3x1
B=(10,20,30,40)' // 4x1
C=(100,200,300,400,500)' // 5x1
// tensor sum for A and B
a=J(1,4,1) // 1x4 with values of 1
b=J(1,3,1) // 1x3 with values of 1
T=(A#a)+(b#B)'
T // 3x4
T=vec(T) // 12x1
// tensor sum for T and C
c=J(1,12,1) // 1x12 with values of 1
t=J(1,5,1) // 1x5 with values of 1
T=(C#c)+(t#T)'
// divide by 3
T=T/3
T' // transposed just for better display
end

Resources