Windows StretchBlt API performance - windows

I timed a DDB drawing operation which uses multiple StretchBlt and StretchDIBits calls.
And I found that, time to complete is increase/decrease proportionally to the destination window size.
With 900x600 window it takes around 5ms, but with 1920x1080 it takes as large as 55ms (source image is 1280x640).
It seems Stretch.. APIs don't use any hardware acceleration features.
Source image (actually this is temporary drawing canvas) is created with CreateDIBSection because I need resulting (stretched and merged) bitmap's pixel data for every frame drawn.
Let's assume, Windows GDI is hopeless. Then what is the promising alternative?
I considered D3D, D2D with WIC method (write to WIC bitmap and draw it with D2D then read back pixel data from the WIC bitmap).
I planed to try D2D with WIC method because I will needed to use extensive text drawing feature sometime soon.
But it seems WIC is not that promising: What is the most effective pixel format for WIC bitmap processing?

I've implemented D2D + WIC routine today. Test results are really good.
With my previous GDI StretchDIBits version, it took 20 ~ 60ms time for drawing 1280x640 DDB into a 1920x1080 window. After switching to Direct2D + WIC, it usually takes under 5ms, also picture quality looks better.
I used ID2D1HwndRenderTarget with WicBitmapRenderTarget, because I need to read/write raw pixel data.
HWndRenderTarget is only used for screen painting (WM_PAINT).
The main advantage of HWndRenderTarget is that the destination window size doesn't affect drawing performance.
WicBitmapRenderTarget is used as a temporary drawing canvas (as Memory DC in GDI drawing). We can create WicBitmapRenderTarget with a WIC bitmap object (like GDI DIBSection). We can read/write raw pixel data from/to this WIC bitmap at any time. Also it's very fast. For side note, somewhat similar D3D GetFrontBufferData call is really slow.
Actual pixel I/O is done through IWICBitmap and IWICBitmapLock interface.
Writing:
IWICBitmapPtr m_wicRemote;
...
const uint8* image = ...;
...
WICRect rcLock = { 0, 0, width, height };
IWICBitmapLockPtr wicLock;
hr = m_wicRemote->Lock(&rcLock, WICBitmapLockWrite, &wicLock);
if (SUCCEEDED(hr))
{
UINT cbBufferSize = 0;
BYTE *pv = NULL;
hr = wicLock->GetDataPointer(&cbBufferSize, &pv);
if (SUCCEEDED(hr))
{
memcpy(pv, image, cbBufferSize);
}
}
m_wicRenderTarget->BeginDraw();
m_wicRenderTarget->SetTransform(D2D1::Matrix3x2F::Identity());
ID2D1BitmapPtr d2dBitmap;
hr = m_wicRenderTarget->CreateBitmapFromWicBitmap(m_wicRemote, &d2dBitmap.GetInterfacePtr());
if (SUCCEEDED(hr))
{
float cw = (renderTargetSize.width / 2);
float ch = renderTargetSize.height;
float x, y, w, h;
FitFrameToCenter(cw, ch, (float)width, (float)height, x, y, w, h);
m_wicRenderTarget->DrawBitmap(d2dBitmap, D2D1::RectF(x, y, x + w, y + h));
}
m_wicRenderTarget->EndDraw();
Reading:
IWICBitmapPtr m_wicCanvas;
IWICBitmapLockPtr m_wicLockedData;
...
UINT width, height;
HRESULT hr = m_wicCanvas->GetSize(&width, &height);
if (SUCCEEDED(hr))
{
WICRect rcLock = { 0, 0, width, height };
hr = m_wicCanvas->Lock(&rcLock, WICBitmapLockRead, &m_wicLockedData);
if (SUCCEEDED(hr))
{
UINT cbBufferSize = 0;
BYTE *pv = NULL;
hr = m_wicLockedData->GetDataPointer(&cbBufferSize, &pv);
if (SUCCEEDED(hr))
{
return pv; // return data pointer
// need to Release m_wicLockedData after reading is done
}
}
}
Drawing:
ID2D1HwndRenderTargetPtr m_renderTarget;
....
D2D1_SIZE_F renderTargetSize = m_renderTarget->GetSize();
m_renderTarget->BeginDraw();
m_renderTarget->SetTransform(D2D1::Matrix3x2F::Identity());
m_renderTarget->Clear(D2D1::ColorF(D2D1::ColorF::Black));
ID2D1BitmapPtr d2dBitmap;
hr = m_renderTarget->CreateBitmapFromWicBitmap(m_wicCanvas, &d2dBitmap.GetInterfacePtr());
if (SUCCEEDED(hr))
{
UINT width, height;
hr = m_wicCanvas->GetSize(&width, &height);
if (SUCCEEDED(hr))
{
float x, y, w, h;
FitFrameToCenter(renderTargetSize.width, renderTargetSize.height, (float)width, (float)height, x, y, w, h);
m_renderTarget->DrawBitmap(d2dBitmap, D2D1::RectF(x, y, x + w, y + h));
}
}
m_renderTarget->EndDraw();
In my opinion, GDI Stretch.. APIs are totally useless in Windows 7+ setup (for performance sensitive applications).
Also note that, unlike Direct3D, basic graphics operations such as text drawing, ling drawing are really simple in Direct2D.

Related

What is the best way to draw 4bpp 2D tiles with multiple palettes?

I'm creating a generic SNES tilemap editor (similar to NES Screen Tool), meaning I'm drawing a lot of 4bpp tiles. However, my graphics loop takes too long to run, even with CachedBitmaps, which can't have their palettes changed, of which I may need to switch between 8. I can deal with the SNES format and size of things, but am struggling with the Windows side.
// basically the entire graphics drawing routine
case(WM_PAINT):{
PAINTSTRUCT ps;
HDC hdc = BeginPaint(hwnd, &ps);
Gdiplus::Graphics graphics(hdc);
graphics.Clear(ARGB1555toARGB8888(CGRAM[0])); // convert 1st 15-bit CGRAM color to 32-bit & clear bkgd
// tileset2[i]->SetPalette(colorpalette); // called in tileset loading to test 1 palette
for(uint16_t i = 0; i < 1024; i++){
tilesetX[i] = new Gdiplus::CachedBitmap(tileset2[i], &graphics);
}
/* struct SNES_Tile{
uint16_t tileIndex: 10,
uint16_t palette: 3,
uint16_t priority: 1, // (irrelevant for this project)
uint16_t horzFlip: 1,
uint16_t vertFlip: 1,
}*/
// I can see each individual tile being drawn
for(int y = 0; y < 32; y++){
for(int x = 0; x < 32; x++){
// assume tilemap is set to 32x32, and not 64x32 or 32x64 or 64x64
graphics.DrawCachedBitmap(tilesetX[BG2[y * 32 + x] & 0x03FF], x * BG2CHRSize, y * BG2CHRSize);
// BG2[y * 32 + x] & 0x03FF : get tile index from VRAM and strip attributes
// tilesetX[...] : get CachedBitmap to draw
}
}
EndPaint(hwnd, &ps);
break;
}
I am early enough in my program that rewriting the entire graphics routine wouldn't be too much of a hassle.
Should I give up on GDI+ and switch to Direct2D or something else? Is there a faster way to draw 4bpp bitmaps without having to create a copy for each palette?
EDIT:
The reason my graphics drawing routine was so slow was because I was drawing directly to the screen. It is much faster to draw to a separate bitmap as a buffer, then draw the buffer to the screen.
Updating the tile's palette when drawing to the buffer results in perfectly reasonably speeds.

How to determine horizontal and vertical extents in device units with SetViewportExtEx() and printer?

I am experimenting with using the Windows GDI API for printing and have been doing a few experiments to attempt to understand the translation and how window and viewport extents work.
Examples I have found are using GetDeviceCaps() to get the HORZRES and VERTRES dimensions (despite the known fact they can be unreliable and inaccurate) and then using these values with SetViewportExtEx() however they divide the values returned by GetDeviceCaps() by two.
Why is the cxpage and the cypage values halved and how can I predict the values to use and the effect on the printed output? Is this due to using MM_ISOTROPIC as the mapping mode?
Examples use something like the following code:
int cxpage = GetDeviceCaps (hDC, HORZRES);
int cypage = GetDeviceCaps (hDC, VERTRES);
SetMapMode (hDC, MM_ISOTROPIC);
SetWindowExtEx(hDC, 1500, 1500, NULL);
SetViewportExtEx(hDC, cxpage/2, cypage/2, NULL);
SetViewportOrgEx(hDC, 0, 0, NULL);
In my actual test program I have the following function to print a page when my main Windows message handler sees the message IDM_PRINT generated when the user selects Print from the File menu of the test application. The handler uses PrintDlg() to get a handle to a Device Context (hDC) then calls this function to exercise the printing.
int PrintMyPages (HDC hDC)
{
int cxpage = GetDeviceCaps (hDC, HORZRES);
int cypage = GetDeviceCaps (hDC, VERTRES);
// When MM_ISOTROPIC mode is set, an application must call the
// SetWindowExtEx function before it calls SetViewportExtEx. Note that
// for the MM_ISOTROPIC mode certain portions of a nonsquare screen may
// not be available for display because the logical units on both axes
// represent equal physical distances.
SetMapMode (hDC, MM_ISOTROPIC);
// Since mapping mode is MM_ISOTROPIC we need to specify the extents of the
// window and the viewport we are using to see the window in order to establish
// the proper translation between window and viewport coordinates.
SetWindowExtEx(hDC, 1500, 1500, NULL);
SetViewportExtEx(hDC, cxpage/2, cypage/2, NULL);
SetViewportOrgEx(hDC, 0, 0, NULL);
// figure out the page size in logical units for the loop that is printing
// out the pages of output. we must do this after setting up our window and
// viewport extents so Windows will calculate the DPtoLP() for the specified
// translation correctly.
RECT pageRect = {0};
pageRect.right = GetDeviceCaps (hDC, HORZRES);
pageRect.bottom = GetDeviceCaps (hDC, VERTRES);
DPtoLP(hDC, (LPPOINT)&pageRect, 2);
// create my font for drawing the text to be printed and select it into the DC for printing.
HFONT DisplayFont = CreateFont (166, 0, 0, 0, FW_DONTCARE, false, false, false, DEFAULT_CHARSET,
OUT_DEFAULT_PRECIS, CLIP_DEFAULT_PRECIS, DEFAULT_QUALITY,
DEFAULT_PITCH | FF_DONTCARE, _T("Arial Rounded MT Bold"));
HGDIOBJ hSave = SelectObject (hDC, DisplayFont);
POINT ptLine = {300, 200}; // our printer line cursor for where printing should start.
static DOCINFO di = { sizeof (DOCINFO), TEXT ("INVOICE TABLE : Printing...")};
StartDoc (hDC, &di);
StartPage (hDC);
for (int i = 1; i < 30; i++) {
TCHAR xBuff[256] = {0};
swprintf (xBuff, 255, _T("This is line %d of my text."), i);
TextOut (hDC, ptLine.x, ptLine.y, xBuff, _tcslen(xBuff));
// get the dimensions of the text string in logical units so we can bump cursor to next line.
SIZE lineSize = {0};
GetTextExtentPoint32(hDC, xBuff, _tcslen(xBuff), &lineSize);
ptLine.y += lineSize.cy; // bump the cursor down to the next line of the printer. X coordinate stays the same.
if (ptLine.y + lineSize.cy > pageRect.bottom) {
// reached the end of this page so lets start another.
EndPage (hDC);
StartPage (hDC);
ptLine.y = 200;
}
}
// end the final page and then end the document so that physical printing will start.
EndPage (hDC);
EndDoc (hDC);
// Release the font object that we no longer need.
SelectObject (hDC, hSave);
DeleteObject (DisplayFont);
return 1;
}
When I modify the call of SetViewportExtEx() from SetViewportExtEx(hDC, cxpage/2, cypage/2, NULL); (output on the right in the image below) to SetViewportExtEx(hDC, cxpage, cypage, NULL); (output on the left in the image below) the printed text seems almost double in height and width.
Additional Notes on Extents and Mapping Modes
Charles Petzold Programming Windows 5th Edition (Chapter 5 - Basic Drawing, page 180) writes:
The formulas also include two points that specify "extents": the point
(xWinExt, yWinExt) is the window extent in logical coordinates;
(xViewExt, yViewExt) is the viewpoort extent in device
coordinates. In most mapping modes, the extents are implied by the
mapping mode and cannot be changed. Each extent means nothing by
itself, but the ratio of the viewport extent to the window extent is a
scaling factor for converting logical units to device units.
For example, when you set the MM_LOENGLISH mapping mode, Windows sets
xViewExt to be a certain number of pixels and xWinExt to be the length in hundredths of an inch occupied by xViewExt pixels. The
ratio gives you pixels per hundredths of an inch. The scaling factors
are expressed as ratios of integers rather than floating point values
for performance reasons.
Petzold then goes on to discuss MM_ISOTROPIC and MM_ANISOTROPIC on page 187.
The two remaining mapping modes are named MM_ISOTROPIC and
MM_ANISOTROPIC. These are the only two mapping modes for which
Windows lets you change the viewport and window extents, which means
that you can change the scaling factor that Windows uses to translate
logical and device coordinates. The word isotropic means "equal in
all directions"; anisotropic is the opposite - "not equal." Like the
metric mapping modes shown earlier, MM_ISOTROPIC uses equally scaled
axes. Logical units on the x-axis have the same physical dimensions as
logical units on the y-axis. This helps when you need to create images
that retain the correct aspect ratio regardless of the aspect ratio of
the display device.
The difference between MM_ISOTROPIC and the metric mapping modes is
that with MM_ISOTROPIC you can control the physical size of the
logical unit. If you want, you can adjust the size of the logical unit
based on the client area. This lets you draw images that are always
contained within the client area, shrinking and expanding
appropriately. The two clock programs in Chapter 8 have isotropic
images. As you size the window, the clocks are resized appropriately.
A Windows program can handle the resizing of an image entirely through
adjusting the window and viewport extents. The program can then use
the same logical units in the drawing functions regardless of the size
of the window.
... why do so many examples use SetViewportExtEx(hDC, cxpage/2, cypage/2, NULL); where cxpage and cypage are GetDeviceCaps(hDC, HORZRES) and GetDeviceCaps(hDC, VERTRES) respectively[?]
I suspect MM_ISOTROPIC is often used for plotting graphs where the origin would be in the center of the page rather than the corner. If we took your code and tweaked it to move the origin to the center of the printable region, like this:
SetWindowExtEx(hDC, 1500, 1500, NULL);
SetViewportExtEx(hDC, cxpage/2, cypage/2, NULL);
SetViewportOrgEx(hDC, cxpage/2, cypage/2, NULL);
Then you could plot using logical coordinates that range from -1500 to +1500. (You might also want to flip the sign of one of the y-extents to get positive "up".)
For textual output, I don't see any advantage to halving the viewport extents and I would keep the origin in the upper left.

Drawing peformance with Win32

I am getting a very poor peformance drawing with Win32. It takes too much time and needs improving. Please advise.
Here is what I do.
HDC dc = GetDC(wnd);
HDC memoryDc = CreateCompatibleDC(dc);
HBITMAP memoryMapBitmap = CreateCompatibleBitmap(dc, 400, 400);
HGDIOBJ originalBitmap = SelectObject(memoryDc, memoryMapBitmap);
Then, I draw in a for-loop as follows.
HBRUSH brush = (HBRUSH)GetStockObject(DC_BRUSH);
SetDCBrushColor(memoryDc, colorRef);
FillRect(memoryDc, &rect, brush);
And finally, I do a cleanup
SelectObject(memoryDc, originalBitmap);
DeleteDC(memoryDc);
ReleaseDC(wnd, dc);
Drawing takes a lot of time (several seconds). Is there a way to draw faster with Win32?
Thanks in advance!
It looks like I have solved it. Below is the solution with some comments.
I have a dialog defined in RC-file. There is a control to display a bitmap image in the dialog.
CONTROL "", IDC_MEMORY_MAP, WC_STATIC, SS_BITMAP | SS_CENTERIMAGE | SS_SUNKEN, 9, 21, 271, 338, WS_EX_LEFT
In the run-time I need to create, draw and display a bitmap:
HWND map = GetDlgItem(dlg, IDC_MEMORY_MAP);
HBITMAP bitmap = createMemoryMapBitmap(map);
bitmap = (HBITMAP)SendMessage(map, STM_SETIMAGE, IMAGE_BITMAP, (LPARAM)bitmap);
DeleteObject(bitmap); // (!) this is a very important line, otherwise old bitmap leaks
Code that finds out the size of the bitmap to create:
HBITMAP createMemoryMapBitmap(HWND map) {
RECT rect = {0, 0, 0, 0};
GetClientRect(map, &rect);
SIZE size = {rect.right - rect.left, rect.bottom - rect.top};
HDC dc = GetDC(map);
HBITMAP bitmap = doCreateMemoryMapBitmap(dc, &size);
ReleaseDC(map, dc);
return bitmap;
}
Finally, we actually create the bitmap and draw on it:
HBITMAP doCreateMemoryMapBitmap(HDC dc, LPSIZE bitmapSize) {
// create 24bpp bitmap in memory in order to draw fast
BITMAPINFO info;
memset(&info, 0, sizeof(info));
info.bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
info.bmiHeader.biWidth = bitmapSize->cx;
info.bmiHeader.biHeight = bitmapSize->cy;
info.bmiHeader.biPlanes = 1;
info.bmiHeader.biBitCount = 24;
info.bmiHeader.biCompression = BI_RGB;
void *pixels = NULL;
HBITMAP memoryBitmap = CreateDIBSection(dc, &info, DIB_RGB_COLORS, &pixels, NULL, 0);
HDC memoryDc = CreateCompatibleDC(dc); // (!) memoryDc is attached to current thread
HGDIOBJ originalDcBitmap = SelectObject(memoryDc, memoryBitmap);
// drawing code here
// perform windows gdi cleanup
SelectObject(memoryDc, originalDcBitmap); // restore original bitmap in memoryDC (optional step)
DeleteDC(memoryDc); // this releases memoryBitmap from memoryDC
return memoryBitmap;
}
The idea above is to create a 24bpp bitmap in the memory and draw on it. This way drawing is fast, as #IInspectable pointed out.
If display is in the indexed color mode, e.g. 16 or 256 colors, it seems Windows native control is smart enough to convert the color depth automatically displaying the bitmap.

Rendering Windows screenshot capture bitmap as DirectX texture

I'm making progress developing a '3d desktop' directx app that needs to display the current contents of a desktop window (e.g. "Calculator") as a 2D texture on a rectangular surface in directx (11). I'm sooo close but really struggling with the screenshot BMP -> Texture2D step. I do have screenshot->HBITMAP and DDSFile->rendered texture successfully working but can't complete the screenshot->rendered texture.
So far I have working the 'capture the window as a screenshot' bit:
RECT user_window_rectangle;
HWND user_window = FindWindow(NULL, TEXT("Calculator"));
GetClientRect(user_window, &user_window_rectangle);
HDC hdcScreen = GetDC(NULL);
HDC hdc = CreateCompatibleDC(hdcScreen);
UINT screenshot_width = user_window_rectangle.right - user_window_rectangle.left;
UINT screenshot_height = user_window_rectangle.bottom - user_window_rectangle.top;
hbmp = CreateCompatibleBitmap(hdcScreen, screenshot_width, screenshot_height);
SelectObject(hdc, hbmp);
PrintWindow(user_window, hdc, PW_CLIENTONLY);
At this point I have the window bitmap referenced by HBITMAP hbmp.
Also working is my code to render a DDS file as a texture on a directx/3d rectangle:
ID3D11Device *dev;
ID3D11DeviceContext *dev_context;
...
dev_context->PSSetShaderResources(0, 1, &shader_resource_view);
dev_context->PSSetSamplers(0, 1, &tex_sampler_state);
...
DirectX::TexMetadata tex_metadata;
DirectX::ScratchImage image;
hr = LoadFromDDSFile(L"Earth.dds", DirectX::DDS_FLAGS_NONE, &tex_metadata, image);
hr = CreateShaderResourceView(dev, image.GetImages(), image.GetImageCount(), tex_metadata, &shader_resource_view);
Pixel shader is:
Texture2D ObjTexture
SamplerState ObjSamplerState
float4 PShader(float4 pos : SV_POSITION, float4 color : COLOR, float2 tex : TEXCOORD) : SV_TARGET\
{
return ObjTexture.Sample( ObjSamplerState, tex );
}
The samplerstate (defaulting to linear) is:
D3D11_SAMPLER_DESC sampler_desc;
ZeroMemory(&sampler_desc, sizeof(sampler_desc));
sampler_desc.AddressU = D3D11_TEXTURE_ADDRESS_WRAP;
sampler_desc.AddressV = D3D11_TEXTURE_ADDRESS_WRAP;
sampler_desc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP;
sampler_desc.MinLOD = 0;
sampler_desc.MaxLOD = D3D11_FLOAT32_MAX;
hr = dev->CreateSamplerState(&sampler_desc, &tex_sampler_state);
Question: how do I replace the LoadFromDDSFile bit with some equivalent that takes the HBITMAP from the windows screencapture and ends up with it on the graphics card as ObjTexture ?
Below is my best shot of bridging from the screenshot HBITMAP hbmp to the shader resource screenshot_texture, but it gives a memory access violation from the graphics driver (I think due to my "data.pSysmem = &bmp.bmBits", but no idea really):
GetObject(hbmp, sizeof(BITMAP), (LPSTR)&bmp)
D3D11_TEXTURE2D_DESC screenshot_desc = CD3D11_TEXTURE2D_DESC(DXGI_FORMAT_R8G8B8A8_UNORM, bmp.bmWidth, bmp.bmHeight, 1,
1,
D3D11_BIND_SHADER_RESOURCE
);
int bytes_per_pixel = 4;
D3D11_SUBRESOURCE_DATA data;
ZeroMemory(&data, sizeof(D3D11_SUBRESOURCE_DATA));
data.pSysMem = &bmp.bmBits; //pixel buffer
data.SysMemPitch = bytes_per_pixel * bmp.bmWidth;// line size in byte
data.SysMemSlicePitch = bytes_per_pixel * bmp.bmWidth * bmp.bmHeight;// total buffer size in byte
hr = dev->CreateTexture2D(
&screenshot_desc, //texture format
&data, // pixel buffer use to fill the texture
&screenshot_texture // created texture
);
:::::::::::::::::::::::::SOLUTION::::::::::::::::::::::::::::::::::::::::::
The main issue was trying to use &bmp.bmBits directly as a pixel buffer caused memory conflicts within the graphics driver - this was resolved by using 'malloc' to allocate an appropriately sized block of memory to store the pixel data. Thanks to Chuck Walbourn for helping with my poking around in the dark to work out how the pixel data is actually stored (it was actually 32 bits/pixel by default). It's still possible/likely some of code is relying on luck to read the pixel data correctly, but it's been improved with Chuck's input.
My basic technique was;
FindWindow to get the client window on the desktop
CreateCompatibleBitmap and SelectObject and PrintWindow to get a HBITMAP to the snapshot
malloc to allocate the correct amount of space for a (byte*)pixel buffer
GetDIBits to populate the (byte*)pixel buffer from the HBITMAP
CreateTexture2D to build the texture buffer
CreateShaderResourceView to map the texture to the graphics pixel shader
So working code to screenshot a windows desktop window and pass that as a texture to a direct3d app is:
RECT user_window_rectangle;
HWND user_window = FindWindow(NULL, TEXT("Calculator")); //the window can't be min
if (user_window == NULL)
{
MessageBoxA(NULL, "Can't find Calculator", "Camvas", MB_OK);
return;
}
GetClientRect(user_window, &user_window_rectangle);
//create
HDC hdcScreen = GetDC(NULL);
HDC hdc = CreateCompatibleDC(hdcScreen);
UINT screenshot_width = user_window_rectangle.right - user_window_rectangle.left;
UINT screenshot_height = user_window_rectangle.bottom - user_window_rectangle.top;
hbmp = CreateCompatibleBitmap(hdcScreen, screenshot_width, screenshot_height);
SelectObject(hdc, hbmp);
//Print to memory hdc
PrintWindow(user_window, hdc, PW_CLIENTONLY);
BITMAPINFOHEADER bmih;
ZeroMemory(&bmih, sizeof(BITMAPINFOHEADER));
bmih.biSize = sizeof(BITMAPINFOHEADER);
bmih.biPlanes = 1;
bmih.biBitCount = 32;
bmih.biWidth = screenshot_width;
bmih.biHeight = 0-screenshot_height;
bmih.biCompression = BI_RGB;
bmih.biSizeImage = 0;
int bytes_per_pixel = bmih.biBitCount / 8;
BYTE *pixels = (BYTE*)malloc(bytes_per_pixel * screenshot_width * screenshot_height);
BITMAPINFO bmi = { 0 };
bmi.bmiHeader = bmih;
int row_count = GetDIBits(hdc, hbmp, 0, screenshot_height, pixels, &bmi, DIB_RGB_COLORS);
D3D11_TEXTURE2D_DESC screenshot_desc = CD3D11_TEXTURE2D_DESC(
DXGI_FORMAT_B8G8R8A8_UNORM, // format
screenshot_width, // width
screenshot_height, // height
1, // arraySize
1, // mipLevels
D3D11_BIND_SHADER_RESOURCE, // bindFlags
D3D11_USAGE_DYNAMIC, // usage
D3D11_CPU_ACCESS_WRITE, // cpuaccessFlags
1, // sampleCount
0, // sampleQuality
0 // miscFlags
);
D3D11_SUBRESOURCE_DATA data;
ZeroMemory(&data, sizeof(D3D11_SUBRESOURCE_DATA));
data.pSysMem = pixels; // texArray; // &bmp.bmBits; //pixel buffer
data.SysMemPitch = bytes_per_pixel * screenshot_width;// line size in byte
data.SysMemSlicePitch = bytes_per_pixel * screenshot_width * screenshot_height;
hr = dev->CreateTexture2D(
&screenshot_desc, //texture format
&data, // pixel buffer use to fill the texture
&screenshot_texture // created texture
);
D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc;
srvDesc.Format = screenshot_desc.Format;
srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
srvDesc.Texture2D.MostDetailedMip = 0;
srvDesc.Texture2D.MostDetailedMip = screenshot_desc.MipLevels;
dev->CreateShaderResourceView(screenshot_texture, NULL, &shader_resource_view);
You are making a lot of assumptions here that the BITMAP returned is actually in 32-bit RGBA form. It is likely not at all in that format, and in any case you need to validate the contents of bmPlanes to be 1 and bmBitsPixel to be 32 if you are assuming it is 4-bytes per pixel. You should read more about the BMP format.
BMPs uses BGRA order, so you can use DXGI_FORMAT_B8G8R8A8_UNORM for the case of bmBitsPixel being 32.
Secondly, you need to derive pitch from bmWidthBytes and not bmWidth.
data.pSysMem = &bmp.bmBits; //pixel buffer
data.SysMemPitch = bmp.bmWidthBytes;// line size in byte
data.SysMemSlicePitch = bmp.bmWidthBytes * bmp.bmHeight;// total buffer size in byte
If bmBitsPixel is 24, there is no DXGI format equivalent to that. You have to copy the data to a 32-bit format such as DXGI_FORMAT_B8G8R8X8_UNORM.
If bmBitsPixel is 15 or 16, you can use DXGI_FORMAT_B5G5R5A1_UNORM on a system with Direct3D 11.1, but remember that 16-bit DXGI formats are not always supported depending on the driver. Otherwise you'll have to convert this data to something else.
For bmBitsPixel values of 1, 2, 4, or 8 you have to convert them as there are no DXGI texture formats that are equivalent.
The main issue was trying to use &bmp.bmBits directly as a pixel buffer caused memory conflicts within the graphics driver - this was resolved by using 'malloc' to allocate an appropriately sized block of memory to store the pixel data. Thanks to Chuck Walbourn for helping with my poking around in the dark to work out how the pixel data is actually stored (it was actually 32 bits/pixel by default). It's still possible/likely some of code is relying on luck to read the pixel data correctly, but it's been improved with Chuck's input.
My basic technique was;
FindWindow to get the client window on the desktop
CreateCompatibleBitmap and SelectObject and PrintWindow to get a HBITMAP to the snapshot
malloc to allocate the correct amount of space for a (byte*)pixel buffer
GetDIBits to populate the (byte*)pixel buffer from the HBITMAP
CreateTexture2D to build the texture buffer
CreateShaderResourceView to map the texture to the graphics pixel shader
So working code to screenshot a windows desktop window and pass that as a texture to a direct3d app is:
RECT user_window_rectangle;
HWND user_window = FindWindow(NULL, TEXT("Calculator")); //the window can't be min
if (user_window == NULL)
{
MessageBoxA(NULL, "Can't find Calculator", "Camvas", MB_OK);
return;
}
GetClientRect(user_window, &user_window_rectangle);
//create
HDC hdcScreen = GetDC(NULL);
HDC hdc = CreateCompatibleDC(hdcScreen);
UINT screenshot_width = user_window_rectangle.right - user_window_rectangle.left;
UINT screenshot_height = user_window_rectangle.bottom - user_window_rectangle.top;
hbmp = CreateCompatibleBitmap(hdcScreen, screenshot_width, screenshot_height);
SelectObject(hdc, hbmp);
//Print to memory hdc
PrintWindow(user_window, hdc, PW_CLIENTONLY);
BITMAPINFOHEADER bmih;
ZeroMemory(&bmih, sizeof(BITMAPINFOHEADER));
bmih.biSize = sizeof(BITMAPINFOHEADER);
bmih.biPlanes = 1;
bmih.biBitCount = 32;
bmih.biWidth = screenshot_width;
bmih.biHeight = 0-screenshot_height;
bmih.biCompression = BI_RGB;
bmih.biSizeImage = 0;
int bytes_per_pixel = bmih.biBitCount / 8;
BYTE *pixels = (BYTE*)malloc(bytes_per_pixel * screenshot_width * screenshot_height);
BITMAPINFO bmi = { 0 };
bmi.bmiHeader = bmih;
int row_count = GetDIBits(hdc, hbmp, 0, screenshot_height, pixels, &bmi, DIB_RGB_COLORS);
D3D11_TEXTURE2D_DESC screenshot_desc = CD3D11_TEXTURE2D_DESC(
DXGI_FORMAT_B8G8R8A8_UNORM, // format
screenshot_width, // width
screenshot_height, // height
1, // arraySize
1, // mipLevels
D3D11_BIND_SHADER_RESOURCE, // bindFlags
D3D11_USAGE_DYNAMIC, // usage
D3D11_CPU_ACCESS_WRITE, // cpuaccessFlags
1, // sampleCount
0, // sampleQuality
0 // miscFlags
);
D3D11_SUBRESOURCE_DATA data;
ZeroMemory(&data, sizeof(D3D11_SUBRESOURCE_DATA));
data.pSysMem = pixels; // texArray; // &bmp.bmBits; //pixel buffer
data.SysMemPitch = bytes_per_pixel * screenshot_width;// line size in byte
data.SysMemSlicePitch = bytes_per_pixel * screenshot_width * screenshot_height;
hr = dev->CreateTexture2D(
&screenshot_desc, //texture format
&data, // pixel buffer use to fill the texture
&screenshot_texture // created texture
);
D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc;
srvDesc.Format = screenshot_desc.Format;
srvDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
srvDesc.Texture2D.MostDetailedMip = 0;
srvDesc.Texture2D.MostDetailedMip = screenshot_desc.MipLevels;
dev->CreateShaderResourceView(screenshot_texture, NULL, &shader_resource_view);

How do I create an 8-bit PNG with transparency from an NSBitmapImageRep?

I have a 32-bit NSBitmapImageRep which has an alpha channel with essentially 1-bit values (the pixels are either on or off).
I want to save this bitmap to an 8-bit PNG file with transparency. If I use the -representationUsingType:properties: method of NSBitmapImageRep and pass in NSPNGFileType, a 32-bit PNG is created, which is not what I want.
I know that 8-bit PNGs can be read, they open in Preview with no problems, but is it possible to write this type of PNG file using any built-in Mac OS X APIs? I'm happy to drop down to Core Image or even QuickTime if necessary. A cursory examination of the CGImage docs didn't reveal anything obvious.
EDIT:
I've started a bounty on this question, if someone can provide working source code that takes a 32-bit NSBitmapImageRep and writes a 256-color PNG with 1-bit transparency, it's yours.
How about pnglib? It's really lightweight and easy to use.
pngnq (and new pngquant which achieves higher quality) has BSD-style license, so you can just include it in your program. No need to spawn as separate task.
A great reference for working with lower level APIs is Programming With Quartz
Some of the code below is based on examples from that book.
Note: This is un-tested code meant to be a starting point only....
- (NSBitmapImageRep*)convertImageRep:(NSBitmapImageRep*)startingImage{
CGImageRef anImage = [startingImage CGImage];
CGContextRef bitmapContext;
CGRect ctxRect;
size_t bytesPerRow, width, height;
width = CGImageGetWidth(anImage);
height = CGImageGetHeight(anImage);
ctxRect = CGRectMake(0.0, 0.0, width, height);
bytesPerRow = (width * 4 + 63) & ~63;
bitmapData = calloc(bytesPerRow * height, 1);
bitmapContext = createRGBBitmapContext(width, height, TRUE);
CGContextDrawImage (bitmapContext, ctxRect, anImage);
//Now extract the image from the context
CGImageRef bitmapImage = nil;
bitmapImage = CGBitmapContextCreateImage(bitmapContext);
if(!bitmapImage){
fprintf(stderr, "Couldn't create the image!\n");
return nil;
}
NSBitmapImageRep *newImage = [[NSBitmapImageRep alloc] initWithCGImage:bitmapImage];
return newImage;
}
Context Creation Function:
CGContextRef createRGBBitmapContext(size_t width, size_t height, Boolean needsTransparentBitmap)
{
CGContextRef context;
size_t bytesPerRow;
unsigned char *rasterData;
//minimum bytes per row is 4 bytes per sample * number of samples
bytesPerRow = width*4;
//round up to nearest multiple of 16.
bytesPerRow = COMPUTE_BEST_BYTES_PER_ROW(bytesPerRow);
int bitsPerComponent = 2; // to get 256 colors (2xRGBA)
//use function 'calloc' so memory is initialized to 0.
rasterData = calloc(1, bytesPerRow * height);
if(rasterData == NULL){
fprintf(stderr, "Couldn't allocate the needed amount of memory!\n");
return NULL;
}
// uses the generic calibrated RGB color space.
context = CGBitmapContextCreate(rasterData, width, height, bitsPerComponent, bytesPerRow,
CGColorSpaceCreateWithName(kCGColorSpaceGenericRGB),
(needsTransparentBitmap ? kCGImageAlphaPremultipliedFirst :
kCGImageAlphaNoneSkipFirst)
);
if(context == NULL){
free(rasterData);
fprintf(stderr, "Couldn't create the context!\n");
return NULL;
}
//Either clear the rect or paint with opaque white,
if(needsTransparentBitmap){
CGContextClearRect(context, CGRectMake(0, 0, width, height));
}else{
CGContextSaveGState(context);
CGContextSetFillColorWithColor(context, getRGBOpaqueWhiteColor());
CGContextFillRect(context, CGRectMake(0, 0, width, height));
CGContextRestoreGState(context);
}
return context;
}
Usage would be:
NSBitmapImageRep *startingImage; // assumed to be previously set.
NSBitmapImageRep *endingImageRep = [self convertImageRep:startingImage];
// Write out as data
NSData *outputData = [endingImageRep representationUsingType:NSPNGFileType properties:nil];
// somePath is set elsewhere
[outputData writeToFile:somePath atomically:YES];
One thing to try would be creating a NSBitmapImageRep with 8 bits, then copying the data to it.
This would actually be a lot of work, as you would have to compute the color index table yourself.
CGImageDestination is your man for low-level image writing, but I don't know if it supports that specific ability.

Resources