Win32 BitBlt -- mirror image horizontally - winapi

So I've read that StretchBlt can mirror images horizontally and/or vertically by negating the nWidthSrc/Dest and nHeightSrc/Dest parameters. I'd like this functionality without the performance overhead of a StretchBlt. I tried the same technique with BitBlt but it didn't work.
Is there any way to mirror an image with something as simple as BitBlt, without the overkill of a StretchBlt? Or will StretchBlt not affect performance if the source and destination sizes are the same?

BitBlt will only perform mathmatic operations (or, xor, etc) on the individual pixels in question, it will not resize the image in any way. That is exactly what StretchBlt is for, and StretchBlt (compared to any other graphics resizing operation) is insanely fast as in most cases it can use the graphics card to accelerate its performance.

All Win32 functions are probably going to be extremely optimized.
What makes you think StretchBlt will be a big performance hit?
Have you profiled your application using StretchBlt?
You could reverse all of the bitmap data yourself and see if you can do better that StretchBlt.
Here's a link that might help you out:
http://www.codeguru.com/cpp/g-m/bitmap/specialeffects/article.php/c1739

To mirror an image you just need to loop through the pixels in reverse order. Such as if you want to mirror horizontally you just need to do the following:
expand image canvas to double the size
start at the bottom of the image and work you way up writing the pixels in to the mirrored area from the top down.
do step 3 from left to right.
I don't know what language you are using, but most of them allow you to manipulates the pixels or bits on an individual basis using GDI.

No way you are going to be more efficient that StretchBlt, unless you know some extra information about the image (e.g, there is a border so you don't have to flip certain pixels.

Related

SDL accelerated rendering

I am trying to understand the whole 2D accelerated rendering process using SDL 2.0.
So my question is which would be the most efficient way to draw circles in the screen and why?
Some ways would be:
First to create a software surface and then draw the necessary pixels on that surface then create a texture out of that surface and lastly copy that texture to the rendering target.
Also another implementation would be to draw a circle using multiple times SDL_RenderDrawLine.And I think this is the way it is being implemented in SDL 2.0 gfx
Or there is a more efficient way to do all of this?
Take this question more generally in means of if I would wanted to draw other shapes manually, which probably, couldn't be rendered easily with the 2D rendering API that SDL provides(using draw line or rectangle).
With the example of circles this is a fairly complicated question, it is more based on the visual quality you wish to achieve which will drive performance. Drawing lots of short lines will vary vastly based on how close to a circle you wish to get, if you are happy to use say, 60 lines, which will work on small shapes nearly seamlessly but if scaled up will begin to appear not to be a circle, the performance will likely be better (depending on the user's hardware). Note also SDL_RenderDrawLines will be much much faster for many lines as it avoids lots of context switches for rendering calls.
However if you need a very accurate circle with thousands of lines to get a good approximation it will be faster to simply use a bitmap and scale and blit it. This will also give you a 'smoother' feel to the circle.
In my personal opinion I do not think the hardware accelerated render API has much use outside of some special uses such as graph rendering and perhaps very simple GUI drawing. For anything more complex I would usually use bitmap based drawing.
With regards to the second part, it again depends on the accuracy of any arcs you need to draw. If you can easily approximate the shape into a few tens of lines it will be fast, otherwise the pixel method is better.

How to work with pixels using Direct2D

Could somebody provide an example of an efficient way to work with pixels using Direct2D?
For example, how can I swap all green pixels (RGB = 0x00FF00) with red pixels (RGB = 0xFF0000) on a render target? What is the standard approach? Is it possible to use ID2D1HwndRenderTarget for that? Here I assume using some kind of hardware acceleration. Should I create a different object for direct pixels manipulations?
Using DirectDraw I would use BltFast method on the IDirectDrawSurface7 with logical operation. Is there something similar with Direct2D?
Another task is to generate complex images dynamically where each point location and color is a result of a mathematical function. For the sake of an example let's simplify everything and draw Y = X ^ 2. How to do that with Direct2D? Ultimately I'm going to need to draw complex functions but if somebody could give me a simple example for Y = X ^ 2.
First, it helps to think of ID2D1Bitmap as a "device bitmap". It may or may not live in local, CPU-addressable memory, and it doesn't give you any convenient (or at least fast) way to read/write the pixels from the CPU side of the bus. So approaching from that angle is probably the wrong approach.
What I think you want is a regular WIC bitmap, IWICBitmap, which you can create with IWICImagingFactory::CreateBitmap(). From there you can call Lock() to get at the buffer, and then read/write using pointers and do whatever you want. Then, when you need to draw it on-screen with Direct2D, use ID2D1RenderTarget::CreateBitmap() to create a new device bitmap, or ID2D1Bitmap::CopyFromMemory() to update an existing device bitmap. You can also render into an IWICBitmap by making use of ID2D1Factory::CreateWicBitmapRenderTarget() (not hardware accelerated).
You will not get hardware acceleration for these types of operations. The updated Direct2D in Win8 (should also be available for Win7 eventually) has some spiffy stuff for this but it's rather complex looking.
Rick's answer talks about the methods you can use if you don't care about losing hardware acceleration. I'm focusing on how to accomplish this using a substantial amount of GPU acceleration.
In order to keep your rendering hardware accelerated and to get the best performance, you are going to want to switch from ID2DHwndRenderTarget to using the newer ID2DDevice and ID2DDeviceContext interfaces. It honestly doesn't add that much more logic to your code and the performance benefits are substantial. It also works on Windows 7 with the Platform Update. To summarize the process:
Create a DXGI factory when you create your D2D factory.
Create a D3D11 device and a D2D device to match.
Create a swap chain using your DXGI factory and the D3D device.
Ask the swap chain for its back buffer and wrap it in a D2D bitmap.
Render like before, between calls to BeginDraw() and EndDraw(). Remember to unbind the back buffer and destroy the D2D bitmap wrapping it!
Call Present() on the swap chain to see the results.
Repeat from 4.
Once you've done that, you have unlocked a number of possible solutions. Probably the simplest and most performant way to solve your exact problem (swapping color channels) is to use the color matrix effect as one of the other answers mentioned. It's important to recognize that you need to use the newer ID2DDeviceContext interface rather than the ID2DHwndRenderTarget to get this however. There are lots of other effects that can do more complicated operations if you so choose. Here are some of the most useful ones for simple pixel manipulation:
Color matrix effect
Arithmetic operation
Blend operation
For generally solving the problem of manipulating the pixels directly without dropping hardware acceleration or doing tons of copying, there are two options. The first is to write a pixel shader and wrap it in a completely custom D2D effect. It's more work than just getting the pixel buffer on the CPU and doing old-fashioned bit mashing, but doing it all on the GPU is substantially faster. The D2D effects framework also makes it super simple to reuse your effect for other purposes, combine it with other effects, etc.
For those times when you absolutely have to do CPU pixel manipulation but still want a substantial degree of acceleration, you can manage your own mappable D3D11 textures. For example, you can use staging textures if you want to asynchronously manipulate your texture resources from the CPU. There is another answer that goes into more detail. See ID3D11Texture2D for more information.
The specific issue of swapping all green pixels with red pixels can be addressed via ID2D1Effect as of Windows 8 and Platform Update for Windows 7.
More specifically, Color matrix effect.

Is it possible to use pointers to write directly (low level) onto a window without using Bitblt?

I have written an anaglyph filter that mixes two images into one stereographic image. It is a fast routine that works with one pixel at a time.
Right now I'm using pointers to output each calculated pixel to a memory bitmap, then Bitblt that whole image onto the window.
This seems redundant to me. I'd rather copy each pixel directly to the screen, since my anaglyph routine is quite fast. Is it possible to bypass Bitblt and simply have the pointer point directly to wherever Bitblt would copy it to?
I'm sure it's possible, but you really really really don't want to do this. It's much more efficient to draw the entire pattern at once.
You can't draw directly to the screen from windows because the graphics card memory isn't necessarily mapped in any sane order.
Bltting to the screen is amazingly fast.
Remember you don't blt after each pixel - only when you want a new result to be shown, even then there's no point doing this faster than the refresh on your screen - probably 60hz
You are looking for something like glMapBuffer in OpenGL, but acessing directly to the screen.
But writing to the GPU memory pixel per pixel is the slower operation you can do. PCI works faster if you send big streams of data. Also, there are many issues if you write and read data. And the pixel layout is also important (see nvidia docs about fast texture transfers). Bitblt will do it for you in a driver optimised way.

Is StretchBlt HALFTONE == BILINEAR for all scaling?

Can anyone clarify if the GDI StretchBlt function for the workstation Win32 API performs bilinear interpolation for scaling to both larger and smaller images for 24/32-bit color images? And if not, is there a GDI (not GDI+) function that does this?
The SetStretchBltMode fn has a setting HALFTONE which is documented as follows:
HALFTONE
Maps pixels from the source rectangle into blocks of pixels in the destination rectangle. The average color over the destination block of pixels approximates the color of the source pixels.
I've seen references (see follow-up to first answer) that this performs bilinear interpolation when scaling down an image, but no clear answer of what happens when scaling up.
I have noticed that the Windows Mobile CE SDK does support a BILINEAR flag - which is documented exactly opposite of the HALFTONE comments (only works for scaling up).
Note that for the scope of this question, I'm not interested in pursuing GDI+ (which has numerous interpolation options), OpenGL, DirectX, etc. as alternatives, so please don't bother with follow-ups regarding these other APIs or alternate image libraries.
What I'm really hoping to find is some definitive MS/MSDN or other high-quality documentation that clearly documents this behavior of the Win32 (desktop) GDI behavior.
Meanwhile, I'll try some experiments comparing GDI vs. Direct2D (which does have an explicit flag to control this) and post my findings.
Thanks!
I've been looking into this same problem for the past couple of weeks.
As far as I can tell, there does not exist any definitive documentation on this behaviour from Microsoft.
However, I've run some tests myself, to try and establish the degree to which StretchBlt can be trusted to perform consistently with respect to up- and down-scaling images in halftone mode.
My findings are:
1) StretchBlt does produce adequate quality up- and down-scaled images. It might be a touch below Photoshop quality, but probably OK for most practical purposes.
2) It seems to depend upon hardware acceleration, whenever it's available. I haven't been able to confirm this, but I have a slight fear that this may lead to different outputs on different types of hardware. However, on the 5 or 6 different systems I've tried it on, old and new, the performance has been consistent and fast.
3) If you use the call on a 16-bit color device, or lower, StretchBlt will automatically dither your image. If you run it on a 24-bit color device, it will not dither.
4) If you use it to scale small images (smaller than 150x150px), it will randomly fall back to nearest neighbour interpolation. This can be remedied in your own software, by padding the bitmap before scaling, doing StretchBlt on it, and then removing the padding afterwards. Kind of a hack, but it works.
HALFTONE mode performs a very blocky halftone dithering on the image, based on varying the conversion thresholds over a defined square. I have never seen a situation where it would be considered the best choice.
COLORONCOLOR is the best mode for color images, but as you've seen it doesn't give great results.
GDI does not support a bilinear mode (except in Windows Mobile CE as you discovered). The naive implementation of bilinear does not do very well when shrinking an image, as it simply tries to interpolate between two adjacent input pixels without trying to draw from a larger area.

HTML5 Canvas Performance: Loading Images vs Drawing

I'm planning on writing a game using javascript / canvas and I just had 1 question: What kind of performance considerations should I think about in regards to loading images vs just drawing using canvas' methods. Because my game will be using very simple geometry for the art (circles, squares, lines), either method will be easy to use. I also plan to implement a simple particle engine in the game, so I want to be able to draw lots of small objects without much of a performance hit.
Thoughts?
If you're drawing simple shapes with solid fills then drawing them procedurally is the best method for you.
If you're drawing more detailed entities with strokes, gradient fills and other performance sensitive make-up you'd be better off using image sprites. Generating graphics procedurally is not always efficient.
It is possible to get away with a mix of both. Draw graphical entities procedurally on the canvas once as your application starts up. After that you can reuse the same sprites by painting copies of them instead of generating the same drop-shadow, gradient and strokes repeatedly.
If you do choose to draw sprites you should read some of the tips and optimization techniques on this thread.
My personal suggestion is to just draw shapes. I've learned that if you're going to use images instead, then the more you use the slower things get, and the more likely you'll end up needing to do off-screen rendering.
This article discusses the subject and has several tests to benchmark the differences.
Conculsions
In brief — Canvas likes small size of canvas and DOM likes working with few elements (although DOM in Firefox is so slow that it's not always true).
And if you are planing to use particles I thought that you might want to take a look to Doodle-js.
Image loading out of the cache is faster than generating it / loading it from the original resource. But then you have to preload the images, so they get into the cache.
It really depends on the type of graphics you'll use, so I suggest you implement the easiest solution and solve the performance problems as they appear.
Generally I would expect copying a bitmap (drawing an image) to get faster compared to recreating it from primitives, as the complexity of the image gets higher.
That is drawing a couple of squares per scene should need about the same time using either method, but a complex image will be faster to copy from a bitmap.
As with most gaming considerations, you may want to look at what you need to do, and use a mixture of both.
For example, if you are using a background image, then loading the bitmap makes sense, especially if you will crop it to fit in the canvas, but if you are making something that is dynamic then you will need to using the drawing API.
If you target IE9 and FF4, for example, then on Windows you should get some good performance from drawing as they are taking advantage of the graphics card, but, for more general browsers you will want to perhaps look at using sprites, which will either be images you draw as part of the initialization and move, or load bitmapped images.
It would help to know what type of game you are looking at, how dynamic the graphics will need to be, how large the bitmapped images would be, what type of framerate you are hoping for.
The landscape is changing with each browser release. I suggest following the HTML5 Games initiative that Facebook has started, and the jsGameBench test suite. They cover a wide range of approaches from Canvas to DOM to CSS transforms, and their performance pros and cons.
http://developers.facebook.com/blog/post/454
http://developers.facebook.com/blog/archive
https://github.com/facebook/jsgamebench
If you are just drawing simple geometry objects you can also use divs. They can be circles, squares and lines in a few CSS lines, you can position them wherever you want and almost all browser support the styles (you may have some problems with mobile devices using Opera Mini or old Android Browser versions and, of course with IE7-) but there wouldn't be almost any performance hit.

Resources