Is it possible to prevent tearing artifacts when drawing using GDI on a window with DWM composition? - windows

I am drawing an animation using double-buffered GDI on a window, on a system where DWM composition is enabled, and seeing clearly visible tearing onscreen. Is there a way to prevent this?
Details
The animation takes the same image, and moves it right to left over the screen; the number of pixels across is determined by the difference between the current time and the time the animation started and the time to end, to get a fraction complete which is applied to the whole window width, using timeGetTime with a 1ms resolution. The animation draws in a loop without processing application messages; it calls the (VCL library) method Repaint which internally invalidates and then calls UpdateWindow for the window in question, directly calling into the message procedure with WM_PAINT. The VCL implementation of the paint handler uses BeginBufferedPaint. Painting is itself double-buffered.
The aim of this is to have as high a frame-rate as possible to get a smooth animation across the screen. (The drawing uses double-buffering to remove flickering and to ensure a whole image or frame is onscreen at any one time. It invalidates and updates directly by calling into the message procedure, without doing other message processing. Painting is implemented using modern techniques (eg BeginBufferedPaint) for Aero composition.) Within this, painting is done in a couple of BitBlt calls (one for the left side of the animation, ie what's moving offscreen, and one for the right side of the animation, ie what's moving onscreen.)
When watching the animation, there is clearly visible tearing. This occurs on Windows Vista, 7 and 8.1 on multiple systems with different graphics cards.
My approach to handle this has been to reduce the rate at which it is drawing, or to try to wait for VSync before painting again. This might be the wrong approach, so the answer to this question might be "Do something else completely: X". If so, great :)
(What I'd really like is a way to ask the DWM to compose / use only fully-painted frames for this specific window.)
I've tried the following approaches, none of which remove all visible tearing. Therefore the question is, Is it possible to avoid tearing when using DWM composition, and if so how?
Approaches tried:
Getting the monitor refresh rate via GetDeviceCaps(Application.MainForm.Handle, VREFRESH); sleeping for 1 / refresh rate milliseconds. Slightly improved over painting as fast as possible, but may be wishful thinking. Perceptually slightly less smooth animation rate. (Tweaks: normal Sleep and a high-resolution spin-wait using timeGetTime.)
Using DwmSetPresentParameters to try to limit updating to the same rate at which the code draws. (Variations: lots of buffers (cBuffer = 8) (no visible effect); specifying a source rate of monitor refresh rate / 1 and sleeping using the above code (the same as just trying the sleeping approach); specifying a refresh per frame of 1, 10, etc (no visible effect); changing the source frame coverage (no visible effect.)
Using DwmGetCompositionTimingInfo in a variety of ways:
While cFramesPending > 0, spin;
Get cFrame (frame composed) and spin while this number doesn't change;
Get cFrameDisplayed and spin while this doesn't change;
Calculating a time to sleep to by adding qpcVBlank + qpcRefreshPeriod, and then while QueryPerformanceCounter returns a time less than this, spin
All these approaches have also been varied by painting, then spinning/sleeping before painting again; or the reverse: sleeping and then painting.
Few seem to have any visible effect and what effect there is is hard to qualify and may just be a result of a lower frame rate. None prevent tearing, ie none make the DWM compose the window with a "whole" copy of the contents of the window's DC.
Advice appreciated :)

Since you're using BitBlt, make sure your DIBs are 4-bytes / pixel. With 3 bytes / pixel, GDI is horribly slow while DWM is running, that could be the source of your tearing. Another BitBlt issue I've run into, if your DIB is somewhat larger, than the BitBlt call make take an unexpectedly long time. If you split up one call into smaller calls than only draw a portion of the data, it might help. Both of these items helped me for my case, only because BitBlt itself was running too slow, thus leading to video artifacts.

Related

How are blinking carets often implemented?

I'm trying to implement a cross-platform UI library that takes as little system resource as possible. I'm considering to either use my own software renderer or opengl.
For stationary controls everything's fine, I can repaint only when it's needed. However, when it comes to implementing animations, especially animated blinking carets like the 'phase' caret in sublime text, I don't see a easy way to balance resource usage and performance.
For a blinking caret, it's required that the caret be redrawn very frequently(15-20 times per sec at least, I guess). On one hand, the software renderer supports partial redraw but is far too slow to be practical(3-4 fps for large redraw regions, say, 1000x800, which makes it impossible to implement animations). On the other hand, opengl doesn't support partial redraw very well as far as I know, which means the whole screen needs to be rendered at 15-20 fps constantly.
So my question is:
How are carets usually implemented in various UI systems?
Is there any way to have opengl to render to only a proportion of the screen?
I know that glViewport enables rendering to part of the screen, but due to double buffering or other stuff the rest of the screen is not kept as it was. In this way I still need to render the whole screen again.
First you need to ask yourself.
Do I really need to partially redraw the screen?
OpenGL or better said the GPU can draw thousands of triangles at ease. So before you start fiddling with partial redrawing of the screen, then you should instead benchmark and see whether it's worth looking into at all.
This doesn't however imply that you have to redraw the screen endlessly. You can still just redraw it when changes happen.
Thus if you have a cursor blinking every 500 ms, then you redraw once every 500 ms. If you have an animation running, then you continuously redraw while that animation is playing (or every time the animation does a change that requires redrawing).
This is what Chrome, Firefox, etc does. You can see this if you open the Developer Tools (F12) and go to the Timeline tab.
Take a look at the following screenshot. The first row of the timeline shows how often Chrome redraws the windows.
The first section shows a lot continuously redrawing. Which was because I was scrolling around on the page.
The last section shows a single redraw every few 500 ms. Which was the cursor blinking in a textbox.
Open the image in a new tab, to see better what's going on.
Note that it doesn't tell whether Chrome is fully redrawing the window or only that parts of it. It is just showing the frequency of the redrawing. (If you want to see the redrawn regions, then both Firefox and Chrome has "Show Paint Rectangles".)
To circumvent the problem with double buffering and partially redrawing. Then you could instead draw to a framebuffer object. Now you can utilize glScissor() as much as you want. If you have various things that are static and only a few dynamic things. Then you could have multiple framebuffer objects and only draw the static contents once and continuously update the framebuffer containing the dynamic content.
However (and I can't emphasize this enough) benchmark and check if this is even needed. Having two framebuffer objects could be more expensive than just always redrawing everything. The same goes for say having a buffer for each rectangle, in contrast to packing all rectangles in a single buffer.
Lastly to give an example let's take NanoGUI (a minimalistic GUI library for OpenGL). NanoGUI continuously redraws the screen.
The problem with not just continuously redrawing the screen is that now you need a system for issuing a redraw. Now calling setText() on a label needs to callback and tell the window to redraw. Now what if the parent panel the label is added to isn't visible? Then setText() just issued a redundant redrawing of the screen.
The point I'm trying to make is that if you have a system for issuing redrawing of the screen. Then that might be more prone to errors. Thus unless continuously redrawing is an issue, then that is definitely a more optimal starting point.

Reusing the previous back buffer on WM_PAINT

If the content of the last frame isn't changed on receiving WM_PAINT, is it possible to simply direct the operating system to redraw the window using the old back buffer instead of redrawing the whole scene again to the new back buffer and swapping it?
No. There is no such "backbuffer". And when drawing occurs you don't know what areas may be covered by other windows. The clipping area isn't a real good indicator.
The only thing you know is that such areas need to be redrawn. Each window cares about its own client area. If you want to buffer something, you have to do it on your own.
The reason is simple: Imagine you have hundreds of windows. To hold a buffer for each window is inefficient, when just a view on the top are visible. So the Windows makers decide not to store any windows content and just notify windows on the top to redraw themselves.
OK. Since we have a DWM (Dynamic Window Manager) things changed a lot. But the principle is still: You are responsible to draw. If you want to buffer something, you have to do it on your own.

A skinning engine in Windows: draw “dirty” regions only or the whole window at once?

I want to make a skinning engine capable of drawing custom-shaped windows with alpha blending. That is, it'll use layered windows (UpdateLayeredWindow). A typical window will contain among its background a couple dozens of other bitmaps ranging from 10×10 to, say, 300×150 pixels. In the worst case most of these elements will have smooth animation up to 30 fps. Everything will be alpha-blended and I am going to use Direct2D for this (yes, I know older Windows versions doesn't support it). In general, Winamp's modern skin engine is the closest example.
Given all this and taking in account modern PCs performance, can I just redraw the whole window every single frame or do I have to constrain to some sort of clip rectangle?
D2D required you to render with WM_Paint messages
Honneslty, use The IAnimation interface, and just let D2D and windows worry about how often to redraw , though i will let you know , winamp is done with adobe air, and layerd windows with d2d causes issues. (Kinda think you have to use a DXGI render target, but with the window being layerd it needs a DC to be returned to an end paint call so it can update it's alpha channel)
I have some experience with this.
If you need to support Windows XP, using UpdateLayeredWindow is the only choice available for solving this problem. The documentation for this call says it copies the whole bitmap to the screen each time it is called and this bottleneck showed up in my benchmarking as the real limiting factor. If your window is 300x300 you pay that price on every update, even if you are careful to modify only a couple of pixels. It would be very easy to over-optimize the rendering side for no real benefit so implement something simple, measure, and then decide if you need to optimize.
If you can drop support for Windows XP then you can avoid UpdateLayeredWindow completely and use DwmExtendFrameIntoClientArea to create the same effect as a layered window. You'll write less code, avoid the UpdateLayeredWindow bottleneck, and D2D will be easier to work with.

How Windows (or other OSes) update client's background area?

Or to ask it another way, how OnEraseBkgnd() works?
I'm building a custom control and I hit on this problem. Childs are rectangles, as usual. I had to disable OnEraseBkgnd() and I use only the OnPaint(). What I need is to efficiently clear the area behind the childs and without flickering. Techniques like using back buffers are not an option.
Edit: I am very interested in the algorithm that's under the hood of OnEraseBkgnd(). But any helpful answer will also be accepted.
Usually in Windows, the easiest (but not most effective) means of reducing flicker, is to turn off the WM_ERASEBKGND notification handling. This is because if you erase the background in the notification handler, then paint the window in the WM_PAINT handler, there is a short delay between the two - this delay is seen as a flicker.
Instead, if you do all of the erasing and drawing in the WM_PAINT handler, you will tend to see a lot less flicker. This is because the delay between the two is reduced. You will still see some flicker, especially when resizing because there is still a small delay between the two actions, and you cannot always get in all of the drawing before the next occurance of the vertical blanking interrupt for the monitor. If you cannot use double buffering, then this is probably the most effective method you will be able to use.
You can get better drawing performance by following most of the usual recommendations around client area invalidation - do not invalidate the whole window unless you really need to. Try to invalidate only the areas that have changed. Also, you should use the BeginDeferWindowPos functions if you are updating the positions of a collection of child windows at the same time.

Why Direct3D application performs better in full screen mode?

The performance of a Direct3D application seems to be significantly better in full screen mode compared to windowed mode. What are the technical reasons behind this?
I guess it has something to do with the fact that a full screen application can gain exclusive control for the display. But why the application cannot gain exclusive control for part of the screen (i.e. window) and have the same performance benefits?
Here are the cliff notes on how things work underneath.
Monitor screen always needs to be associated with so-called primary surface to be able to display anything, i.e. videocard can only scan out of one surface in video memory.
When application is fullscreen (and everything was set up correctly to enable flipping), primary surface is just one of the application backbuffers, and flipped to another backbuffer every frame. It is the most efficient way of presenting on the screen, but it requires application to own the entire monitor area (i.e. entire primary surface).
When there's no fullscreen application and DWM is off, primary surface is owned by OS, and every windowed application performs a blit from application backbuffer to a primary surface. This blit takes some GPU time to complete (as well as blits from the other applications visible on the screen), so it's not as efficient as fullscreen presentation. XP worked that way.
When DWM is composing the screen, things get even more complicated.
Here, DWM owns the primary surface and needs to draw application windows there. To make it possible, every window has an associated surface holding its contents, called redirection surface (which allows DWM to enable window ghosting, glass effects, and all that good stuff). Every time D3D application issues a frame, it adds a blit to a redirection surface.
That way, several blits need to happen: blit to a redirection surface by the app, blit from a redirection surface to the primary by DWM, which is, again, some overhead compared to fullscreen.
Note all of that additional work is on the GPU, so it doesn't affect CPU performance.
Stuff to read further:
http://blogs.msdn.com/greg_schechter/archive/2006/03/19/555087.aspx
http://blogs.msdn.com/greg_schechter/archive/2006/05/02/588934.aspx
http://blogs.msdn.com/greg_schechter/archive/2006/03/05/544314.aspx
There's a bit on MSDN that says full screen mode uses buffer flipping, if set up correctly, as opposed to blitting. It makes sense.
Of course you can (and in a way, do) give exclusive control for part of the screen to an application, but what happens to the rest of the screen? You still have to blit, do occlusion checking, etc. on the rest of the windows, and I think that's what causes the performance hit.
I'll add to #aib's answer that the rest of the screen is being managed by the OS. So, if anything else needs to be drawn/worked upon simultaneously, there has to be a performance hit.
For example, if you have a video playing in Windows Media Player in one window, then start Civilization in another, when Civ starts doing its fancy graphics, it will need to share screen space with everything else (like the video.
Whereas if the DirectX app has the full-screen, everything else might be "updating" or "playing", but not being drawn.
Basically, the video hardware is completely dedicated to the exclusive mode application.
There is no contention for video resources (pipeline, texture memory, etc...)
In particular, texture upload can be a big bottleneck. The less you have to do it (because you have it all), the better.

Resources