How many buffers does DWM have? - dwm

When DWM is turned on, applications have back buffers and when something in the buffer changes, the DWN need to compose the desktop screen again out of these buffers. Does that mean the DWM have their own back buffer that is used for this composition job before swapping it with the front buffer?
Does that mean applications like video players are, in effect, writing their screen twice, once to the back buffer of the window and the other that is copied from the back buffer of the window to the back buffer of the DWM?
Isn't that much inefficient than when compared to the pre-DWM days?

Related

How does windowed rendering work without a compositor?

On systems with a compositor, a windowed application must render into an off-screen buffer, which is then submitted to the compositor for composition and presentation. How does displaying a windowed desktop work without a compositor?
Suppose we have a 3D application using double buffering to render into a window, fully redrawing on each frame. This is my understanding of the process of presenting a new frame:
Application submits a frame buffer for presentation.
Compositor receives the buffer.
Later, the compositor composites all the windows into the screen's back buffer.
Compositor swaps the screen's buffers.
What happens after step 1 if there is no compositor? (For example, on Windows without DWM, or on an X server.) Clearly, something is laying out the windows and making sure they render in the correct position and order. How is that different from compositing?

Reusing the previous back buffer on WM_PAINT

If the content of the last frame isn't changed on receiving WM_PAINT, is it possible to simply direct the operating system to redraw the window using the old back buffer instead of redrawing the whole scene again to the new back buffer and swapping it?
No. There is no such "backbuffer". And when drawing occurs you don't know what areas may be covered by other windows. The clipping area isn't a real good indicator.
The only thing you know is that such areas need to be redrawn. Each window cares about its own client area. If you want to buffer something, you have to do it on your own.
The reason is simple: Imagine you have hundreds of windows. To hold a buffer for each window is inefficient, when just a view on the top are visible. So the Windows makers decide not to store any windows content and just notify windows on the top to redraw themselves.
OK. Since we have a DWM (Dynamic Window Manager) things changed a lot. But the principle is still: You are responsible to draw. If you want to buffer something, you have to do it on your own.

Is it possible to prevent tearing artifacts when drawing using GDI on a window with DWM composition?

I am drawing an animation using double-buffered GDI on a window, on a system where DWM composition is enabled, and seeing clearly visible tearing onscreen. Is there a way to prevent this?
Details
The animation takes the same image, and moves it right to left over the screen; the number of pixels across is determined by the difference between the current time and the time the animation started and the time to end, to get a fraction complete which is applied to the whole window width, using timeGetTime with a 1ms resolution. The animation draws in a loop without processing application messages; it calls the (VCL library) method Repaint which internally invalidates and then calls UpdateWindow for the window in question, directly calling into the message procedure with WM_PAINT. The VCL implementation of the paint handler uses BeginBufferedPaint. Painting is itself double-buffered.
The aim of this is to have as high a frame-rate as possible to get a smooth animation across the screen. (The drawing uses double-buffering to remove flickering and to ensure a whole image or frame is onscreen at any one time. It invalidates and updates directly by calling into the message procedure, without doing other message processing. Painting is implemented using modern techniques (eg BeginBufferedPaint) for Aero composition.) Within this, painting is done in a couple of BitBlt calls (one for the left side of the animation, ie what's moving offscreen, and one for the right side of the animation, ie what's moving onscreen.)
When watching the animation, there is clearly visible tearing. This occurs on Windows Vista, 7 and 8.1 on multiple systems with different graphics cards.
My approach to handle this has been to reduce the rate at which it is drawing, or to try to wait for VSync before painting again. This might be the wrong approach, so the answer to this question might be "Do something else completely: X". If so, great :)
(What I'd really like is a way to ask the DWM to compose / use only fully-painted frames for this specific window.)
I've tried the following approaches, none of which remove all visible tearing. Therefore the question is, Is it possible to avoid tearing when using DWM composition, and if so how?
Approaches tried:
Getting the monitor refresh rate via GetDeviceCaps(Application.MainForm.Handle, VREFRESH); sleeping for 1 / refresh rate milliseconds. Slightly improved over painting as fast as possible, but may be wishful thinking. Perceptually slightly less smooth animation rate. (Tweaks: normal Sleep and a high-resolution spin-wait using timeGetTime.)
Using DwmSetPresentParameters to try to limit updating to the same rate at which the code draws. (Variations: lots of buffers (cBuffer = 8) (no visible effect); specifying a source rate of monitor refresh rate / 1 and sleeping using the above code (the same as just trying the sleeping approach); specifying a refresh per frame of 1, 10, etc (no visible effect); changing the source frame coverage (no visible effect.)
Using DwmGetCompositionTimingInfo in a variety of ways:
While cFramesPending > 0, spin;
Get cFrame (frame composed) and spin while this number doesn't change;
Get cFrameDisplayed and spin while this doesn't change;
Calculating a time to sleep to by adding qpcVBlank + qpcRefreshPeriod, and then while QueryPerformanceCounter returns a time less than this, spin
All these approaches have also been varied by painting, then spinning/sleeping before painting again; or the reverse: sleeping and then painting.
Few seem to have any visible effect and what effect there is is hard to qualify and may just be a result of a lower frame rate. None prevent tearing, ie none make the DWM compose the window with a "whole" copy of the contents of the window's DC.
Advice appreciated :)
Since you're using BitBlt, make sure your DIBs are 4-bytes / pixel. With 3 bytes / pixel, GDI is horribly slow while DWM is running, that could be the source of your tearing. Another BitBlt issue I've run into, if your DIB is somewhat larger, than the BitBlt call make take an unexpectedly long time. If you split up one call into smaller calls than only draw a portion of the data, it might help. Both of these items helped me for my case, only because BitBlt itself was running too slow, thus leading to video artifacts.

What happens during a display mode change?

What happens during a display mode change (resolution, depth) on an ordinary computer? (classical stationarys and laptops)
It might not be so trivial since video cards are so different, but one thing is common to all of them:
The screen goes black (understandable since the signal is turned off)
It takes many seconds for the signal to return with the new mode
and if it is under D3D or GL:
The graphics device is lost and all VRAM objects must be reloaded, making the mode change take even longer
Can someone explain the underlying nature of this, and specifically why a display mode change is not a trivial reallocation of the backbuffer(s) and takes such a "long" time?
The only thing that actually changes are the settings of the so called RAMDAC (a Digital Analog Converter directly attached to the video RAM), well today with digital connections it's more like a RAMTX (a DVI/HDMI/DisplayPort Transmitter attached to the video RAM). DOS graphics programmer veterans probably remember the fights between the RAMDAC, the specification and the woes of one's own code.
It actually doesn't take seconds until the signal returns. This is a rather quick process, but most display devices take their time to synchronize with the new signal parameters. Actually with well written drivers the change happens almost immediately, between vertical blanks. A few years ago, when the displays were, errr, stupider and analogue, after changing the video mode settings, one could see the picture going berserk for a short moment, until the display resynchronized (maybe I should take a video of this, while I still own equipment capable of this).
Since what actually is going on is just a change of RAMDAC settings there's also not neccesary data lost as long as the basic parameters stays the same: Number of Bits per Pixel, number of components per pixel and pixel stride. And in fact OpenGL contexts usually don't loose their data with an video mode change. Of course visible framebuffer layouts change, but that happens also when moving the window around.
DirectX Graphics is a bit of different story, though. There is device exclusive access and whenever switching between Direct3D fullscreen mode and regular desktop mode all graphics objects are swapped, so that's the reason for DirectX Graphics being so laggy when switching from/to a game to the Windows desktop.
If the pixel data format changes it usually requires a full reinitialization of the visible framebuffer, but today GPUs are exceptionally good in maping heterogenous pixel formats into a target framebuffer, so no delays neccesary there, too.

Why Direct3D application performs better in full screen mode?

The performance of a Direct3D application seems to be significantly better in full screen mode compared to windowed mode. What are the technical reasons behind this?
I guess it has something to do with the fact that a full screen application can gain exclusive control for the display. But why the application cannot gain exclusive control for part of the screen (i.e. window) and have the same performance benefits?
Here are the cliff notes on how things work underneath.
Monitor screen always needs to be associated with so-called primary surface to be able to display anything, i.e. videocard can only scan out of one surface in video memory.
When application is fullscreen (and everything was set up correctly to enable flipping), primary surface is just one of the application backbuffers, and flipped to another backbuffer every frame. It is the most efficient way of presenting on the screen, but it requires application to own the entire monitor area (i.e. entire primary surface).
When there's no fullscreen application and DWM is off, primary surface is owned by OS, and every windowed application performs a blit from application backbuffer to a primary surface. This blit takes some GPU time to complete (as well as blits from the other applications visible on the screen), so it's not as efficient as fullscreen presentation. XP worked that way.
When DWM is composing the screen, things get even more complicated.
Here, DWM owns the primary surface and needs to draw application windows there. To make it possible, every window has an associated surface holding its contents, called redirection surface (which allows DWM to enable window ghosting, glass effects, and all that good stuff). Every time D3D application issues a frame, it adds a blit to a redirection surface.
That way, several blits need to happen: blit to a redirection surface by the app, blit from a redirection surface to the primary by DWM, which is, again, some overhead compared to fullscreen.
Note all of that additional work is on the GPU, so it doesn't affect CPU performance.
Stuff to read further:
http://blogs.msdn.com/greg_schechter/archive/2006/03/19/555087.aspx
http://blogs.msdn.com/greg_schechter/archive/2006/05/02/588934.aspx
http://blogs.msdn.com/greg_schechter/archive/2006/03/05/544314.aspx
There's a bit on MSDN that says full screen mode uses buffer flipping, if set up correctly, as opposed to blitting. It makes sense.
Of course you can (and in a way, do) give exclusive control for part of the screen to an application, but what happens to the rest of the screen? You still have to blit, do occlusion checking, etc. on the rest of the windows, and I think that's what causes the performance hit.
I'll add to #aib's answer that the rest of the screen is being managed by the OS. So, if anything else needs to be drawn/worked upon simultaneously, there has to be a performance hit.
For example, if you have a video playing in Windows Media Player in one window, then start Civilization in another, when Civ starts doing its fancy graphics, it will need to share screen space with everything else (like the video.
Whereas if the DirectX app has the full-screen, everything else might be "updating" or "playing", but not being drawn.
Basically, the video hardware is completely dedicated to the exclusive mode application.
There is no contention for video resources (pipeline, texture memory, etc...)
In particular, texture upload can be a big bottleneck. The less you have to do it (because you have it all), the better.

Resources