Could somebody provide an example of an efficient way to work with pixels using Direct2D?
For example, how can I swap all green pixels (RGB = 0x00FF00) with red pixels (RGB = 0xFF0000) on a render target? What is the standard approach? Is it possible to use ID2D1HwndRenderTarget for that? Here I assume using some kind of hardware acceleration. Should I create a different object for direct pixels manipulations?
Using DirectDraw I would use BltFast method on the IDirectDrawSurface7 with logical operation. Is there something similar with Direct2D?
Another task is to generate complex images dynamically where each point location and color is a result of a mathematical function. For the sake of an example let's simplify everything and draw Y = X ^ 2. How to do that with Direct2D? Ultimately I'm going to need to draw complex functions but if somebody could give me a simple example for Y = X ^ 2.
First, it helps to think of ID2D1Bitmap as a "device bitmap". It may or may not live in local, CPU-addressable memory, and it doesn't give you any convenient (or at least fast) way to read/write the pixels from the CPU side of the bus. So approaching from that angle is probably the wrong approach.
What I think you want is a regular WIC bitmap, IWICBitmap, which you can create with IWICImagingFactory::CreateBitmap(). From there you can call Lock() to get at the buffer, and then read/write using pointers and do whatever you want. Then, when you need to draw it on-screen with Direct2D, use ID2D1RenderTarget::CreateBitmap() to create a new device bitmap, or ID2D1Bitmap::CopyFromMemory() to update an existing device bitmap. You can also render into an IWICBitmap by making use of ID2D1Factory::CreateWicBitmapRenderTarget() (not hardware accelerated).
You will not get hardware acceleration for these types of operations. The updated Direct2D in Win8 (should also be available for Win7 eventually) has some spiffy stuff for this but it's rather complex looking.
Rick's answer talks about the methods you can use if you don't care about losing hardware acceleration. I'm focusing on how to accomplish this using a substantial amount of GPU acceleration.
In order to keep your rendering hardware accelerated and to get the best performance, you are going to want to switch from ID2DHwndRenderTarget to using the newer ID2DDevice and ID2DDeviceContext interfaces. It honestly doesn't add that much more logic to your code and the performance benefits are substantial. It also works on Windows 7 with the Platform Update. To summarize the process:
Create a DXGI factory when you create your D2D factory.
Create a D3D11 device and a D2D device to match.
Create a swap chain using your DXGI factory and the D3D device.
Ask the swap chain for its back buffer and wrap it in a D2D bitmap.
Render like before, between calls to BeginDraw() and EndDraw(). Remember to unbind the back buffer and destroy the D2D bitmap wrapping it!
Call Present() on the swap chain to see the results.
Repeat from 4.
Once you've done that, you have unlocked a number of possible solutions. Probably the simplest and most performant way to solve your exact problem (swapping color channels) is to use the color matrix effect as one of the other answers mentioned. It's important to recognize that you need to use the newer ID2DDeviceContext interface rather than the ID2DHwndRenderTarget to get this however. There are lots of other effects that can do more complicated operations if you so choose. Here are some of the most useful ones for simple pixel manipulation:
Color matrix effect
Arithmetic operation
Blend operation
For generally solving the problem of manipulating the pixels directly without dropping hardware acceleration or doing tons of copying, there are two options. The first is to write a pixel shader and wrap it in a completely custom D2D effect. It's more work than just getting the pixel buffer on the CPU and doing old-fashioned bit mashing, but doing it all on the GPU is substantially faster. The D2D effects framework also makes it super simple to reuse your effect for other purposes, combine it with other effects, etc.
For those times when you absolutely have to do CPU pixel manipulation but still want a substantial degree of acceleration, you can manage your own mappable D3D11 textures. For example, you can use staging textures if you want to asynchronously manipulate your texture resources from the CPU. There is another answer that goes into more detail. See ID3D11Texture2D for more information.
The specific issue of swapping all green pixels with red pixels can be addressed via ID2D1Effect as of Windows 8 and Platform Update for Windows 7.
More specifically, Color matrix effect.
Related
So, Vulkan introduced subpasses and opengl implelemts similar behaviour with ARM_framebuffer_fetch
In the past, I have used framebuffer_fetch successfully for tonemapping post-effect shaders.
Back then the limitation was that one could only read the contents of the framebuffer at the location of the currently rendered fragment.
Now, what I wonder is whether there is any way by now in Vulkan (or even OpenGL ES) to read from multiple locations (for example to implement a blur kernel) without having a tiled hardware to store/load to RAM.
In theory I guess it should be possible, the first pass wpuld just need to render slightly larger than the blur subpass, based on kernel size (so for example if kernel size was 4 pixels then the tile resolved would need to be 4 pixels smaller than the in-tile buffer sizes) and some pixels would have to be rendered redundantly (on the overlaps of tiles).
Now, is there a way to do that?
I seem to recall having seen some Vulkan instruction related to subpasses that would allow to define the support size (which sounded like what I’m looking for now) but I can’t recall where I saw that.
So my questions:
With Vulkan on a mobile tiled renderer architecture, is it possible to forward-render some geometry and the render a full-screen blur over it, all within a single in-tile pass (without the hardware having to store the result of the intermediate pass to ram first and then load the texture from ram when bluring)? If so, how?
If the answer to 1 is yes, can it also be done in OpenGL ES?
Short answer, no. Vulkan subpasses still have the 1:1 fragment-to-pixel association requirements.
Is there some way to store a "scene" in Direct2D on the GPU?
I'm looking for something like ID2D1Mesh (i.e. storing the resource in vector format, not as a bitmap) but where I can configure if the mesh/scene/resource should be rendered with anti-aliasing or not.
Rick is correct in that you can apply antialiasing at two different levels. Either through Direct2D or through Direct3D. You can do both but that’s pointless and would only waste resources and lead to poor results. Direct2D antialiasing is suitable if you want per-primitive geometry-aware antialiasing. Direct3D antialiasing is useful if you want to sacrifice a bit of quality for better overall performance in some scenarios.
The Direct2D 1.1 command list literally stores/records a list of drawing commands that can be played back against different targets. This may be what you’re after as it’s not rasterized. Conceptually it’s like storing a vector image in device memory. Command lists are somewhat limited in that you cannot modify the command list once created and resources being drawn may also not be changed, but it’s still quite handy nonetheless.
There is a way to get antialiasing with ID2D1Mesh, but it's non-trivial. You have to create the Direct3D device yourself and then use ID2D1Factory::CreateDxgiSurfaceRenderTarget(). This allows you to configure the multisampling/antialiasing settings of the D3D device directly, and then meshes play along just fine (in fact I think you'd just always tell Direct2D to use aliased rendering). I haven't done this myself, but there is a MSDN sample that shows how to do this. It's not for the faint of heart ... and in order to do software rendering you have to initialize a WARP device. It does work, however.
Also, in Direct2D 1.1 (Windows 8, or Windows 7 + Platform Update), you can use the ID2D1CommandList interface for record/playback stuff. I'm not sure if that's implemented as "compile to GPU" (ala mesh), or if it's just macros (record/playback of commands).
In Windows 8.1, Direct2D introduced geometry realizations, which lets you store a tessellated version of the geometry and later render it back with or without anti-aliasing, just like you asked. These are highly recommended over the use of meshes. Command lists, while convenient, don't have the same caching abilities as creating and storing the geometry realizations yourself.
I've been using SDL to render graphics in C. I know there are several options to create graphics at the pixel level on Windows, including SDL and OpenGL. But how do these programs do it? Fine, I can use SDL. But I'd like to know what SDL is doing so I don't feel like an ignorant fool. Am I the only one slightly frustrated by the opaque layer of frosting on modern computers?
A short explanation as to how this is done on other operating systems would also be interesting, but I am most concerned with Windows.
Edit: Since this question seems to be somehow unclear, this is precisely what I want:
I would like to know how pixel level graphics manipulations (drawing on the screen pixel by pixel) works on Windows. What do libraries like SDL do with the operating system to allow this to happen. I can manipulate the screen pixel by pixel using SDL, so what magic happens in SDL to let me do this?
Windows has many graphics APIs. Some are layers built on top of others (e.g., GDI+ on top of GDI), and others are completely independent stacks (like the Direct3D family).
In an API like GDI, there are functions like SetPixel which let you change the value of a single pixel on the screen (or within a region of the screen that you have access to). But using SetPixel to setting lots of pixels is generally slow.
If you were to build a photorealistic renderer, like a ray tracer, then you'd probably build up a bitmap in memory (pixel by pixel), and use an API like BitBlt that sends the entire bitmap to the screen at once. This is much faster.
But it still may not be fast enough for rendering something like video. Moving all that data from system memory to the video card memory takes time. For video, it's common to use a graphics stack that's closer to the low-level graphics drivers and hardware. If the graphics card can do the video decompression directly, then sending the compressed video stream to the card will be much more efficient than sending the decompressed data from system memory to the video card--and that's often the limiting factor.
But conceptually, it's the same thing: you're manipulating a bitmap (or texture or surface or raster or ...), but that bitmap lives in graphics memory, and you're issuing commands to the GPU to set the pixels the way you want, and then to display that bitmap at some portion of the screen (often with some sort of transformation).
Modern graphics processors actually run little programs--called shaders--that can (among other things) do calculations to determine the pixel values. The GPUs are optimized to do these types of calculations and can do many of them in parallel. But ultimately, it boils down to getting the pixel values into some sort of bitmap in video memory.
I want to make a skinning engine capable of drawing custom-shaped windows with alpha blending. That is, it'll use layered windows (UpdateLayeredWindow). A typical window will contain among its background a couple dozens of other bitmaps ranging from 10×10 to, say, 300×150 pixels. In the worst case most of these elements will have smooth animation up to 30 fps. Everything will be alpha-blended and I am going to use Direct2D for this (yes, I know older Windows versions doesn't support it). In general, Winamp's modern skin engine is the closest example.
Given all this and taking in account modern PCs performance, can I just redraw the whole window every single frame or do I have to constrain to some sort of clip rectangle?
D2D required you to render with WM_Paint messages
Honneslty, use The IAnimation interface, and just let D2D and windows worry about how often to redraw , though i will let you know , winamp is done with adobe air, and layerd windows with d2d causes issues. (Kinda think you have to use a DXGI render target, but with the window being layerd it needs a DC to be returned to an end paint call so it can update it's alpha channel)
I have some experience with this.
If you need to support Windows XP, using UpdateLayeredWindow is the only choice available for solving this problem. The documentation for this call says it copies the whole bitmap to the screen each time it is called and this bottleneck showed up in my benchmarking as the real limiting factor. If your window is 300x300 you pay that price on every update, even if you are careful to modify only a couple of pixels. It would be very easy to over-optimize the rendering side for no real benefit so implement something simple, measure, and then decide if you need to optimize.
If you can drop support for Windows XP then you can avoid UpdateLayeredWindow completely and use DwmExtendFrameIntoClientArea to create the same effect as a layered window. You'll write less code, avoid the UpdateLayeredWindow bottleneck, and D2D will be easier to work with.
I have written an anaglyph filter that mixes two images into one stereographic image. It is a fast routine that works with one pixel at a time.
Right now I'm using pointers to output each calculated pixel to a memory bitmap, then Bitblt that whole image onto the window.
This seems redundant to me. I'd rather copy each pixel directly to the screen, since my anaglyph routine is quite fast. Is it possible to bypass Bitblt and simply have the pointer point directly to wherever Bitblt would copy it to?
I'm sure it's possible, but you really really really don't want to do this. It's much more efficient to draw the entire pattern at once.
You can't draw directly to the screen from windows because the graphics card memory isn't necessarily mapped in any sane order.
Bltting to the screen is amazingly fast.
Remember you don't blt after each pixel - only when you want a new result to be shown, even then there's no point doing this faster than the refresh on your screen - probably 60hz
You are looking for something like glMapBuffer in OpenGL, but acessing directly to the screen.
But writing to the GPU memory pixel per pixel is the slower operation you can do. PCI works faster if you send big streams of data. Also, there are many issues if you write and read data. And the pixel layout is also important (see nvidia docs about fast texture transfers). Bitblt will do it for you in a driver optimised way.