I can toggle depth testing on/off in OpenGL using glEnable( GL_DEPTH_TEST );
But this switches the test on/off for the entire draw call.
I would like to control the test on a per-fragment basis.
This is to achieve the following effect: in a chess-board pattern, half the pixels of the primitive are potentially occluded, but the other half are always drawn, even if there is an occluding object in the frame-buffer that is nearer to the camera.
To illustrate, see this example where only half the pixels of the pyramid are potentially occluded by the box in front of it.
Note that the red box is drawn first, and is already in the frame-buffer when the green pyramid is drawn.
Also note: I will draw the object with the special depth test last, so the case where the red box is drawn first would not happen.
For my purposes, all objects drawn this way are convex and will never occlude itself.
I know how to get a chessboard pattern in my fragment shader:
float xm2 = mod( gl_FragCoord.x, 2.0 );
float ym2 = mod( gl_FragCoord.y, 2.0 );
if ( int(xm2) != int(ym2) )
discard;
But in my case, I don't want to flat out discard on a per-fragment basis, I just want the depth test toggled on a per-fragment basis.
I am targeting OpenGL-ES 3.0.
This is easily achieved by drawing the (red) mesh twice:
First draw it normally, all fragments, with depth test enabled.
Next draw it again, without depth-test, and discarding half the fragments.
The solution was prompted by the comment made by #derhass.
Related
Suppose I need to render the following scene:
Two cubes, one yellow, another red.
The red cube needs to 'glow' with red light, the yellow one does not glow.
The cubes are rotating around the common center of gravity.
The camera is positioned in
such a way that when the red, glowing cube is close to the camera,
it partially obstructs the yellow cube, and when the yellow cube is
close to the camera, it partially obstructs the red, glowing one.
If not for the glow, the scene would be trivial to render. With the glow, I can see at least 2 ways of rendering it:
WAY 1
Render the yellow cube to the screen.
Compute where the red cube will end up on the screen (easy, we have the vertices +the model view matrix), so render it to an off-screen
FBO just big enough (leave margins for the glow); make sure to save
the Depths to a texture.
Post-process the FBO and make the glow.
Now the hard part: merge the FBO with the screen. We need to take into account the Depths (which we have stored in a texture) so looks
like we need to do the following:
a) render a quad , textured with the FBO's color attachment.
b) set up the ModelView matrix appropriately (
we need to move the texture by some vector because we intentionally
rendered the red cube to a smaller than the screen FBO in step 2 (for
speed reasons!)) c) in the 'merging' fragment shader, we need to write
the gl_FragDepth from FBO's Depth attachment texture (and not from
FragCoord.z)
WAY2
Render both cubes to a off-screen FBO; set up stencil so that the unobstructed part of the red cube is marked with 1's.
Post-process the FBO so that the marked area gets blurred and blend this to make the glow
Blit the FBO to the screen
WAY 1 works, but major problem with it is speed, namely step 4c. Writing to gl_FragDepth in fragment shader disables the early z-test.
WAY 2 also kind of works, and looks like it should be much faster, but it does not give 100% correct results.
The problem is when the red cube is partially obstructed by the yellow one, pixels of the red cube that are close to the yellow one get 'yellowish' when we blur them, i.e. the closer, yellow cube 'creeps' into the glow.
I guess I could kind of remedy the above problem by, when I am blurring, stop blurring when the pixels I am reading suddenly decrease in Depth (means we just jumped from a further object to a closer one) but that would mean twice as many texture accesses when blurring (in addition to fetching the COLOR texture we need to keep fetching the DEPTH texture), and a conditional statement in the blurring fragment shader. I haven't tried, but I am not convinced it would be any faster than WAY 1, and even that wouldn't give 100% correct results (the red pixels close to the border with the yellow cube would be only influenced by the visible part of the red cube, rather than the whole (-blurRadius,+blurRadius) area so in this area the glow would not be 100% the same).
Would anyone have suggestions how to best implement such 'per-object post-processing' ?
EDIT:
What I am writing is a sort of OpenGL ES library for graphics effects. Clients are able to give it a series of instructions like 'take this Mesh, texture it with this, apply the following matrix transformations it its ModelView matrix, apply the following distortions to its vertices, the following set of fragment effects, render to the following Framebuffer'.
In my library, I already have what I call 'matrix effects' (modifying the Model View) 'vertex effects' (various vertex distortions) and 'fragment effects' (various changes of RGBA per-fragment).
Now I am trying to add what I call 'post-processing' effects, this 'GLOW' being the first of them. I define the effect and I vision it exactly as you described above.
The effects are applied to whole Meshes; thus now I need what I call 'per-object post-processing'.
The library is aimed mostly at kind of '2.5D' usages, like GPU-accelerated UIs in Mobile Apps, 2-2.5D games (think Candy Crush), etc. I doubt people will actually ever use it for any real 3D, large game.
So FPS, while always important, is a bit less crucial then usually.
I try really hard to keep the API 'Mesh-local', i.e. the rendering pipeline only knows about the current Mesh it is rendering. Main complaint about the above is that it has to be aware of the whole set me meshes we are going to render to a given Framebuffer. That being said, if 'mesh-locality' is impossible or cannot be done efficiently with post-processing effects, then I guess I'll have to give it up (and make my Tutorials more complicated).
Yesterday I was thinking about this:
# 'Almost-Mesh-local' algorithm for rendering N different Meshes, some of them glowing
Create FBO, attach texture the size of the screen to COLOR0, another texture 1/4 the size of the screen to COLOR1.
Enable DEPTH test, clear COLOR/DEPTH
FOREACH( glowing Mesh )
{
use MRT to render it to COLOR0 and COLOR1 in one go
}
Detach COLOR1, attach STENCIL texture
Set up STENCIL so that the test always passes and writes 1s when Depth test passes
Switch off DEPTH/COLOR writes
FOREACH( glowing Mesh )
{
enlarge it by N% (amount of GLOW needs to be modifiable!)
render to STENCIL // i.e. mark the future 'glow' regions with 1s in stencil
}
Set up STENCIL so that test always passes and writes 0 when Depth test passes
Switch on DEPTH/COLOR writes
FOREACH( not glowing Mesh )
{
render to COLOR0/STENCIL/DEPTH // now COLOR0 contains everything rendered, except for the GLOW. STENCIL marks the unobstructed glowing areas with 1s
}
Blur the COLOR1 texture with BLUR radius 'N'
Merge COLOR0 and COLOR1 to the screen in the following way:
IF ( STENCIL==0 ) take pixel from COLOR0
ELSE blend COLOR0 and COLOR1
END
This is not Mesh-local (we still need to be able to process all 'glowing' Meshes first) although I call it 'almost Mesh-local' because it differentiates between meshes only on the basis of the Effects being applied to them, and not which one is where or which obstructs which.
It also can have problems when two GLOWING Meshes obstruct each other (blend does not have to be done in the right order) although with the GLOW being half-transparent, I am hoping the final look will be more or less ok.
Looks like it can even be turned into a completely 'Mesh-local' algorithm by doing one giant
FOREACH(Mesh)
{
if( glowing )
{
}
else
{
}
}
although at a cost of having to attach and detach stuff from FBO and setting STENCILS differently at each loop iteration.
A knee-jerk suggestion is to do the hybrid:
compute where the red cube will end up on screen, so render it to an off-screen FBO just big enough (or one the same size as the screen, since creating FBOs on the hoof may not be efficient); don't worry about depths, it's only the colours you're after;
render both cubes to an off-screen FBO; set up stencil so that the unobstructed part of the red cube is marked with 1s;
post-process to the screen by using an original pixel from (2) wherever the stencil is 0, or a blurred pixel computed by sampling (1) wherever the stencil is 1.
I am trying to draw large numbers of 2d circles for my 2d games in opengl. They are all the same size and have the same texture. Many of the sprites overlap. What would be the fastest way to do this?
an example of the kind of effect I'm making http://img805.imageshack.us/img805/6379/circles.png
(It should be noted that the black edges are just due to the expanding explosion of circles. It was filled in a moment after this screen-shot was taken.
At the moment I am using a pair of textured triangles to make each circle. I have transparency around the edges of the texture so as to make it look like a circle. Using blending for this proved to be very slow (and z culling was not possible as they were rendered as squares to the depth buffer). Instead I am not using blending but having my fragment shader discard any fragments with an alpha of 0. This works, however it means that early z is not possible (as fragments are discarded).
The speed is limited by the large amounts of overdraw and the gpu's fillrate. The order that the circles are drawn in doesn't really matter (provided it doesn't change between frames creating flicker) so I have been trying to ensure each pixel on the screen can only be written to once.
I attempted this by using the depth buffer. At the start of each frame it is cleared to 1.0f. Then when a circle is drawn it changes that part of the depth buffer to 0.0f. When another circle would normally be drawn there it is not as the new circle also has a z of 0.0f. This is not less than the 0.0f that is currently there in the depth buffer so it is not drawn. This works and should reduce the number of pixels which have to be drawn. However; strangely it isn't any faster. I have already asked a question about this behavior (opengl depth buffer slow when points have same depth) and the suggestion was that z culling was not being accelerated when using equal z values.
Instead I have to give all of my circles separate false z-values from 0 upwards. Then when I render using glDrawArrays and the default of GL_LESS we correctly get a speed boost due to z culling (although early z is not possible as fragments are discarded to make the circles possible). However this is not ideal as I've had to add in large amounts of z related code for a 2d game which simply shouldn't require it (and not passing z values if possible would be faster). This is however the fastest way I have currently found.
Finally I have tried using the stencil buffer, here I used
glStencilFunc(GL_EQUAL, 0, 1);
glStencilOp(GL_KEEP, GL_INCR, GL_INCR);
Where the stencil buffer is reset to 0 each frame. The idea is that after a pixel is drawn to the first time. It is then changed to be none-zero in the stencil buffer. Then that pixel should not be drawn to again therefore reducing the amount of overdraw. However this has proved to be no faster than just drawing everything without the stencil buffer or a depth buffer.
What is the fastest way people have found to write do what I am trying?
The fundamental problem is that you're fill limited, which is the GPUs inability to shade all the fragments you ask it to draw in the time you're expecting. The reason that you're depth buffering trick isn't effective is that the most time-comsuming part of processing is shading the fragments (either through your own fragment shader, or through the fixed-function shading engine), which occurs before the depth test. The same issue occurs for using stencil; shading the pixel occurs before stenciling.
There are a few things that may help, but they depend on your hardware:
render your sprites from front to back with depth buffering. Modern GPUs often try to determine if a collection of fragments will be visible before sending them off to be shaded. Roughly speaking, the depth buffer (or a represenation of it) is checked to see if the fragment that's about to be shaded will be visible, and if not, it's processing is terminated at that point. This should help reduce the number of pixels that need to be written to the framebuffer.
Use a fragment shader that immediately checks your texel's alpha value, and discards the fragment before any additional processing, as in:
varying vec2 texCoord;
uniform sampler2D tex;
void main()
{
vec4 texel = texture( tex, texCoord );
if ( texel.a < 0.01 ) discard;
// rest of your color computations
}
(you can also use alpha test in fixed-function fragment processing, but it's impossible to say if the test will be applied before the completion of fragment shading).
Could you please share some code (any language) on how draw textured line (that would be smooth or have a glowing like effect, blue line, four points) consisting of many points like on attached image using OpenGL ES 1.0.
What I was trying was texturing a GL_LINE_STRIP with texture 16x16 or 1x16 pixels, but without any success.
In ES 1.0 you can use render-to-texture creatively to achieve the effect that you want, but it's likely to be costly in terms of fill rate. Gamasutra has an (old) article on how glow was achieved in the Tron 2.0 game — you'll want to pay particular attention to the DirectX 7.0 comments since that was, like ES 1.0, a fixed pipeline. In your case you probably want just to display the Gaussian image rather than mixing it with an original since the glow is all you're interested in.
My summary of the article is:
render all lines to a texture as normal, solid hairline lines. Call this texture the source texture.
apply a linear horizontal blur to that by taking the source texture you just rendered and drawing it, say, five times to another texture, which I'll call the horizontal blur texture. Draw one copy at an offset of x = 0 with opacity 1.0, draw two further copies — one at x = +1 and one at x = -1 — with opacity 0.63 and a final two copies — one at x = +2 and one at x = -2 with an opacity of 0.17. Use additive blending.
apply a linear vertical blur to that by taking the horizontal blur texture and doing essentially the same steps but with y offsets instead of x offsets.
Those opacity numbers were derived from the 2d Gaussian kernel on this page. Play around with them to affect the fall off towards the outside of your lines.
Note the extra costs involved here: you're ostensibly adding ten full-screen textured draws plus some framebuffer swapping. You can probably get away with fewer draws by using multitexturing. A shader approach would likely do the horizontal and vertical steps in a single pass.
I am trying to render 2D (flat) sprites in a 3D environment using OpenGL ES 2. The way I create each sprite is pretty standard: I create a quad consisting of two triangles, and I map the texture onto that. Everything works fine, except I noticed something strange: when depth testing is turned on (which it should be in 3D mode), the corners of my sprites are painted using the background color.
The easiest way to show this is by illustration:
When I turn off depth testing (on the left) it looks fine, but when I turn it on (on the right) you can see the green sprite's rectangle overlapping on top of the yellow sprite. They both use the same code, the same PNG file, the same shader. Everything is the same except depth testing.
I'm hoping someone might know a way to work around this.
What you can do is alpha testing. Basically your texture has to have an alpha value of 0 where it should be transparent (which it may already have). Then you configure alpha test like e.g.
glAlphaFunc(GL_GREATER, 0.5f);
glEnable(GL_ALPHA_TEST);
This way every pixel (or better fragment) with an alpha value <= 0.5 will not be written into the framebuffer (and therefore not into the depth buffer). You can also do the alpha test yourself in the fragment shader by just discarding the fragment:
...
if(color.a < 0.5)
discard;
...
Then you don't need the fixed-function alpha test (I think that is the reason why it is deprecated in modern desktop GL, don't know about ES).
EDIT: After looking into the ES 2.0 spec, it seems there is no fixed-function alpha test any more, so you will have to do it in the fragment shader like written above. This way you can also make it dependent on a specific color or any other computable property instead of the alpha channel.
I'm a little bit lost, and this is somewhat related to another question I've asked about fragment shaders, but goes beyond it.
I have an orthographic scene (although that may not be relevant), with the scene drawn here as black, and I have one billboarded sprite that I draw using a shader, which I show in red. I have a point that I know and define myself, A, represented by the blue dot, at some x,y coordinate in the 2d coordinate space. (Lower-left of screen is origin). I need to mask the red billboard in a programmatic fashion where I specify 0% to 100%, with 0% being fully intact and 100% being fully masked. I can either pass 0-100% (0 to 1.0) in to the shader, or I could precompute an angle, either solution would be fine.
( Here you can see the scene drawn with '0%' masking )
So when I set "15%" I want the following to show up:
( Here you can see the scene drawn with '15%' masking )
And when I set "45%" I want the following to show up:
( Here you can see the scene drawn with '45%' masking )
And here's an example of "80%":
The general idea, I think, is to pass in a uniform 'A' vec2d, and within the fragment shader I determine if the fragment is within the area from 'A' to bottom of screen, to the a line that's the correct angle offset clockwise from there. If within that area, discard the fragment. (Discarding makes more sense than setting alpha to 0.0 or 1.0 if keeping, right?)
But how can I actually achieve this?? I don't understand how to implement that algorithm in terms of a shader. (I'm using OpenGL ES 2.0)
One solution to this would be to calculate the difference between gl_FragCoord (I hope that exists under ES 2.0!) and the point (must be sure the point is in screen coords) and using the atan function with two parameters, giving you an angle. If the angle is not some value that you like (greater than minimum and less than maximum), kill the fragment.
Of course, killing fragments is not precisely the most performant thing to do. A (somewhat more complicated) triangle solution may still be faster.
EDIT:
To better explain "not precisely the most performant thing", consider that killing fragments still causes the fragment shader to run (it only discards the result afterwards) and interferes with early depth/stencil fragment rejection.
Constructing a triangle fan like whoplisp suggested is more work, but will not process any fragments that are not visible, will not interfere with depth/stencil rejection, and may look better in some situations, too (MSAA for example).
Why don't you just draw some black triangles ontop of the red rectangle?