Not calling glclear results in weird artifacts - opengl-es

im wanting to NOT call glClear for the depth or color bit because i want to be able to see all the previously rendered frames. And it does work except it repeats the model all over the x and y axis, and also causes some strange grey blocky lines. Is there a way to accomplish this? Im using opengl es 3 on android. Thank you for any help.

The contents of the default framebuffer at the start of a frame is undefined, especially on tile-based renderers, which most of the mobile GPUs are. Your "repeats" in the X and Y axis are likely just showing how big the tiles are on your particular GPU (e.g. it's just dumping out whatever is in the GPU local tile RAM, repeated N times to completely cover the screen).
If you want to render on top of the previous frame you need to configure the rendering context configuration to use EGL_BUFFER_PRESERVED (the default is EGL_BUFFER_DESTROYED). E.g:
eglSurfaceAttrib(m_display, m_surface, EGL_SWAP_BEHAVIOR, EGL_BUFFER_PRESERVED);
Note 1: this will incur some overhead (the surface is effectively copied back into tile-local memory), whereas starting with a surface discard or invalidate, or a clear is usually free.
Note 2: this will only preserve color data; there is no means to preserve depth or stencil across frames for the default framebuffer.

Related

OpenGL ES 3.x How to (performantly) render blended triangles front-to-back with alpha-blending and early-reject occluded fragments?

I recently found out that one can render alpha-blended primitives correctly not just back-to-front but also front-to-back (http://hacksoflife.blogspot.com/2010/02/alpha-blending-back-to-front-front-to.html) by using GL_ONE_MINUS_DST_ALPHA, GL_ONE, premultiplying the fragment's alpha in the fragment shader and clearing destination alpha to black before rendering.
It occurred to me that it would then be great if one could combine this with EITHER early-z rejection OR some kind of early "destination-alpha testing" in order to discard fragments that won't contribute to the final pixel color.
When rendering with front-to-back alpha-blending, a fragment can be skipped if the destination-alpha at this location already contains the value 1.0.
I did prototype-implement that by using GL_EXT_shader_framebuffer_fetch to test the destination alpha at the start of the pixel shader and then manually discard the fragment if the value is above a certain threshold. That works but it made things actually slower on my test hardware (Snapdragon XR2) - so I wonder:
whether it's somehow possible to not even have the fragment shader execute if destination alpha is already above a certain threshold?
alternatively, if it would be possible to only write to the depth buffer for fragments that are completely opaque and leave the current depth buffer value unchanged for all fragments that have an alpha value of less than 1 (but still depth-test every fragment), that should allow the hardware to use early-z rejection for occluded fragments. So,
Is this possible somehow (i.e. use depth testing, but update the depth buffer value only for opaque fragments and leave it unchanged for others)?
bottom line this would allow to reduce overdraw of alpha-blended sprites to only those fragments that contribute to the final pixel color and I wonder whether there is a performant way of doing this.
For number 2, I think you could modify gl_FragDepth in the fragment shader to achieve something close, but doing so would disable early-z rejection so wouldn't really help.
I think one viable way to reduce overdraw would be to create a tool to generate a mesh for each sprite which aims to cover a decent proportion of the opaque part of the sprite without using too many verts. I imagine for a typical sprite, even just a well placed quad could cover 80%+.
You'd render the generated opaque geometry of your sprites with depth write enabled, and do a second pass the ordinary way with depth testing enabled to cover the transparent parts.
You would massively reduce overdraw, but significantly increase the complexity of your code and number of verts rendered. You would double your draw calls, but if you're atlassing and using texture arrays, you might be doubling from 1 to 2 draw calls which is fine. I've never tried it so can't say if it's worth all the effort that would be involved.

Vulkan/OpenGL subpasses that fetch more than single fragment

So, Vulkan introduced subpasses and opengl implelemts similar behaviour with ARM_framebuffer_fetch
In the past, I have used framebuffer_fetch successfully for tonemapping post-effect shaders.
Back then the limitation was that one could only read the contents of the framebuffer at the location of the currently rendered fragment.
Now, what I wonder is whether there is any way by now in Vulkan (or even OpenGL ES) to read from multiple locations (for example to implement a blur kernel) without having a tiled hardware to store/load to RAM.
In theory I guess it should be possible, the first pass wpuld just need to render slightly larger than the blur subpass, based on kernel size (so for example if kernel size was 4 pixels then the tile resolved would need to be 4 pixels smaller than the in-tile buffer sizes) and some pixels would have to be rendered redundantly (on the overlaps of tiles).
Now, is there a way to do that?
I seem to recall having seen some Vulkan instruction related to subpasses that would allow to define the support size (which sounded like what I’m looking for now) but I can’t recall where I saw that.
So my questions:
With Vulkan on a mobile tiled renderer architecture, is it possible to forward-render some geometry and the render a full-screen blur over it, all within a single in-tile pass (without the hardware having to store the result of the intermediate pass to ram first and then load the texture from ram when bluring)? If so, how?
If the answer to 1 is yes, can it also be done in OpenGL ES?
Short answer, no. Vulkan subpasses still have the 1:1 fragment-to-pixel association requirements.

OpenGL ES: Is it more efficient to use glClear or use glDrawArrays with primitives that render over used frames?

For example, if I have several figures rendered over a black background, is it more efficient to call glClear(GL_COLOR_BUFFER_BIT) each frame, or render black triangles over artifacts from the past frame? I would think that rendering black triangles over artifacts would be faster, since less pixels need to be changed than clearing the entire screen. However, I have also read online that some drivers and graphics hardware perform optimizations when using glClear(GL_COLOR_BUFFER_BIT) that cannot occur when rendering black triangles instead.
Couple things you neglected which will probably change how you look at this:
Painting over the top with triangles doesn't play well with depth testing. So you have to disable testing first.
Painting over the top with triangles has to be strictly ordered with the previous result. While glClear breaks data dependencies, allowing out of order execution to start drawing the next frame as long as there's enough memory for both framebuffers to exist independently.
Double-buffering, to avoid tearing effects, uses multiple framebuffers anyway. So to use the result of the prior frame as a starting point for the new one, not only do you have to wait for the prior frame to finish rendering, you also have to copy its framebuffer into the buffer for the new frame. Which requires touching every pixel.
In a single buffering scenario, drawing over the top of the previous frame requires not only finishing rendering, but finishing streaming it to the display device. Which leaves you only the blanking interval to render the entire new frame. That almost certainly will prevent achieving maximum frame rate.
Generally, clearing touches every pixel just as copying does, but doesn't require flushing the pipeline. So it's much better to start a frame by glFlush.

WebGL: Framebuffers and textures with one, one-byte channel?

I'm generating blurred drop shadows in WebGL by drawing the object to be blurred onto an off-screen framebuffer/texture, then applying a few passes of a filter to it (back and forth between two off-screen framebuffers), then copying the result to the final output.
However, I'm just dropping the RGB channels, overwriting them with the desired color of the drop shadow (usually black) while maintaining the alpha channel. It seems like I could probably get better performance by just having my off-screen framebuffers be a single (alpha) channel.
Is there a way to do that, and would it actually help?
Also, is there a better way to apply multiple passes of a filter than just alternating between two frame buffers and using the previous frame buffer's bound texture as the input?
Assuming WebGL follows GLES then per the spec (Page 91):
The name of the color buffer of an application-created framebuffer object
is COLOR_ATTACHMENT0 ... Color buffers consist of R, G, B, and,
optionally, A unsigned integer values.
So you can't attach only to A, or only to any single colour channel.
Options to explore:
Use colorMask to disable writing to R, G and B. Depending on what data layout your GPU uses internally you can imagine that could effectively achieve exactly what you want or possibly have no effect whatsoever.
Is there a way you could render to the depth channel instead of to the alpha channel?
Reducing memory bandwidth is often helpful but if it's not a bottleneck then you could end up prematurely optimising.
To avoid excessive per-frame ping-ponging you'd normally attempt to reform your shader so that it does the effect of all the stages in one. Otherwise consider whether there's any better than-linear way to combine multiple passes. Instead of knowing only how to get from stage n to stage n+1, can you go from stage n to stage 2n? Or even just n+2?

Issue with NPOT Atlas (C++/iOS) using glTexCoordPointer

My app uses an atlas and reaches parts of it to display items using glTexCoordPointer.
It works well with power-of-two textures, but I wanted to use NPOT to reduce the amount of memory used.
Actually, the picture itself is well loaded with the linear filter and clamp-to-edge wrapping (the content displayed comes from the pic, even with alpha), but the display is deformed.
The coordinates are not the correct ones, and the "shape" is more a trapezoid than a rectangle.
I guessed I had to play with glEnable(), passing GL_TEXTURE_2D in the case of a POT texture, and GL_APPLE_texture_2D_limited_npot in the other case, but I cannot find a way to do so.
Also, I do not have the GL_TEXTURE_RECTANGLE_ARB, I don't know if it is an issue...
Anyone had the same kind of problem ?
Since OpenGL-2 (i.e. for about 10 years) there are no longer constraints on the size of a regular texture. You can use whatever image size you want, it will just work.

Resources