OpenGL ES 2.0 Vertex Shader Texture Reads not possible from FBO?

OpenGL ES 2.0 Vertex Shader Texture Reads not possible from FBO? - opengl-es

I'm currently working on a GPGPU project that uses OpenGL ES 2.0. I have a rendering pipeline that uses framebuffer objects (FBOs) as targets, i.e. the result of each rendering pass is saved in a texture which is attached to an FBO. So far, this works when using fragment shaders. For example I have to following rendering pipeline:
Preprocessing (downscaling, grayscale conversion)
-> Adaptive Thresholding Pass 1 -> Adapt. Thresh. Pass 2
-> Copy back to CPU
However, I wanted to extend this pipeline by adding a grayscale histogram calculation after the proprocessing step. With OpenGL ES 2.0 this only works with texture reads in the vertex shader, as far as I know [1]. I can confirm that my shaders work in a different program where the input is a "real" image, not a rendered texture that is attached to an FBO. Hence I think it is not possible to read texture data in a vertex shader if it comes from an FBO. Can anyone confirm this assumption or am I missing something? I'm using a Nexus 10 for my experiments.
[1]: It basically works by reading each pixel value from the texture in the vertex shader, then calculating of the histogram bin from it and "adding" it in the fragment shader by using alpha blending.

Texture reads within a vertex shader are not a required element in OpenGL ES 2.0, so you'll find some manufacturers supporting them and some not. In fact, there was a weird situation where iOS supported it on some devices for one version of iOS, but not the next (and it's now officially supported in iOS 7). That might be the source of the inconsistency you see here.
To work around this, I implemented a histogram calculation by instead feeding the colors from the FBO (or its attached texture) in as vertices and using a scattering operation similar to what you describe. This doesn't require a texture read of any kind in the vertex shader, but it does involve a round-trip from and to the GPU and potentially a lot of vertices. It works on all OpenGL ES 2.0 hardware, but it can be costly.

Related

Copying a non-multisampled FBO to a multisampled one

I have been trying to implement a render to texture approach in our application that uses GLES 3 and I have got it working but I am a little disappointed with the frame rate drop.
So far we have been rendering directly to the main FBO, which has been a multisampled FBO created using EGL_SAMPLES=8.
What I want basically is to be able to get a hold of the pixels that have been already drawn, while I'm still drawing. So I thought a render to texture approach should do it. Then I'd just read a section of the off-screen FBO's texture whenever I want and when I'm done rendering to it I'll blit the whole thing to the main FBO.
Digging into this I found I had to implement a system with a multisampled FBO as well as a non-multisampled textured FBO in which I had to resolve the multisampled one. Then just blit the resolved to the main FBO.
This all works but the problem is that by using the above system and a non-multisampled main FBO (EGL_SAMPLES=0) I get quite a big frame rate drop compared to the frame rate I get when I use just the main FBO with EGL_SAMPLES=8.
Digging a bit more into this I found people reporting online as well as a post here https://community.arm.com/thread/6925 that says that the fastest approach to multisampling is to use EGL_SAMPLES. And indeed that's what it looks like on the jetson tk1 too which is our target board.
Which finally leads me to the question, and apologies for the long introduction:
Is there any way that I can design this to use a non-multisampled off-screen fbo for all the rendering that eventually is blitted to a main multisampled FBO that uses EGL_SAMPLES?

The only point of MSAA is to anti-alias geometry edges. It only provides benefit if multiple triangle edges appear in the same pixel. For rendering pipelines which are doing multiple off-screen passes you want to enable multiple samples for the off-screen passes which contain your geometry (normally one of the early passes in the pipeline, before any post-processing effects).
Applying MSAA at the end of the pipeline on the final blit will provide zero benefit, and probably isn't free (it will be close to free on tile-based renderers like IMG Series 6 and Mali (the blog you linked), less free on immediate-mode renders like the Nvidia in your Jetson board).
Note for off-screen anti-aliasing the "standard" approach is rendering to an MSAA framebuffer, and then resolving as a second pass (e.g. using glBlitFramebuffer to blit into a single sampled buffer). This bounce is inefficient on many architectures, so this extension exists to help:
https://www.khronos.org/registry/gles/extensions/EXT/EXT_multisampled_render_to_texture.txt
Effectively this provides the same implicit resolve as the EGL window surface functionality.
Answers to your questions in the comments.
Is the resulting texture a multisampled texture in that case?
From the application point of view, no. The multisampled data is inside an implicitly allocated buffer, allocated by the driver. See this bit of the spec:
"The implementation allocates an implicit multisample buffer with TEXTURE_SAMPLES_EXT samples and the same internalformat, width, and height as the specified texture level."
This may require a real MSAA buffer allocation in main memory on some GPU architectures (and so be no faster than the manual glBlitFramebuffer approach without the extension), but is known to be effectively free on others (i.e. tile-based GPUs where the implicit "buffer" is a small RAM inside the GPU, and not in main memory at all).
The goal is to blur the background behind widgets
MSAA is not in any way a general purpose blur - it only anti-aliases the pixels which are coincident with edges of triangles. If you want to blur triangle faces you'd be better off just using a separable gaussian blur implemented as a pair of fragment shaders, and implement it as a 2D post-processing pass.

Is there any way that I can design this to use a non-multisampled off-screen fbo for all the rendering that eventually is blitted to a main multisampled FBO that uses EGL_SAMPLES?
Not in any way which is genuinely useful.
Framebuffer blitting does allow blits from single-sampled buffers to multisample buffers. But all that does is give every sample within a pixel the same value as the one from the source.
Blitting cannot generate new information. So you won't get any actual antialiasing. All you will get is the same data stored in a much less efficient way.

Draw bitmap using Metal

Question is quite simple. On Windows you have BitBlt() to draw to the screen. On OS X you could normally use OpenGl but how do I draw the bitmap to the screen using Apples new Metal framework? I cant find anything valuable in Apples Metal references.
I'm right now using Core Graphics for the drawing part but since my bitmap is updating all the time, I feel like I should move to Metal to reduce the overhead.

The very short answer is this: First, you need to setup a standard rendering pipeline using Metal, which is a bit of work, if you don't know anything about the rendering pipeline (note: but this can be simplified by avoiding 3D stuff by rendering a quad with two triangles, just give xy coordinates for the vertices). There is a lot of sample code from Apple that shows how to setup a standard rendering pipeline, the simplest is maybe MetalImageProcessing, so you could strip that down (even though that uses a compute shader which is overcomplicated to draw in, you'd want to substitute it for standard vertex and fragment shaders).
Second, you need to learn how vertex and fragment shaders work and how to draw stuff with them, see shadertoy for this.
Just noticed the users last comment, if you have an array of bytes that represent an image then you can just make an MTLTexture out of that and render it to the screen, see Apple's example above but change the compute shader to standard vertex and fragments shaders for faster performance.

Multi-pass shaders in OpenGL ES 2.0

First - Does subroutines require GLSL 4.0+? So it unavailable in GLSL version of OpenGL ES 2.0?
I quite understand what multi-pass shaders are.
Well what is my picture:
Draw group of something (e.g. sprites) to FBO using some shader.
Think of FBO as big texture for big screen sized quad and use another shader, which, for example, turn texture colors to grayscale.
Draw FBO textured quad to screen with grayscaled colors.
Or is this called else?
So multi-pass = use another shader output to another shader input? So we render one object twice or more? How shader output get to another shader input?
For example
glUseProgram(shader_prog_1);//Just plain sprite draw
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, /*some texture_id*/);
//Setting input for shader_prog_1
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
//Disabling arrays, buffers
glUseProgram(shader_prog_1);//Uses same vertex, but different fragment shader program
//Setting input for shader_prog_2
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
Can anyone provide simple example of this in basic way?

In general, the term "multi-pass rendering" refers to rendering the same object multiple times with different shaders, and accumulating the results in the framebuffer. The accumulation is generally done via blending, not with shaders. That is, the second shader doesn't take the output of the first. They each perform part of the computation, and the blend stage combines them into the final value.
Nowadays, this is primarily done for lighting in forward-rendering scenarios. You render each object once for each light, passing different lighting parameters and possibly using different shaders each time you render a light. The blend mode used to accumulate the results is additive, since light reflectance is an additive property.
Does subroutines require GLSL 4.0+? So it unavailable in GLSL version of OpenGL ES 2.0?
This is a completely different question from the entire rest of your post, but the answer is yes and no.
No, in the sense that ARB_shader_subroutine is an OpenGL extension, and it therefore could be implemented by any OpenGL implementation. Yes, in the practical sense that any hardware that actually could implement shader_subroutine could also implement the rest of GL 4.x and therefore would already be advertising 4.x functionality.
In practice, you won't find shader_subroutine supported by non-4.x OpenGL implementations.
It is unavailable in GLSL ES 2.0 because it's GLSL ES. Do not confuse desktop OpenGL with OpenGL ES. They are two different things, with different GLSL versions and different featuresets. They don't even share extensions (except for a very few recent ones).

Returning values from a OpenGL ES 2.0 shader

Is it possible to get any values out of a OpenGL ES 2.0 shader? I'd like to use the gpu to do some processing (not 3D). The only thing I could think of is to render to the canvas and then to use readPixels to get the colors (preferably in a large 2d array).

Yes, that's called GPGPU. The only way is to draw to a framebuffer or a texture, here is a tutorial that explains it, just stick to the GLSL version.

How to draw a colored rectangle in OpenGL ES?

Is this easy to do? I don't want to use texture images. I want to create a rectangle, probably of two polygons, and then set a color on this. A friend who claims to know OpenGL a little bit said that I must always use triangles for everything and that I must use textures for everything when I want it colored. Can't imagine that is true.

You can set per-vertex colors (which can all be the same) and draw quads. The tricky part about OpenGL ES is that they don't support immediate mode, so you have a much steeper initial learning curve compared to OpenGL.
This question covers the differences between OpenGL and ES:
OpenGL vs OpenGL ES 2.0 - Can an OpenGL Application Be Easily Ported?

With OpenGL ES 2.0, you do have to use a shader, which (among other things) normally sets the color. As long as you want one solid color for the whole thing, you can do it in the vertex shader.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio