How do I properly group polygons into arrays? - opengl-es

I want to render a scene in OpenGL ES, but I have a problem.
Because there is no immediate mode in ES, and simulating immediate mode with single-polygon buffers is slow, I can't just switch textures and skip invisible polygons, so I have to group my polygons.
Here are characteristics of different polygons:
Diffuse texture (mipmapped, lots of them).
Lightmap texture (packed, up to 64 textures).
Visibility.
At first I thought to group the polygons only by visibility area, but I couldn't find a way to use texture index arrays.
So, how do I properly make buffers of polygons to render?

I will make visibility groups with texture subgroups. Quake uses 64 128x128 lightmaps, I'm going to replace them with a single 1024x1024 lightmap, since modern hardware supports textures of such size.

Related

rendering millions of voxels using 3D textures with three.js

I am using three.js to render a voxel representation as a set of triangles. I have got it render 5 million triangles comfortably but that seems to be the limit. you can view it online here.
select the Dublin model at resolution 3 to see a lot of triangles being drawn.
I have used every trick to get it this far (buffer geometry, voxel culling, multiple buffers) but I think it has hit the maximum amount that openGL triangles can accomplish.
Large amounts of voxels are normally rendered as a set of images in a 3D texture and while there are several posts on how to hack 2d textures into 3D textures but they seem to have a maximum limit on the texture size.
I have searched for tutorials or examples using this approach but haven't found any. Has anyone used this approach before with three.js
Your scene is render twice, because SSAO need depth texture. You could use WEBGL_depth_texture extension - which have pretty good support - so you just need a single render pass. You can stil fallback to low-perf-double-pass if extension is unavailable.
Your voxel's material is double sided. It's may be on purpose, but it may create a huge overdraw.
In your demo, you use a MeshPhongMaterial and directional lights. It's a needlessly complex material. Your geometries don't have any normals so you can't have any lighting. Try to use a simpler unlit material.
Your goal is to render a huge amount of vertices, so assuming the framerate is bound by vertex shader :
try stuff like https://github.com/GPUOpen-Tools/amd-tootle to preprocess your geometries. Focusing on prefetch vertex cache and vertex cache.
reduce the bandwidth used by your vertex buffers. Since your vertices are aligned on a "grid", you could store vertices position as 3 Shorts instead of 3 floats, reducing your VBO size by 2. You could use a same tricks if you had normals since all normals should be Axis aligned (cubes)
generally reduce the amount of varyings needed by fragment shader
if you need more attributes than just vec3 position, use one single interleaved VBO instead of one per attrib.

Is drawing outside the viewport in OpenGL expensive?

I have several thousand quads to draw, some of which might fall entirely outside the viewport. I could write code which will detect which quads fall wholly outside viewport and ask OpenGL to draw only those which will be at least partially visible. Alternatively, I could simply have OpenGL draw all of the quads, regardless of whether they intersect with the viewport.
I don't have enough experience with OpenGL to know if one of these is obviously better (or if OpenGL offers some quick viewport intersection test I can use). Are draws outside the viewport close to being no-ops, or are they expensive enough that I should I try to avoid them?
It depends on your circumstances.
Drawing is best done in batches, preferably batches that are static in structure (ie: each batch is drawn in its entirety). So you shouldn't be culling down at the quad level. But doing some culling of large groups of quads is not unwelcome.
The primary performance that you'll lose is vertex transform (aka: your vertex shader). A vertex shader has to be run on every vertex you provide, regardless of anything else. However, hardware will discard triangles that are trivially outside of the viewport, so you won't soak up any fillrate or other performance.
However, that doesn't mean that it's OK if your vertex T&L is cheap. Rendering large blocks of triangles that aren't visible may very well stall the rasterizer, because all of the triangles are being culled. That is, if you draw a lot of stuff that gets culled by being off screen, the fillrate that you might have used on actually visible triangles may be lost.
So it's not a good idea to just hurl geometry at the GPU willy-nilly.
In any case, if you're doing 2D rendering, coarse culling of discrete groups of quads is really all you need. You could divide your tilemap into screen-sized portions, and you draw up to 4 of these based on the position of the camera.

Non predefined multiple light sources in OpenGL ES 2.0

There is a great article about multiple light sources in GLSL
http://en.wikibooks.org/wiki/GLSL_Programming/GLUT/Multiple_Lights
But light0 and light1 parameters described in shader code, what if must draw flare gun shots, e.g every flare has it own position, color and must illuminate surroundings. How we manage other objects shader to deal with unknown (well there is a limit to max flares on the screen) position, colors of flares? For example there will be 8 max flares on screen, what i must to pass 8*2 uniforms, even if they not exist at this time?
Or imagine you making level editor, user can place lamps, how other objects will "know" about new light source and render then new lamp has been added?
I think there must be clever solution, but i can't find one.
Lighting equations usually rely on additive colour. So the output is the colour of light one plus the colour of light two plus the colour of light three, etc.
One of the in-framebuffer blending modes offered by OpenGL is additive blending. So the colour output of anything new that you draw will be added to whatever is already in the buffer.
The most naive solution is therefore to write your shader to do exactly one light. If you have multiple lights, draw the scene that many times, each time with a different nominated line. It's an example of multipass rendering.
Better solutions involve writing shaders to do two, four, eight or whatever lights at once, doing, say, 15 lights as an 8-light draw then a 4-light draw then a 2-light draw then a 1-light draw, and including only geometry within reach of each light when you do that pass. Which tends to mean finding intelligent ways to group lights by locality.
EDIT: with a little more thought, I should add that there's another option in deferred shading, though it's not completely useful on most GL ES devices at the moment due to the limited options for output buffers.
Suppose theoretically you could render your geometry exactly once and store whatever you wanted per pixel. So you wouldn't just output a colour, you'd output, say, a position in 3d space, a normal, a diffuse colour, a specular colour and a specular exponent. Those would then all be in a per-pixel buffer.
You could then render each light by (i) working out the maximum possible space it can occupy when projected onto the screen (so, a 2d rectangle that relates directly to pixels); and (ii) rendering the light as a single quad of that size, for each pixel reading the relevant values from the buffer you just set up and outputting an appropriately lit colour.
Then you'd do all the actual geometry in your scene only exactly once, and each additional light would cost at most a single, full-screen quad.
In practice you can't really do that because the output buffers you tend to be able to use in ES provide too little storage. But what you can usually do is render to a 32bit colour buffer with an attached depth buffer. So you can just store depth in the depth buffer and work out world (x, y, z) from that plus the [uniform] position of the camera in the light shader. You could store 8-bit versions of normal x and y in the colour buffer so as to spend 16 bits and work out z in the colour buffer because you know that the normal is always of unit length. Then, to pick a concrete example at random, maybe you could store a 16-bit version of the diffuse colour in the remaining space, possibly in YCrCb with extra storage for Y.
The main disadvantage is that hardware antialiasing then doesn't due to much the same sort of concerns as transparency and depth buffers. But if you get to the point where you save dramatically on lighting it might still make sense to do manual antialiasing by rendering a large version of the scene and then scaling it down in a final pass.

WebGL and rectangular (power of two) textures

WebGL is known to have poor support for NPOT (non-power-of-two) textures. But what about rectangular textures where both width and height are powers of two? Specifically, I'm trying to draw to a rectangular framebuffer as part of a render-to-texture scheme to generate some UI elements. The framebuffer would need to be 512x64 or thereabouts.
How much less efficient would this be in terms of drawing? If framerate is a concern, would I do better to allocate a 512x512 power-of-two-sized buffer and only render to the top 64 pixels, sacrificing memory for speed?
There has never been the constraint for that width must equal height.
More specifically: 2D textures are not at all required to be square; a 512x64 texture is not only allowed but should also be efficiently implemented by the driver; on the other hand cube maps need to be square.
For 2D textures, you can use NPOT textures if both wrap modes are CLAMP_TO_EDGE and your minification filter does not require a mipmap. Efficiency of NPOT texture may vary depending on your driver.

OpenGL: Large texture 'maps' versus multiple singular textures performance

I'm deciding if I want to create a map of multiple textures of my entire scene into several 1024x1024 texture atlases. Is it much of a gain at all to have fewer, larger textures so that you can make less GL render calls, albeit with the same number of overall triangles? Is this something that's common?
In my case, I'm targeting an iPad with about a 5,000 triangle scene with 16 textures. Given the distribution I'm looking at around 5x more calls to glDrawArrays if I don't combine.

Resources