How to convert to a HDR renderer?

How to convert to a HDR renderer? - opengl-es

I am in the process of converting my webgl deferred renderer to one that uses high dynamic range. I've read a lot about the subject from various sources online and I have a few questions that I hope could be clarified. Most of the reading I have done covers HDR image rendering, but my questions pertain to how a renderer might have to change to support HDR.
As I understand it, HDR is essentially trying to capture higher light ranges so that we can see detail in both extremely lit or dark scenes. Typically in games we use an intensity of 1 to represent white light and 0 black. But in HDR / the real world, the ranges are far more varied. I.e. a sun in the engine might be 10000 times brighter than a lightbulb of 10.
To cope with these larger ranges you have to convert your renderer to use floating point render targets (or ideally half floats as they use less memory) for its light passes.
My first question is on the lighting. Besides the floating point render targets, does this simply mean that if previously I had a light representing the sun, which was of intensity 1, it could/should now be represented as 10000? I.e.
float spec = calcSpec();
vec4 diff = texture2D( sampler, uv );
vec4 color = diff * max(0.0, dot( N, L )) * lightIntensity + spec; //Where lightIntensity is now 10000?
return color;
Are there any other fundamental changes to the lighting system (other than float textures and higher ranges)?
Following on from this, we now have a float render target that has additively accumulated all the light values (in the higher ranges as described). At this point I might do some post processing on the render target with things like bloom. Once complete it now needs to be tone-mapped before it can be sent to the screen. This is because the light ranges must be converted back to the range of our monitors.
So for the tone-mapping phase, I would presumably use a post process and then using a tone-mapping formula convert the HDR lighting to a low dynamic range. The technique I chose was John Hables from Uncharted 2:
const float A = 0.15;
const float B = 0.50;
const float C = 0.10;
const float D = 0.20;
const float E = 0.02;
const float F = 0.30;
const float W = 11.2;
vec3 Uncharted2Tonemap(vec3 x)
{
return ((x*(A*x+C*B)+D*E)/(x*(A*x+B)+D*F))-E/F;
}
... // in main pixel shader
vec4 texColor = texture2D(lightSample, texCoord );
texColor *= 16; // Hardcoded Exposure Adjustment
float ExposureBias = 2.0;
vec3 curr = Uncharted2Tonemap( ExposureBias * texColor.xyz );
vec3 whiteScale = 1.0 / Uncharted2Tonemap(W);
vec3 color = curr * whiteScale;
// Gama correction
color.x = pow( color.x, 1.0 /2.2 );
color.y = pow( color.y, 1.0 /2.2 );
color.z = pow( color.z, 1.0 /2.2 );
return vec4( color, 1.0 );
Tone mapping article
My second question is related to this tone mapping phase. Is there much more to it than simply this technique? Is simply using higher light intensities and tweaking the exposure all thats required to be considered HDR - or is there more to it? I understand that some games have auto exposure functionality to figure out the average luminescence, but at the most basic level is this needed? Presumably you can just use manually tweak the exposure?
Something else thats discussed in a lot of the documents is that of gama correction. The gama correction seems to be done in two areas. First when textures are read and then once again when they are sent to the screen. When textures are read they must simply be changed to something like this:
vec4 diff = pow( texture2D( sampler, uv), 2.2 );
Then in the above tone mapping technique the output correction is done by:
pow(color,1/2.2);
From John Hables presentation he says that not all textures must be corrected like this. Diffuse textures must be, but things like normal maps don't necessarily have to.
My third question is on this gama correction. Is this necessary in order for it to work? Does it mean I have to change my engine in all places where diffuse maps are read?
That is my current understanding of whats involved for this conversion. Is it correct and is there anything I have misunderstood or got wrong?

Light Calculation / Accumulation
Yes, you are generally able to keep your lightning calculation the same and increasing say the intensity of directional lights over 1.0 is certainly fine. Another way the value can exceed one is simply by adding the contributions of several lights together.
Tone Mapping
You certainly understood the concept. There are quite a few different ways to do the actual mapping, from the more simple / naive one color = clamp(hdrColor * exposure) to the more sophisticated (and better) one you posted.
Adaptive tone mapping can quickly become more complicated. Again the naive way is to simply normalize colors by diving with the brightest pixel, which will certainly make it hard/impossible to perceive details in the darker parts of the image. You can also average the brightness and clamp. Or you can save whole histograms of the several last frames and use those in your mapping.
Another method is to normalize each pixel only with the values of the neighbouring pixels, i.e. "local tone mapping". This one is not usually done in real-time rendering.
While it may sound complicated the formula you posted will generate very good results, so it is fine to go with it. Once you have a working implementation feel free to experiment here. There are also great papers available :)
Gamma
Now gamma-correction is important, even if you do not use hdr rendering. But never worry, it is not hard.
The most important thing is to be always aware in what color space you are working. Just like a number without unit, a color without color space just makes seldom sense. Now we like to work in linear (rgb) color space in our shaders, meaning a color with twice the rgb-values should be twice as bright. However this is not how monitors work.
Cameras and photo-editing software often simply hide all this from us and simply save pictures in the format the monitor likes (called sRGB).
There is an additional advantage in sRGB and that is compression. We usually save image with 8/16/32 bit per pixel per channel. If you save pictures in linear space and you have small but very bright spots in the image your 8/16/32 bit may not be precise enough to save brightness differences in the darker parts of the image and if you are displaying them again (of course gamma correct) details may be lost in the dark.
You are able to change the color space your images are saved in many cameras and programs, even if it is sometimes a bit hidden. So if you tell your artists to save all images in linear (rgb) color space you do not need to gamma-correct images at all. Since most programs like sRGB and sRGB offers better compression it is generally a good idea to save images that describe color in sRGB, those therefore need to be gamma corrected. Images that describe values/data like normal maps or bump maps are usually saved in linear color space (if your normal [1.0, 0.5, 0.0] just does not have a 45 degree angle everybody will be confused; the compression advantage is also naught with non-colors).
If you want to use a sRGB Texture just tell OpenGL so and it will convert it to a linear color space for you, without performance hit.
void glTexImage2D( GLenum target,
GLint level,
GLint internalFormat, // Use **GL_SRGB** here
GLsizei width,
GLsizei height,
GLint border,
GLenum format,
GLenum type,
const GLvoid * data);
Oh and of course you have to gamma-correct everything you send to your display (so change from linear to sRGB or gamma 2.2). You can do this in your tone mapping or another post-process step. Or let OpenGL do it for you; see glEnable(GL_FRAMEBUFFER_SRGB)

Related

WebGL – Stretching non power of two textures or adding padding

I'm using images of varying sizes and aspect ratios, uploaded through a CMS in Three.js / A-Frame. Of course, these aren't power of two textures. It seems like I have two options for processing them.
The first is to stretch the image, as is done in Three.JS – with the transformation undone when applied to the plane.
The second is to add extra pixels (which aren't displayed) due to custom UVs.
Would one approach be better than the other? Based on image quality, I'd imagine not doing any stretching would be preferred.
EDIT:
For those interested, I couldn't spot a difference between the two approaches. Here's the code for altering the UVs to cut off the unused texture padding:
var uvX = 1;
var uvY = 0;
if(this.orientation === 'portrait') {
uvX = (1.0 / (this.data.textureWidth / this.data.imageWidth));
} else {
uvY = 1.0 - (this.data.imageHeight / this.data.textureHeight);
}
var uvs = new Float32Array( [
0, uvY,
uvX, uvY,
uvX, 1,
0, 1
]);
EDIT 2:
I hadn't set the texture up properly.
Side by side, the non-stretched (padded) image does look better up close – but not a huge difference:
Left: Stretched to fit the power of two texture. Right: Non-stretched with padding

Custom UV's can be a bit a pain (especially when users can modify the texturing), and padding can break tiling when repeating the texture (unless taken very special care of).
Just stretch the images (or let Three.js do it for you). That's what most engines (like Unity) do anyway. There -might- be a tiny bit of visual degradation if the stretch algorithm and texel sampling do not 100% match, but it will be fine.
The general idea is that if your users -really- cared about sampling quality at that level, they'd carefully handcraft POT textures anyway. Usually, they just want to throw texture images at their models and have them look about right... and they will.

How can I best implement a weight-normalizing blend operation in opengl?

Suppose I have a source color in RGBA format (sr, sb, sg, sa), and similarly a destination color (dr, db, dg, da), all components assumed to be in [0.0, 1.0].
Let p = (sa)/(sa+da), and q = da/(sa+da). Note that p+q = 1.0. Do anything you want if sa and da are both 0.0.
I would like to implement blending in opengl so that the blend result =
(p*sr + q*dr, p*sg + q*dg, p*sb + q*db, sa+da).
(Or to be a smidge more rigorous, following https://www.opengl.org/sdk/docs/man/html/glBlendFunc.xhtml, I'd like f_R, f_G, and f_B to be either p for src or q for dst; and f_A = 1.)
For instance, in the special case where (sa+da) == 1.0, I could use glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); but I'm specifically attempting to deal with alpha values that do not sum to 1.0. (That's why I call it 'weight-normalizing' - I want to treat the src and dst alphas as weights that need to be normalized into linear combination coefficients).
You can assume that I have full control over the data being passed to opengl, the code rendering, and the vertex and fragment shaders. I'm targeting WebGL, but I'm also just curious in general.
The best I could think of was to blend with ONE, ONE, premultiply all src rgb values by alpha, and do a second pass in the end that divides by alpha. But I'm afraid I sacrifice a lot of color depth this way, especially if the various alpha values are small.

I don't believe standard blend equation can do this. At least I can't think of a way how.
However, this is fairly easy to do with OpenGL. Blending might just be the wrong tool for the job. I would make what you currently describe as "source" and "destination" both input textures to the fragment shader. Then you can mix and combine them any way your heart desires.
Say you have two texture you want to combine in the way you describe. Right now you might have something like this:
Bind texture 1.
Render to default framebuffer, sampling the currently bound texture.
Set up fancy blending.
Bind texture 2.
Render to default framebuffer, sampling the currently bound texture.
What you can do instead:
Bind texture 1 to texture unit 0.
Bind texture 2 to texture unit 1.
Render to default framebuffer, sampling both bound textures.
Now you have the values from both textures available in your shader code, and can apply any kind of logic and math to calculate the combined color.
The same thing works if your original data does not come from a texture, but is the result of rendering. Let's say that you have two parts in your rendering process, which you want to combine in the way you describe:
Attach texture 1 as render target to FBO.
Render first part of content.
Attach texture 2 as render target to FBO.
Render second part of content.
Bind texture 1 to texture unit 0.
Bind texture 2 to texture unit 1.
Render to default framebuffer, sampling both bound textures.

performance - drawing many 2d circles in opengl

I am trying to draw large numbers of 2d circles for my 2d games in opengl. They are all the same size and have the same texture. Many of the sprites overlap. What would be the fastest way to do this?
an example of the kind of effect I'm making http://img805.imageshack.us/img805/6379/circles.png
(It should be noted that the black edges are just due to the expanding explosion of circles. It was filled in a moment after this screen-shot was taken.
At the moment I am using a pair of textured triangles to make each circle. I have transparency around the edges of the texture so as to make it look like a circle. Using blending for this proved to be very slow (and z culling was not possible as they were rendered as squares to the depth buffer). Instead I am not using blending but having my fragment shader discard any fragments with an alpha of 0. This works, however it means that early z is not possible (as fragments are discarded).
The speed is limited by the large amounts of overdraw and the gpu's fillrate. The order that the circles are drawn in doesn't really matter (provided it doesn't change between frames creating flicker) so I have been trying to ensure each pixel on the screen can only be written to once.
I attempted this by using the depth buffer. At the start of each frame it is cleared to 1.0f. Then when a circle is drawn it changes that part of the depth buffer to 0.0f. When another circle would normally be drawn there it is not as the new circle also has a z of 0.0f. This is not less than the 0.0f that is currently there in the depth buffer so it is not drawn. This works and should reduce the number of pixels which have to be drawn. However; strangely it isn't any faster. I have already asked a question about this behavior (opengl depth buffer slow when points have same depth) and the suggestion was that z culling was not being accelerated when using equal z values.
Instead I have to give all of my circles separate false z-values from 0 upwards. Then when I render using glDrawArrays and the default of GL_LESS we correctly get a speed boost due to z culling (although early z is not possible as fragments are discarded to make the circles possible). However this is not ideal as I've had to add in large amounts of z related code for a 2d game which simply shouldn't require it (and not passing z values if possible would be faster). This is however the fastest way I have currently found.
Finally I have tried using the stencil buffer, here I used
glStencilFunc(GL_EQUAL, 0, 1);
glStencilOp(GL_KEEP, GL_INCR, GL_INCR);
Where the stencil buffer is reset to 0 each frame. The idea is that after a pixel is drawn to the first time. It is then changed to be none-zero in the stencil buffer. Then that pixel should not be drawn to again therefore reducing the amount of overdraw. However this has proved to be no faster than just drawing everything without the stencil buffer or a depth buffer.
What is the fastest way people have found to write do what I am trying?

The fundamental problem is that you're fill limited, which is the GPUs inability to shade all the fragments you ask it to draw in the time you're expecting. The reason that you're depth buffering trick isn't effective is that the most time-comsuming part of processing is shading the fragments (either through your own fragment shader, or through the fixed-function shading engine), which occurs before the depth test. The same issue occurs for using stencil; shading the pixel occurs before stenciling.
There are a few things that may help, but they depend on your hardware:
render your sprites from front to back with depth buffering. Modern GPUs often try to determine if a collection of fragments will be visible before sending them off to be shaded. Roughly speaking, the depth buffer (or a represenation of it) is checked to see if the fragment that's about to be shaded will be visible, and if not, it's processing is terminated at that point. This should help reduce the number of pixels that need to be written to the framebuffer.
Use a fragment shader that immediately checks your texel's alpha value, and discards the fragment before any additional processing, as in:
varying vec2 texCoord;
uniform sampler2D tex;
void main()
{
vec4 texel = texture( tex, texCoord );
if ( texel.a < 0.01 ) discard;
// rest of your color computations
}
(you can also use alpha test in fixed-function fragment processing, but it's impossible to say if the test will be applied before the completion of fragment shading).

Render depth buffer to texture

Quite new at shaders, so please bear with me if I am doing something silly here. :)
I am trying to render the depth buffer of a scene to a texture using opengl ES 2.0 on iOS, but I do not seem to get entirely accurate results unless the models have a relatively high density of polygons showing on the display.
So, for example if I render a large plane consisting of only four vertices, I get very inaccurate results, but if I subdivide this plane the results get more accurate for each subdivision, and ultimately I get a correctly rendered depth buffer.
This reminds me a lot about affine versus perspective projected texture mapping issues, and I guess I need to play around with the ".w" component somehow to fix this. But I thought the "varying" variables should take this into account already, so I am a bit at loss here.
This is my vertex and fragment shader:
[vert]
uniform mat4 uMVPMatrix;
attribute vec4 aPosition;
varying float objectDepth;
void main()
{
gl_Position=uMVPMatrix * aPosition;
objectDepth=gl_Position.z;
}
[frag]
precision mediump float;
varying float objectDepth;
void main()
{
//Divide by scene clip range, set to a constant 200 here
float grayscale=objectDepth/200.0;
gl_FragColor = vec4(grayscale,grayscale,grayscale,1.0);
}
Please note that this shader is simplified a lot just to highlight the method I am using. Although for the naked eye it seems to work well in most cases, I am in fact rendering to 32 bit textures (by packing a float into ARGB), and I need very high accuracy for later processing or I get noticeable artifacts.
I can achieve pretty high precision by cranking up the polygon count, but that drives my framerate down a lot, so, is there a better way?

You need to divide z by the w component.

This is very simple, the depth is not linear so you can not use linear interpolation for z ... you will solve it very easily if you interpolate 1/z instead of z. You can also perform some w math, exactly as suggested by rasmus.
You can read more about coordinate interpolation at http://www.luki.webzdarma.cz/eng_05_en.htm (page about implementing a simple software renderer)

Sticker/Decal in OpenGL: Draw a texture without the stretch from CLAMP_TO_EDGE?

Basically trying to make a sticker on a plane. The texture image shows up on the plane(can move around) but I don't want to stretch the image to edge or repeat it. How can I just have the image on the wall like a sticker? I'm using shaders, don't know if that's where I tackle the problem. Imagine a checkerboard is my image, this is what happens:
http://i.stack.imgur.com/bBDM0.gif

You could incorporate a range check into your shader. Setting clamp_to_edge affects what pixels you get back from sampling only, so the texture coordinates coming into your shader will still be outside of the range from zero to one when you're in the region where you don't want to output any pixels. On the hardware commonly used for GL ES today, it's faster to pass on a colour with an alpha of 0 than to discard, so do that if your blend mode allows it.
E.g. supposing you currently have just the most trivial textured fragment shader:
void main()
{
gl_FragColor = texture2D(tex2D, texCoordVarying);
}
You could instead go with:
void main()
{
if(clamp(texCoordVarying, vec2(0.0, 0.0), vec2(1.0, 1.0)) == texCoordVarying)
gl_FragColor = texture2D(tex2D, texCoordVarying);
else
gl_FragColor = vec4(0.0); // or discard, if you need to
}
However, an explicit range check is going to add processing cost. Obviously worry about that only if it becomes an issue, but the solution in that case is just to add a border inside your texture. So if your texture is 256x256 then you'd use the central 254x254 pixels and put a completely transparent border around the outside.
EDIT: thinking more clearly, if alpha rather than discard is acceptable then you could use the step function to eliminate the conditional. Do it four times over - once for each limit on each axis, invert the results where you want a greater than test rather than a less than, then multiply the output alpha by all four results. If any one is zero, the output alpha will be zero.

Use CLAMP_TO_BORDER instead and don't forget to set the border color.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio