Transition shader - image

I have created a transition shader.
This is what is does:
On each update the color that should be alpha changes.
Then preform a check for each pixel.
If the color of the pixel is more that the 'alpha' value
Set this pixel to transparent.
Else If the color of the pixel is more that the 'alpha' value - 50
Set this pixel to partly transparent.
Else
Set the color to black.
EDIT (DELETED OLD PARTS):
I tried converting my GLSL into AGAL (using http://cmodule.org/glsl2agal):
Fragment shader:
const float alpha = 0.8;
varying vec2 TexCoord; //not used but required for converting
uniform sampler2D transition;//not used but required for converting
void main()
{
vec4 color = texture2D(transition, TexCoord.st);//not used but required for converting
color.a = float(color.r < alpha);
if(color.r >= (alpha - 0.1)){
color.a = 0.2 * (color.r - alpha - 0.1);
}
gl_FragColor = vec4(0, 0, 0, color.a);
}
And I've customized the output and added that to a (custom) Starling filter:
var fragmentShader:String =
"tex ft0, v0, fs0 <2d, clamp, linear, mipnone> \n" + // copy color to ft0
"slt ft0.w, ft0.x, fc0.x \n" + // alpha = red < inputAlpha
"mov ft0.xyz, fc1.xyzz \n" + // set color to black
"mov oc, ft0";
mShaderProgram = target.registerProgramFromSource(PROGRAM_NAME, vertexShader, fragmentShader);
It works and when I set the filters alpha, it will update the stuff. The only thing left is the partly transparent thing, but I have no idea how I can do that.

Swap the cycle on the Y and X coordinates. By using the X in the inner loop you optimize the L1 cache and the prefetcher of the CPU.
Some minor hints:
Remove the zeros for a cleaner code:
const c:uint = a << 24
Verify that 255/50 is collapsed into a single constant by the compiler.

Don't be crazy by doing it with BitmapData once you're using Starling.
I didn't get if you're grayscaling it by yourself or not. In not, just create a Starling filter for grayscale (pixel shader below will do the trick)
tex ft0, v0, fs0 <2d,linear,clamp>
add ft1.x, ft0.x, ft0.y
add ft1.x, ft1.x, ft0.z
div ft1.x, ft1.x, fc0.x
mov ft0.xyz, ft1.xxx
mov oc ft0
And for the alpha transition just extend the Image Class, implement IAnimatable add it to the Juggler. in the advanceTime just do a this.alpha -= VALUE;
Simple like that :)

Just going to elaborate a bit on #Paxel's answer. I discussed with another developer Jackson Dunstan about the L1 caching, where the speed improvement comes from, and what other improvements can be made to code like this to see performance gain.
After which Jackson posted a blog entry which can be read at here: Take Advantage of CPU caching
I'll post some the relative items. First the bitmap data is stored in memory by rows. The rows memory addresses might look something like this:
row 1: 0 1 2 3 4 5
row 2: 6 7 8 9 10 11
row 3: 12 13 14 15 16 17
Now running your inner loop through the rows will allow you leverage the L1 cache advantage since you can read the memory in order. So inner looping X first you'll read the first row as:
0 1 2 3 4 5
But if you were to do it Y first you'd read it as:
0 6 12 1 7 13
As you can see you are bouncing around memory addresses making it a slower process.
As for optimizations that could be made, the suggestion is to cache your width and height getters, storing the properties into local variables. Also using the Math.round() is pretty slow, replacing that would see a speed increase.

Related

Rendering to custom FrameBuffer using same texture both as input and output

Some Fragment shaders in ShaderToy (e.g. fluid dynamics, https://www.shadertoy.com/view/4tGfDW ) use same buffer as both input and output. But when I try to do this in my C/C++ code it does not work (I renders strange checkerboard artifacts like inconsistent visual memory). To workaround this issue I have to use two different FrameBuffers A,B and flip textures ( first render A to B then render B back to A )
I understand that OpenGL does not allow to use the same texture both as input and output (?) due to memory consistency issues.
But isn't there more elegant solution than using two FrameBuffers ? E.g. using some lock, or temporary cache (I don't know some sychronization flag which takes care of this)???
EDIT - Details to answer the comment/question:
OpenGL (depending the GL version) has some very specific rules of what
can and can''t be done when the same texture is used as render target
and sampler input. If your use case can be implemented within this set
of requirements or not is not clear, as you have not explained what
exactly you need or want to do here.
basically I want to implement Fluid-Dynamics solver (e.g. that from ShaderToy linked above) as well as other partial differential equation solvers. That means each pixel output depends on some convolution mask (derivative, laplacian, average) of neighboring pixels. There may be also some movement (advection) which means reading values form distant pixels.
Currently I realized the artifacts appear mostly when I read/write pixels which are different place - i.e. it is non-local (e.g. pixel[100,100] depend on pixel[10,10])
Example of simple Fluid-Solver from Shadertoy:
vec4 solveFluid(sampler2D smp, vec2 uv, vec2 w, float time, vec3 mouse, vec3 lastMouse)
{
const float K = 0.2;
const float v = 0.55;
vec4 data = textureLod(smp, uv, 0.0);
vec4 tr = textureLod(smp, uv + vec2(w.x , 0), 0.0);
vec4 tl = textureLod(smp, uv - vec2(w.x , 0), 0.0);
vec4 tu = textureLod(smp, uv + vec2(0 , w.y), 0.0);
vec4 td = textureLod(smp, uv - vec2(0 , w.y), 0.0);
vec3 dx = (tr.xyz - tl.xyz)*0.5;
vec3 dy = (tu.xyz - td.xyz)*0.5;
vec2 densDif = vec2(dx.z ,dy.z);
data.z -= dt*dot(vec3(densDif, dx.x + dy.y) ,data.xyz); //density
vec2 laplacian = tu.xy + td.xy + tr.xy + tl.xy - 4.0*data.xy;
vec2 viscForce = vec2(v)*laplacian;
data.xyw = textureLod(smp, uv - dt*data.xy*w, 0.).xyw; //advection
vec2 newForce = vec2(0);
data.xy += dt*(viscForce.xy - K/dt*densDif + newForce); //update velocity
data.xy = max(vec2(0), abs(data.xy)-1e-4)*sign(data.xy); //linear velocity decay
#ifdef USE_VORTICITY_CONFINEMENT
data.w = (tr.y - tl.y - tu.x + td.x);
vec2 vort = vec2(abs(tu.w) - abs(td.w), abs(tl.w) - abs(tr.w));
vort *= VORTICITY_AMOUNT/length(vort + 1e-9)*data.w;
data.xy += vort;
#endif
data.y *= smoothstep(.5,.48,abs(uv.y-0.5)); //Boundaries
data = clamp(data, vec4(vec2(-10), 0.5 , -10.), vec4(vec2(10), 3.0 , 10.));
return data;
}
Currently I realized the artifacts appear mostly when I read/write pixels which are different place - i.e. it is non-local (e.g. pixel[100,100] depend on pixel[10,10])
Yes, this is never going to work on GPUs, as there are no particular guarantees on the order of individual fragment shader invocations whatsoever. So if the invocation writing to pixel [100,100] will see the results of the invocation writing to [10,10] or the original data will be totally random. As per the spec, you're getting undefined values when reading in such a cuncurrent read/write scenario, so theoretically, you could get even not one or the other, but see partial writes or totally different values (although that's not likely to occur on real world hardware).
And any order guarantees of such a scale simply does not make sense within the render pipeline, so there is also no partical means of synchronization you can manually add to solve this issue.
To workaround this issue I have to use two different FrameBuffers A,B and flip textures ( first render A to B then render B back to A )
Yes, the ping-pong approach is what you should do for this use case. And honestly, it should not incur any significant performance penalty in that scenario anyway, as you seem to write to each output pixel once anyway, so you don't need an additional copy of "untouched" pixels. So all it costs is the additional memory.

Metal fragment shader bound resource acting erratically on macOS

This problem has been stumping me, and I can't figure it out. I've got a few different shaders which use the same resource, a structure with a bunch of lighting values. The first shader to use it works fine -- the second one does not. Seems like it might be getting zeroes. The third shader to use it also works fine.
If I don't have any objects which use the first shader, then the second one works. And the third one does NOT work. The resource never changes, it's an MTLBuffer I set once. And, if I GPU Frame Capture, the values reported in the draw calls are all correct, all the time. Yet nothing shows up for the shaders that don't work.
The reason I think it is this one particular struct that is fluctuating is that if I hard-code the values into the shader instead of reading the struct values, then that also works. It's driving me crazy.
I am not sure what kind of code examples will serve here. Here is the structure and how I am using it. I'm not binding any other resources to this particular index, anywhere else in the program.
struct Light {
packed_float3 color; // 0 - 2
float ambientIntensity; // 3
packed_float3 direction; // 4 - 6
float diffuseIntensity; // 7
float shininess; // 8
float specularIntensity; // 9
float dummy1,dummy2;
/*
_______________________
|0 1 2 3|4 5 6 7|8 9 |
-----------------------
| | | |
| chunk0| chunk1| chunk2|
*/
}__attribute__ ((aligned (16)));
... shader code ....
float4 ambientColor = float4(1,1,1,1) * color * 0.25;
//Diffuse
float diffuseFactor = max(0.0,dot(interpolated.normal, float3(0.0,0.5,-1.0))); // 1
float4 diffuseColor = float4(light.color,1) * color * 0.5 * diffuseFactor ; // 2
//Specular
float3 eye = normalize(interpolated.fragmentPosition); //1
float3 reflection = reflect(float3(0.0,0.5,-1.0),interpolated.normal); // 2
float specularFactor = pow(max(0.0, dot(reflection, eye)), 10.0); //3
float4 specularColor = float4(float3(1,1,1)* 2 * specularFactor ,color.w);//4
color = (ambientColor + diffuseColor + specularColor);
color = clamp(color, 0.0, 1.0);
Anyone have any ideas about what could be the trouble? Also, this is only happening on the Mac. On iOS it works fine.
EDIT: It seems like when the shader is failing, it is simply unable to access the resource at all. For example, I tried just setting my fragment return color to equal the light color, with no modifications.
And it produces nothing -- not even black! If I replace that statement with one of these two, I get white and black:
color = float4(1,1,1, 1); // produces white
color = float4(0,0,0, 1); //produces black
color = float4(light.color.xyz, 1); // nothing at all
It is like the fragment shader is just aborting when it is trying to access the light structure -- otherwise I would get SOMETHING in my color, with a forced alpha value of 1. I don't get it.

glTexSubImage2D shifting NSImage by a pixel

I’m working on an app that creates it’s own texture atlas. The elements on the atlas can vary in size but are placed in a grid pattern.
It’s all working fine except for the fact that when I write over the section of the atlas with a new element (the data from an NSImage), the image is shifted a pixel to the right.
The code I’m using to write the pixels onto the atlas is:
-(void)writeToPlateWithImage:(NSImage*)anImage atCoord:(MyGridPoint)gridPos;
{
static NSSize insetSize; //ultimately this is the size of the image in the box
static NSSize boundingBox; //this is the size of the box that holds the image in the grid
static CGFloat multiplier;
multiplier = 1.0;
NSSize plateSize = NSMakeSize(atlas.width, atlas.height);//Size of entire atlas
MyGridPoint _gridPos;
//make sure the column and row position is legal
_gridPos.column= gridPos.column >= m_numOfColumns ? m_numOfColumns - 1 : gridPos.column;
_gridPos.row = gridPos.row >= m_numOfRows ? m_numOfRows - 1 : gridPos.row;
_gridPos.column = gridPos.column < 0 ? 0 : gridPos.column;
_gridPos.row = gridPos.row < 0 ? 0 : gridPos.row;
insetSize = NSMakeSize(plateSize.width / m_numOfColumns, plateSize.height / m_numOfRows);
boundingBox = insetSize;
//…code here to calculate the size to make anImage so that it fits into the space allowed
//on the atlas.
//multiplier var will hold a value that sizes up or down the image…
insetSize.width = anImage.size.width * multiplier;
insetSize.height = anImage.size.height * multiplier;
//provide a padding around the image so that when mipmaps are created the image doesn’t ‘bleed’
//if it’s the same size as the grid’s boxes.
insetSize.width -= ((insetSize.width * (insetPadding / 100)) * 2);
insetSize.height -= ((insetSize.height * (insetPadding / 100)) * 2);
//roundUp() is a handy function I found somewhere (I can’t remember now)
//that makes the first param a multiple of the the second..
//here we make sure the image lines are aligned as it’s a RGBA so we make
//it a multiple of 4
insetSize.width = (CGFloat)roundUp((int)insetSize.width, 4);
insetSize.height = (CGFloat)roundUp((int)insetSize.height, 4);
NSImage *insetImage = [self resizeImage:[anImage copy] toSize:insetSize];
NSData *insetData = [insetImage TIFFRepresentation];
GLubyte *data = malloc(insetData.length);
memcpy(data, [insetData bytes], insetData.length);
insetImage = NULL;
insetData = NULL;
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, atlas.textureIndex);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1); //have also tried 2,4, and 8
GLint Xplace = (GLint)(boundingBox.width * _gridPos.column) + (GLint)((boundingBox.width - insetSize.width) / 2);
GLint Yplace = (GLint)(boundingBox.height * _gridPos.row) + (GLint)((boundingBox.height - insetSize.height) / 2);
glTexSubImage2D(GL_TEXTURE_2D, 0, Xplace, Yplace, (GLsizei)insetSize.width, (GLsizei)insetSize.height, GL_RGBA, GL_UNSIGNED_BYTE, data);
glGenerateMipmap(GL_TEXTURE_2D);
free(data);
glBindTexture(GL_TEXTURE_2D, 0);
glGetError();
}
The images are RGBA, 8bit (as reported by PhotoShop), here's a test image I've been using:
and here's a screen grab of the result in my app:
Am I unpacking the image incorrectly...? I know the resizeImage: function works as I've saved it's result to disk as well as bypassed it so the problem is somewhere in the gl-code...
EDIT: just to clarify, the section of the atlas being rendered is larger than the box diagram. So the shift is occurring withing the area that's written to with glTexSubImage2D.
EDIT 2: Sorted, finally, by offsetting the copied data that goes into the section of the atlas.
I don't fully understand why that is, perhaps it's a hack instead of a proper solution but here it is.
//resize the image to fit into the section of the atlas
NSImage *insetImage = [self resizeImage:[anImage copy] toSize:NSMakeSize(insetSize.width, insetSize.height)];
//pointer to the raw data
const void* insetDataPtr = [[insetImage TIFFRepresentation] bytes];
//for debugging, I placed the offset value next
int offset = 8;//it needed a 2 pixel (2 * 4 byte for RGBA) offset
//copy the data with the offset into a temporary data buffer
memcpy(data, insetDataPtr + offset, insetData.length - offset);
/*
.
. Calculate it's position with the texture
.
*/
//And finally overwrite the texture
glTexSubImage2D(GL_TEXTURE_2D, 0, Xplace, Yplace, (GLsizei)insetSize.width, (GLsizei)insetSize.height, GL_RGBA, GL_UNSIGNED_BYTE, data);
You may be running into the issue I answered already here: stackoverflow.com/a/5879551/524368
It's not really about pixel coordinates, but pixel perfect addressing of texels. This is especially important for texture atlases. A common misconception is, that many people assume texture coordinates 0 and 1 come to lie exactly on pixel centers. But in OpenGL this is not the case, texture coordinates 0 and 1 are exactly on the border between the pixels of a texture wrap. If you build your texture atlas making the 0 and 1 are on pixel centers assumption, then using the very same addressing scheme in OpenGL will lead to either a blurry picture or pixel shifts. You need to account for this.
I still don't understand how that makes a difference to a sub-section of the texture that's being rendered.
It helps a lot to understand that to OpenGL textures are not so much images rather than support samples for an interpolator (hence "sampler" uniforms in shaders). So to get really crisp looking images you've to choose the texture coordinates you're sampling from in a way, so that the interpolator evaluates at exactly the position of the support samples. The position of those samples however are neither integer coordinates nor simply fractions (i/N).
Note that newer versions of GLSL provide the texture sampling function texelFetch which completely bypasses the interpolator and addresses texture pixels directly. If you need pixel perfect texturing you might find this easier to use (if available).

How to blur the outcome of a fragment shader?

I'm working on a shader that generates little clouds based on some mask images. Right now it works well, but i feel the result is missing something, and i thought a blur would be nice. I remember a basic blur algorithm where you have to apply a convolution with a matrix of norm 1 (the bigger the matrix the greater the result) and an image. The thing is, I don't know how to treat the current outcome of the shader as an image. So basically I want to keep the shader as is, but getting it blurry. Any ideas?, how can I integrate the convolution algorithm to the shader? Or does anyone know of other algorithm?
Cg code:
float Luminance( float4 Color ){
return 0.6 * Color.r + 0.3 * Color.g + 0.1 * Color.b;
}
struct v2f {
float4 pos : SV_POSITION;
float2 uv_MainTex : TEXCOORD0;
};
float4 _MainTex_ST;
v2f vert(appdata_base v) {
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv_MainTex = TRANSFORM_TEX(v.texcoord, _MainTex);
return o;
}
sampler2D _MainTex;
sampler2D _Gradient;
sampler2D _NoiseO;
sampler2D _NoiseT;
float4 frag(v2f IN) : COLOR {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
half4 c = tex2D (_MainTex, IN.uv_MainTex);
if (lum >= 1.0f){
float pos = lum - 1.0f;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
//turb.a = 0.0f;
return turb;
}
else return c;
}
It appears to me that this shader is emulating alpha testing between a backbuffer-like texture (passed via the sampler2D _MainTex) and a generated cloud luminance (represented by float lum) mapped onto a gradient. This makes things trickier because you can't just fake a blur and let alpha blending take care of the rest. You'll also need to change your alpha testing routine to emulate an alpha blend instead or restructure your rendering pipeline accordingly. We'll deal with blurring the clouds first.
The first question you need to ask yourself is if you need a screen-space blur. Seeing the mechanics of this fragment shader, I would think not -- you want to blur the clouds on the actual model. Given this, it should be sufficient to blur the underlying textures and result in a blurred result -- except you're emulating alpha clipping, so you'll get rough edges. The question is what to do about those rough edges. That's where alpha blending comes in.
You can emulate alpha blending by using a lerp (linear interpolation) between the turb color and c color with lerp() function (depending on which shader language you're using). You'll probably want something that looks like return lerp(c, turb, 1 - pos); instead of return turb; ... I'd expect you'll want to tweak this continually until you understand and start getting the results you want. (For example, you may prefer lerp(c, turb, 1 - pow(pos,4)))
In fact, you can try this last step (just adding the lerp) before modifying your textures to get an idea of what the alpha blending will do for you.
Edit: I hadn't considered the case where the _NoiseO and _NoiseT samplers were changing continually, so simply telling you to blur them was minimally useful advice. You can emulate blurring by using a multi-tap filter. The most simple way is to take uniformly spaced samples, weight them, and sum them together resulting in your final color. (Typically you'll want the weights themselves to sum to 1.)
This being said, you may or may not way to do this on the _NoiseO and _NoiseT textures themselves -- you may want to create a screen-space blur instead which may look more interesting to a viewer. In this case, the same concept applies, but you need to do the calculations for the offset coordinates for each tap and then perform a weighted summation.
For example if we were going with the first case and we wanted to sample from the _Noise0 sampler and blur it slightly, we could use this box filter (where all the weights are the same and sum to 1, thus performing an average):
// Untested code.
half4 nO = 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, g_offset.y))
Alternatively, if we wanted the entire cloud output to appear blurry we'd wrap the cloud generation portion in a function and call it instead of tex2D() for the taps.
// More untested code.
half4 genCloud(float2 tc) {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
float pos = lum - 1.0;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
// Figure out how you'd generate your alpha blending constant here for your lerp
turb.a = ACTUAL_ALPHA;
return turb;
}
And the multi-tap filtering would look like:
// And even more untested code.
half4 cloudcolor = 0.25 * genCloud(IN.uv_MainTex + float2( 0, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, g_offset.y))
return lerp(c, cloudcolor, cloudcolor.a);
However doing this is going to be relatively slow for calculations if you make the cloud function too complex. If you're bound by raster operations and texture reads (transferring texture/buffer data to and from memory) chances are this won't matter much unless you use a much more advanced blurring technique (such successful downsampling through ping-ponged buffers, useful for blurs/filters that are expensive because they have lots of taps). But performance is another entire consideration from just getting the look you want.

Numeric Stability with Summed Area Tables in Shadow Mapping

Im having issue with loss of precision in my SAVSM setup.
when you see the light moving around the effect is very striking; there is a lot of noise with fragments going black and white all the time. This can be somewhat lessened by using the minvariance (thus ignoring anything below a certain threshold) but then we get even worse effects with incorrect falloff (see my other post).
Im using GLSL 1.2 because I'm on a mac so I dont have access to the modf function in order to split the precision across two channels as described in GPU Gems 3 Chapter 8.
Im using GL_RGBA32F_ARB textures with a Framebuffer object and ping ponging two textures to generate a summed area table which i use with the VSM algorithm.
Moments / Depth Shader to create the basis for the tables
varying vec4 v_position;
varying float tDepth;
float g_DistributeFactor = 1024.0;
void main()
{
// Is this linear depth? I would say yes but one can't be utterly sure.
// Could try a divide by the far plane?
float depth = v_position.z / v_position.w ;
depth = depth * 0.5 + 0.5; //Don't forget to move away from unit cube ([-1,1]) to [0,1] coordinate system
vec2 moments = vec2(depth, depth * depth);
// Adjusting moments (this is sort of bias per pixel) using derivative
float dx = dFdx(depth);
float dy = dFdy(depth);
moments.y += 0.25 * (dx*dx+dy*dy);
// Subtract 0.5 off now so we can get this into our summed area table calc
//moments -= 0.5;
// Split the moments into rg and ba for EVEN MORE PRECISION
// float FactorInv = 1.0 / g_DistributeFactor;
// gl_FragColor = vec4(floor(moments.x) * FactorInv, fract(moments.x ) * g_DistributeFactor,
// floor(moments.y) * FactorInv, fract(moments.y) * g_DistributeFactor);
gl_FragColor = vec4(moments,0.0,0.0);
}
The shadowmap shader
varying vec4 v_position;
varying float tDepth;
float g_DistributeFactor = 1024.0;
void main()
{
// Is this linear depth? I would say yes but one can't be utterly sure.
// Could try a divide by the far plane?
float depth = v_position.z / v_position.w ;
depth = depth * 0.5 + 0.5; //Don't forget to move away from unit cube ([-1,1]) to [0,1] coordinate system
vec2 moments = vec2(depth, depth * depth);
// Adjusting moments (this is sort of bias per pixel) using derivative
float dx = dFdx(depth);
float dy = dFdy(depth);
moments.y += 0.25 * (dx*dx+dy*dy);
// Subtract 0.5 off now so we can get this into our summed area table calc
//moments -= 0.5;
// Split the moments into rg and ba for EVEN MORE PRECISION
// float FactorInv = 1.0 / g_DistributeFactor;
// gl_FragColor = vec4(floor(moments.x) * FactorInv, fract(moments.x ) * g_DistributeFactor,
// floor(moments.y) * FactorInv, fract(moments.y) * g_DistributeFactor);
gl_FragColor = vec4(moments,0.0,0.0);
}
The Summed tables do seem to be working. I know this because I have a function that converts back from the summed table to the original depth map and the two images do look pretty much the same. Im also using the -0.5 + 0.5 trick in order to get some more precision but it doesnt seem to be helping
My question is this, given that im on a mac which has GLSL 1.2 only, how can I split the precision over two channels? If I could use these extra channels for space in the summed table then maybe that would work? Ive seen some stuff that uses modf but that isnt available to me.
Also, people have suggested 32 bit integer buffers but I dont think I have support for these on my macbook pro.

Resources