Is it possible to change the value of an attribute in the shader and have this reflected in the buffer for the next frame render?
So for example change the position of a vertex in the vertex shader and send this new value back to the javascript buffer object?
Code sample below:
attribute vec3 newPosition;
attribute vec3 divideVal;
void main() {
vec3 difference = newPosition - position;
velocity = difference / divideVal;
position += velocity;
vec4 mvPosition = modelViewMatrix * vec4(position, 1.0);
gl_PointSize = size * (sizeMultipler / -mvPosition.z);
gl_Position = projectionMatrix * mvPosition;
}
Edit:
Right now I do this in the JS itself, but understand it will be faster if I move as much calculation as I can to the shaders? This is my current JS:
const positions = this.geometry.attributes.position.array;
const newPositions = this.geometry.attributes.newPosition.array;
for (let i = 0, i3 = 0; i < this.numParticles; i++, i3 += 3) {
const velocity = [newPositions[i3] - positions[i3], newPositions[i3 + 1] - positions[i3 + 1], newPositions[i3 + 2] - positions[i3 + 2]];
if (velocity[0] || velocity[1] || velocity[2]) {
const minReset = 1;
velocity[0] = velocity[0] / 60;
velocity[1] = velocity[1] / 60;
velocity[2] = velocity[2] / 60;
positions[i3] = positions[i3] + velocity[0];
positions[i3 + 1] = positions[i3 + 1] + velocity[1];
positions[i3 + 2] = positions[i3 + 2] + velocity[2];
}
}
I found out how to do this.
Using a concept called FBO Simulation, I created a simulation shader, which computes the calculations in its glsl shaders, and then rather than displaying the results on screens, writes them to a texture. I then read from that texture in the "real" shader and wrote the results to the screen. This also allowed me to compare different output textures to work out velocity and size differences of particles between frames.
You can read more about it being discussed here: https://github.com/mrdoob/three.js/issues/1183
Example:
http://barradeau.com/blog/?p=621
Related
My application is coded in Javascript + Three.js / WebGL + GLSL. I have 200 curves, each one made of 85 points. To animate the curves I add a new point and remove the last.
So I made a positions shader that stores the new positions onto a texture (1) and the lines shader that writes the positions for all curves on another texture (2).
The goal is to use textures as arrays: I know the first and last index of a line, so I need to convert those indices to uv coordinates.
I use FBOHelper to debug FBOs.
1) This 1D texture contains the new points for each curve (200 in total): positionTexture
2) And these are the 200 curves, with all their points, one after the other: linesTexture
The black parts are the BUG here. Those texels shouldn't be black.
How does it work: at each frame the shader looks up the new point for each line in the positionTexture and updates the linesTextures accordingly, with a for loop like this:
#define LINES_COUNT = 200
#define LINE_POINTS = 85 // with 100 it works!!!
// Then in main()
vec2 uv = gl_FragCoord.xy / resolution.xy;
for (float i = 0.0; i < LINES_COUNT; i += 1.0) {
float startIdx = i * LINE_POINTS; // line start index
float endIdx = beginIdx + LINE_POINTS - 1.0; // line end index
vec2 lastCell = getUVfromIndex(endIdx); // last uv coordinate reserved for current line
if (match(lastCell, uv)) {
pos = texture2D( positionTexture, vec2((i / LINES_COUNT) + minFloat, 0.0)).xyz;
} else if (index >= startIdx && index < endIdx) {
pos = texture2D( lineTexture, getNextUV(uv) ).xyz;
}
}
This works, but it's slightly buggy when I have many lines (150+): likely a precision problem. I'm not sure if the functions I wrote to look up the textures are right. I wrote functions like getNextUV(uv) to get the value from the next index (converted to uv coordinates) and copy to the previous. Or match(xy, uv) to know if the current fragment is the texel I want.
I though I could simply use the classic formula:
index = uv.y * width + uv.x
But it's more complicated than that. For example match():
// Wether a point XY is within a UV coordinate
float size = 132.0; // width and height of texture
float unit = 1.0 / size;
float minFloat = unit / size;
bool match(vec2 point, vec2 uv) {
vec2 p = point;
float x = floor(p.x / unit) * unit;
float y = floor(p.y / unit) * unit;
return x <= uv.x && x + unit > uv.x && y <= uv.y && y + unit > uv.y;
}
Or getUVfromIndex():
vec2 getUVfromIndex(float index) {
float row = floor(index / size); // Example: 83.56 / 10 = 8
float col = index - (row * size); // Example: 83.56 - (8 * 10) = 3.56
col = col / size + minFloat; // u = 0.357
row = row / size + minFloat; // v = 0.81
return vec2(col, row);
}
Can someone explain what's the most efficient way to lookup values in a texture, by getting a uv coordinate from index value?
Texture coordinates go from the edge of pixels not the centers so your formula to compute a UV coordinates needs to be
u = (xPixelCoord + .5) / widthOfTextureInPixels;
v = (yPixelCoord + .5) / heightOfTextureInPixels;
So I'm guessing you want getUVfromIndex to be
uniform vec2 sizeOfTexture; // allow texture to be any size
vec2 getUVfromIndex(float index) {
float widthOfTexture = sizeOfTexture.x;
float col = mod(index, widthOfTexture);
float row = floor(index / widthOfTexture);
return (vec2(col, row) + .5) / sizeOfTexture;
}
Or, based on some other experience with math issues in shaders you might need to fudge index
uniform vec2 sizeOfTexture; // allow texture to be any size
vec2 getUVfromIndex(float index) {
float fudgedIndex = index + 0.1;
float widthOfTexture = sizeOfTexture.x;
float col = mod(fudgedIndex, widthOfTexture);
float row = floor(fudgedIndex / widthOfTexture);
return (vec2(col, row) + .5) / sizeOfTexture;
}
If you're in WebGL2 you can use texelFetch which takes integer pixel coordinates to get a value from a texture
I have an object I'm trying to blur.
Render it to a transparent (glClear with 1, 1, 1, 0) FBO.
Render it to a second transparent FBO with a vertical blur shader.
Render it to the screen with a horizontal blur shader.
Here is what an example looks like not blurred, and then blurred with this technique:
Obviously the issue is that white glow around the blurred object.
I think I grasp the basic concept of why this is happening. While the pixels around the object in the FBO are transparent, they still hold the color (1,1,1) and as a result, that color is mixed into the blur.
I just don't know what I would do to remedy this?
Here is my horizontal blur shader, vertical is much of the same:
hBlur.vert
uniform mat4 u_projTrans;
uniform float u_blurPixels;
uniform float u_texelWidth;
attribute vec4 a_position;
attribute vec2 a_texCoord0;
attribute vec4 a_color;
varying vec2 v_texCoord;
varying vec2 v_blurTexCoords[14];
void main()
{
v_texCoord = a_texCoord0;
gl_Position = u_projTrans * a_position;
float blurDistance6 = u_blurPixels * u_texelWidth;
float blurDistance5 = blurDistance6 * 0.84;
float blurDistance4 = blurDistance6 * 0.70;
float blurDistance3 = blurDistance6 * 0.56;
float blurDistance2 = blurDistance6 * 0.42;
float blurDistance1 = blurDistance6 * 0.28;
float blurDistance0 = blurDistance6 * 0.14;
v_blurTexCoords[ 0] = v_texCoord + vec2(-blurDistance6, 0.0);
v_blurTexCoords[ 1] = v_texCoord + vec2(-blurDistance5, 0.0);
v_blurTexCoords[ 2] = v_texCoord + vec2(-blurDistance4, 0.0);
v_blurTexCoords[ 3] = v_texCoord + vec2(-blurDistance3, 0.0);
v_blurTexCoords[ 4] = v_texCoord + vec2(-blurDistance2, 0.0);
v_blurTexCoords[ 5] = v_texCoord + vec2(-blurDistance1, 0.0);
v_blurTexCoords[ 6] = v_texCoord + vec2(-blurDistance0, 0.0);
v_blurTexCoords[ 7] = v_texCoord + vec2( blurDistance0, 0.0);
v_blurTexCoords[ 8] = v_texCoord + vec2( blurDistance1, 0.0);
v_blurTexCoords[ 9] = v_texCoord + vec2( blurDistance2, 0.0);
v_blurTexCoords[10] = v_texCoord + vec2( blurDistance3, 0.0);
v_blurTexCoords[11] = v_texCoord + vec2( blurDistance4, 0.0);
v_blurTexCoords[12] = v_texCoord + vec2( blurDistance5, 0.0);
v_blurTexCoords[13] = v_texCoord + vec2( blurDistance6, 0.0);
}
blur.frag
uniform sampler2D u_texture;
varying vec2 v_texCoord;
varying vec2 v_blurTexCoords[14];
void main()
{
gl_FragColor = vec4(0.0);
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 0]) * 0.0044299121055113265;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 1]) * 0.00895781211794;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 2]) * 0.0215963866053;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 3]) * 0.0443683338718;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 4]) * 0.0776744219933;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 5]) * 0.115876621105;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 6]) * 0.147308056121;
gl_FragColor += texture2D(u_texture, v_texCoord ) * 0.159576912161;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 7]) * 0.147308056121;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 8]) * 0.115876621105;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[ 9]) * 0.0776744219933;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[10]) * 0.0443683338718;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[11]) * 0.0215963866053;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[12]) * 0.00895781211794;
gl_FragColor += texture2D(u_texture, v_blurTexCoords[13]) * 0.0044299121055113265;
}
I'd lie if I said I was completely certain what this code is doing. But in summary it's just sampling pixels from within a radius of u_blurPixels and summing up the resulting color for gl_FragColor with pre-determined gaussian weights.
How would I modify this to prevent the white glow due to a transparent background?
This blur procedure is really not meant for transparent images so some adjustment is needed.
To see what you are doing in your code is you are applying surrounding pixel taking effect depending on their distance when you then normalize them, take their average, whatever you want to call it... What I am talking about is that your factors 0.0044299121055113265, 0.00895781211794 are normalized so that the sum of them is always 1. More naturally these values might be for instance (using only 3 pixels) scales = [1, 5, 1] where then the result is (pix[0]*scales[0] + pix[1]*scales[1] + pix[2]*scales[2])/(scales[0]+scales[1]+scales[2]).
So if we take a step back your code can be transformed into:
int offset = 7; // Maximum range taking effect
float offsetScales[offset+1] = { // +1 is for the zero offset
0.159576912161,
0.147308056121,
...
};
float sumOfScales = offsetScales[0];
for(int i=0; i<offset; i++) sumOfScales += offsetScales[i+1]*2.0;
gl_FragColor = texture2D(u_texture, v_texCoord)*offsetScales[0];
for(int i=0; i<offset; i++) {
gl_FragColor += texture2D(u_texture, v_blurTexCoords[6-i]) * offsetScales[i+1];
gl_FragColor += texture2D(u_texture, v_blurTexCoords[7+i]) * offsetScales[i+1];
}
gl_FragColor /= sumOfScales; // sumOfScales in current case is always 1.0
Now unless I made some mistakes this code should do exactly the same as your code. It is a bit more agile though because if you added another pixel in range (offset = 8) you could simply add it and set its value to 0.0022 and your color will never overflow. Where in your case you need to adjust all of the scales so their sum equals to 1.0. But never mind that, your way is closer to optimal so keep using it, I am explaining this to take a step back and find the solution to your problem...
Ok so now that the code is a bit more maintainable let's see what happens when an alpha channel needs to take effect; When a pixel is transparent or semitransparent it should take less or no effect in computing the overall color. It should take effect on the end alpha though. That means that next to those scales we need to also take the alpha scale. But doing so we need to adjust the sum of applied colors:
int offset = 7; // Maximum range taking effect
float offsetScales[offset+1] = { // +1 is for the zero offset
0.159576912161,
0.147308056121,
...
};
highp vec4 summedColor = vec4(0.0); // We best take a large value now
highp float overallAlpha = 0.0; // The actual end alpha (previously sumOfScales)
highp float overallScale = 0.0; // We need this one so alpha doesn't overflow. If the sum of original scales is 1.0 then this factor is 1.0 and is not needed at all.
vec4 fetchedColor = texture2D(u_texture, v_texCoord);
float scaleWithAlpha = fetchedColor.a * offsetScales[0];
overallScale += offsetScales[0];
summedColor += fetchedColor*scaleWithAlpha;
overallAlpha += scaleWithAlpha;
for(int i=0; i<offset; i++) {
vec4 fetchedColor = texture2D(u_texture, v_blurTexCoords[6-i]);
float scaleWithAlpha = fetchedColor.a * offsetScales[i+1];
overallScale += offsetScales[i+1];
summedColor += fetchedColor*scaleWithAlpha;
overallAlpha += scaleWithAlpha;
fetchedColor = texture2D(u_texture, v_blurTexCoords[7+i]);
scaleWithAlpha = fetchedColor.a * offsetScales[i+1];
overallScale += offsetScales[7+i];
summedColor += fetchedColor*scaleWithAlpha;
overallAlpha += scaleWithAlpha;
}
overallAlpha /= overallScale;
summedColor /= overallAlpha; // TODO: if overallAlpha is 0.0 then discard or use clear color
gl_FragColor = vec4(summedColor.xyz, overallAlpha);
Some adjustment may still be needed to this code but I hope this gets you on the right tracks. After you make it work I suggest you again lose all the loops and do it like you originally did (with the new logic). It would also be nice if you posted the code you ended up with.
Feel free to ask any questions...
The problem is that OpenGL ES use post-multipled alpha (it's cheaper in hardware), whereas doing it "properly" needs premultiplied alpha.
You can do the pre-multiplication maths in the shader for each sample you blur:
premult.rgb = source.rgb * source.a;
... but then you incur run-time cost for every texture sample you are blending. It's generally better to premultiply your input art assets offline during texture creation/compression.
If you need post-multiplied data for lighting computation, etc, you can make the error less visible by extruding the object color into the neighboring "transparent" pixels. E.g.:
Note if you shaders are emitting premultipled alpha fragmetn colors you'll need to fix your OpenGL blend equation (use GL_ONE for srcAlpha, not the alpha value).
For a little background this is for doing particle collisions with lookup textures on the GPU. I read the position texture with javascript and create a grid texture that contains the particles that are in the corresponding grid cell. The working example that is mentioned in the post can be viewed here: https://pacific-hamlet-84784.herokuapp.com/
The reason I want the buckets system is that it will allow me to do much fewer checks and the number of checks wouldn't increase with the number of particles.
For the actual problem description:
I am attempting to read from a lookup texture centered around a pixel (lets say i have a texture that is 10x10, and I want to read the pixels around (4,2), i would read
(3,1),(3,2)(3,3)
(4,1),(4,2)(4,3)
(5,1),(5,2)(5,3)
The loop is a little more complicated but that is the general idea. If I make the loop look like the following
float xcenter = 5.0;
float ycenter = 5.0;
for(float i = -5.0; i < 5.0; i++){
for(float j = -5.0; j < 5.0; j++){
}
}
It works (however it goes over all of the particles which defeats the purpose), however if I calculate the value dynamically (which is what I need), then I get really bizarre behavior. Is this a problem with GLSL or a problem with my code? I output the values to an image and read the pixel values and they all appear to be within the right range. The problem is coming from using the for loop variables (i,j) to change a bucket index that is calculated outside of the loop, and use that variable to index a texture.
The entire shader code can be seen here:
(if I remove the hard coded 70, and remove the comments it breaks, but all of those values are between 0 and 144. This is where I am confused. I feel like this code should still work fine.).
uniform sampler2D pos;
uniform sampler2D buckets;
uniform vec2 res;
uniform vec2 screenSize;
uniform float size;
uniform float bounce;
const float width = &WIDTH;
const float height = &HEIGHT;
const float cellSize = &CELLSIZE;
const float particlesPerCell = &PPC;
const float bucketsWidth = &BW;
const float bucketsHeight = &BH;
$rand
void main(){
vec2 uv = gl_FragCoord.xy / res;
vec4 posi = texture2D( pos , uv );
float x = posi.x;
float y = posi.y;
float z = posi.z;
float target = 1.0 * size;
float x_bkt = floor( (x + (screenSize.x/2.0) )/cellSize);
float y_bkt = floor( (y + (screenSize.y/2.0) )/cellSize);
float x_bkt_ind_start = 70.0; //x_bkt * particlesPerCell;
float y_bkt_ind_start =70.0; //y_bkt * particlesPerCell;
//this is the code that is acting weirdly
for(float j = -144.0 ; j < 144.0; j++){
for(float i = -144.0 ; i < 144.0; i++){
float x_bkt_ind = (x_bkt_ind_start + i)/bucketsWidth;
float y_bkt_ind = (y_bkt_ind_start + j)/bucketsHeight;
vec4 ind2 = texture2D( buckets , vec2(x_bkt_ind,y_bkt_ind) );
if( abs(ind2.z - 1.0) > 0.00001 || x_bkt_ind < 0.0 || x_bkt_ind > 1.0 || y_bkt_ind < 0.0 || y_bkt_ind > 1.0 ){
continue;
}
vec4 pos2 = texture2D( pos , vec2(ind2.xy)/res );
vec2 diff = posi.xy - pos2.xy;
float dist = length(diff);
vec2 uvDiff = ind2.xy - gl_FragCoord.xy ;
float uvDist = abs(length(uvDiff));
if(dist <= target && uvDist >= 0.5){
float factor = (dist-target)/dist;
x = x - diff.x * factor * 0.5;
y = y - diff.y * factor * 0.5;
}
}
}
gl_FragColor = vec4( x, y, x_bkt_ind_start , y_bkt_ind_start);
}
EDIT:
To make my problem clear, what is happening is that when I do the first texture lookup, I get the position of the particle:
vec2 uv = gl_FragCoord.xy / res;
vec4 posi = texture2D( pos , uv );
After, I calculate the bucket that the particle is in:
float x_bkt = floor( (x + (screenSize.x/2.0) )/cellSize);
float y_bkt = floor( (y + (screenSize.y/2.0) )/cellSize);
float x_bkt_ind_start = x_bkt * particlesPerCell;
float y_bkt_ind_start = y_bkt * particlesPerCell;
All of this is correct. Like I am getting the correct values and if I set these as the output values of the shader and read the pixels they are the correct values. I also changed my implementation a little and this code works fine.
In order to text the for loop, I replaced the pixel lookup coordinates in the grid bucket by the pixel positions. I adapted the code and it works fine, however I have to recalculate the buckets multiple times per frame so the code is not very efficient. If instead of storing the pixel positions I store the uv coordinates of the pixels and then do a lookup using those uv positions:
//get the texture coordinate that is offset by the for loop
float x_bkt_ind = (x_bkt_ind_start + i)/bucketsWidth;
float y_bkt_ind = (y_bkt_ind_start + j)/bucketsHeight;
//use the texture coordinates to get the stored texture coordinate in the actual position table from the bucket table
vec4 ind2 = texture2D( buckets , vec2(x_bkt_ind,y_bkt_ind) );
and then I actually get the position
vec4 pos2 = texture2D( pos , vec2(ind2.xy)/res );
this pos2 value will be wrong. I am pretty sure that the ind2 value is correct because if instead of storing a pixel coordinate in that bucket table I store position values and remove the second texture lookup, the code runs fine. But using the second lookup causes the code to break.
In the original post if I set the bucket to be any value, lets say the middle of the texture, and iterate over every possible bucket coordinate around the pixel, it works fine. However if I calculate the bucket position and iterate over every pixel it does not. I wonder if it has to do with the say glsl compiles the shaders and that some sort of optimization it is making is causing the double texture lookups to break in the for look. Or it is just a mistake in my code. I was able to get the single texture lookup in a for loop working when I just stored position values in the bucket texture.
I wrote the following shader to render a pattern with a bunch of concentric circles. Eventually I want to have each rotating sphere be a light emitter to create something along these lines.
Of course right now I'm just doing the most basic part to render the different objects.
Unfortunately the shader is incredibly slow (16fps full screen on a high-end macbook). I'm pretty sure this is due to the numerous for loops and branching that I have in the shader. I'm wondering how I can pull off the geometry I'm trying to achieve in a more performance optimized way:
EDIT: you can run the shader here: https://www.shadertoy.com/view/lssyRH
One obvious optimization I am missing is that currently all the fragments are checked against the entire 24 surrounding circles. It would be pretty quick and easy to just discard these checks entirely by checking if the fragment intersects the outer bounds of the diagram. I guess I'm just trying to get a handle on how the best practice is of doing something like this.
#define N 10
#define M 5
#define K 24
#define M_PI 3.1415926535897932384626433832795
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
float aspectRatio = iResolution.x / iResolution.y;
float h = 1.0;
float w = aspectRatio;
vec2 uv = vec2(fragCoord.x / iResolution.x * aspectRatio, fragCoord.y / iResolution.y);
float radius = 0.01;
float orbitR = 0.02;
float orbiterRadius = 0.005;
float centerRadius = 0.002;
float encloseR = 2.0 * orbitR;
float encloserRadius = 0.002;
float spacingX = (w / (float(N) + 1.0));
float spacingY = h / (float(M) + 1.0);
float x = 0.0;
float y = 0.0;
vec4 totalLight = vec4(0.0, 0.0, 0.0, 1.0);
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
// compute the center of the diagram
vec2 center = vec2(spacingX * (float(i) + 1.0), spacingY * (float(j) + 1.0));
x = center.x + orbitR * cos(iGlobalTime);
y = center.y + orbitR * sin(iGlobalTime);
vec2 bulb = vec2(x,y);
if (length(uv - center) < centerRadius) {
// frag intersects white center marker
fragColor = vec4(1.0);
return;
} else if (length(uv - bulb) < radius) {
// intersects rotating "light"
fragColor = vec4(uv,0.5+0.5*sin(iGlobalTime),1.0);
return;
} else {
// intersects one of the enclosing 24 cylinders
for(int k = 0; k < K; k++) {
float theta = M_PI * 2.0 * float(k)/ float(K);
x = center.x + cos(theta) * encloseR;
y = center.y + sin(theta) * encloseR;
vec2 encloser = vec2(x,y);
if (length(uv - encloser) < encloserRadius) {
fragColor = vec4(uv,0.5+0.5*sin(iGlobalTime),1.0);
return;
}
}
}
}
}
}
Keeping in mind that you want to optimize the fragment shader, and only the fragment shader:
Move the sin(iGlobalTime) and cos(iGlobalTime) out of the loops, these remain static over the whole draw call so no need to recalculate them every loop iteration.
GPUs employ vectorized instruction sets (SIMD) where possible, take advantage of that. You're wasting lots of cycles by doing multiple scalar ops where you could use a single vector instruction(see annotated code)
[Three years wiser me here: I'm not really sure if this statement is true in regards to how modern GPUs process the instructions, however it certainly does help readability and maybe even give a hint or two to the compiler]
Do your radius checks squared, save that sqrt(length) for when you really need it
Replace float casts of constants(your loop limits) with a float constant(intelligent shader compilers will already do this, not something to count on though)
Don't have undefined behavior in your shader(not writing to gl_FragColor)
Here is an optimized and annotated version of your shader(still containing that undefined behavior, just like the one you provided). Annotation is in the form of:
// annotation
// old code, if any
new code
#define N 10
// define float constant N
#define fN 10.
#define M 5
// define float constant M
#define fM 5.
#define K 24
// define float constant K
#define fK 24.
#define M_PI 3.1415926535897932384626433832795
// predefine 2 times PI
#define M_PI2 6.28318531
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
float aspectRatio = iResolution.x / iResolution.y;
// we dont need these separate
// float h = 1.0;
// float w = aspectRatio;
// use vector ops(2 divs 1 mul => 1 div 1 mul)
// vec2 uv = vec2(fragCoord.x / iResolution.x * aspectRatio, fragCoord.y / iResolution.y);
vec2 uv = fragCoord.xy / iResolution.xy;
uv.x *= aspectRatio;
// most of the following declarations should be predefined or marked as "const"...
float radius = 0.01;
// precalc squared radius
float radius2 = radius*radius;
float orbitR = 0.02;
float orbiterRadius = 0.005;
float centerRadius = 0.002;
// precalc squared center radius
float centerRadius2 = centerRadius * centerRadius;
float encloseR = 2.0 * orbitR;
float encloserRadius = 0.002;
// precalc squared encloser radius
float encloserRadius2 = encloserRadius * encloserRadius;
// Use float constants and vector ops here(2 casts 2 adds 2 divs => 1 add 1 div)
// float spacingX = w / (float(N) + 1.0);
// float spacingY = h / (float(M) + 1.0);
vec2 spacing = vec2(aspectRatio, 1.0) / (vec2(fN, fM)+1.);
// calc sin and cos of global time
// saves N*M(sin,cos,2 muls)
vec2 stct = vec2(sin(iGlobalTime), cos(iGlobalTime));
vec2 orbit = orbitR * stct;
// not needed anymore
// float x = 0.0;
// float y = 0.0;
// was never used
// vec4 totalLight = vec4(0.0, 0.0, 0.0, 1.0);
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
// compute the center of the diagram
// Use vector ops
// vec2 center = vec2(spacingX * (float(i) + 1.0), spacingY * (float(j) + 1.0));
vec2 center = spacing * (vec2(i,j)+1.0);
// Again use vector opts, use precalced time trig(orbit = orbitR * stct)
// x = center.x + orbitR * cos(iGlobalTime);
// y = center.y + orbitR * sin(iGlobalTime);
// vec2 bulb = vec2(x,y);
vec2 bulb = center + orbit;
// calculate offsets
vec2 centerOffset = uv - center;
vec2 bulbOffset = uv - bulb;
// use squared length check
// if (length(uv - center) < centerRadius) {
if (dot(centerOffset, centerOffset) < centerRadius2) {
// frag intersects white center marker
fragColor = vec4(1.0);
return;
// use squared length check
// } else if (length(uv - bulb) < radius) {
} else if (dot(bulbOffset, bulbOffset) < radius2) {
// Use precalced sin global time in stct.x
// intersects rotating "light"
fragColor = vec4(uv,0.5+0.5*stct.x,1.0);
return;
} else {
// intersects one of the enclosing 24 cylinders
for(int k = 0; k < K; k++) {
// use predefined 2*PI and float K
float theta = M_PI2 * float(k) / fK;
// Use vector ops(2 muls 2 adds => 1 mul 1 add)
// x = center.x + cos(theta) * encloseR;
// y = center.y + sin(theta) * encloseR;
// vec2 encloser = vec2(x,y);
vec2 encloseOffset = uv - (center + vec2(cos(theta),sin(theta)) * encloseR);
if (dot(encloseOffset,encloseOffset) < encloserRadius2) {
fragColor = vec4(uv,0.5+0.5*stct.x,1.0);
return;
}
}
}
}
}
}
I did a little more thinking ... I realized the best way to optimize it is to actually change the logic so that before doing intersection tests on the small circles it checks the bounds of the group of circles. This got it to run at 60fps:
Example here:
https://www.shadertoy.com/view/lssyRH
I'm storing floating-point gpgpu values in a webgl RGBA render texture, using only the r channel to store my data (I know I should be using a more efficient texture format but that's a separate concern).
Is there any efficient way / trick / hack to find the global min and max floating-point values without resorting to gl.readPixels? Note that just exporting the floating-point data is a hassle in webgl since readPixels doesn't yet support reading gl.FLOAT values.
This is the gist of how I'm currently doing things:
if (!gl) {
gl = renderer.getContext();
fb = gl.createFramebuffer();
pixels = new Uint8Array(SIZE * SIZE * 4);
}
if (!!gl) {
// TODO: there has to be a more efficient way of doing this than via readPixels...
gl.bindFramebuffer(gl.FRAMEBUFFER, fb);
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, data.rtTemp2.__webglTexture, 0);
if (gl.checkFramebufferStatus(gl.FRAMEBUFFER) == gl.FRAMEBUFFER_COMPLETE) {
// HACK: we're pickling a single float value in every 4 bytes
// because webgl currently doesn't support reading gl.FLOAT
// textures.
gl.readPixels(0, 0, SIZE, SIZE, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
var max = -100, min = 100;
for (var i = 0; i < SIZE; ++i) {
for (var j = 0; j < SIZE; ++j) {
var o = 4 * (i * SIZE + j);
var x = pixels[o + 0];
var y = pixels[o + 1] / 255.0;
var z = pixels[o + 2] / 255.0;
var v = (x <= 1 ? -1.0 : 1.0) * y;
if (z > 0.0) { v /= z; }
max = Math.max(max, v);
min = Math.min(min, v);
}
}
// ...
}
}
(using a fragment shader that ouputs floating-point data in the following format suitable for UNSIGNED_BYTE parsing...
<script id="fragmentShaderCompX" type="x-shader/x-fragment">
uniform sampler2D source1;
uniform sampler2D source2;
uniform vec2 resolution;
void main() {
vec2 uv = gl_FragCoord.xy / resolution.xy;
float v = texture2D(source1, uv).r + texture2D(source2, uv).r;
vec4 oo = vec4(1.0, abs(v), 1.0, 1.0);
if (v < 0.0) {
oo.x = 0.0;
}
v = abs(v);
if (v > 1.0) {
oo.y = 1.0;
oo.z = 1.0 / v;
}
gl_FragColor = oo;
}
</script>
Without compute shaders, the only thing that comes to mind is using a fragment shader to do that. For a 100x100 texture you could try rendering to a 20x20 grid texture, have the fragment shader do 5x5 lookups (with GL_NEAREST) to determine min and max, then download the 20x20 texture and do the rest on the CPU. Or do another pass to reduce it again. I don't know for which grid sizes it's more efficient though, you'll have to experiment. Maybe this helps, or googling "reduction gpu".
Render 1 vertex on 1x1 framebuffer and within shader sample whole previously rendered texture. That way you are testing texture on GPU which should be fast enough for real-time (or not?), however it is definitely faster than doing it on CPU, and the output would be min/max value.
I also ran across solution to try mipmap-ing texture and going through different levels.
These links might be helpful:
http://www.gamedev.net/topic/559942-glsl--find-global-min-and-max-in-texture/
http://www.opengl.org/discussion_boards/showthread.php/175692-most-efficient-way-to-get-maximum-value-in-texture
Hope this helps.