How to create a 3D random gradient out of 3 passed values in a fragment shader? - random

I need to create an animated smoke-like texture. I can achieve this with the 3D perlin noise using gradients passed from the CPU side, which i did:
But on the current project I cannot pass an array from the normal cpp-code. I'm limited to only writing HLSL shaders (although, all the following stuff is written in GLSL as it's easier to set up). So I thought I need to generate some sort of random values for my gradients inside the fragment shader. While investigating how I can tackle this problem, I figured out that I can actually use hash functions as my pseudo random values. I'm following these articles (the first and the second), so I chose to use PCG hash for my purposes. I managed to generate decently looking value noise with the following code.
#version 420 core
#define MAX_TABLE_SIZE 256
#define MASK (MAX_TABLE_SIZE - 1)
in vec2 TexCoord;
out vec4 FragColor;
uniform float Time;
uint pcg_hash(uint input)
{
uint state = input * 747796405u + 2891336453u;
uint word = ((state >> ((state >> 28u) + 4u)) ^ state) * 277803737u;
return (word >> 22u) ^ word;
}
// This is taken from here
// https://stackoverflow.com/a/17479300/9778826
float ConvertToFloat(uint n)
{
uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask
uint ieeeOne = 0x3F800000u; // 1.0 in IEEE binary32
n &= ieeeMantissa;
n |= ieeeOne;
float f = uintBitsToFloat(n);
return f - 1.0;
}
float Random1(uint x, uint y)
{
uint hash = pcg_hash(y ^ pcg_hash(x));
return ConvertToFloat(hash);
}
float ValueNoise(vec2 p)
{
int xi = int(p.x);
uint rx0 = uint(xi & MASK);
uint rx1 = uint((xi + 1) & MASK);
int yi = int(p.y);
uint ry0 = uint(yi & MASK);
uint ry1 = uint((yi + 1) & MASK);
float tx = p.x - float(xi);
float ty = p.y - float(yi);
float r00 = Random1(rx0, ry0);
float r10 = Random1(rx1, ry0);
float r01 = Random1(rx0, ry1);
float r11 = Random1(rx1, ry1);
float sx = smoothstep(0, 1, tx);
float sy = smoothstep(0, 1, ty);
float lerp0 = mix(r00, r10, sx);
float lerp1 = mix(r01, r11, sx);
return mix(lerp0, lerp1, sy);
}
float FractalNoise(vec2 point)
{
float sum = 0.0;
float frequency = 0.01;
float amplitude = 1;
int nLayers = 5;
for (int i = 0; i < nLayers; i++)
{
float noise = ValueNoise(point * frequency) * amplitude * 0.5;
sum += noise;
amplitude *= 0.5;
frequency *= 2.0;
}
return sum;
}
void main()
{
// Coordinates go from 0.0 to 1.0 both horizontally and vertically
vec2 Point = TexCoord * 2000;
float noise = FractalNoise(Point);
FragColor = vec4(noise, noise, noise, 1.0);
}
What I want, however, is to generate a 3D random gradient (which is actually just a 3D random vector) out of three arguments that I pass, to then feed it into the Perlin noise function. But I don't know how to do it properly. To clarify a bit about these three arguments: see, I need an animated Perlin noise, which means I will need a three component gradient at every joint of the 3D lattice. And the arguments are exactly x, y as well as the time variable but in the strict order. Say, a point (1, 4, 5) produces a gradient (0.1, 0.03, 0.78), but a point (4, 1, 5) should produce a completely different gradient, say, (0.22, 0.95, 0.43). So again, the order matters.
What I came up with (and what I could understand from the articles in question) is that I can hash the arguments sequentially and then use the resulting value as a seed to the same hash function which will now be working as a random number generator. So I wrote this function:
vec3 RandomGradient3(int x, int y, int z)
{
uint seed = pcg_hash(z ^ pcg_hash(y ^ pcg_hash(x)));
uint s1 = seed ^ pcg_hash(seed);
uint s2 = s1 ^ pcg_hash(s1);
uint s3 = s2 ^ pcg_hash(s2);
float g1 = ConvertToFloat(s1);
float g2 = ConvertToFloat(s2);
float g3 = ConvertToFloat(s3);
return vec3(g1, g2, g3);
}
And the gradient I then feed to the 3D perlin noise function:
float CalculatePerlin3D(vec2 p)
{
float z = Time; // a uniform variable passed from the CPU side
int xi0 = int(floor(p.x)) & MASK;
int yi0 = int(floor(p.y)) & MASK;
int zi0 = int(floor(z)) & MASK;
int xi1 = (xi0 + 1) & MASK;
int yi1 = (yi0 + 1) & MASK;
int zi1 = (zi0 + 1) & MASK;
float tx = p.x - int(floor(p.x));
float ty = p.y - int(floor(p.y));
float tz = z - int(floor(z));
float u = smoothstep(0, 1, tx);
float v = smoothstep(0, 1, ty);
float w = smoothstep(0, 1, tz);
vec3 c000 = RandomGradient3(xi0, yi0, zi0);
vec3 c100 = RandomGradient3(xi1, yi0, zi0);
vec3 c010 = RandomGradient3(xi0, yi1, zi0);
vec3 c110 = RandomGradient3(xi1, yi1, zi0);
vec3 c001 = RandomGradient3(xi0, yi0, zi1);
vec3 c101 = RandomGradient3(xi1, yi0, zi1);
vec3 c011 = RandomGradient3(xi0, yi1, zi1);
vec3 c111 = RandomGradient3(xi1, yi1, zi1);
float x0 = tx, x1 = tx - 1;
float y0 = ty, y1 = ty - 1;
float z0 = tz, z1 = tz - 1;
vec3 p000 = vec3(x0, y0, z0);
vec3 p100 = vec3(x1, y0, z0);
vec3 p010 = vec3(x0, y1, z0);
vec3 p110 = vec3(x1, y1, z0);
vec3 p001 = vec3(x0, y0, z1);
vec3 p101 = vec3(x1, y0, z1);
vec3 p011 = vec3(x0, y1, z1);
vec3 p111 = vec3(x1, y1, z1);
float a = mix(dot(c000, p000), dot(c100, p100), u);
float b = mix(dot(c010, p010), dot(c110, p110), u);
float c = mix(dot(c001, p001), dot(c101, p101), u);
float d = mix(dot(c011, p011), dot(c111, p111), u);
float e = mix(a, b, v);
float f = mix(c, d, v);
float noise = mix(e, f, w);
float unsignedNoise = (noise + 1.0) / 2.0;
return unsignedNoise;
}
With this RandomGradient3 function, the following noise texture is produced:
So the gradients seem to be correlated, hence the noise is not really random. The question is, how can I properly randomize these s1, s2 and s3 from RandomGradient3? I'm a real beginner in all this random numbers generating stuff and is certainly not a math guy.
The 3D perlin noise function, which I have, seems to be fine because if I feed it with predefined gradients from CPU it produces the expected result.

Oh, well. After I have posted the question, I realized that I haven't scaled the generated gradients properly! The function produced gradients in the range [0.0; 1.0] but we actually need [-1.0; 1.0] to make it work. So I rewrote this piece of code
vec3 RandomGradient3(int x, int y, int z)
{
uint seed = pcg_hash(z ^ pcg_hash(y ^ pcg_hash(x)));
uint s1 = seed ^ pcg_hash(seed);
uint s2 = s1 ^ pcg_hash(s1);
uint s3 = s2 ^ pcg_hash(s2);
float g1 = ConvertToFloat(s1);
float g2 = ConvertToFloat(s2);
float g3 = ConvertToFloat(s3);
return vec3(g1, g2, g3);
}
To this:
vec3 RandomGradient3(int x, int y, int z)
{
uint seed = pcg_hash(z ^ pcg_hash(y ^ pcg_hash(x)));
uint s1 = seed ^ pcg_hash(seed);
uint s2 = s1 ^ pcg_hash(s1);
uint s3 = s2 ^ pcg_hash(s2);
float g1 = (ConvertToFloat(s1) - 0.5) * 2.0;
float g2 = (ConvertToFloat(s2) - 0.5) * 2.0;
float g3 = (ConvertToFloat(s3) - 0.5) * 2.0;
return vec3(g1, g2, g3);
}
The animation now looks as expected:
Although, I've got another question. Do these computations produce really good pseudo random numbers that we can rely on to generate random textures? Or is there a better way to do this? Obviously, they produce a good enough result which we can assume from the GIF above, but still. Sure, I can dive into statistics and stuff but maybe somebody's got a quick answer.
uint seed = pcg_hash(z ^ pcg_hash(y ^ pcg_hash(x)));
uint s1 = seed ^ pcg_hash(seed);
uint s2 = s1 ^ pcg_hash(s1);
uint s3 = s2 ^ pcg_hash(s2);

Related

What is this called and how to achieve! Visuals in processing

Hey does anyone know how to achieve this effect using processing or what this is called?
I have been trying to use the wave gradient example in the processing library and implementing Perlin noise but I can not get close to the gif quality.
I know the artist used processing but can not figure out how!
Link to gif:
https://giphy.com/gifs/processing-jodeus-QInYLzY33wMwM
The effect is reminescent of Op Art (optical illusion art): I recommend reading/learning more about this fascinating genre and artists like:
Bridget Riley
(Bridget Riley, Intake, 1964)
(Bridget Riley, Hesistate, 1964,
Copyright: (c) Bridget Riley 2018. All rights reserved. / Photo (c) Tate)
Victor Vasarely
(Victor Vasarely, Zebra Couple)
(Victor Vasarely, VegaII)
Frank Stella
(Frank Stella, Untitled 1965, Image curtesy of Art Gallery NSW)
and more
You notice this waves are reminiscent/heavily inspired by Bridget Riley's work.
I also recommend checking out San Charoenchai;s album visualiser for Beach House - 7
As mentioned in my comment: you should post your attempt.
Waves and perlin noise could work for sure.
There are many ways to achieve a similar look.
Here's tweaked version of Daniel Shiffman's Noise Wave example:
int numWaves = 24;
float[] yoff = new float[numWaves]; // 2nd dimension of perlin noise
float[] yoffIncrements = new float[numWaves];
void setup() {
size(640, 360);
noStroke();
for(int i = 0 ; i < numWaves; i++){
yoffIncrements[i] = map(i, 0, numWaves - 1, 0.01, 0.03);
}
}
void draw() {
background(0);
float waveHeight = height / numWaves;
for(int i = 0 ; i < numWaves; i++){
float waveY = i * waveHeight;
fill(i % 2 == 0 ? color(255) : color(0));
// We are going to draw a polygon out of the wave points
beginShape();
float xoff = 0; // Option #1: 2D Noise
// float xoff = yoff; // Option #2: 1D Noise
// Iterate over horizontal pixels
for (float x = 0; x <= width + 30; x += 20) {
// Calculate a y value according to noise, map to
float y = map(noise(xoff, yoff[i]), 0, 1, waveY , waveY + (waveHeight * 3)); // Option #1: 2D Noise
// float y = map(noise(xoff), 0, 1, 200,300); // Option #2: 1D Noise
// Set the vertex
vertex(x, y);
// Increment x dimension for noise
xoff += 0.05;
}
// increment y dimension for noise
yoff[i] += yoffIncrements[i];
vertex(width, height);
vertex(0, height);
endShape(CLOSE);
}
}
Notice the quality of the noise wave in comparison to the image you're trying to emulate: there is a constant rhythm to it. To me that is a hint that it's using cycling sine waves changing phase and amplitude (potentially even adding waves together).
I've written an extensive answer on animating sine waves here
(Reuben Margolin's kinectic sculpture system demo)
From your question it sounds like you would be comfortable implementing a sine wave animation. It it helps, here's an example of adding two waves together:
void setup(){
size(600,600);
noStroke();
}
void draw(){
background(0);
// how many waves per sketch height
int heightDivisions = 30;
// split the sketch height into equal height sections
float heightDivisionSize = (float)height / heightDivisions;
// for each height division
for(int j = 0 ; j < heightDivisions; j++){
// use % 2 to alternate between black and white
// see https://processing.org/reference/modulo.html and
// https://processing.org/reference/conditional.html for more
fill(j % 2 == 0 ? color(255) : color(0));
// offset drawing on Y axis
translate(0,(j * heightDivisionSize));
// start a wave shape
beginShape();
// first vertex is at the top left corner
vertex(0,height);
// how many horizontal (per wave) divisions ?
int widthDivisions = 12;
// equally space the points on the wave horizontally
float widthDivsionSize = (float)width / widthDivisions;
// for each point on the wave
for(int i = 0; i <= widthDivisions; i++){
// calculate different phases
// play with arithmetic operators to make interesting wave additions
float phase1 = (frameCount * 0.01) + ((i * j) * 0.025);
float phase2 = (frameCount * 0.05) + ((i + j) * 0.25);
// calculate vertex x position
float x = widthDivsionSize * i;
// multiple sine waves
// (can use cos() and use other ratios too
// 150 in this case is the wave amplitude (e.g. from -150 to + 150)
float y = ((sin(phase1) * sin(phase2) * 150));
// draw calculated vertex
vertex(x,y);
}
// last vertex is at bottom right corner
vertex(width,height);
// finish the shape
endShape();
}
}
The result:
Minor note on performance: this could be implemented more efficiently using PShape, however I recommend playing with the maths/geometry to find the form you're after, then as a last step think of optimizing it.
My intention is not to show you how to create an exact replica, but to show there's more to Op Art than an effect and hopefully inspire you to explore other methods of achieving something similar in the hope that you will discover your own methods and outcomes: something new and of your own through fun happy accidents.
In terms of other techniques/avenues to explore:
displacement maps:
Using an alternating black/white straight bars texture on wavy 3D geometry
using shaders:
Shaders are a huge topic on their own, but it's worth noting:
There's a very good Processing Shader Tutorial
You might be able to explore frament shaders on shadertoy, tweak the code in browser then make slight changes so you can run them in Processing.
Here are a few quick examples:
https://www.shadertoy.com/view/Wts3DB
tweaked for black/white waves in Processing as shader-Wts3DB.frag
// https://www.shadertoy.com/view/Wts3DB
uniform vec2 iResolution;
uniform float iTime;
#define COUNT 6.
#define COL_BLACK vec3(23,32,38) / 255.0
#define SF 1./min(iResolution.x,iResolution.y)
#define SS(l,s) smoothstep(SF,-SF,l-s)
#define hue(h) clamp( abs( fract(h + vec4(3,2,1,0)/3.) * 6. - 3.) -1. , 0., 1.)
// Original noise code from https://www.shadertoy.com/view/4sc3z2
#define MOD3 vec3(.1031,.11369,.13787)
vec3 hash33(vec3 p3)
{
p3 = fract(p3 * MOD3);
p3 += dot(p3, p3.yxz+19.19);
return -1.0 + 2.0 * fract(vec3((p3.x + p3.y)*p3.z, (p3.x+p3.z)*p3.y, (p3.y+p3.z)*p3.x));
}
float simplex_noise(vec3 p)
{
const float K1 = 0.333333333;
const float K2 = 0.166666667;
vec3 i = floor(p + (p.x + p.y + p.z) * K1);
vec3 d0 = p - (i - (i.x + i.y + i.z) * K2);
vec3 e = step(vec3(0.0), d0 - d0.yzx);
vec3 i1 = e * (1.0 - e.zxy);
vec3 i2 = 1.0 - e.zxy * (1.0 - e);
vec3 d1 = d0 - (i1 - 1.0 * K2);
vec3 d2 = d0 - (i2 - 2.0 * K2);
vec3 d3 = d0 - (1.0 - 3.0 * K2);
vec4 h = max(0.6 - vec4(dot(d0, d0), dot(d1, d1), dot(d2, d2), dot(d3, d3)), 0.0);
vec4 n = h * h * h * h * vec4(dot(d0, hash33(i)), dot(d1, hash33(i + i1)), dot(d2, hash33(i + i2)), dot(d3, hash33(i + 1.0)));
return dot(vec4(31.316), n);
}
void mainImage( vec4 fragColor, vec2 fragCoord )
{
}
void main(void) {
//vec2 uv = vec2(gl_FragColor.x / iResolution.y, gl_FragColor.y / iResolution.y);
vec2 uv = gl_FragCoord.xy / iResolution.y;
float m = 0.;
float t = iTime *.5;
vec3 col;
for(float i=COUNT; i>=0.; i-=1.){
float edge = simplex_noise(vec3(uv * vec2(2., 0.) + vec2(0, t + i*.15), 3.))*.2 + (.95/COUNT)*i;
float mi = SS(edge, uv.y) - SS(edge + .095, uv.y);
m += mi;
if(mi > 0.){
col = vec3(1.0);
}
}
col = mix(COL_BLACK, col, m);
gl_FragColor = vec4(col,1.0);
// mainImage(gl_FragColor,gl_FragCoord);
}
loaded in Processing as:
PShader shader;
void setup(){
size(300,300,P2D);
noStroke();
shader = loadShader("shader-Wts3DB.frag");
shader.set("iResolution",(float)width, float(height));
}
void draw(){
background(0);
shader.set("iTime",frameCount * 0.05);
shader(shader);
rect(0,0,width,height);
}
https://www.shadertoy.com/view/MtsXzl
tweaked as shader-MtsXzl.frag
//https://www.shadertoy.com/view/MtsXzl
#define SHOW_GRID 1
const float c_scale = 0.5;
const float c_rate = 2.0;
#define FLT_MAX 3.402823466e+38
uniform vec3 iMouse;
uniform vec2 iResolution;
uniform float iTime;
//=======================================================================================
float CubicHermite (float A, float B, float C, float D, float t)
{
float t2 = t*t;
float t3 = t*t*t;
float a = -A/2.0 + (3.0*B)/2.0 - (3.0*C)/2.0 + D/2.0;
float b = A - (5.0*B)/2.0 + 2.0*C - D / 2.0;
float c = -A/2.0 + C/2.0;
float d = B;
return a*t3 + b*t2 + c*t + d;
}
//=======================================================================================
float hash(float n) {
return fract(sin(n) * 43758.5453123);
}
//=======================================================================================
float GetHeightAtTile(vec2 T)
{
float rate = hash(hash(T.x) * hash(T.y))*0.5+0.5;
return (sin(iTime*rate*c_rate) * 0.5 + 0.5) * c_scale;
}
//=======================================================================================
float HeightAtPos(vec2 P)
{
vec2 tile = floor(P);
P = fract(P);
float CP0X = CubicHermite(
GetHeightAtTile(tile + vec2(-1.0,-1.0)),
GetHeightAtTile(tile + vec2(-1.0, 0.0)),
GetHeightAtTile(tile + vec2(-1.0, 1.0)),
GetHeightAtTile(tile + vec2(-1.0, 2.0)),
P.y
);
float CP1X = CubicHermite(
GetHeightAtTile(tile + vec2( 0.0,-1.0)),
GetHeightAtTile(tile + vec2( 0.0, 0.0)),
GetHeightAtTile(tile + vec2( 0.0, 1.0)),
GetHeightAtTile(tile + vec2( 0.0, 2.0)),
P.y
);
float CP2X = CubicHermite(
GetHeightAtTile(tile + vec2( 1.0,-1.0)),
GetHeightAtTile(tile + vec2( 1.0, 0.0)),
GetHeightAtTile(tile + vec2( 1.0, 1.0)),
GetHeightAtTile(tile + vec2( 1.0, 2.0)),
P.y
);
float CP3X = CubicHermite(
GetHeightAtTile(tile + vec2( 2.0,-1.0)),
GetHeightAtTile(tile + vec2( 2.0, 0.0)),
GetHeightAtTile(tile + vec2( 2.0, 1.0)),
GetHeightAtTile(tile + vec2( 2.0, 2.0)),
P.y
);
return CubicHermite(CP0X, CP1X, CP2X, CP3X, P.x);
}
//=======================================================================================
vec3 NormalAtPos( vec2 p )
{
float eps = 0.01;
vec3 n = vec3( HeightAtPos(vec2(p.x-eps,p.y)) - HeightAtPos(vec2(p.x+eps,p.y)),
2.0*eps,
HeightAtPos(vec2(p.x,p.y-eps)) - HeightAtPos(vec2(p.x,p.y+eps)));
return normalize( n );
}
//=======================================================================================
float RayIntersectSphere (vec4 sphere, in vec3 rayPos, in vec3 rayDir)
{
//get the vector from the center of this circle to where the ray begins.
vec3 m = rayPos - sphere.xyz;
//get the dot product of the above vector and the ray's vector
float b = dot(m, rayDir);
float c = dot(m, m) - sphere.w * sphere.w;
//exit if r's origin outside s (c > 0) and r pointing away from s (b > 0)
if(c > 0.0 && b > 0.0)
return -1.0;
//calculate discriminant
float discr = b * b - c;
//a negative discriminant corresponds to ray missing sphere
if(discr < 0.0)
return -1.0;
//ray now found to intersect sphere, compute smallest t value of intersection
float collisionTime = -b - sqrt(discr);
//if t is negative, ray started inside sphere so clamp t to zero and remember that we hit from the inside
if(collisionTime < 0.0)
collisionTime = -b + sqrt(discr);
return collisionTime;
}
//=======================================================================================
vec3 DiffuseColor (in vec3 pos)
{
#if SHOW_GRID
pos = mod(floor(pos),2.0);
return vec3(mod(pos.x, 2.0) < 1.0 ? 1.0 : 0.0);
#else
return vec3(0.1, 0.8, 0.9);
#endif
}
//=======================================================================================
vec3 ShadePoint (in vec3 pos, in vec3 rayDir, float time, bool fromUnderneath)
{
vec3 diffuseColor = DiffuseColor(pos);
vec3 reverseLightDir = normalize(vec3(1.0,1.0,-1.0));
vec3 lightColor = vec3(1.0);
vec3 ambientColor = vec3(0.05);
vec3 normal = NormalAtPos(pos.xz);
normal *= fromUnderneath ? -1.0 : 1.0;
// diffuse
vec3 color = diffuseColor;
float dp = dot(normal, reverseLightDir);
if(dp > 0.0)
color += (diffuseColor * lightColor);
return color;
}
//=======================================================================================
vec3 HandleRay (in vec3 rayPos, in vec3 rayDir, in vec3 pixelColor, out float hitTime)
{
float time = 0.0;
float lastHeight = 0.0;
float lastY = 0.0;
float height;
bool hitFound = false;
hitTime = FLT_MAX;
bool fromUnderneath = false;
vec2 timeMinMax = vec2(0.0, 20.0);
time = timeMinMax.x;
const int c_numIters = 100;
float deltaT = (timeMinMax.y - timeMinMax.x) / float(c_numIters);
vec3 pos = rayPos + rayDir * time;
float firstSign = sign(pos.y - HeightAtPos(pos.xz));
for (int index = 0; index < c_numIters; ++index)
{
pos = rayPos + rayDir * time;
height = HeightAtPos(pos.xz);
if (sign(pos.y - height) * firstSign < 0.0)
{
fromUnderneath = firstSign < 0.0;
hitFound = true;
break;
}
time += deltaT;
lastHeight = height;
lastY = pos.y;
}
if (hitFound) {
time = time - deltaT + deltaT*(lastHeight-lastY)/(pos.y-lastY-height+lastHeight);
pos = rayPos + rayDir * time;
pixelColor = ShadePoint(pos, rayDir, time, fromUnderneath);
hitTime = time;
}
return pixelColor;
}
//=======================================================================================
void main()
{
// scrolling camera
vec3 cameraOffset = vec3(iTime, 0.5, iTime);
//----- camera
vec2 mouse = iMouse.xy / iResolution.xy;
vec3 cameraAt = vec3(0.5,0.5,0.5) + cameraOffset;
float angleX = iMouse.z > 0.0 ? 6.28 * mouse.x : 3.14 + iTime * 0.25;
float angleY = iMouse.z > 0.0 ? (mouse.y * 6.28) - 0.4 : 0.5;
vec3 cameraPos = (vec3(sin(angleX)*cos(angleY), sin(angleY), cos(angleX)*cos(angleY))) * 5.0;
// float angleX = 0.8;
// float angleY = 0.8;
// vec3 cameraPos = vec3(0.0,0.0,0.0);
cameraPos += vec3(0.5,0.5,0.5) + cameraOffset;
vec3 cameraFwd = normalize(cameraAt - cameraPos);
vec3 cameraLeft = normalize(cross(normalize(cameraAt - cameraPos), vec3(0.0,sign(cos(angleY)),0.0)));
vec3 cameraUp = normalize(cross(cameraLeft, cameraFwd));
float cameraViewWidth = 6.0;
float cameraViewHeight = cameraViewWidth * iResolution.y / iResolution.x;
float cameraDistance = 6.0; // intuitively backwards!
// Objects
vec2 rawPercent = (gl_FragCoord.xy / iResolution.xy);
vec2 percent = rawPercent - vec2(0.5,0.5);
vec3 rayTarget = (cameraFwd * vec3(cameraDistance,cameraDistance,cameraDistance))
- (cameraLeft * percent.x * cameraViewWidth)
+ (cameraUp * percent.y * cameraViewHeight);
vec3 rayDir = normalize(rayTarget);
float hitTime = FLT_MAX;
vec3 pixelColor = vec3(1.0, 1.0, 1.0);
pixelColor = HandleRay(cameraPos, rayDir, pixelColor, hitTime);
gl_FragColor = vec4(clamp(pixelColor,0.0,1.0), 1.0);
}
and the mouse interactive Processing sketch:
PShader shader;
void setup(){
size(300,300,P2D);
noStroke();
shader = loadShader("shader-MtsXzl.frag");
shader.set("iResolution",(float)width, float(height));
}
void draw(){
background(0);
shader.set("iTime",frameCount * 0.05);
shader.set("iMouse",(float)mouseX , (float)mouseY, mousePressed ? 1.0 : 0.0);
shader(shader);
rect(0,0,width,height);
}
Shadertoy is great way to play/learn: have fun !
Update
Here's a quick test tweaking Daniel Shiffman's 3D Terrain Generation example to add a stripped texture and basic sine waves instead of perlin noise:
// Daniel Shiffman
// http://codingtra.in
// http://patreon.com/codingtrain
// Code for: https://youtu.be/IKB1hWWedMk
int cols, rows;
int scl = 20;
int w = 2000;
int h = 1600;
float flying = 0;
float[][] terrain;
PImage texture;
void setup() {
size(600, 600, P3D);
textureMode(NORMAL);
noStroke();
cols = w / scl;
rows = h/ scl;
terrain = new float[cols][rows];
texture = getBarsTexture(512,512,96);
}
void draw() {
flying -= 0.1;
float yoff = flying;
for (int y = 0; y < rows; y++) {
float xoff = 0;
for (int x = 0; x < cols; x++) {
//terrain[x][y] = map(noise(xoff, yoff), 0, 1, -100, 100);
terrain[x][y] = map(sin(xoff) * sin(yoff), 0, 1, -60, 60);
xoff += 0.2;
}
yoff += 0.2;
}
background(0);
translate(width/2, height/2+50);
rotateX(PI/9);
translate(-w/2, -h/2);
for (int y = 0; y < rows-1; y++) {
beginShape(TRIANGLE_STRIP);
texture(texture);
for (int x = 0; x < cols; x++) {
float u0 = map(x,0,cols-1,0.0,1.0);
float u1 = map(x+1,0,cols-1,0.0,1.0);
float v0 = map(y,0,rows-1,0.0,1.0);
float v1 = map(y+1,0,rows-1,0.0,1.0);
vertex(x*scl, y*scl, terrain[x][y], u0, v0);
vertex(x*scl, (y+1)*scl, terrain[x][y+1], u1, v1);
}
endShape();
}
}
PGraphics getBarsTexture(int textureWidth, int textureHeight, int numBars){
PGraphics texture = createGraphics(textureWidth, textureHeight);
int moduleSide = textureWidth / numBars;
texture.beginDraw();
texture.background(0);
texture.noStroke();
for(int i = 0; i < numBars; i+= 2){
texture.rect(0, i * moduleSide, textureWidth, moduleSide);
}
texture.endDraw();
return texture;
}

Why is wrapping coordinates not making my simplex noise tile seamlessly?

I've been trying to create a fake 3D texture that repeats in shadertoy (see here, use wasd to move, arrow keys to rotate) But as you can see, it doesn't tile.
I generate the noise myself, and I've isolated the noise generation in this minimal example, however it does not generate seamlessly tileable noise seemingly no matter what I do.
Here is the code:
//Common, you probably won't have to look here.
vec2 modv(vec2 value, float modvalue){
return vec2(mod(value.x, modvalue),
mod(value.y, modvalue));
}
vec3 modv(vec3 value, float modvalue){
return vec3(mod(value.x, modvalue),
mod(value.y, modvalue),
mod(value.z, modvalue));
}
vec4 modv(vec4 value, float modvalue){
return vec4(mod(value.x, modvalue),
mod(value.y, modvalue),
mod(value.z, modvalue),
mod(value.w, modvalue));
}
//MATH CONSTANTS
const float pi = 3.1415926535897932384626433832795;
const float tau = 6.2831853071795864769252867665590;
const float eta = 1.5707963267948966192313216916397;
const float SQRT3 = 1.7320508075688772935274463415059;
const float SQRT2 = 1.4142135623730950488016887242096;
const float LTE1 = 0.9999999999999999999999999999999;
const float inf = uintBitsToFloat(0x7F800000u);
#define saturate(x) clamp(x,0.0,1.0)
#define norm01(x) ((x + 1.0) / 2.0)
vec2 pos3DTo2D(in vec3 pos,
const in int size_dim,
const in ivec2 z_size){
float size_dimf = float(size_dim);
pos = vec3(mod(pos.x, size_dimf), mod(pos.y, size_dimf), mod(pos.z, size_dimf));
int z_dim_x = int(pos.z) % z_size.x;
int z_dim_y = int(pos.z) / z_size.x;
float x = pos.x + float(z_dim_x * size_dim);
float y = pos.y + float(z_dim_y * size_dim);
return vec2(x,y);
}
vec4 textureAs3D(const in sampler2D iChannel,
in vec3 pos,
const in int size_dim,
const in ivec2 z_size,
const in vec3 iResolution){
//only need whole, will do another texture read to make sure interpolated?
vec2 tex_pos = pos3DTo2D(pos, size_dim, z_size)/iResolution.xy;
vec4 base_vec4 = texture(iChannel, tex_pos);
vec2 tex_pos_z1 = pos3DTo2D(pos+vec3(0.0,0.0,1.0), size_dim, z_size.xy)/iResolution.xy;
vec4 base_vec4_z1 = texture(iChannel, tex_pos_z1);
//return base_vec4;
return mix(base_vec4, base_vec4_z1, fract(pos.z));
}
vec4 textureZ3D(const in sampler2D iChannel,
in int y,
in int z,
in int offsetX,
const in int size_dim,
const in ivec2 z_size,
const in vec3 iResolution){
int tx = (z%z_size.x);
int ty = z/z_size.x;
int sx = offsetX + size_dim * tx;
int sy = y + (ty *size_dim);
if(ty < z_size.y){
return texelFetch(iChannel, ivec2(sx, sy),0);
}else{
return vec4(0.0);
}
//return texelFetch(iChannel, ivec2(x, y - (ty *32)),0);
}
//Buffer B this is what you are going to have to look at.
//noise
//NOISE CONSTANTS
// captured from https://en.wikipedia.org/wiki/SHA-2#Pseudocode
const uint CONST_A = 0xcc9e2d51u;
const uint CONST_B = 0x1b873593u;
const uint CONST_C = 0x85ebca6bu;
const uint CONST_D = 0xc2b2ae35u;
const uint CONST_E = 0xe6546b64u;
const uint CONST_F = 0x510e527fu;
const uint CONST_G = 0x923f82a4u;
const uint CONST_H = 0x14292967u;
const uint CONST_0 = 4294967291u;
const uint CONST_1 = 604807628u;
const uint CONST_2 = 2146583651u;
const uint CONST_3 = 1072842857u;
const uint CONST_4 = 1396182291u;
const uint CONST_5 = 2227730452u;
const uint CONST_6 = 3329325298u;
const uint CONST_7 = 3624381080u;
uvec3 singleHash(uvec3 uval){
uval ^= uval >> 16;
uval.x *= CONST_A;
uval.y *= CONST_B;
uval.z *= CONST_C;
return uval;
}
uint combineHash(uint seed, uvec3 uval){
// can move this out to compile time if need be.
// with out multiplying by one of the randomizing constants
// will result in not very different results from seed to seed.
uint un = seed * CONST_5;
un ^= (uval.x^uval.y)* CONST_0;
un ^= (un >> 16);
un = (un^uval.z)*CONST_1;
un ^= (un >> 16);
return un;
}
/*
//what the above hashes are based upon, seperate
//out this mumurhash based coherent noise hash
uint fullHash(uint seed, uvec3 uval){
uval ^= uval >> 16;
uval.x *= CONST_A;
uval.y *= CONST_B;
uval.z *= CONST_D;
uint un = seed * CONST_6;
un ^= (uval.x ^ uval.y) * CONST_0;
un ^= un >> 16;
un = (un^uval.z) * CONST_2;
un ^= un >> 16;
return un;
}
*/
const vec3 gradArray3d[8] = vec3[8](
vec3(1, 1, 1), vec3(1,-1, 1), vec3(-1, 1, 1), vec3(-1,-1, 1),
vec3(1, 1,-1), vec3(1,-1,-1), vec3(-1, 1,-1), vec3(-1,-1,-1)
);
vec3 getGradient3Old(uint uval){
vec3 grad = gradArray3d[uval & 7u];
return grad;
}
//source of some constants
//https://github.com/Auburns/FastNoise/blob/master/FastNoise.cpp
const float SKEW3D = 1.0 / 3.0;
const float UNSKEW3D = 1.0 / 6.0;
const float FAR_CORNER_UNSKEW3D = -1.0 + 3.0*UNSKEW3D;
const float NORMALIZE_SCALE3D = 30.0;// * SQRT3;
const float DISTCONST_3D = 0.6;
float simplexNoiseV(uint seed, in vec3 pos, in uint wrap){
pos = modv(pos, float(wrap));
float skew_factor = (pos.x + pos.y + pos.z)*SKEW3D;
vec3 fsimplex_corner0 = floor(pos + skew_factor);
ivec3 simplex_corner0 = ivec3(fsimplex_corner0);
float unskew_factor = (fsimplex_corner0.x + fsimplex_corner0.y + fsimplex_corner0.z) * UNSKEW3D;
vec3 pos0 = fsimplex_corner0 - unskew_factor;
//subpos's are positions with in grid cell.
vec3 subpos0 = pos - pos0;
//precomputed values used in determining hash, reduces redundant hash computation
//shows 10% -> 20% speed boost.
uvec3 wrapped_corner0 = uvec3(simplex_corner0);
uvec3 wrapped_corner1 = uvec3(simplex_corner0+1);
wrapped_corner0 = wrapped_corner0 % wrap;
wrapped_corner1 = wrapped_corner1 % wrap;
//uvec3 hashes_offset0 = singleHash(uvec3(simplex_corner0));
//uvec3 hashes_offset1 = singleHash(uvec3(simplex_corner0+1));
uvec3 hashes_offset0 = singleHash(wrapped_corner0);
uvec3 hashes_offset1 = singleHash(wrapped_corner1);
//near corner hash value
uint hashval0 = combineHash(seed, hashes_offset0);
//mid corner hash value
uint hashval1;
uint hashval2;
//far corner hash value
uint hashval3 = combineHash(seed, hashes_offset1);
ivec3 simplex_corner1;
ivec3 simplex_corner2;
if (subpos0.x >= subpos0.y)
{
if (subpos0.y >= subpos0.z)
{
hashval1 = combineHash(seed, uvec3(hashes_offset1.x, hashes_offset0.yz));
hashval2 = combineHash(seed, uvec3(hashes_offset1.xy, hashes_offset0.z));
simplex_corner1 = ivec3(1,0,0);
simplex_corner2 = ivec3(1,1,0);
}
else if (subpos0.x >= subpos0.z)
{
hashval1 = combineHash(seed, uvec3(hashes_offset1.x, hashes_offset0.yz));
hashval2 = combineHash(seed, uvec3(hashes_offset1.x, hashes_offset0.y, hashes_offset1.z));
simplex_corner1 = ivec3(1,0,0);
simplex_corner2 = ivec3(1,0,1);
}
else // subpos0.x < subpos0.z
{
hashval1 = combineHash(seed, uvec3(hashes_offset0.xy, hashes_offset1.z));
hashval2 = combineHash(seed, uvec3(hashes_offset1.x, hashes_offset0.y, hashes_offset1.z));
simplex_corner1 = ivec3(0,0,1);
simplex_corner2 = ivec3(1,0,1);
}
}
else // subpos0.x < subpos0.y
{
if (subpos0.y < subpos0.z)
{
hashval1 = combineHash(seed, uvec3(hashes_offset0.xy, hashes_offset1.z));
hashval2 = combineHash(seed, uvec3(hashes_offset0.x, hashes_offset1.yz));
simplex_corner1 = ivec3(0,0,1);
simplex_corner2 = ivec3(0,1,1);
}
else if (subpos0.x < subpos0.z)
{
hashval1 = combineHash(seed, uvec3(hashes_offset0.x, hashes_offset1.y, hashes_offset0.z));
hashval2 = combineHash(seed, uvec3(hashes_offset0.x, hashes_offset1.yz));
simplex_corner1 = ivec3(0,1,0);
simplex_corner2 = ivec3(0,1,1);
}
else // subpos0.x >= subpos0.z
{
hashval1 = combineHash(seed, uvec3(hashes_offset0.x, hashes_offset1.y, hashes_offset0.z));
hashval2 = combineHash(seed, uvec3(hashes_offset1.xy, hashes_offset0.z));
simplex_corner1 = ivec3(0,1,0);
simplex_corner2 = ivec3(1,1,0);
}
}
//we would do this if we didn't want to seperate the hash values.
//hashval0 = fullHash(seed, uvec3(simplex_corner0));
//hashval1 = fullHash(seed, uvec3(simplex_corner0+simplex_corner1));
//hashval2 = fullHash(seed, uvec3(simplex_corner0+simplex_corner2));
//hashval3 = fullHash(seed, uvec3(simplex_corner0+1));
vec3 subpos1 = subpos0 - vec3(simplex_corner1) + UNSKEW3D;
vec3 subpos2 = subpos0 - vec3(simplex_corner2) + 2.0*UNSKEW3D;
vec3 subpos3 = subpos0 + FAR_CORNER_UNSKEW3D;
float n0, n1, n2, n3;
//http://catlikecoding.com/unity/tutorials/simplex-noise/
//circle distance factor to make sure second derivative is continuous
// t variables represent (1 - x^2 + y^2 + ...)^3, a distance function with
// continous first and second derivatives that are zero when x is one.
float t0 = DISTCONST_3D - subpos0.x*subpos0.x - subpos0.y*subpos0.y - subpos0.z*subpos0.z;
//if t < 0, we get odd dips in continuity at the ends, so we just force it to zero
// to prevent it
if(t0 < 0.0){
n0 = 0.0;
}else{
float t0_pow2 = t0 * t0;
float t0_pow4 = t0_pow2 * t0_pow2;
vec3 grad = getGradient3Old(hashval0);
float product = dot(subpos0, grad);
n0 = t0_pow4 * product;
}
float t1 = DISTCONST_3D - subpos1.x*subpos1.x - subpos1.y*subpos1.y - subpos1.z*subpos1.z;
if(t1 < 0.0){
n1 = 0.0;
}else{
float t1_pow2 = t1 * t1;
float t1_pow4 = t1_pow2 * t1_pow2;
vec3 grad = getGradient3Old(hashval1);
float product = dot(subpos1, grad);
n1 = t1_pow4 * product;
}
float t2 = DISTCONST_3D - subpos2.x*subpos2.x - subpos2.y*subpos2.y - subpos2.z*subpos2.z;
if(t2 < 0.0){
n2 = 0.0;
}else{
float t2_pow2 = t2 * t2;
float t2_pow4 = t2_pow2*t2_pow2;
vec3 grad = getGradient3Old(hashval2);
float product = dot(subpos2, grad);
n2 = t2_pow4 * product;
}
float t3 = DISTCONST_3D - subpos3.x*subpos3.x - subpos3.y*subpos3.y - subpos3.z*subpos3.z;
if(t3 < 0.0){
n3 = 0.0;
}else{
float t3_pow2 = t3 * t3;
float t3_pow4 = t3_pow2*t3_pow2;
vec3 grad = getGradient3Old(hashval3);
float product = dot(subpos3, grad);
n3 = t3_pow4 * product;
}
return (n0 + n1 + n2 + n3);
}
//settings for fractal brownian motion noise
struct BrownianFractalSettings{
uint seed;
int octave_count;
float frequency;
float lacunarity;
float persistence;
float amplitude;
};
float accumulateSimplexNoiseV(in BrownianFractalSettings settings, vec3 pos, float wrap){
float accumulated_noise = 0.0;
wrap *= settings.frequency;
vec3 octave_pos = pos * settings.frequency;
for (int octave = 0; octave < settings.octave_count; octave++) {
octave_pos = modv(octave_pos, wrap);
float noise = simplexNoiseV(settings.seed, octave_pos, uint(wrap));
noise *= pow(settings.persistence, float(octave));
accumulated_noise += noise;
octave_pos *= settings.lacunarity;
wrap *= settings.lacunarity;
}
float scale = 2.0 - pow(settings.persistence, float(settings.octave_count - 1));
return (accumulated_noise/scale) * NORMALIZE_SCALE3D * settings.amplitude;
}
const float FREQUENCY = 1.0/8.0;
const float WRAP = 32.0;
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
//set to zero in order to stop scrolling, scrolling shows the lack of tilability between
//wrapping.
const float use_sin_debug = 1.0;
vec3 origin = vec3(norm01(sin(iTime))*64.0*use_sin_debug,0.0,0.0);
vec3 color = vec3(0.0,0.0,0.0);
BrownianFractalSettings brn_settings =
BrownianFractalSettings(203u, 1, FREQUENCY, 2.0, 0.4, 1.0);
const int size_dim = 32;
ivec2 z_size = ivec2(8, 4);
ivec2 iFragCoord = ivec2(fragCoord.x, fragCoord.y);
int z_dim_x = iFragCoord.x / size_dim;
int z_dim_y = iFragCoord.y / size_dim;
if(z_dim_x < z_size.x && z_dim_y < z_size.y){
int ix = iFragCoord.x % size_dim;
int iy = iFragCoord.y % size_dim;
int iz = (z_dim_x) + ((z_dim_y)*z_size.x);
vec3 pos = vec3(ix,iy,iz) + origin;
float value = accumulateSimplexNoiseV(brn_settings, pos, WRAP);
color = vec3(norm01(value));
}else{
color = vec3(1.0,0.0,0.0);
}
fragColor = vec4(color,1.0);
}
//Image, used to finally display
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
const float fcm = 4.0;
//grabs a single 32x32 tile in order to test tileability, currently generates
//a whole array of images however.
vec2 fragCoordMod = vec2(mod(fragCoord.x, 32.0 * fcm), mod(fragCoord.y, 32.0 * fcm));
vec3 color = texture(iChannel2, fragCoordMod/(fcm*iResolution.xy)).xyz;
fragColor = vec4(color, 1.0);
}
What I've tried position % wrap value, modifying wrap value by lacunarity, and after warp % wrap value, which are currently in use (look in simplexNoiseV for the core algorithm, accumulateSimplexNoiseV for the octave summation).
According to these answers it should be that simple (mod position used for hashing), however this clearly just doesn't work. I'm not sure if it's partially because my hashing function is not Ken Perlin's, but it doesn't seem like that should make a difference. It does seem the skewing of coordinates should make this method not work at all, but apparently others have had success with this.
Here's an example of it not tiling:
Why is wrapping coordinates not making my simplex noise tile seamlessly?
UPDATE:
I've still not fixed the issue, but it appears that tiling works appropriately along the simplicies, and not the grid seen here:
Do I have to modify my modulus to account for the skewing?

Rotating textures individually using webgl

I want to rotate each texture individually in my rendercall. I read this tutorial, and it works as i want it to except that the rotation is applied to all objects(due to the rotation value being uniform).
So i rewrote it to use a buffer but i cant get it to work properly.
Heres my shader:
attribute vec2 a_position;
attribute vec2 a_texture_coord;
attribute vec2 a_rotation;
attribute vec4 a_color;
uniform vec2 u_resolution;
varying highp vec2 v_texture_coord;
varying vec4 v_color;
void main() {
v_color = a_color;
vec2 rotatedPosition = vec2(
a_position.x * a_rotation.y + a_position.y * a_rotation.x,
a_position.y * a_rotation.y - a_position.x * a_rotation.x);
vec2 zeroToOne = rotatedPosition / u_resolution;
vec2 zeroToTwo = zeroToOne * 2.0;
vec2 clipSpace = zeroToTwo - 1.0;
gl_Position = vec4(clipSpace * vec2(1, -1), 0, 1);
v_texture_coord = a_texture_coord;
}
and typescript code
this.gl.enableVertexAttribArray(this.rotationAttributeLocation);
this.gl.bindBuffer(this.gl.ARRAY_BUFFER, this.rotationBuffer);
this.gl.bufferData(this.gl.ARRAY_BUFFER, new Float32Array(renderCall.rotation), this.gl.STATIC_DRAW);
this.gl.bindBuffer(this.gl.ARRAY_BUFFER, this.rotationBuffer);
this.gl.vertexAttribPointer(this.rotationAttributeLocation, 2, this.gl.FLOAT, false, 0, 0);
I get no errors from webgl or from the browser but end up with a blank canvas. Any ideas?
After much digging around in matrix math and how to use it in webgl.
I came up with a solution that worked well for my specific problem.
Creating a rendercall for each object(squares 6 vertices) turned out to effect the performance quite drastically.
As i only needed to rotate a few objects each rendering cycle i rotated the vertecies directly in javascript.
Something like this:
let x1 = x + width;
let x2 = x;
let y1 = y;
let y2 = y + height;
let rotatePointX = x2;
let rotatePointY = y1;
let moveToRotationPointMatrix = Matrix3.createTranslationMatrix(-rotatePointX, -rotatePointY);
let rotationMatrix = Matrix3.createRotationMatrix(angle);
let moveBackMatrix = Matrix3.createTranslationMatrix(rotatePointX, rotatePointY);
let matrix = Matrix3.multiply(moveToRotationPointMatrix, rotationMatrix);
matrix = Matrix3.multiply(matrix, moveBackMatrix);
let x1y1 = Matrix3.positionConvertion(x1, y1, matrix);
let x2y2 = Matrix3.positionConvertion(x2, y2, matrix);
let x2y1 = Matrix3.positionConvertion(x2, y1, matrix);
let x1y2 = Matrix3.positionConvertion(x1, y2, matrix);
let newVertecies = [
x1y1[0], x1y1[1],
x2y2[0], x2y2[1],
x2y1[0], x2y1[1],
x1y1[0], x1y1[1],
x2y2[0], x2y2[1],
x1y2[0], x1y2[1]
];
Where Matrix3 is more or less a copy from Webglfundamentals helper class for 3x3 matrix math, from here
public static positionConvertion(x: number, y: number, matrix: number[]) {
x = x * matrix[0] + y * matrix[3] + 1 * matrix[6];
y = x * matrix[1] + y * matrix[4] + 1 * matrix[7];
return [x, y];
}
Also check out this answer for a simple example of how to do the rotation in the shader.
Other helpful sources
webglfundamentals.org/webgl/lessons/webgl-2d-matrices.html
webglfundamentals.org/webgl/lessons/webgl-2d-matrix-stack.html
webglfundamentals.org/webgl/lessons/webgl-2d-rotation.html

Optimize WebGL shader?

I wrote the following shader to render a pattern with a bunch of concentric circles. Eventually I want to have each rotating sphere be a light emitter to create something along these lines.
Of course right now I'm just doing the most basic part to render the different objects.
Unfortunately the shader is incredibly slow (16fps full screen on a high-end macbook). I'm pretty sure this is due to the numerous for loops and branching that I have in the shader. I'm wondering how I can pull off the geometry I'm trying to achieve in a more performance optimized way:
EDIT: you can run the shader here: https://www.shadertoy.com/view/lssyRH
One obvious optimization I am missing is that currently all the fragments are checked against the entire 24 surrounding circles. It would be pretty quick and easy to just discard these checks entirely by checking if the fragment intersects the outer bounds of the diagram. I guess I'm just trying to get a handle on how the best practice is of doing something like this.
#define N 10
#define M 5
#define K 24
#define M_PI 3.1415926535897932384626433832795
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
float aspectRatio = iResolution.x / iResolution.y;
float h = 1.0;
float w = aspectRatio;
vec2 uv = vec2(fragCoord.x / iResolution.x * aspectRatio, fragCoord.y / iResolution.y);
float radius = 0.01;
float orbitR = 0.02;
float orbiterRadius = 0.005;
float centerRadius = 0.002;
float encloseR = 2.0 * orbitR;
float encloserRadius = 0.002;
float spacingX = (w / (float(N) + 1.0));
float spacingY = h / (float(M) + 1.0);
float x = 0.0;
float y = 0.0;
vec4 totalLight = vec4(0.0, 0.0, 0.0, 1.0);
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
// compute the center of the diagram
vec2 center = vec2(spacingX * (float(i) + 1.0), spacingY * (float(j) + 1.0));
x = center.x + orbitR * cos(iGlobalTime);
y = center.y + orbitR * sin(iGlobalTime);
vec2 bulb = vec2(x,y);
if (length(uv - center) < centerRadius) {
// frag intersects white center marker
fragColor = vec4(1.0);
return;
} else if (length(uv - bulb) < radius) {
// intersects rotating "light"
fragColor = vec4(uv,0.5+0.5*sin(iGlobalTime),1.0);
return;
} else {
// intersects one of the enclosing 24 cylinders
for(int k = 0; k < K; k++) {
float theta = M_PI * 2.0 * float(k)/ float(K);
x = center.x + cos(theta) * encloseR;
y = center.y + sin(theta) * encloseR;
vec2 encloser = vec2(x,y);
if (length(uv - encloser) < encloserRadius) {
fragColor = vec4(uv,0.5+0.5*sin(iGlobalTime),1.0);
return;
}
}
}
}
}
}
Keeping in mind that you want to optimize the fragment shader, and only the fragment shader:
Move the sin(iGlobalTime) and cos(iGlobalTime) out of the loops, these remain static over the whole draw call so no need to recalculate them every loop iteration.
GPUs employ vectorized instruction sets (SIMD) where possible, take advantage of that. You're wasting lots of cycles by doing multiple scalar ops where you could use a single vector instruction(see annotated code)
[Three years wiser me here: I'm not really sure if this statement is true in regards to how modern GPUs process the instructions, however it certainly does help readability and maybe even give a hint or two to the compiler]
Do your radius checks squared, save that sqrt(length) for when you really need it
Replace float casts of constants(your loop limits) with a float constant(intelligent shader compilers will already do this, not something to count on though)
Don't have undefined behavior in your shader(not writing to gl_FragColor)
Here is an optimized and annotated version of your shader(still containing that undefined behavior, just like the one you provided). Annotation is in the form of:
// annotation
// old code, if any
new code
#define N 10
// define float constant N
#define fN 10.
#define M 5
// define float constant M
#define fM 5.
#define K 24
// define float constant K
#define fK 24.
#define M_PI 3.1415926535897932384626433832795
// predefine 2 times PI
#define M_PI2 6.28318531
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
float aspectRatio = iResolution.x / iResolution.y;
// we dont need these separate
// float h = 1.0;
// float w = aspectRatio;
// use vector ops(2 divs 1 mul => 1 div 1 mul)
// vec2 uv = vec2(fragCoord.x / iResolution.x * aspectRatio, fragCoord.y / iResolution.y);
vec2 uv = fragCoord.xy / iResolution.xy;
uv.x *= aspectRatio;
// most of the following declarations should be predefined or marked as "const"...
float radius = 0.01;
// precalc squared radius
float radius2 = radius*radius;
float orbitR = 0.02;
float orbiterRadius = 0.005;
float centerRadius = 0.002;
// precalc squared center radius
float centerRadius2 = centerRadius * centerRadius;
float encloseR = 2.0 * orbitR;
float encloserRadius = 0.002;
// precalc squared encloser radius
float encloserRadius2 = encloserRadius * encloserRadius;
// Use float constants and vector ops here(2 casts 2 adds 2 divs => 1 add 1 div)
// float spacingX = w / (float(N) + 1.0);
// float spacingY = h / (float(M) + 1.0);
vec2 spacing = vec2(aspectRatio, 1.0) / (vec2(fN, fM)+1.);
// calc sin and cos of global time
// saves N*M(sin,cos,2 muls)
vec2 stct = vec2(sin(iGlobalTime), cos(iGlobalTime));
vec2 orbit = orbitR * stct;
// not needed anymore
// float x = 0.0;
// float y = 0.0;
// was never used
// vec4 totalLight = vec4(0.0, 0.0, 0.0, 1.0);
for (int i = 0; i < N; i++) {
for (int j = 0; j < M; j++) {
// compute the center of the diagram
// Use vector ops
// vec2 center = vec2(spacingX * (float(i) + 1.0), spacingY * (float(j) + 1.0));
vec2 center = spacing * (vec2(i,j)+1.0);
// Again use vector opts, use precalced time trig(orbit = orbitR * stct)
// x = center.x + orbitR * cos(iGlobalTime);
// y = center.y + orbitR * sin(iGlobalTime);
// vec2 bulb = vec2(x,y);
vec2 bulb = center + orbit;
// calculate offsets
vec2 centerOffset = uv - center;
vec2 bulbOffset = uv - bulb;
// use squared length check
// if (length(uv - center) < centerRadius) {
if (dot(centerOffset, centerOffset) < centerRadius2) {
// frag intersects white center marker
fragColor = vec4(1.0);
return;
// use squared length check
// } else if (length(uv - bulb) < radius) {
} else if (dot(bulbOffset, bulbOffset) < radius2) {
// Use precalced sin global time in stct.x
// intersects rotating "light"
fragColor = vec4(uv,0.5+0.5*stct.x,1.0);
return;
} else {
// intersects one of the enclosing 24 cylinders
for(int k = 0; k < K; k++) {
// use predefined 2*PI and float K
float theta = M_PI2 * float(k) / fK;
// Use vector ops(2 muls 2 adds => 1 mul 1 add)
// x = center.x + cos(theta) * encloseR;
// y = center.y + sin(theta) * encloseR;
// vec2 encloser = vec2(x,y);
vec2 encloseOffset = uv - (center + vec2(cos(theta),sin(theta)) * encloseR);
if (dot(encloseOffset,encloseOffset) < encloserRadius2) {
fragColor = vec4(uv,0.5+0.5*stct.x,1.0);
return;
}
}
}
}
}
}
I did a little more thinking ... I realized the best way to optimize it is to actually change the logic so that before doing intersection tests on the small circles it checks the bounds of the group of circles. This got it to run at 60fps:
Example here:
https://www.shadertoy.com/view/lssyRH

How do I convert a vec4 rgba value to a float?

I packed some float data in a texture as an unsigned_byte, my only option in webgl. Now I would like unpack it in the vertex shader. When I sample a pixel I get a vec4 which is really one of my floats. How do I convert from the vec4 to a float?
The following code is specifically for the iPhone 4 GPU using OpenGL ES 2.0. I have no experience with WebGL so I cant claim to know how the code will work in that context. Furthermore the main problem here is that highp float is not 32 bits but is instead 24 bit.
My solution is for fragment shaders - I didnt try it in the vertex shader but it shouldnt be any different. In order to use the you will need to get the RGBA texel from a sampler2d uniform and make sure that the values of each R,G,B and A channels are between 0.0 and 255.0 . This is easy to achieve as follows:
highp vec4 rgba = texture2D(textureSamplerUniform, texcoordVarying)*255.0;
You should be aware though that the endianess of your machine will dictate the correct order of your bytes. The above code assumes that floats are stored in big-endian order. If you see your results are wrong then just swap the order of the data by writing
rgba.rgba=rgba.abgr;
immediately after the line where you set it. Alternatively swap the indices on rgba. I think the above line is more intutive though and less prone to careless errors.
I am not sure if it works for all given input. I tested for a large range of numbers and found that decode32 and encode32 are NOT exact inverses. Ive also left out the code I used to test it.
#pragma STDGL invariant(all)
highp vec4 encode32(highp float f) {
highp float e =5.0;
highp float F = abs(f);
highp float Sign = step(0.0,-f);
highp float Exponent = floor(log2(F));
highp float Mantissa = (exp2(- Exponent) * F);
Exponent = floor(log2(F) + 127.0) + floor(log2(Mantissa));
highp vec4 rgba;
rgba[0] = 128.0 * Sign + floor(Exponent*exp2(-1.0));
rgba[1] = 128.0 * mod(Exponent,2.0) + mod(floor(Mantissa*128.0),128.0);
rgba[2] = floor(mod(floor(Mantissa*exp2(23.0 -8.0)),exp2(8.0)));
rgba[3] = floor(exp2(23.0)*mod(Mantissa,exp2(-15.0)));
return rgba;
}
highp float decode32(highp vec4 rgba) {
highp float Sign = 1.0 - step(128.0,rgba[0])*2.0;
highp float Exponent = 2.0 * mod(rgba[0],128.0) + step(128.0,rgba[1]) - 127.0;
highp float Mantissa = mod(rgba[1],128.0)*65536.0 + rgba[2]*256.0 +rgba[3] + float(0x800000);
highp float Result = Sign * exp2(Exponent) * (Mantissa * exp2(-23.0 ));
return Result;
}
void main()
{
highp float result;
highp vec4 rgba=encode32(-10.01);
result = decode32(rgba);
}
Here are some links on IEEE precision I found useful. Link1. Link2. Link3.
Twerdster posted some excellent code in his answer. So all credit go to him. I post this new answer, since comments don't allow for nice syntax colored code blocks, and i wanted to share some code. But if you like the code, please upvote Twerdster original answer.
In Twerdster previous post he mentioned that the decode and encode might not work for all values.
To further test this, and validate the result i made a java program. While porting the code i tried to stayed as close as possible to the shader code (therefore i implemented some helper functions).
Note: I also use a store/load function to similate what happens when you write/read from a texture.
I found out that:
You need a special case for the zero
You might also need special case for infinity, but i did not implement that to keep the shader simple (eg: faster)
Because of rounding errors sometimes the result was wrong therefore:
subtract 1 from exponent when because of rounding the mantissa is not properly normalised (eg mantissa < 1)
Change float Mantissa = (exp2(- Exponent) * F); to float Mantissa = F/exp2(Exponent); to reduce precision errors
Use float Exponent = floor(log2(F)); to calc exponent. (simplified by new mantissa check)
Using these small modifications i got equal output on almost all inputs, and got only small errors between the original and encoded/decoded value when things do go wrong, while in Twerdster's original implementation rounding errors often resulted in the wrong exponent (thus the result being off by factor two).
Please note that this is a Java test application which i wrote to test the algorithm. I hope this will also work when ported to the GPU. If anybody tries to run it on a GPU, please leave a comment with your experience.
And for the code with a simple test to try different numbers until it failes.
import java.io.PrintStream;
import java.util.Random;
public class BitPacking {
public static float decode32(float[] v)
{
float[] rgba = mult(255, v);
float sign = 1.0f - step(128.0f,rgba[0])*2.0f;
float exponent = 2.0f * mod(rgba[0],128.0f) + step(128.0f,rgba[1]) - 127.0f;
if(exponent==-127)
return 0;
float mantissa = mod(rgba[1],128.0f)*65536.0f + rgba[2]*256.0f +rgba[3] + ((float)0x800000);
return sign * exp2(exponent-23.0f) * mantissa ;
}
public static float[] encode32(float f) {
float F = abs(f);
if(F==0){
return new float[]{0,0,0,0};
}
float Sign = step(0.0f,-f);
float Exponent = floor(log2(F));
float Mantissa = F/exp2(Exponent);
if(Mantissa < 1)
Exponent -= 1;
Exponent += 127;
float[] rgba = new float[4];
rgba[0] = 128.0f * Sign + floor(Exponent*exp2(-1.0f));
rgba[1] = 128.0f * mod(Exponent,2.0f) + mod(floor(Mantissa*128.0f),128.0f);
rgba[2] = floor(mod(floor(Mantissa*exp2(23.0f -8.0f)),exp2(8.0f)));
rgba[3] = floor(exp2(23.0f)*mod(Mantissa,exp2(-15.0f)));
return mult(1/255.0f, rgba);
}
//shader build-in's
public static float exp2(float x){
return (float) Math.pow(2, x);
}
public static float[] step(float edge, float[] x){
float[] result = new float[x.length];
for(int i=0; i<x.length; i++)
result[i] = x[i] < edge ? 0.0f : 1.0f;
return result;
}
public static float step(float edge, float x){
return x < edge ? 0.0f : 1.0f;
}
public static float mod(float x, float y){
return x-y * floor(x/y);
}
public static float floor(float x){
return (float) Math.floor(x);
}
public static float pow(float x, float y){
return (float)Math.pow(x, y);
}
public static float log2(float x)
{
return (float) (Math.log(x)/Math.log(2));
}
public static float log10(float x)
{
return (float) (Math.log(x)/Math.log(10));
}
public static float abs(float x)
{
return (float)Math.abs(x);
}
public static float log(float x)
{
return (float)Math.log(x);
}
public static float exponent(float x)
{
return floor((float)(Math.log(x)/Math.log(10)));
}
public static float mantissa(float x)
{
return floor((float)(Math.log(x)/Math.log(10)));
}
//shorter matrix multiplication
private static float[] mult(float scalar, float[] w){
float[] result = new float[4];
for(int i=0; i<4; i++)
result[i] = scalar * w[i];
return result;
}
//simulate storage and retrieval in 4-channel/8-bit texture
private static float[] load(int[] v)
{
return new float[]{v[0]/255f, v[1]/255f, v[2]/255f, v[3]/255f};
}
private static int[] store(float[] v)
{
return new int[]{((int) (v[0]*255))& 0xff, ((int) (v[1]*255))& 0xff, ((int) (v[2]*255))& 0xff, ((int) (v[3]*255))& 0xff};
}
//testing until failure, and some specific hard-cases separately
public static void main(String[] args) {
//for(float v : new float[]{-2097151.0f}){ //small error here
for(float v : new float[]{3.4028233e+37f, 8191.9844f, 1.0f, 0.0f, 0.5f, 1.0f/3, 0.1234567890f, 2.1234567890f, -0.1234567890f, 1234.567f}){
float output = decode32(load(store(encode32(v))));
PrintStream stream = (v==output) ? System.out : System.err;
stream.println(v + " ?= " + output);
}
//System.exit(0);
Random r = new Random();
float max = 3200000f;
float min = -max;
boolean error = false;
int trials = 0;
while(!error){
float fin = min + r.nextFloat() * ((max - min) + 1);
float fout = decode32(load(store(encode32(fin))));
if(trials % 10000 == 0)
System.out.print('.');
if(trials % 1000000 == 0)
System.out.println();
if(fin != fout){
System.out.println();
System.out.println("correct trials = " + trials);
System.out.println(fin + " vs " + fout);
error = true;
}
trials++;
}
}
}
I tried Arjans solution, but it returned invalid values for 0, 1, 2, 4. There was a bug with the packing of the exponent, which i changed so the exp takes one 8bit float and the sign is packed with the mantissa:
//unpack a 32bit float from 4 8bit, [0;1] clamped floats
float unpackFloat4( vec4 _packed)
{
vec4 rgba = 255.0 * _packed;
float sign = step(-128.0, -rgba[1]) * 2.0 - 1.0;
float exponent = rgba[0] - 127.0;
if (abs(exponent + 127.0) < 0.001)
return 0.0;
float mantissa = mod(rgba[1], 128.0) * 65536.0 + rgba[2] * 256.0 + rgba[3] + (0x800000);
return sign * exp2(exponent-23.0) * mantissa ;
}
//pack a 32bit float into 4 8bit, [0;1] clamped floats
vec4 packFloat(float f)
{
float F = abs(f);
if(F == 0.0)
{
return vec4(0,0,0,0);
}
float Sign = step(0.0, -f);
float Exponent = floor( log2(F));
float Mantissa = F/ exp2(Exponent);
//std::cout << " sign: " << Sign << ", exponent: " << Exponent << ", mantissa: " << Mantissa << std::endl;
//denormalized values if all exponent bits are zero
if(Mantissa < 1.0)
Exponent -= 1;
Exponent += 127;
vec4 rgba;
rgba[0] = Exponent;
rgba[1] = 128.0 * Sign + mod(floor(Mantissa * float(128.0)),128.0);
rgba[2] = floor( mod(floor(Mantissa* exp2(float(23.0 - 8.0))), exp2(8.0)));
rgba[3] = floor( exp2(23.0)* mod(Mantissa, exp2(-15.0)));
return (1 / 255.0) * rgba;
}
Since you didn't deign to give us the exact code you used to create and upload your texture, I can only guess at what you're doing.
You seem to be creating a JavaScript array of floating-point numbers. You then create a Uint8Array, passing that array to the constructor.
According to the WebGL spec (or rather, the spec that the WebGL spec refers to when ostensibly specifying this behavior), the conversion from floats to unsigned bytes happens in one of two ways, based on the destination. If the destination is considered "clamped", then it clamps the number to the destination range, namely [0, 255] for your case. If the destination is not considered "clamped", then it is taken modulo 28. The WebGL "specification" is sufficiently poor that it is not entirely clear whether the construction of Uint8Array is considered clamped or not. Whether clamped or taken modulo 28, the decimal point is chopped off and the integer value stored.
However, when you give this data to OpenWebGL, you told WebGL to interpret the bytes as normalized unsigned integer values. This means that the input values on the range [0, 255] will be accessed by users of the texture as [0, 1] floating point values.
So if your input array had the value 183.45, the value in the Uint8Array would be 183. The value in the texture would be 183/255, or 0.718. If your input value was 0.45, the Uint8Array would hold 0, and the texture result would be 0.0.
Now, because you passed the data as GL_RGBA, that means that every 4 unsigned bytes will be taken as a single texel. So every call to texture will fetch those particular four values (at the given texture coordinate, using the given filtering parameters), thus returning a vec4.
It is not clear what you intend to do with this floating-point data, so it is hard to make suggestions as to how best to pass float data to a shader. However, a general solution would be to use the OES_texture_float extension and actually create a texture that stores floating-point data. Of course, if it isn't available, you'll still have to find a way to do what you want.
BTW, Khronos really should be ashamed of themselves for even calling WebGL a specification. It barely specifies anything; it's just a bunch of references to other specifications, which makes finding the effects of anything exceedingly difficult.
You won't be able to just interpret the 4 unsigned bytes as the bits of a float value (which I assume you want) in a shader (at least not in GLES or WebGL, I think). What you can do is not store the float's bit representation in the 4 ubytes, but the bits of the mantissa (or the fixed point representation). For this you need to know the approximate range of the floats (I'll assume [0,1] here for simplicity, otherwise you have to scale differently, of course):
r = clamp(int(2^8 * f), 0, 255);
g = clamp(int(2^16 * f), 0, 255);
b = clamp(int(2^24 * f), 0, 255); //only have 24 bits of precision anyway
Of course you can also work directly with the mantissa bits. And then in the shader you can just reconstruct it that way, using the fact that the components of the vec4 are all in [0,1]:
f = (v.r) + (v.g / 2^8) + (v.b / 2^16);
Although I'm not sure if this will result in the exact same value, the powers of two should help a bit there.

Resources