I've been learning how Metal works using Swift and targeting macOS. Thing's have been going okay, but now, close to getting the stuff done, I've hit a problem that I cannot possibly understand ... I hope you guys will help me :)
I'm loading and displaying a OBJ teapot, which I'm lighting using ambiant+diffuse+specular light. Lighting in itself works well, but problem is : the normal vector is not interpolated when going to the fragment shader, which results in having flat lighting on supposedly curved surface ... Not good ...
I really don't understand why the normal is not interpolated while other values (position + eye) are ... Here is my shader and an image to show the result :
Thanks in advance :)
struct Vertex
{
float4 position;
float4 normal;
};
struct ProjectedVertex
{
float4 position [[position]];
float3 eye;
float3 normal;
};
vertex ProjectedVertex vertex_project(device Vertex *vertices [[buffer(0)]],
constant Uniforms &uniforms [[buffer(1)]],
uint vid [[vertex_id]])
{
ProjectedVertex outVert;
outVert.position = uniforms.modelViewProjectionMatrix * vertices[vid].position;
outVert.eye = -(uniforms.modelViewProjectionMatrix * vertices[vid].position).xyz;
outVert.normal = (uniforms.modelViewProjectionMatrix * float4(vertices[vid].normal)).xyz;
return outVert;
}
fragment float4 fragment_light(ProjectedVertex vert [[stage_in]],
constant Uniforms &uniforms [[buffer(0)]])
{
float3 ambientTerm = light.ambientColor * material.ambientColor;
float3 normal = normalize(vert.normal);
float diffuseIntensity = saturate(dot(normal, light.direction));
float3 diffuseTerm = light.diffuseColor * material.diffuseColor * diffuseIntensity;
float3 specularTerm(0);
if (diffuseIntensity > 0)
{
float3 eyeDirection = normalize(vert.eye);
float3 halfway = normalize(light.direction + eyeDirection);
float specularFactor = pow(saturate(dot(normal, halfway)), material.specularPower);
specularTerm = light.specularColor * material.specularColor * specularFactor;
}
return float4(ambientTerm + diffuseTerm + specularTerm, 1);
}
screenshot
So problem was that using OBJ-C, when I indexed the vertices from the OBJ file, I only generated 1 vertex for shared vertices between surfaces, so I kept only 1 normal.
When translating it to swift, the hash value I used to check if the vertex is at the same place than one I already have was wrong and couldn't detect shared vertices, which resulted in keeping all of the normals, so each surface is flat.
I don't know if I'm clear enough but that's what happened, for future reference, this question was about making a Swift version of "metalbyexample" book which is Obj-C only.
Related
I'm trying to implement voxel cone tracing in Metal. One of the steps in the algorithm is to voxelize the geometry using a geometry shader. Metal does not have geometry shaders so I was looking into emulating them using a compute shader. I pass in my vertex buffer into the compute shader, do what a geometry shader would normally do, and write the result to an output buffer. I also add a draw command to an indirect buffer. I use the output buffer as the vertex buffer for my vertex shader. This works fine, but I need twice as much memory for my vertices, one for the vertex buffer and one for the output buffer. Is there any way to directly pass the output of the compute shader to the vertex shader without storing it in an intermediate buffer? I don't need to save the contents of the output buffer of the compute shader. I just need to give the results to the vertex shader.
Is this possible? Thanks
EDIT
Essentially, I'm trying to emulate the following shader from glsl:
#version 450
layout(triangles) in;
layout(triangle_strip, max_vertices = 3) out;
layout(location = 0) in vec3 in_position[];
layout(location = 1) in vec3 in_normal[];
layout(location = 2) in vec2 in_uv[];
layout(location = 0) out vec3 out_position;
layout(location = 1) out vec3 out_normal;
layout(location = 2) out vec2 out_uv;
void main()
{
vec3 p = abs(cross(in_position[1] - in_position[0], in_position[2] - in_position[0]));
for (uint i = 0; i < 3; ++i)
{
out_position = in_position[i];
out_normal = in_normal[i];
out_uv = in_uv[i];
if (p.z > p.x && p.z > p.y)
{
gl_Position = vec4(out_position.x, out_position.y, 0, 1);
}
else if (p.x > p.y && p.x > p.z)
{
gl_Position = vec4(out_position.y, out_position.z, 0, 1);
}
else
{
gl_Position = vec4(out_position.x, out_position.z, 0, 1);
}
EmitVertex();
}
EndPrimitive();
}
For each triangle, I need to output a triangle with vertices at these new positions instead. The triangle vertices come from a vertex buffer and is drawn using an index buffer. I also plan on adding code that will do conservative rasterization (just increase the size of the triangle by a little bit) but it's not shown here. Currently what I'm doing in the Metal compute shader is using the index buffer to get the vertex, do the same code in the geometry shader above, and outputting the new vertex in another buffer which I then use to draw.
Here's a very speculative possibility depending on exactly what your geometry shader needs to do.
I'm thinking you can do it sort of "backwards" with just a vertex shader and no separate compute shader, at the cost of redundant work on the GPU. You would do a draw as if you had a buffer of all of the output vertices of the output primitives of the geometry shader. You would not actually have that on hand, though. You would construct a vertex shader that would calculate them in flight.
So, in the app code, calculate the number of output primitives and therefore the number of output vertices that would be produced for a given count of input primitives. Do a draw of the output primitive type with that many vertices.
You would not provide a buffer with the output vertex data as input to this draw.
You would provide the original index buffer and original vertex buffer as inputs to the vertex shader for that draw. The shader would calculate from the vertex ID which output primitive it's for, and which vertex of that primitive (e.g. for a triangle, vid / 3 and vid % 3, respectively). From the output primitive ID, it would calculate which input primitive would have generated it in the original geometry shader.
The shader would look up the indices for that input primitive from the index buffer and then the vertex data from the vertex buffer. (This would be sensitive to the distinction between a triangle list vs. triangle strip, for example.) It would apply any pre-geometry-shader vertex shading to that data. Then it would do the part of the geometry computation that contributes to the identified vertex of the identified output primitive. Once it has calculated the output vertex data, you can apply any post-geometry-shader vertex shading(?) that you want. The result is what it would return.
If the geometry shader can produce a variable number of output primitives per input primitive, well, at least you have a maximum number. So, you can draw the maximum potential count of vertices for the maximum potential count of output primitives. The vertex shader can do the computations necessary to figure out if the geometry shader would have, in fact, produced that primitive. If not, the vertex shader can arrange for the whole primitive to be clipped away, either by positioning it outside of the frustum or using a [[clip_distance]] property of the output vertex data.
This avoids ever storing the generated primitives in a buffer. However, it causes the GPU to do some of the pre-geometry-shader vertex shader and geometry shader calculations repeatedly. It will be parallelized, of course, but may still be slower than what you're doing now. Also, it may defeat some optimizations around fetching indices and vertex data that may be possible with more normal vertex shaders.
Here's an example conversion of your geometry shader:
#include <metal_stdlib>
using namespace metal;
struct VertexIn {
// maybe need packed types here depending on your vertex buffer layout
// can't use [[attribute(n)]] for these because Metal isn't doing the vertex lookup for us
float3 position;
float3 normal;
float2 uv;
};
struct VertexOut {
float3 position;
float3 normal;
float2 uv;
float4 new_position [[position]];
};
vertex VertexOut foo(uint vid [[vertex_id]],
device const uint *indexes [[buffer(0)]],
device const VertexIn *vertexes [[buffer(1)]])
{
VertexOut out;
const uint triangle_id = vid / 3;
const uint vertex_of_triangle = vid % 3;
// indexes is for a triangle strip even though this shader is invoked for a triangle list.
const uint index[3] = { indexes[triangle_id], index[triangle_id + 1], index[triangle_id + 2] };
const VertexIn v[3] = { vertexes[index[0]], vertexes[index[1]], vertexes[index[2]] };
float3 p = abs(cross(v[1].position - v[0].position, v[2].position - v[0].position));
out.position = v[vertex_of_triangle].position;
out.normal = v[vertex_of_triangle].normal;
out.uv = v[vertex_of_triangle].uv;
if (p.z > p.x && p.z > p.y)
{
out.new_position = float4(out.position.x, out.position.y, 0, 1);
}
else if (p.x > p.y && p.x > p.z)
{
out.new_position = float4(out.position.y, out.position.z, 0, 1);
}
else
{
out.new_position = float4(out.position.x, out.position.z, 0, 1);
}
return out;
}
Unfortunately there is no way to do this (and other things) in Metal, without going into unneeded complications.
The API lacks critical features that are common in Vulkan, OpenGL and DirectX...
looking for info on how to recreate the ShaderToy parameters iGlobalTime, iChannel etc within threejs. I know that iGlobalTime is the time elapsed since the Shader started, and I think the iChannel stuff is for pulling rgb out of textures, but would appreciate info on how to set these.
edit: have been going through all the shaders that come with three.js examples and think that the answers are all in there somewhere - just have to find the equivalent to e.g. iChannel1 = a texture input etc.
I am not sure if you have answered your question, but it might be good for others to know the integration steps for shadertoys to THREEJS.
First, you need to know that shadertoys is a fragment shaders. That being said, you have to set a "general purpose" vertex shader that should work with all shadertoys (fragment shaders).
Step 1
Create a "general purpose" vertex shader
varying vec2 vUv;
void main()
{
vUv = uv;
vec4 mvPosition = modelViewMatrix * vec4(position, 1.0 );
gl_Position = projectionMatrix * mvPosition;
}
This vertex shader is pretty basic. Notice that we defined a varying variable vUv to tell the fragment shader where is the texture mapping. This is important because we are not going to use the screen resolution (iResolution) for our base rendering. We will use the texture coordinates instead. We have done that in order to integrate multiple shadertoys on different objects in the same THREEJS scene.
Step 2
Pick the shadertoys that we want and create the fragment shader. (I have chosen a simple toy that performs well: Simple tunnel 2D by niklashuss).
Here is the given code for this toy:
void main(void)
{
vec2 p = gl_FragCoord.xy / iResolution.xy;
vec2 q = p - vec2(0.5, 0.5);
q.x += sin(iGlobalTime* 0.6) * 0.2;
q.y += cos(iGlobalTime* 0.4) * 0.3;
float len = length(q);
float a = atan(q.y, q.x) + iGlobalTime * 0.3;
float b = atan(q.y, q.x) + iGlobalTime * 0.3;
float r1 = 0.3 / len + iGlobalTime * 0.5;
float r2 = 0.2 / len + iGlobalTime * 0.5;
float m = (1.0 + sin(iGlobalTime * 0.5)) / 2.0;
vec4 tex1 = texture2D(iChannel0, vec2(a + 0.1 / len, r1 ));
vec4 tex2 = texture2D(iChannel1, vec2(b + 0.1 / len, r2 ));
vec3 col = vec3(mix(tex1, tex2, m));
gl_FragColor = vec4(col * len * 1.5, 1.0);
}
Step 3
Customize the shadertoy raw code to have a complete GLSL fragment shader.
The first thing missing out the code are the uniforms and varyings declaration. Add them at the top of your frag shader file (just copy and paste the following):
uniform float iGlobalTime;
uniform sampler2D iChannel0;
uniform sampler2D iChannel1;
varying vec2 vUv;
Note, only the shadertoys variables used for that sample are declared, plus the varying vUv previously declared in our vertex shader.
The last thing we have to twick is the proper UV mapping, now that we have decided to not use the screen resolution. To do so, just replace the line that uses the IResolution uniforms i.e.:
vec2 p = gl_FragCoord.xy / iResolution.xy;
with:
vec2 p = -1.0 + 2.0 *vUv;
That's it, your shaders are now ready for usage in your THREEJS scenes.
Step 4
Your THREEJS code:
Set up uniform:
var tuniform = {
iGlobalTime: { type: 'f', value: 0.1 },
iChannel0: { type: 't', value: THREE.ImageUtils.loadTexture( 'textures/tex07.jpg') },
iChannel1: { type: 't', value: THREE.ImageUtils.loadTexture( 'textures/infi.jpg' ) },
};
Make sure the textures are wrapping:
tuniform.iChannel0.value.wrapS = tuniform.iChannel0.value.wrapT = THREE.RepeatWrapping;
tuniform.iChannel1.value.wrapS = tuniform.iChannel1.value.wrapT = THREE.RepeatWrapping;
Create the material with your shaders and add it to a planegeometry. The planegeometry() will simulate the shadertoys 700x394 screen resolution, in other words it will best transfer the work the artist intented to share.
var mat = new THREE.ShaderMaterial( {
uniforms: tuniform,
vertexShader: vshader,
fragmentShader: fshader,
side:THREE.DoubleSide
} );
var tobject = new THREE.Mesh( new THREE.PlaneGeometry(700, 394,1,1), mat);
Finally, add the delta of the THREE.Clock() to iGlobalTime value and not the total time in your update function.
tuniform.iGlobalTime.value += clock.getDelta();
That is it, you are now able to run most of the shadertoys with this setup...
2022 edit: The version of Shaderfrog described below is no longer being actively developed. There are bugs in the compiler used making it not able to parse all shaders correctly for import, and it doesn't support many of Shadertoy's features, like multiple image buffers. I'm working on a new tool if you want to follow along, otherwise you can try the following method, but it likely won't work most of the time.
Original answer follows:
This is an old thread, but there's now an automated way to do this. Simply go to http://shaderfrog.com/app/editor/new and on the top right click "Import > ShaderToy" and paste in the URL. If it's not public you can paste in the raw source code. Then you can save the shader (requires sign up, no email confirm), and click "Export > Three.js".
You might need to tweak the parameters a little after import, but I hope to have this improved over time. For example, ShaderFrog doesn't support audio nor video inputs yet, but you can preview them with images instead.
Proof of concept:
ShaderToy https://www.shadertoy.com/view/MslGWN
ShaderFrog http://shaderfrog.com/app/view/247
Full disclosure: I am the author of this tool which I launched last week. I think this is a useful feature.
This is based on various sources , including the answer of #INF1.
Basically you insert missing uniform variables from Shadertoy (iGlobalTime etc, see this list: https://www.shadertoy.com/howto) into the fragment shader, the you rename mainImage(out vec4 z, in vec2 w) to main(), and then you change z in the source code to 'gl_FragColor'. In most Shadertoys 'z' is 'fragColor'.
I did this for two cool shaders from this guy (https://www.shadertoy.com/user/guil) but unfortunately I didn't get the marble example to work (https://www.shadertoy.com/view/MtX3Ws).
A working jsFiddle is here: https://jsfiddle.net/dirkk0/zt9dhvqx/
Change the shader from frag1 to frag2 in line 56 to see both examples.
And don't 'Tidy' in jsFiddle - it breaks the shaders.
EDIT:
https://medium.com/#dirkk/converting-shaders-from-shadertoy-to-threejs-fe17480ed5c6
I have tried to make better quality of my volume ray casting algorithm. I have set a smaller step of raycast (quality is better), but it causes problem. It is on pictures below (black areas where they shouldnt be).
I am using RGB cube to get direction of ray in volume.
I think, i have the same algorithm like there: volume rendering (using glsl) with ray casting algorithm
Have anybody some ideas, where could be a problem? I need to resolve this, because deadline of my diplom thesis is to close:( I realy don't know, why it doesnt work:(
EDIT:
I cant show there my all code (it could be problem, if i will supply it before hand it in school). But the key code to going throught the volume:
// All variables neede to rays
vec3 rayDirection = texture2D(backFaceCube, texCoo).xyz - varcolor.xyz;
float lenRay = length(rayDirection);
vec3 normDir = normalize(rayDirection);
float d = qualitySteps; //quality steps is size of steps defined by user -> example: 0.01, 0.001, 0.0001 etc.
vec3 step = normDir * d;
float lenStep = length(step);
float accumulatedLength = 0.0;
and then in cycle:
posInCube.xyz += step;
accumulatedLength += lenStep;
...
...
...
if(accumulatedLength >= lenRay || accumulatedColor.a > 1.0 ) {
break;
}
EDIT2:(sorry but like comment it was too long)
Yes, the texture is noisy...i have tried to delete the condition with alpha: if(accumulatedColor.a > 1.0), but the result is same.
I think that there is some direct correlation with length of ray and size of step. I tried many combination and i have found these things.
If step is big, i am able to go throught all volume, but if it is small, than i am realy not able to go throught volume (maybe). If step is extremely big, than i can see mirroved object (it can be caused by repeating texture if i go out of the texture on GPU). If step is too small, than i am able to mapped only small part of texture -> it seems, that ray is too short, but in reality he isnt. Questins are, why mapping of 3D coordinates to 2D texture is wrong and depend on size of step..
Can you please supply the code for your fragment shader?
Are you traversing the whole vector from front to end position? Here's an example shader (the code might contain some errors since I just wrote it from the top of my head. I unfortunately can't test the code on my computer at the moment):
in vec2 texCoord;
out vec4 outColor;
uniform float stepSize;
uniform int numSteps;
uniform sampler2d frontTexture;
uniform sampler2d backTexture;
uniform sampler3d volumeTexture;
uniform sampler1d transferTexture; // Density to RGB
void main()
{
vec4 color = vec4(0.0);
vec3 startPosition = texture(frontTexture, texCoord);
vec3 endPosition = texture(backTexture, texCoord);
vec3 delta = normalize(startPosition - endPosition) * stepSize;
vec3 position = startPosition;
for (int i = 0; i < numSteps; ++i)
{
float density = texture(volumeTexture, position).r;
vec3 voxelColor = texture(transferTexture, density);
// Sampling distance correction
color.a = 1.0 - pow((1.0 - color.a), stepSize * 500.0);
// Front to back blending (no shading done)
color.rgb = color.rgb + (1.0 - color.a) * voxelColor.a * voxelColor.rgb;
color.a = color.a + (1.0 - color.a) * voxelColor.a;
if (color.a >= 1.0)
{
break;
}
// Advance
position += direction;
if (position.x > 1.0 || position.y > 1.0 || position.z > 1.0)
{
break;
}
}
outColor = color;
}
I came up with some code that simulates 3D texture lookup using a big 2D texture that contains the tiles. 3D Texture is 128x128x64 and the big 2D texture is 1024x1024, divided into 64 tiles of 128x128.
The lookup code in the fragment shader looks like this:
#extension GL_EXT_gpu_shader4 : enable
varying float LightIntensity;
varying vec3 pos;
uniform sampler2D noisef;
vec4 flat_texture3D()
{
vec3 p = pos;
vec2 inimg = p.xy;
int d = int(p.z*128.0);
float ix = (d % 8);
float iy = (d / 8);
vec2 oc = inimg + vec2(ix, iy);
oc *= 0.125;
return texture2D(noisef, oc);
}
void main (void)
{
vec4 noisevec = flat_texture3D();
gl_FragColor = noisevec;
}
The tiling logic seems to work ok and there is only one problem with this code. It looks like this:
There are strange 1 to 2 pixel wide streaks between the layers of voxels.
The streaks appear just at the border when d changes.
I've been working on this for 2 days now and still without any idea of what's going on here.
This looks like a texture filter issue. Think about it: when you come close to the border, the bilinear filter will consider the neighboring texel, in your case: from another "depth layer".
To avoid this, you can clamp the texture coords so that they are never outside the rect defined outmost texel centers of the tile (similiar to GL_CLAMP_TO_EDGE, but on a per-tile basis). But you should be aware that the problems will become worse when using mipmapping. You should also be aware, that currently you are not able to filter in the z direction, as a real 3D texture would. You could simulate this manually in the shader, of course.
But really: why not just using 3D textures? The hw can do all this for you, with much less overhead...
I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?
After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.
I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.
After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.
from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.
I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.