DX9 style intristics are disabled when not in dx9 compatibility mode? - directx-11

I am currently writing an HLSL shader for a basic Gaussian blur. The shader code is straight forward, but I keep getting an error:
DX9 style intristics are disabled when not in dx9 compatibility mode. (LN#: 19)
This tells me that line 19 in my code is the issue, and I believe it is either due to tex2D or Sampler in that particular line.
#include "Common.hlsl"
Texture2D Texture0 : register(t0);
SamplerState Sampler : register(s0);
float4 PSMain(PixelShaderInput pixel) : SV_Target {
float2 uv = pixel.TextureUV; // This is TEXCOORD0.
float4 result = 0.0f;
float offsets[21] = { ... };
float weights[21] = { ... };
// Blur horizontally.
for (int x = 0; x < 21; x++)
result += tex2D(Sampler, float2(uv.x + offsets[x], uv.y)) * weights[x];
return result;
}
See below for notes about the code, and my questions.
Notes
I have to hand type my code into StackOverflow due to my code being on a computer without a connection. Therefore:
Any spelling or case errors present here do not exist in code.
The absence of values inside of offsets and weights is intentional.
This is because there are 21 values in each and I didn't feel like typing them all.
offsets is every integer from -10 to 10.
weights ranges from 0.01 to 0.25 and back to 0.01.
The line count here is smaller due to the absence mentioned prior.
The line number of the error here is 15.
The column range is 13 - 59 which encapsulates tex2D and Sampler.
My Questions
Am I using the wrong data type for Sampler in the tex2D call?
Is tex2D deprecated in DirectX 11?
What should I be using instead?
What am I doing wrong here as that is my only error.

After some extensive searching, I've found out that tex2D is no longer supported from ps_4_0 and up. Well, 4_0 will work in legacy mode, but it doesn't work it 5_0 which is what I am using.
Shader Model : Supported
Shader Model 4 : yes (pixel shader only), but you must use the legacy compile option when compiling.
Shader Model 3 (DirectX HLSL) : yes (pixel shader only)
Shader Model 2 (DirectX HLSL) : yes (pixel shader only)
Shader Model 1 (DirectX HLSL) : yes (pixel shader only)
This has been replaced by Texture2D; the documentation is available for it. Below is an example from the mentioned documentation:
// Object Declarations
Texture2D g_MeshTexture;
SamplerState MeshTextureSampler
{
Filter = MIN_MAG_MIP_LINEAR;
AddressU = Wrap;
AddressV = Wrap;
};
struct VS_OUTPUT
{
float4 Position : SV_POSITION;
float4 Diffuse : COLOR0;
float2 TextureUV : TEXCOORD0;
};
VS_OUTPUT In;
// Shader body calling the intrinsic function
Output.RGBColor = g_MeshTexture.Sample(MeshTextureSampler, In.TextureUV) * In.Diffuse;
To replace the tex2D call in my code:
result += Texture0.Sample(Sampler, float2(uv.x + offsets[x], uv.y)) * weights[x];
Also, note that the code in this post is for the horizontal pass of a Gaussian blur.

Related

Metal fragment shader bound resource acting erratically on macOS

This problem has been stumping me, and I can't figure it out. I've got a few different shaders which use the same resource, a structure with a bunch of lighting values. The first shader to use it works fine -- the second one does not. Seems like it might be getting zeroes. The third shader to use it also works fine.
If I don't have any objects which use the first shader, then the second one works. And the third one does NOT work. The resource never changes, it's an MTLBuffer I set once. And, if I GPU Frame Capture, the values reported in the draw calls are all correct, all the time. Yet nothing shows up for the shaders that don't work.
The reason I think it is this one particular struct that is fluctuating is that if I hard-code the values into the shader instead of reading the struct values, then that also works. It's driving me crazy.
I am not sure what kind of code examples will serve here. Here is the structure and how I am using it. I'm not binding any other resources to this particular index, anywhere else in the program.
struct Light {
packed_float3 color; // 0 - 2
float ambientIntensity; // 3
packed_float3 direction; // 4 - 6
float diffuseIntensity; // 7
float shininess; // 8
float specularIntensity; // 9
float dummy1,dummy2;
/*
_______________________
|0 1 2 3|4 5 6 7|8 9 |
-----------------------
| | | |
| chunk0| chunk1| chunk2|
*/
}__attribute__ ((aligned (16)));
... shader code ....
float4 ambientColor = float4(1,1,1,1) * color * 0.25;
//Diffuse
float diffuseFactor = max(0.0,dot(interpolated.normal, float3(0.0,0.5,-1.0))); // 1
float4 diffuseColor = float4(light.color,1) * color * 0.5 * diffuseFactor ; // 2
//Specular
float3 eye = normalize(interpolated.fragmentPosition); //1
float3 reflection = reflect(float3(0.0,0.5,-1.0),interpolated.normal); // 2
float specularFactor = pow(max(0.0, dot(reflection, eye)), 10.0); //3
float4 specularColor = float4(float3(1,1,1)* 2 * specularFactor ,color.w);//4
color = (ambientColor + diffuseColor + specularColor);
color = clamp(color, 0.0, 1.0);
Anyone have any ideas about what could be the trouble? Also, this is only happening on the Mac. On iOS it works fine.
EDIT: It seems like when the shader is failing, it is simply unable to access the resource at all. For example, I tried just setting my fragment return color to equal the light color, with no modifications.
And it produces nothing -- not even black! If I replace that statement with one of these two, I get white and black:
color = float4(1,1,1, 1); // produces white
color = float4(0,0,0, 1); //produces black
color = float4(light.color.xyz, 1); // nothing at all
It is like the fragment shader is just aborting when it is trying to access the light structure -- otherwise I would get SOMETHING in my color, with a forced alpha value of 1. I don't get it.

Metal normal doesn't interpolate

I've been learning how Metal works using Swift and targeting macOS. Thing's have been going okay, but now, close to getting the stuff done, I've hit a problem that I cannot possibly understand ... I hope you guys will help me :)
I'm loading and displaying a OBJ teapot, which I'm lighting using ambiant+diffuse+specular light. Lighting in itself works well, but problem is : the normal vector is not interpolated when going to the fragment shader, which results in having flat lighting on supposedly curved surface ... Not good ...
I really don't understand why the normal is not interpolated while other values (position + eye) are ... Here is my shader and an image to show the result :
Thanks in advance :)
struct Vertex
{
float4 position;
float4 normal;
};
struct ProjectedVertex
{
float4 position [[position]];
float3 eye;
float3 normal;
};
vertex ProjectedVertex vertex_project(device Vertex *vertices [[buffer(0)]],
constant Uniforms &uniforms [[buffer(1)]],
uint vid [[vertex_id]])
{
ProjectedVertex outVert;
outVert.position = uniforms.modelViewProjectionMatrix * vertices[vid].position;
outVert.eye = -(uniforms.modelViewProjectionMatrix * vertices[vid].position).xyz;
outVert.normal = (uniforms.modelViewProjectionMatrix * float4(vertices[vid].normal)).xyz;
return outVert;
}
fragment float4 fragment_light(ProjectedVertex vert [[stage_in]],
constant Uniforms &uniforms [[buffer(0)]])
{
float3 ambientTerm = light.ambientColor * material.ambientColor;
float3 normal = normalize(vert.normal);
float diffuseIntensity = saturate(dot(normal, light.direction));
float3 diffuseTerm = light.diffuseColor * material.diffuseColor * diffuseIntensity;
float3 specularTerm(0);
if (diffuseIntensity > 0)
{
float3 eyeDirection = normalize(vert.eye);
float3 halfway = normalize(light.direction + eyeDirection);
float specularFactor = pow(saturate(dot(normal, halfway)), material.specularPower);
specularTerm = light.specularColor * material.specularColor * specularFactor;
}
return float4(ambientTerm + diffuseTerm + specularTerm, 1);
}
screenshot
So problem was that using OBJ-C, when I indexed the vertices from the OBJ file, I only generated 1 vertex for shared vertices between surfaces, so I kept only 1 normal.
When translating it to swift, the hash value I used to check if the vertex is at the same place than one I already have was wrong and couldn't detect shared vertices, which resulted in keeping all of the normals, so each surface is flat.
I don't know if I'm clear enough but that's what happened, for future reference, this question was about making a Swift version of "metalbyexample" book which is Obj-C only.

How can I programmatically switch to openGL in Sprite Kit to adopt to older devices like iPad2

I am developing a game using Sprite Kit. Since iOS9 SpriteKit uses Metal as shader backend. SK and the shaders work great. But if I test it on an iPad2, it's not working anymore. I've read about this problem and know that iPad2 is not supporting Metal. Now I would like to fall back to open GL to provide GLES shaders in this case.
I can programmatically test if "Metal" is available like this (Swift 2):
/**
Returns true if the executing device supports metal.
*/
var metalAvailable:Bool {
get {
struct Static {
static var metalAvailable : Bool = false
static var metalNeedsToBeTested : Bool = true
}
if Static.metalNeedsToBeTested {
let device = MTLCreateSystemDefaultDevice()
Static.metalAvailable = (device != nil)
}
return Static.metalAvailable
}
}
I know that it's possible to set a compatibility mode in the plist of the application:
Edit your app's Info.plist
Add the PrefersOpenGL key with a bool value of YES
In this case SpriteKit always uses openGL. This is not what I would like to use. I want my application to always use metal and just fall back to openGL if no Metal device was detected.
Is there any option in SpriteKit or UIKit or somewhere in the APIs where I can switch to the "PrefersOpenGL"-option programmatically?
Thanks in advance,
Jack
Summary
I found the solution. At the end it was my mistake. SpriteKit is definitely automatically falling back to openGL. The GLES shader language is less forgiving than Metal. This is where the problem came from. In openGL-shaders you MUST set the decimal point in each number. Unfortunately the shader compiler did't told me that after compiling. Another issue was that sometimes old shader builds stick with the bundle. Perform "clean" before testing a shader.
So this is how to deal with both kinds of shaders and detecting Metal / openGL:
Detect if Metal is available
This small helper can be placed anywhere in your code. It helps you to detect Metal at the first usage and gives you the opportunity to execute custom code depending on the configuration once.
Headers:
#import SpriteKit
#import Metal
Code:
/**
Detect Metal. Returns true if the device supports metal.
*/
var metalAvailable:Bool {
get {
struct Static {
static var metalAvailable : Bool = false
static var metalNeedsToBeTested : Bool = true
}
if Static.metalNeedsToBeTested {
Static.metalNeedsToBeTested = false
let device = MTLCreateSystemDefaultDevice()
Static.metalAvailable = (device != nil)
if Static.metalAvailable {
// Do sth. to init Metal code, if needed
} else {
// Do sth. to init openGL code, if needed
}
}
return Static.metalAvailable
}
}
Create shader in Sprite Kit
Create the shader as usual using sprite kit.
let shaderContainer = SKSpriteNode()
shaderContainer.position = CGPoint(x:self.frame.size.width/2, y:self.frame.size.height/2)
shaderContainer.size = CGSize(width:self.frame.size.width, height:self.frame.size.height)
self.backgroundNode.addChild(shaderContainer)
let bgShader:SKShader
// Test if metal is available
if self.metalAvailable {
bgShader = SKShader(fileNamed:"plasma.fsh")
} else {
NSLog("Falling back to openGL")
bgShader = SKShader(fileNamed:"plasmaGL.fsh")
}
// Add your uniforms. OpenGL needs the size of the frame to normalize
// The coordinates. This is why we always use the size uniform
bgShader.uniforms = [
SKUniform(name: "size", floatVector2:GLKVector2Make(1920.0, 1024.0))
]
shaderContainer.shader = bgShader
As you can see depending on the detected configuration another shader file is being loaded. The openGL shaders need an additional uniform for the size, because the symbol v_tex_coord is not available in openGL. If you don't use the size uniform in Metal, you can move the uniforms statement into the if block or just ignore it. Metal is not complaining if you don't use it.
Metal shader: plasma.fsh
#define M_PI 3.1415926535897932384626433832795
#define frequency 1 // Metal is less sensitive to number types.
#define colorDepth 2 // Numbers without decimal point make problems with openGL
void main(void) {
vec2 uv = v_tex_coord; // Normalized coordinates in Metal shaders
float red = ((sin((uv.x + u_time * 0.01) * M_PI * frequency) * cos((uv.y + u_time * 0.03) * M_PI * frequency) + 1) / colorDepth) + (colorDepth / 2.75) - (2 / 2.75);
gl_FragColor = vec4(red, uv.x, u_time, 1.0);
}
In Metal shaders you can simply read normalized coordinates. You can use the size to reconstruct the image coordinates if you like. However Metal is more forgiving with decimal points. As you can see, some numbers don't have decimal points here.
Open GL shader: plasmaGL.fsh
// OPEN GL shaders NEED the decimal point in numbers. so never use 1 but 1. or 1.0
#define M_PI 3.1415926535897932384626433832795
#define frequency 1.0 // This number must have a decimal point
#define colorDepth 2.0 // Same here.
void main(void) {
vec2 uv = gl_FragCoord.xy / size.xy; // Frame coordinates in openGL
// This formula is always using numbers with decimal points.
// Compare it to the metal shader. Two numbers of the metal
// have no decimal point. If you cut copy paste the metal shader
// formula to the GL shader it will not work!
float red = ((sin((uv.x + u_time * 0.01) * M_PI * frequency) * cos((uv.y + u_time * 0.03) * M_PI * frequency) + 1.0) / colorDepth) + (colorDepth / 2.75) - (2.0 / 2.75);
gl_FragColor = vec4(red, uv.x, u_time, 1.0);
}
Outlook
It is more work to test for both systems and creating two shaders. But as long as we are transitioning from GL to Metal this is a good method to test which kind of shader should be used. The iOS simulator doesn't support Metal, too. That means you can test the openGL behavior with the iOS and tvOS simulator.
If you develop for AppleTV then this approach is really handy, because the openGL shaders always work with Metal. You just need to replace gl_FragCoord.xy / size.xy with v_tex_coord. If you run the code on the simulator, you will see the openGL code, if you run it on the AppleTV target you'll see the smooth Metal shaders.
And another hint to all swift developers: Never ever forget the semicolon at the end of the line with shaders ;-)
Another trap is casting.
Metal:
int intVal = (int) uv.x;
float a = (float) intVal;
Open GL:
int intVal = int(uv.x);
float a = float(intVal);
I hope I could help anyone.
Cheers,
Jack

How to blur the outcome of a fragment shader?

I'm working on a shader that generates little clouds based on some mask images. Right now it works well, but i feel the result is missing something, and i thought a blur would be nice. I remember a basic blur algorithm where you have to apply a convolution with a matrix of norm 1 (the bigger the matrix the greater the result) and an image. The thing is, I don't know how to treat the current outcome of the shader as an image. So basically I want to keep the shader as is, but getting it blurry. Any ideas?, how can I integrate the convolution algorithm to the shader? Or does anyone know of other algorithm?
Cg code:
float Luminance( float4 Color ){
return 0.6 * Color.r + 0.3 * Color.g + 0.1 * Color.b;
}
struct v2f {
float4 pos : SV_POSITION;
float2 uv_MainTex : TEXCOORD0;
};
float4 _MainTex_ST;
v2f vert(appdata_base v) {
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv_MainTex = TRANSFORM_TEX(v.texcoord, _MainTex);
return o;
}
sampler2D _MainTex;
sampler2D _Gradient;
sampler2D _NoiseO;
sampler2D _NoiseT;
float4 frag(v2f IN) : COLOR {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
half4 c = tex2D (_MainTex, IN.uv_MainTex);
if (lum >= 1.0f){
float pos = lum - 1.0f;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
//turb.a = 0.0f;
return turb;
}
else return c;
}
It appears to me that this shader is emulating alpha testing between a backbuffer-like texture (passed via the sampler2D _MainTex) and a generated cloud luminance (represented by float lum) mapped onto a gradient. This makes things trickier because you can't just fake a blur and let alpha blending take care of the rest. You'll also need to change your alpha testing routine to emulate an alpha blend instead or restructure your rendering pipeline accordingly. We'll deal with blurring the clouds first.
The first question you need to ask yourself is if you need a screen-space blur. Seeing the mechanics of this fragment shader, I would think not -- you want to blur the clouds on the actual model. Given this, it should be sufficient to blur the underlying textures and result in a blurred result -- except you're emulating alpha clipping, so you'll get rough edges. The question is what to do about those rough edges. That's where alpha blending comes in.
You can emulate alpha blending by using a lerp (linear interpolation) between the turb color and c color with lerp() function (depending on which shader language you're using). You'll probably want something that looks like return lerp(c, turb, 1 - pos); instead of return turb; ... I'd expect you'll want to tweak this continually until you understand and start getting the results you want. (For example, you may prefer lerp(c, turb, 1 - pow(pos,4)))
In fact, you can try this last step (just adding the lerp) before modifying your textures to get an idea of what the alpha blending will do for you.
Edit: I hadn't considered the case where the _NoiseO and _NoiseT samplers were changing continually, so simply telling you to blur them was minimally useful advice. You can emulate blurring by using a multi-tap filter. The most simple way is to take uniformly spaced samples, weight them, and sum them together resulting in your final color. (Typically you'll want the weights themselves to sum to 1.)
This being said, you may or may not way to do this on the _NoiseO and _NoiseT textures themselves -- you may want to create a screen-space blur instead which may look more interesting to a viewer. In this case, the same concept applies, but you need to do the calculations for the offset coordinates for each tap and then perform a weighted summation.
For example if we were going with the first case and we wanted to sample from the _Noise0 sampler and blur it slightly, we could use this box filter (where all the weights are the same and sum to 1, thus performing an average):
// Untested code.
half4 nO = 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, g_offset.y))
Alternatively, if we wanted the entire cloud output to appear blurry we'd wrap the cloud generation portion in a function and call it instead of tex2D() for the taps.
// More untested code.
half4 genCloud(float2 tc) {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
float pos = lum - 1.0;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
// Figure out how you'd generate your alpha blending constant here for your lerp
turb.a = ACTUAL_ALPHA;
return turb;
}
And the multi-tap filtering would look like:
// And even more untested code.
half4 cloudcolor = 0.25 * genCloud(IN.uv_MainTex + float2( 0, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, g_offset.y))
return lerp(c, cloudcolor, cloudcolor.a);
However doing this is going to be relatively slow for calculations if you make the cloud function too complex. If you're bound by raster operations and texture reads (transferring texture/buffer data to and from memory) chances are this won't matter much unless you use a much more advanced blurring technique (such successful downsampling through ping-ponged buffers, useful for blurs/filters that are expensive because they have lots of taps). But performance is another entire consideration from just getting the look you want.

DirectX 11 Compute Shader - not writing all values

I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?
After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.
I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.
After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.
from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.
I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.

Resources