Severe artifact when interpolating between dual quaternions - matrix

I'm having trouble with my implementation of dual quaternion skinning. I'm still learning about the subject, so for the moment I'm converting from the bone matrix to a dual quaternion CPU side, and back to a matrix in the shader.
The conversion does apparently work correctly for single bones, but if I try to linearly blend between dual quaternions, I get this artifact:
http://imagizer.imageshack.us/a/img838/8671/nun.gif
I don't know what's causing this. Maybe it's related to how I normalize the dual quaternion, maybe it's in how I convert from dual quat to matrix. I've tried searching for actual dual quaternion code, but all I find is a bunch of hard to read mathematical definitions.
I am including pieces of the shader code, as I'm pretty sure that's where the problem is. Hopefully somebody proficient in quaternion math can look through it!
Blending the dual quaternions. Boneweight2 = (1.0 - boneweight1), so they'll always sum up to 1.
vec4 blendReal = boneReal[bone1] * boneWeight1 + boneReal[bone2] * boneWeight2;
vec4 blendDual = boneDual[bone1] * boneWeight1 + boneDual[bone2] * boneWeight2;
float blend_norm_real = length(blendReal);
blendReal /= blend_norm_real;
blendDual /= blend_norm_real;
Create matrix from dual quaternion:
mat4 MatFromDualQuat(vec4 rq, vec4 dq)
{
//Source: Section 3.4 http://www.seas.upenn.edu/~ladislav/papers/sdq-i3d07/sdq-i3d07.pdf
//rq = real quaternion
//dq = dual quaternion
mat4 M;
M[0][0] = 1.0 - 2.0 * (rq.y * rq.y + rq.z * rq.z); //
M[1][0] = 2.0 * (rq.x * rq.y + rq.w * rq.z);//
M[2][0] = 2.0 * (rq.w * rq.y - rq.x * rq.z);//
M[3][0] = 0.0;
M[0][1] = 2.0 * (rq.x * rq.y - rq.w * rq.z);
M[1][1] = 1.0 - 2.0 * (rq.x * rq.x + rq.z * rq.z);
M[2][1] = 2.0 * (rq.y * rq.z + rq.w * rq.x);
M[3][1] = 0.0;
M[0][2] = - 2.0 * (rq.x * rq.z + rq.w * rq.y);
M[1][2] = 2.0 * (rq.y * rq.z - rq.w * rq.x);
M[2][2] = 1.0 - 2.0 * (rq.x * rq.x + rq.y * rq.y);
M[3][2] = 0.0;
M[0][3] = 2.0 * (-dq.w * rq.x + dq.x * rq.w + dq.z * rq.y - dq.y * rq.z);
M[1][3] = 2.0 * (-dq.w * rq.y + dq.y * rq.w + dq.x * rq.z - dq.z * rq.x);
M[2][3] = 2.0 * (-dq.w * rq.z + dq.z * rq.w + dq.y * rq.x - dq.x * rq.y);
M[3][3] = 1.0;
return M;
}
And then I multiply that with the bind pose vertex position.

Related

Why is matrix multiplication row x row 4-5 times slower than row x column on Mali's GPU?

Recently, I encountered a problem when using computer shader to develop matrix multiplication. A common matrix multiplication C = AB. in order to make the memory continuous, I transposed the B matrix. I think this can speed up the running speed. However, when I measured the speed, I found that the form of line X was several times slower than that of line X. I explored it for a long time and couldn't understand it, so I wrote down the problem for help!!!
My environment Mali G77 (MediaTek Dimensity 1200)
A matrix dimension: 4x2048x2048
B matrix dimension: 4x2048x2048
Time comparison:
Row x row: About 9s
Row x column: about 1.6s
Column x column: about 3.3s
question demo:https://github.com/yikox/ProfilerDemo
shader code:
//computer shader
#version 310 es
#define XLOCAL 8
#define YLOCAL 8
#define ZLOCAL 1
layout(binding = 0) writeonly buffer soutput{
vec4 data[];
} uOutput;
layout(binding = 1) readonly buffer sinput0{
vec4 data[];
} uInput0;
layout(binding = 2) readonly buffer sinput1{
vec4 data[];
} uInput1;
layout(location=3) uniform ivec4 uInputSize0;
layout(location=4) uniform ivec4 uInputSize1;
layout(location=5) uniform ivec4 uOutputSize;
layout (local_size_x = XLOCAL, local_size_y = YLOCAL, local_size_z = ZLOCAL) in;
//矩阵A和矩阵B相乘的某一列的第I个元素
vec4 PixelMul(int i, ivec3 pos)
{
// 行x行
// vec4 data0 = uInput0.data[i + pos.y * uInputSize0.x + pos.z * uInputSize0.x * uInputSize0.y];
// vec4 data1 = uInput1.data[i + pos.x * uInputSize1.y + pos.z * uInputSize1.x * uInputSize1.y];
// 行x列
// vec4 data0 = uInput0.data[i + pos.y * uInputSize0.x + pos.z * uInputSize0.x * uInputSize0.y];
// vec4 data1 = uInput1.data[pos.x + i * uInputSize1.y + pos.z * uInputSize1.x * uInputSize1.y];
// 列x列
vec4 data0 = uInput0.data[pos.y + i * uInputSize0.x + pos.z * uInputSize0.x * uInputSize0.y];
vec4 data1 = uInput1.data[pos.x + i * uInputSize1.y + pos.z * uInputSize1.x * uInputSize1.y];
return data0 * data1;
}
void main()
{
ivec3 pos = ivec3(gl_GlobalInvocationID) * ivec3(2, 2, 1);
if(all(lessThan(pos, uOutputSize.xyz)))
{
vec4 outData00 = vec4(0);
vec4 outData01 = vec4(0);
vec4 outData10 = vec4(0);
vec4 outData11 = vec4(0);
for(int i = 0; i < uInputSize0.x; i++)
{
outData00 += PixelMul(i, pos + ivec3(0, 0, 0));
outData01 += PixelMul(i, pos + ivec3(1, 0, 0));
outData10 += PixelMul(i, pos + ivec3(0, 1, 0));
outData11 += PixelMul(i, pos + ivec3(1, 1, 0));
}
uOutput.data[pos.x + 0 + (pos.y + 0) * uOutputSize.x + pos.z * uOutputSize.x * uOutputSize.y] = outData00;
uOutput.data[pos.x + 1 + (pos.y + 0) * uOutputSize.x + pos.z * uOutputSize.x * uOutputSize.y] = outData01;
uOutput.data[pos.x + 0 + (pos.y + 1) * uOutputSize.x + pos.z * uOutputSize.x * uOutputSize.y] = outData10;
uOutput.data[pos.x + 1 + (pos.y + 1) * uOutputSize.x + pos.z * uOutputSize.x * uOutputSize.y] = outData11;
}
}

3D Texture Rendering Using 2D Texture

I want render 3d texture at OpenGL es 2.0 environment. So I make 3d texture data to 2d texture.
3d texture (256 * 256 * 100) -> 2d texture(2560 * 2560)
I think two offsets are same.
offset = z3 * 256 * 256 + y3 * 256 + x3
offset = y2 * 2560 + x2
But result is not good.
vec3 size3 = vec3(256.0, 256.0, 100.0);
vec2 size2 = vec2(2560.0, 2560.0);
vec2 calc3dTo2d(vec3 coords) {
vec3 offset3 = vec3(coords.x * size3.x, coords.y * size3.y, coords.z * size3.z);
float offset = offset3.z * size3.x * size3.y + offset3.y * size3.x + offset3.x;
float y = floor(offset / size2.x) / size2.y;
float x = fract(offset / size2.x);
return vec2(x, y);
}
What I'm missing?

How does this 2d noise generation function work? Does it have a name?

I came across this 2D noise function in the Book of Shaders
float noise(vec2 st) {
vec2 integerPart = floor(st);
vec2 fractionalPart = fract(st);
float s00 = random(integerPart);
float s01 = random(integerPart + vec2(0.0, 1.0));
float s10 = random(integerPart + vec2(1.0, 0.0));
float s11 = random(integerPart + vec2(1.0, 1.0));
float dx1 = s10 - s00;
float dx2 = s11 - s01;
float dy1 = s01 - s00;
float dy2 = s11 - s10;
float alpha = smoothstep(0.0, 1.0, fractionalPart.x);
float beta = smoothstep(0.0, 1.0, fractionalPart.y);
return s00 + alpha * dx1 + (1 - alpha) * beta * dy1 + alpha * beta * dy2;
}
It is clear what this function does: it generates four random numbers at the vertices of a square, then interpolates them. What I am finding difficult is understanding why the interpolation (the s00 + alpha * dx1 + (1 - alpha) * beta * dy1 + alpha * beta * dy2 expression) works. How is it interpolating the four values when it does not seem to be symmetric in the x and y values?
If you expand the last line, it's:
return s00 * (1-alpha) * (1-beta) +
s10 * alpha * (1-beta) +
s01 * (1-alpha) * beta +
s11 * alpha * beta;
Which is symmetric in x and y. If you add up the weights:
alpha * beta + (1-alpha) * beta + alpha * (1-beta) + (1-alpha) * (1-beta)
= (alpha + 1-alpha) * beta + (alpha + 1-alpha) * (1-beta)
= beta + 1-beta
= 1
so it's an affine combination of the values at the corners

Three.js - Bend plane with spherical coordinates

I try to bend a plane mesh using spherical coordinates. I token formulas on wikipedia and it almost work !
But some vertices are positionned to the same place.
The mesh plane is placed at 0,0,0, you can watch the result here :
Before : http://hpics.li/78c0871
After : http://hpics.li/19ada1a
And here is my code :
#radius = 4
#oPhi = 0
#oTheta = 0
projection : (vertice) ->
p = Math.sqrt(vertice.x ** 2 + vertice.y ** 2)
c = Math.asin(p/#radius)
phi = Math.asin(Math.cos(c) * Math.sin(#oPhi) + (vertice.y * Math.sin(c) * Math.cos(#oPhi) / p))
theta = #oTheta + Math.atan((vertice.x * Math.sin(c)) / (p * Math.cos(#oPhi) * Math.cos(c) - vertice.y * Math.sin(#oPhi) * Math.sin(c)))
vertice.x = #radius * Math.sin(phi) * Math.cos(theta)
vertice.z = #radius * Math.sin(phi) * Math.sin(theta)
vertice.y = #radius * Math.cos(phi)
Thanks for help !
Sorry for that, but the formulas concern le orthographic projection as you can see here :
http://fr.wikipedia.org/wiki/Projection_orthographique

edge detection on depth buffer [cel shading]

I am currently writing a cel shading shader, but I'm having issues with edge detection. I am currently using the following code utilizing laplacian edge detection on non-linear depth buffer values:
uniform sampler2d depth_tex;
void main(){
vec4 color_out;
float znear = 1.0;
float zfar = 50000.0;
float depthm = texture2D(depth_tex, gl_TexCoord[0].xy).r;
float lineAmp = mix( 0.001, 0.0, clamp( (500.0 / (zfar + znear - ( 2.0 * depthm - 1.0 ) * (zfar - znear) )/2.0), 0.0, 1.0 ) );// make the lines thicker at close range
float depthn = texture2D(depth_tex, gl_TexCoord[0].xy + vec2( (0.002 + lineAmp)*0.625 , 0.0) ).r;
depthn = depthn / depthm;
float depths = texture2D(depth_tex, gl_TexCoord[0].xy - vec2( (0.002 + lineAmp)*0.625 , 0.0) ).r;
depths = depths / depthm;
float depthw = texture2D(depth_tex, gl_TexCoord[0].xy + vec2(0.0 , 0.002 + lineAmp) ).r;
depthw = depthw / depthm;
float depthe = texture2D(depth_tex, gl_TexCoord[0].xy - vec2(0.0 , 0.002 + lineAmp) ).r;
depthe = depthe / depthm;
float Contour = -4.0 + depthn + depths + depthw + depthe;
float lineAmp2 = 100.0 * clamp( depthm - 0.99, 0.0, 1.0);
lineAmp2 = lineAmp2 * lineAmp2;
Contour = (512.0 + lineAmp2 * 204800.0 ) * Contour;
if(Contour > 0.15){
Contour = (0.15 - Contour) / 1.5 + 0.5;
} else
Contour = 1.0;
color_out.rgb = color_out.rgb * Contour;
color_out.a = 1.0;
gl_FragColor = color_out;
}
but it is hackish[note the lineAmp2], and the details at large distances are lost. So I made up some other algorithm:
[Note that Laplacian edge detection is in use]
1.Get 5 samples from the depth buffer: depthm, depthn, depths, depthw, depthe, where depthm is exactly where the processed fragment is, depthn is slightly to the top, depths is slightly to the bottom etc.
2.Calculate their real coordinates in camera space[as well as convert to linear].
3.Compare the side samples to the middle sample by substracting and then normalize each difference by dividing by difference in distance between two camera-space points and add all four results. This should in theory help with situation, where at large distances from the camera two fragments are very close on the screen but very far in camera space, which is fatal for linear depth testing.
where:
2.a convert the non linear depth to linear using an algorithm from [url=http://stackoverflow.com/questions/6652253/getting-the-true-z-value-from-the-depth-buffer]http://stackoverflow.com/questions/6652253/getting-the-true-z-value-from-the-depth-buffer[/url]
exact code:
uniform sampler2D depthBuffTex;
uniform float zNear;
uniform float zFar;
varying vec2 vTexCoord;
void main(void)
{
float z_b = texture2D(depthBuffTex, vTexCoord).x;
float z_n = 2.0 * z_b - 1.0;
float z_e = 2.0 * zNear * zFar / (zFar + zNear - z_n * (zFar - zNear));
}
2.b convert the screen coordinates to be [tan a, tan b], where a is horizontal angle and b i vertical. There probably is a better terminology with some spherical coordinates but I don't know these yet.
2.c create a 3d vector ( converted screen coordinates, 1.0 ) and scale it by linear depth. I assume this is estimated camera space coordinates of the fragment. It looks like it.
3.a each difference is as follows: (depthm - sidedepth)/lenght( positionm - sideposition)
And I may have messed up something at any point. Code looks fine, but the algorithm may not be, as I made it up myself.
My code:
uniform sampler2d depth_tex;
void main(){
float znear = 1.0;
float zfar = 10000000000.0;
float depthm = texture2D(depth_tex, gl_TexCoord[0].xy + distort ).r;
depthm = 2.0 * zfar * znear / (zfar + znear - ( 2.0 * depthm - 1.0 ) * (zfar - znear) ); //convert to linear
vec2 scorm = (gl_TexCoord[0].xy + distort) -0.5; //conversion to desired coordinates space. This line returns value from range (-0.5,0.5)
scorm = scorm * 2.0 * 0.5; // normalize to (-1, 1) and multiply by tan FOV/2, and default fov is IIRC 60 degrees
scorm.x = scorm.x * 1.6; //1.6 is aspect ratio 16/10
vec3 posm = vec3( scorm, 1.0 );
posm = posm * depthm; //scale by linearized depth
float depthn = texture2D(depth_tex, gl_TexCoord[0].xy + distort + vec2( 0.002*0.625 , 0.0) ).r; //0.625 is aspect ratio 10/16
depthn = 2.0 * zfar * znear / (zfar + znear - ( 2.0 * depthn - 1.0 ) * (zfar - znear) );
vec2 scorn = (gl_TexCoord[0].xy + distort + vec2( 0.002*0.625, 0.0) ) -0.5;
scorn = scorn * 2.0 * 0.5;
scorn.x = scorn.x * 1.6;
vec3 posn = vec3( scorn, 1.0 );
posn = posn * depthn;
float depths = texture2D(depth_tex, gl_TexCoord[0].xy + distort - vec2( 0.002*0.625 , 0.0) ).r;
depths = 2.0 * zfar * znear / (zfar + znear - ( 2.0 * depths - 1.0 ) * (zfar - znear) );
vec2 scors = (gl_TexCoord[0].xy + distort - vec2( 0.002*0.625, 0.0) ) -0.5;
scors = scors * 2.0 * 0.5;
scors.x = scors.x * 1.6;
vec3 poss = vec3( scors, 1.0 );
poss = poss * depths;
float depthw = texture2D(depth_tex, gl_TexCoord[0].xy + distort + vec2(0.0 , 0.002) ).r;
depthw = 2.0 * zfar * znear / (zfar + znear - ( 2.0 * depthw - 1.0 ) * (zfar - znear) );
vec2 scorw = ( gl_TexCoord[0].xy + distort + vec2( 0.0 , 0.002) ) -0.5;
scorw = scorw * 2.0 * 0.5;
scorw.x = scorw.x * 1.6;
vec3 posw = vec3( scorw, 1.0 );
posw = posw * depthw;
float depthe = texture2D(depth_tex, gl_TexCoord[0].xy + distort - vec2(0.0 , 0.002) ).r;
depthe = 2.0 * zfar * znear / (zfar + znear - ( 2.0 * depthe - 1.0 ) * (zfar - znear) );
vec2 score = ( gl_TexCoord[0].xy + distort - vec2( 0.0 , 0.002) ) -0.5;
score = score * 2.0 * 0.5;
score.x = score.x * 1.6;
vec3 pose = vec3( score, 1.0 );
pose = pose * depthe;
float Contour = ( depthn - depthm )/length(posm - posn) + ( depths - depthm )/length(posm - poss) + ( depthw - depthm )/length(posm - posw) + ( depthe - depthm )/length(posm - pose);
Contour = 0.25 * Contour;
color_out.rgb = vec3( Contour, Contour, Contour );
color_out.a = 1.0;
gl_FragColor = color_out;
}
The exact issue with the second code is that it exhibits some awful artifacts at larger distances.
My goal is to make either of them work properly. Are there any tricks I could use to improve precision/quality in both linearized and non-linearized depth buffer? Is anything wrong with my algorithm for linearized depth buffer?

Resources