Is opencl known to generate corrupt code? - macos

I have a small opencl kernel that writes to a shared GL texture. I have separated different stages of the computation into several functions. Every function gets a pointer to the final color and passes this along if needed. If you look at the code fragment you see a line that is called "UNREACHABLE". For some reason it does get executed. What ever color I put in there appears in the final image. How is that possible?
If I duplicate the same code block below that does not happen. Only for the first one. :(
To make things even funnier, if I change the code above (e.g. add another multiplication) the UNREACHABLE line gets executed at random.
Therefore my questions: Is this a compiler bug? Do I exhaust a certain memory or register budged that I should be aware of? Are OpenCL compilers buggy in general?
void sample(float4 *color) {
...
float4 r_color = get_color(...);
float factor = r_color.w + (*color).w - 1.0f;
r_color = r_color * ((r_color.w - factor) / r_color.w);
*color += r_color;
if(color->w >= 1.0f) {
if(color->w <= 0.0f) {
(*color) = (float4)(0.0f, 0.0f, 0.0f, 1.0f); //UNREACHABLE?
return;
}
}
...
}
...
__kernel void render(
__write_only image2d_t output_buffer,
int width,
int height
) {
uint screen_x = get_global_id(0);
uint screen_y = get_global_id(1);
float4 color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
sample(&color);
write_imagef(output_buffer, (int2)(screen_x, screen_y), color);
}
My Platform:
Apple
Intel(R) Core(TM) i5-2415M CPU # 2.30GHz
selected device has extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
[Edit]
After observing the values I get during the calculation I am thinking r_color.w being exactly 0.0f after get_color may cause the problem. I am still looking for a definite statement that says comparing a NaN is not defined or is always true or something.
Also "Is opencl known to generate corrupt code?" has the invisible postfix "or what am I missing".
I used to work with embedded systems where the vendor would provide their own proprietary compilers, which in turn were known to break your code. So I want to get this off the table if possible. I suspect that clang would not do that. But you never know.

Related

Using a SSBO in Qt3D

I can't get a SSBO working using Qt3D. I'm also unable to find a single example displaying how it is supposed to be done.
Here are the main parts of my code:
Buffer init:
QByteArray ssboData;
ssboData.resize(1000);
ssboData.fill(0);
mySSBOBuffer = new Qt3DRender::QBuffer(this);
mySSBOBuffer->setUsage(Qt3DRender::QBuffer::DynamicRead);
mySSBOBuffer->setAccessType(Qt3DRender::QBuffer::AccessType::ReadWrite);
mySSBOBuffer->setData(ssboData);
QByteArray atomicCounterData;
atomicCounterData.resize(4);
atomicCounterData.fill(0);
myAtomicCounterBuffer = new Qt3DRender::QBuffer(this);
myAtomicCounterBuffer->setUsage(Qt3DRender::QBuffer::DynamicRead);
myAtomicCounterBuffer->setAccessType(Qt3DRender::QBuffer::AccessType::ReadWrite);
myAtomicCounterBuffer->setData(atomicCounterData);
Passing the buffers as QParameters to the shader.
myMaterial->addParameter(new Qt3DRender::QParameter("acCountFrags", QVariant::fromValue(myAtomicCounterBuffer->id()), myMaterial));
myMaterial->addParameter(new Qt3DRender::QParameter("ssboBuffer", QVariant::fromValue(mySSBOBuffer->id()), myMaterial));
I also tried
myMaterial->addParameter(new Qt3DRender::QParameter("acCountFrags", QVariant::fromValue(myAtomicCounterBuffer), myMaterial));
myMaterial->addParameter(new Qt3DRender::QParameter("ssboBuffer", QVariant::fromValue(mySSBOBuffer), myMaterial));
Fragment Shader (color has no use, just to check shader is working):
#version 430 core
layout(binding = 0, offset = 0) uniform atomic_uint acCountFrags;
layout (std430) buffer ssboBuffer
{
uint fragIds[];
};
out vec4 color;
void main()
{
uint index = atomicCounterIncrement(acCountFrags);
fragIds[index] = 5;
color = vec4(0.2, 0.2, 0.2, 1.0);
}
In all of my tries, nothing is written to the buffers after rendering. They are still full of 0 like after init.
Does anybody know if i'm doing something wrong ? Or somewhere I could find a working example ?
Thank you.
The answer was a missing BufferCapture component in my FrameGraph. Found it thanks to the example given by HappyFeet in the comments.

How to sample a SRV when enable msaa x4?DirectX11

I'm learning dx11 from Introduction_to_3D_Game_Programming_with_Directx_11.
Everything is ok without msaa. When I enable it, my .fx and C++ codes will not work well.
Do someone experienced it too and how to deal with this situation?
Before Codes:
Texture2D gTexture1;
float4 BLEND_PS(VertexOut_SV pin) :SV_TARGET
{
float4 texColor = float4(0.0f, 0.0f, 0.0f, 0.0f);
texColor = gTexture1.Sample(SamAnisotropic, pin.Tex);
return texColor;
}
because I can't bind a texture created with msaa to a texture2D,so I take msaa ON whenever.
After codes:
Texture2DMS<float4> gTexture1;
float4 BLEND_PS(VertexOut_SV pin) :SV_TARGET
{
float4 texColor = float4(0.0f, 0.0f, 0.0f, 0.0f);
texColor = gTexture1.Load(int2(pin.Tex.x*1400, pin.Tex.y*900), 0);
return texColor;
}
But the texColor is not right pixel I want.How to sample an SRV with msaa?
How to convert an UAV without msaa into a SRV with msaa?
And how to enable and disable msaa in C++ game codes with corresponding hlsl codes?
Do I have to keep different hlsl for each other?
For 'standard' MSAA use, you do the following:
When creating your swap chain and render traget view, set DXGI_SWAP_CHAIN_DESC.SampleDesc.Count or DXGI_SWAP_CHAIN_DESC1.SampleDesc.Count to 2, 4, 8, etc.
When creating your depth buffer/stencil, you need to use the same sample count for D3D11_TEXTURE2D_DESC.SampleDesc.Count.
When creating your render target view, you need to use D3D11_RTV_DIMENSION_TEXTURE2DMS (or pass nullptr for the view description so it matches the resource exactly)
When creating your depth buffer/stencil view, you need to use D3D11_DSV_DIMENSION_TEXTURE2DMS (or pass nullptr for the view description so it matches the resource exactly)
When rendering, you need to use a rasterizer state with D3D11_RASTERIZER_DESC.MultisampleEnable set to TRUE.
See also the Simple rendering tutorial for DirectX Tool Kit
Sample count
Depending on the Direct3D feature level, some MSAA sample counts are required for particular render target formats. Use can use CheckFormatSupport to verify render target format supports MSAA:
UINT formatSupport = 0;
if (FAILED(device->CheckFormatSupport(m_backBufferFormat, &formatSupport)))
{
throw std::exception("CheckFormatSupport");
}
UINT flags = D3D11_FORMAT_SUPPORT_MULTISAMPLE_RESOLVE
| D3D11_FORMAT_SUPPORT_MULTISAMPLE_RENDERTARGET;
if ( (formatSupport & flags) != flags )
{
// error
}
You then use CheckMultisampleQualityLevels to verify the sample count is supported. This code finds the highest supported MSAA level count for a particular format:
for (m_sampleCount = D3D11_MAX_MULTISAMPLE_SAMPLE_COUNT;
m_sampleCount > 1; m_sampleCount--)
{
UINT levels = 0;
if (FAILED(device->CheckMultisampleQualityLevels(m_backBufferFormat,
m_sampleCount, &levels)))
continue;
if ( levels > 0)
break;
}
if (m_sampleCount < 2)
{
// error
}
You can also validate the depth/stencil format you want to use supports D3D11_FORMAT_SUPPORT_DEPTH_STENCIL | D3D11_FORMAT_SUPPORT_MULTISAMPLE_RENDERTARGET.
Flip Style modes
The technique above only works for the older "bit-blt" style flip modes DXGI_SWAP_EFFECT_DISCARD or DXGI_SWAP_EFFECT_SEQUENTIAL. For UWP and DirectX 12 you are required to use DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL or DXGI_SWAP_EFFECT_FLIP_DISCARD which will fail if you attempt to create a back buffer with a SampleCount > 1.
In this case, you create the backbuffer with a SampleCount of 1, and create your own MSAA Render Target 2D texture. You have your Render Target View point to your MSAA render target, and before you Present you call ResolveSubresource from your MSAA render target to the backbufffer. This is exactly the same thing that DXGI did for you 'behind the scenes' with the older flip models.
For gamma-correct rendering (aka when you use a backbuffer format ending in _SRGB), the newer flip styles require that you use the non-SRGB equivalent for the backbuffer format or the swapchain create will fail. You set the SRGB format on the render target view instead.

glGetQueryObjectuiv, "Bound query buffer is not large enough to store result."

I am trying to solve an error I get when I run this sample.
It regards query occlusion, essentially it renders four times a square changing everytime viewport but only the central two times it will actually render something since the first and the last viewport are outside the monitor area on purpose.
viewports[0] = new Vec4(windowSize.x * -0.5f, windowSize.y * -0.5f, windowSize.x * 0.5f, windowSize.y * 0.5f);
viewports[1] = new Vec4(0, 0, windowSize.x * 0.5f, windowSize.y * 0.5f);
viewports[2] = new Vec4(windowSize.x * 0.5f, windowSize.y * 0.5f, windowSize.x * 0.5f, windowSize.y * 0.5f);
viewports[3] = new Vec4(windowSize.x * 1.0f, windowSize.y * 1.0f, windowSize.x * 0.5f, windowSize.y * 0.5f);
Each of this time, it will glBeginQuery with a different query and render a first time and then I query GL_ANY_SAMPLES_PASSED
// Samples count query
for (int i = 0; i < viewports.length; ++i) {
gl4.glViewportArrayv(0, 1, viewports[i].toFA_(), 0);
gl4.glBeginQuery(GL_ANY_SAMPLES_PASSED, queryName.get(i));
{
gl4.glDrawArraysInstanced(GL_TRIANGLES, 0, vertexCount, 1);
}
gl4.glEndQuery(GL_ANY_SAMPLES_PASSED);
}
Then I try to read the result
gl4.glBindBuffer(GL_QUERY_BUFFER, bufferName.get(Buffer.QUERY));
IntBuffer params = GLBuffers.newDirectIntBuffer(1);
for (int i = 0; i < viewports.length; ++i) {
params.put(0, i);
gl4.glGetQueryObjectuiv(queryName.get(i), GL_QUERY_RESULT, params);
}
But I get:
GlDebugOutput.messageSent(): GLDebugEvent[ id 0x502
type Error
severity High: dangerous undefined behavior
source GL API
msg GL_INVALID_OPERATION error generated. Bound query buffer is not large enough to store result.
when 1455696348371
source 4.5 (Core profile, arb, debug, compat[ES2, ES3, ES31, ES32], FBO, hardware) - 4.5.0 NVIDIA 356.39 - hash 0x238337ea]
If I look on the api doc they say:
params
If a buffer is bound to the GL_QUERY_RESULT_BUFFER target, then params is treated as an offset to a location within that buffer's data store to receive the result of the query. If no buffer is bound to GL_QUERY_RESULT_BUFFER, then params is treated as an address in client memory of a variable to receive the resulting data.
I guess there is an error in that phrase, I think they meant GL_QUERY_BUFFER instead of GL_QUERY_RESULT_BUFFER, indeed they use GL_QUERY_BUFFER also here for example..
Anyway, if anything is bound there, then params is interpreted as offset, ok
but my buffer is big enough:
gl4.glBindBuffer(GL_QUERY_BUFFER, bufferName.get(Buffer.QUERY));
gl4.glBufferData(GL_QUERY_BUFFER, Integer.BYTES * queryName.capacity(), null, GL_DYNAMIC_COPY);
gl4.glBindBuffer(GL_QUERY_BUFFER, 0);
So what's the problem?
I tried to write a big number, such as 500, for the buffer size, but no success...
I guess the error lies somewhere else.. could you see it?
if I have to answer, I say I expect that if I bind a buffer to GL_QUERY_BUFFER target, then OpenGL should read the value inside params and interpreter that as the offset (in bytes) where it should save the result of the query to.
No, that's not how it works.
In C/C++, the value taken by glGetQueryObject is a pointer, which normally is a pointer to a client memory buffer. For this particular function, this would often be a stack variable:
GLuint val;
glGetQueryObjectuiv(obj, GL_QUERY_RESULT, &val);
val is declared by client code (ie: the code calling into OpenGL). This code passes a pointer to that variable, and glGetQueryObjectuiv will write data to this pointer.
This is emulated in C# bindings by using *Buffer types. These represent contiguous arrays of values from which C# can extract a pointer that is compatible with C and C++ pointers-to-arrays.
However, when a buffer is bound to GL_QUERY_BUFFER, the meaning of the parameter changes. As you noted, it goes from being a client pointer to memory into an offset. But please note what that says. It does not say a "client pointer to an offset".
That is, the pointer value itself ceases being a pointer to actual memory. Instead, the numerical value of the pointer is treated as an offset.
In C++ terms, that's this:
glBindBuffer(GL_QUERY_BUFFER, buff);
glGetQueryObjectuiv(obj, GL_QUERY_RESULT, reinterpret_cast<void*>(16));
Note how it takes the offset of 16 bytes and pretends that this value is actually a void* who's numerical value is 16. That's what the reinterpret cast does.
How do you do that in C#? I have no idea; it would depend on the binding you're using, and you never specified what that was. Tao's long-since dead, and OpenTK looks to be heading that way too. But I did find out how to do this in OpenTK.
What you need to do is this:
gl4.glBindBuffer(GL_QUERY_BUFFER, bufferName.get(Buffer.QUERY));
for (int i = 0; i < viewports.length; ++i)
{
gl4.glGetQueryObjectuiv(queryName.get(i), GL_QUERY_RESULT,
(IntPtr)(i * Integer.BYTES));
}
You multiply times Integer.BYTES because the value is a byte offset into the buffer, not the integer index into an array of ints.

Color Changing Sprites Cocos2d

I need my sprite to transition to one color to another and on and on... like blue tint then green then purple, but i cannot find any good actions for that and am wondering, should i use animations? or is there an incorporated action for this?
you can use CCTintTo action to change the color of the sprite
[sprite runAction:[CCTintTo actionWithDuration:2 red:255 green:0 blue:0]];
since i saw several questions about replacing pixel colours in sprites, and i did'nt see any good solution (all solution only tint the color, and none of them is able to change an array of colours without forcing you into creating multiple image layers which construct the final image you want, i.e: one layer for pans, other for show, other for shirt, another for hair colour... and it goes on - note that they do have their advantages like the ability to use accurate gradients)
my solution allows you to change array of colors, meaning you can have a single image with a known colors (you dont want any gradiants in this layer, only colours that you KNOW their values - PS this only applies to colors you intent to change, other pixels can have any colour you want)
if you need gradiants over the colours you change, create an additional image with only the shading and place it as a child of the sprite.
also be aware that i am super-new to cocos2d/x (3 days), and that this code is written for cocos2dx but can be ported to cocos2d easily.
also note that i didnt test it on android only on iOS, i am not sure how capable is android official gcc and how will it deal with the way i allocate _srcC and _dstC, but again, this is easily portable.
so here it goes:
cocos2d::CCSprite * spriteWithReplacedColors( const char * imgfilename, cocos2d::ccColor3B * srcColors, cocos2d::ccColor3B * dstColors, int numColors )
{
CCSprite *theSprite = NULL;
CCImage *theImage = new CCImage;
if( theImage->initWithImageFile( imgfilename ) )
{
//make a color array which is easier to work with
unsigned long _srcC [ numColors ];
unsigned long _dstC [ numColors ];
for( int c=0; c<numColors; c++ )
{
_srcC[c] = (srcColors[c].r << 0) | (srcColors[c].g << 8) | (srcColors[0].b << 16);
_dstC[c] = (dstColors[c].r << 0) | (dstColors[c].g << 8) | (dstColors[0].b << 16);
}
unsigned char * rawData = theImage->getData();
int width = theImage->getWidth();
int height = theImage->getHeight();
//replace the colors need replacing
unsigned int * b = (unsigned int *) rawData;
for( int pixel=0; pixel<width*height; pixel++ )
{
register unsigned int p = *b;
for( int c=0; c<numColors; c++ )
{
if( (p&0x00FFFFFF) == _srcC[c] )
{
*b = (p&0xFF000000) | _dstC[c];
break;
}
}
b++;
}
CCTexture2D *theTexture = new CCTexture2D();
if( theTexture->initWithData(rawData, kCCTexture2DPixelFormat_RGBA8888, width, height, CCSizeMake(width, height)) )
{
theSprite = CCSprite::spriteWithTexture(theTexture);
}
theTexture->release();
}
theImage->release();
return theSprite;
}
to use it just do the following:
ccColor3B src[] = { ccc3( 255,255,255 ), ccc3( 0, 0, 255 ) };
ccColor3B dst[] = { ccc3( 77,255,77 ), ccc3( 255, 0 0 ) };
//will change all whites to greens, and all blues to reds.
CCSprite * pSprite = spriteWithReplacedColors( "character_template.png", src, dst, sizeof(src)/sizeof(src[0]) );
of course if you need speed, you would create an extension for a sprite that create a pixel shader that does it hw accelerated at render time ;)
btw: this solution might cause some artefacts on the edges on some cases, so you can create a large image and scale it down, letting GL minimise the artefact.
you can also create "fix" layers with black outlines to hide the artefacts and place it on top etc.
also make sure you don't use these 'key' colors on the rest of the image you don't want the pixels changed.
also keep in mind that the fact that the alpha channel is not changed, and that if you use basic images with pure red/green/blue colors only, you can also optimize this function to eliminate all artefacts on edges automatically (and avoid in many cases, the need for an additional shade layer) and other cool stuff (multiplexing several images into a single bitmap - remember palette animation?)
enjoy ;)
In case someone wanted this for cocos2d-x here is the code:
somesprite->runAction(TintTo::create(float duration, Color3b &color));

partially covered glut window doesn't redraw correctly after it is uncovered

Using free glut on windows 7 home ultimate with video card ati mobility radeon 5650
code snippet:
void ResizeFunction(int width, int height)
{
glViewport(0, 0, width, height);
}
void RenderFunction()
{
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
//...drawing code based on some flag, I draw a triangle or a rectangle
//the flag is toggled on pressing 't' or 'T' key
glutSwapBuffers(); //double buffering is enabled
glutPostRedisplay();
}
void KeyboardFunction(unsigned char key, int x, int y)
{
switch(key)
{
case 't':
case 'T':
{
flag = !flag;
glutPostRedisplay();
break;
}
default:
break;
}
}
problem: The triangle or the rectangle is drawn covering the entire window first time. But if I partially cover the glut window with another window (say, with some notepad window) and then uncover it, subsequently, when I toggle, the object is drawn only in the covered portion of the glut window. If I re-size the glut window, drawing works correctly as before.
Any help will be appreciated.
regards,
fs
Glut only redraws on the screen when you tell it or when it decides. That is, if you don't do anything in the window, the scene is not redrawn. Advantage of this: less cpu/gpu usage. Disadvantage: Only good for non animated applications.
If you want to constantly update the screen (which is what is done in applications with lots of animations (games for example)), you can use glutIdleFunc
http://www.opengl.org/resources/libraries/glut/spec3/node63.html
That is in the beginning of the program when you set all the functions for glut, you also write:
glutIdleFunc(RenderFunction);
This way, when glut is idle, it keeps calling your render function.
If you want to render slower than possible (for example with a fixed frame rate), you could use a timer:
void RenderFunction()
{
glutTimerFunc(YOUR_DELAY_IN_MS, RenderFunction, 0);
/* rest of code */
}
and instead of glutIdleFunc(RenderFunction);, you write
`glutTimerFunc(YOUR_DELAY_IN_MS, RenderFunction, 0);`
To simply call the render function once (you could also just write RenderFunction() once) and the function keeps setting the timer for its next run.
As a side note, I suggest using SDL instead of glut.

Resources