Basic programming sample of OpenCL from Apple fails to run on GPU - xcode

I started learning some basics about OpenCL a while ago and decided to give the "Basic programming sample" from Apple a go. I runs OK on CPU, but when I select GPU as the target device I get err = -45 from
err = gclExecKernelAPPLE(k, ndrange, &kargs);
This error code translates to CL_INVALID_PROGRAM_EXECUTABLE. Any idea how can I correct the sample code?
Automatically generated code looks like this (+ includes on top):
static void initBlocks(void);
// Initialize static data structures
static block_kernel_pair pair_map[1] = {
static block_kernel_map bmap = { 0, 1, initBlocks, pair_map };
// Block function
void (^square_kernel)(const cl_ndrange *ndrange, cl_float* input, cl_float* output) =
^(const cl_ndrange *ndrange, cl_float* input, cl_float* output) {
int err = 0;
cl_kernel k =[0].kernel;
if (!k) {
k =[0].kernel;
if (!k)
gcl_log_fatal("kernel square does not exist for device");
kargs_struct kargs;
gclCreateArgsAPPLE(k, &kargs);
err |= gclSetKernelArgMemAPPLE(k, 0, input, &kargs);
err |= gclSetKernelArgMemAPPLE(k, 1, output, &kargs);
gcl_log_cl_fatal(err, "setting argument for square failed");
err = gclExecKernelAPPLE(k, ndrange, &kargs);
gcl_log_cl_fatal(err, "Executing square failed");
gclDeleteArgsAPPLE(k, &kargs);
// Initialization functions
static void initBlocks(void) {
const char* build_opts = " -cl-std=CL1.1";
static dispatch_once_t once;
^{ int err = gclBuildProgramBinaryAPPLE("OpenCL/", "", &bmap, build_opts);
if (!err) {
assert([0].block_ptr == square_kernel && "mismatch block");[0].kernel = clCreateKernel(bmap.program, "square", &err);
static void RegisterMap(void) {
gclRegisterBlockKernelMap(&bmap);[0].block_ptr = square_kernel;

I saw this same problem when running under 10.7.3, while a machine on 10.7.5 worked fine. I noticed the CVMCompiler process was crashing after each invocation of my app.
Inspecting the stack trace, I noticed it was crashing when trying to parse the bitcode for compilation into native code. Since the parsing of the bitcode failed failed, there was no resulting compiled program for gclExecKernelAPPLE() to execute, hence the error.
Try upgrading to 10.7.5, or indeed 10.8 and the problem should go away. (I just tested this and it does indeed fix the problem.)


Crashes after parsing the equation

The application want to parse a string equation to mathematics and return the data to user. for this purpose the library is used is exprtk
for easy analysis I have shared minimum working code
minimum working code
when application parses the string to code back to back [multithreaded but locked]
void reset()
// Why? because msvc doesn't support swap properly.
//stack_ = std::stack<std::pair<char,std::size_t> >();
it was crashing on destructor on ~deque()
stating memory reallocation
so I change it to pop so for now this has been resolved
while(stack_.size()) stack_.pop();
state_ = true;
now the code always crashes on
static inline void destroy(control_block*& cntrl_blck)
if (cntrl_blck)
/**now crashes on this condition check*/
if ( (0 != cntrl_blck->ref_count) && (0 == --cntrl_blck->ref_count) )
delete cntrl_blck;
cntrl_blck = 0;
pastebin code updated new code with main has been added with main and minimum working code.
all the shared_ptr has been removed. now they are normal objects.
as for exprtk reset function has been changed to original one
void reset()
// Why? because msvc doesn't support swap properly.
stack_ = std::stack<std::pair<char,std::size_t> >();
state_ = true;
and backtrace of gdb has been added backtrace

Print() giving assertion when printing an object from a custom function

Ok so i have this function in the engine
static bool
myTestFunction(JSContext* cx, unsigned argc, Value* vp)
CallArgs args = CallArgsFromVp(argc, vp);
int length = args.length();
if (length==2)
if (args.get(1).isObject())
RootedObject obj4(cx,&args.get(1).toObject());
return true;
and this statement in the js script
var obj = {ss:"qq"};
var handler = {tt:"vv"};
var prox1 = myTestFunction(obj,handler);
So the problem is in the last line basically i am just trying to return the second argument but when i print the variable it is giving me this assertion failure
Assertion failure: mStatementDone != reinterpret_cast<bool*>(uintptr_t(-1)), at ../../../dist/include/mozilla/GuardObjects.h:95
Segmentation fault (core dumped)
Now i am really new to SpiderMonkey Engine and have checked everything but haven't been able to figure out what's wrong here. Any help would be really appreciated.

alBufferData() sets AL_INVALID_OPERATION when using buffer ID obtained from alSourceUnqueueBuffers()

I am trying to stream audio data from disk using OpenAL's buffer queueing mechanism. I load and enqueue 4 buffers, start the source playing, and check in a regular intervals to refresh the queue. Everything looks like it's going splendidly, up until the first time I try to load data into a recycled buffer I got from alSourceUnqueueBuffers(). In this situation, alBufferData() always sets AL_INVALID_OPERATION, which according to the official v1.1 spec, it doesn't seem like it should be able to do.
I have searched extensively on Google and StackOverflow, and can't seem to find any reason why this would happen. The closest thing I found was someone with a possibly-related issue in an archived forum post, but details are few and responses are null. There was also this SO question with slightly different circumstances, but the only answer's suggestion does not help.
Possibly helpful: I know my context and device are configured correctly, because loading small wav files completely into a single buffer and playing them works fine. Through experimentation, I've also found that queueing 2 buffers, starting the source playing, and immediately loading and enqueueing the other two buffers throws no errors; it's only when I've unqueued a processed buffer that I run into trouble.
The relevant code:
static constexpr int MAX_BUFFER_COUNT = 4;
#define alCall(funcCall) {funcCall; SoundyOutport::CheckError(__FILE__, __LINE__, #funcCall) ? abort() : ((void)0); }
bool SoundyOutport::CheckError(const string &pFile, int pLine, const string &pfunc)
ALenum tErrCode = alGetError();
if(tErrCode != 0)
auto tMsg = alGetString(tErrCode);
Log::e(ro::TAG) << tMsg << " at " << pFile << "(" << pLine << "):\n"
<< "\tAL call " << pfunc << " failed." << end;
return true;
return false;
void SoundyOutport::EnqueueBuffer(const float* pData, int pFrames)
static int called = 0;
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
if(tState == AL_STATIC)
// alCall(alSourcei(mSourceId, AL_BUFFER, NULL));
ALuint tBufId = AL_NONE;
int tQueuedBuffers = QueuedUpBuffers();
int tReady = ProcessedBuffers();
if(tQueuedBuffers < MAX_BUFFER_COUNT)
tBufId = mBufferIds[tQueuedBuffers];
else if(tReady > 0)
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
return; // no update needed yet.
void* tConverted = convert(pData, pFrames);
// the fifth time through, we get AL_INVALID_OPERATION, and call abort()
alCall(alBufferData(tBufId, mFormat, tConverted, pFrames * mBitdepth/8, mSampleRate));
alCall(alSourceQueueBuffers(mSourceId, 1, &mBufferId));
if(mBitdepth == BITDEPTH_8)
delete (uint8_t*)tConverted;
else // if(mBitdepth == BITDEPTH_16)
delete (uint16_t*)tConverted;
void SoundyOutport::PlayBufferedStream()
if(!StreamingMode() || !QueuedUpBuffers())
Log::w(ro::TAG) << "Attempted to play an unbuffered stream" << end;
alCall(alSourcei(mSourceId, AL_LOOPING, AL_FALSE)); // never loop streams
int SoundyOutport::QueuedUpBuffers()
int tCount = 0;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_QUEUED, &tCount));
return tCount;
int SoundyOutport::ProcessedBuffers()
int tCount = 0;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_PROCESSED, &tCount));
return tCount;
void SoundyOutport::Stop()
int tBuffers;
alCall(alGetSourcei(mSourceId, AL_BUFFERS_QUEUED, &tBuffers));
ALuint tDummy[tBuffers];
alCall(alSourceUnqueueBuffers(mSourceId, tBuffers, tDummy));
alCall(alSourcei(mSourceId, AL_BUFFER, AL_NONE));
bool SoundyOutport::Playing()
ALint tPlaying;
alCall(alGetSourcei(mSourceId, AL_SOURCE_STATE, &tPlaying));
return tPlaying == AL_PLAYING;
bool SoundyOutport::StreamingMode()
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
return tState == AL_STREAMING;
bool SoundyOutport::StaticMode()
ALint tState;
alCall(alGetSourcei(mSourceId, AL_SOURCE_TYPE, &tState));
return tState == AL_STATIC;
And here's an annotated screen cap of what I see in my debugger when I hit the error:
I've tried a bunch of little tweaks and variations, and the result is always the same. I've wasted too many days trying to fix this. Please help :)
This error occurs when you trying to fill buffer with data, when the buffer is still queued to the source.
Also this code is wrong.
if(tQueuedBuffers < MAX_BUFFER_COUNT)
tBufId = mBufferIds[tQueuedBuffers];
else if(tReady > 0)
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
return; // no update needed yet.
You can fill buffer with data only if it unqueued from source. But your first if block gets tBufId that queued to the source. Rewrite code like so
if(tReady > 0)
// the fifth time through, this code gets hit
alCall(alSourceUnqueueBuffers(mSourceId, 1, &tBufId));
// debug code: make sure these values go down by one
tQueuedBuffers = QueuedUpBuffers();
tReady = ProcessedBuffers();
return; // no update needed yet.

Free VRam on OS X

does anyone know how to get the free(!) vram on os x?
I know that you can query for a registry entry:
typeCode = IORegistryEntrySearchCFProperty(dspPort,kIOServicePlane,CFSTR(kIOFBMemorySizeKey),
kIORegistryIterateRecursively | kIORegistryIterateParents);
but this will return ALL vram, not the free vram. Under windows you can query for free VRAM using directshow
mDDrawResult = DirectDrawCreate(NULL, &mDDraw, NULL);
mDDrawResult = mDDraw->QueryInterface(IID_IDirectDraw2, (LPVOID *)&mDDraw2);
DDSCAPS ddscaps;
DWORD totalmem, freemem;
mDDrawResult = mDDraw2->GetAvailableVidMem(&ddscaps, &totalmem, &freemem);
Ugly, but it works. Anyone knows the osx way?
answering myself so others may use this:
#include <IOKit/graphics/IOGraphicsLib.h>
size_t currentFreeVRAM()
kern_return_t krc;
mach_port_t masterPort;
krc = IOMasterPort(bootstrap_port, &masterPort);
if (krc == KERN_SUCCESS)
CFMutableDictionaryRef pattern = IOServiceMatching(kIOAcceleratorClassName);
io_iterator_t deviceIterator;
krc = IOServiceGetMatchingServices(masterPort, pattern, &deviceIterator);
if (krc == KERN_SUCCESS)
io_object_t object;
while ((object = IOIteratorNext(deviceIterator)))
CFMutableDictionaryRef properties = NULL;
krc = IORegistryEntryCreateCFProperties(object, &properties, kCFAllocatorDefault, (IOOptionBits)0);
if (krc == KERN_SUCCESS)
CFMutableDictionaryRef perf_properties = (CFMutableDictionaryRef) CFDictionaryGetValue( properties, CFSTR("PerformanceStatistics") );
// look for a number of keys (this is mostly reverse engineering and best-guess effort)
const void* free_vram_number = CFDictionaryGetValue(perf_properties, CFSTR("vramFreeBytes"));
if (free_vram_number)
ssize_t vramFreeBytes;
CFNumberGetValue( (CFNumberRef) free_vram_number, kCFNumberSInt64Type, &vramFreeBytes);
return vramFreeBytes;
if (properties) CFRelease(properties);
return 0; // when we come here, this is a fail
i am somewhat surprised that this query takes almost 3 msec ..
be aware that there may be more than one accelerator on your system ( eg. macbook )
so be sure you select the proper one for the query

Debugging panics in Symbian OS using Carbide.c++

Is there a way to drop into the debugger when any panic occurs like if there were a breakpoint?
I'm using Carbide.c++ 2.3.0. I know about the Debug Configurations > x86 Exceptions, but it covers only a small fraction of what can actually happen in a real application. For instance, it does not trap user panics, or ALLOC panics when application exits with memory leaks.
If you are using the emulator, you can debug panics by enabling 'just-in-time debugging. This is done by adding the following line to epoc32\data\epoc.ini:
JustInTime debug
For more details, see the epoc.ini reference in the SDK documentation.
To the best of my knowledge it can't be done.
What I've done is use simple function tracing logic so when a panic happens I have a stack trace at the point of the panic in my the panic handling code (which I log out). This works well except for the fact that you have to remember to add your macro's at the beginning of every function.
#ifndef NDEBUG
class __FTrace
__FTrace(const char* function)
#define FTRACE() __FTrace(__PRETTY_FUNCTION__)
#define FTRACE()
void Func()
For ALLOC's, I've had a lot of success with the Hook Logger under the emulator. It's a real pain to setup and use but it will make it real easy to track down ALLOC memory leaks.
UPDATE: As requested, here is what my panic handling code looks like. Note that my application has to run in the background all the time, so it's setup to restart the app when something bad happens. Also this code works for 3rd Edition SDK's, I haven't tried it on later versions of the SDK's.
The point is to run the main application in another thread and then wait for it to exit. Then check to see why the thread exits, it the thread as exited for unknown reasons, log stuff like my own stack trace and restart the application.
TInt StartMainThread(TAny*)
__LOGSTR_TOFILE("Main Thread Start");
TInt result(KErrNone);
TRAPD(err, result = EikStart::RunApplication(NewApplication));
if(KErrNone != err || KErrNone != result )
__LOGSTR_TOFILE("EikStart::RunApplication error: trap(%d), %d", err, result);
__LOGSTR_TOFILE("Main Thread End");
return result;
const TInt KMainThreadToLiveInSeconds = 10;
} // namespace *unnamed*
LOCAL_C CApaApplication* NewApplication()
return new CMainApplication;
GLDEF_C TInt E32Main()
#ifdef NDEBUG
__LOGSTR_TOFILE("Application Start (release)");
__LOGSTR_TOFILE("Application Start (debug)");
#ifndef NO_TRACING
#endif // !NO_TRACING
RHeap& heap(User::Heap());
TInt heapsize=heap.MaxLength();
TInt exitReason(KErrNone);
TTime timeToLive;
timeToLive += TTimeIntervalSeconds(KMainThreadToLiveInSeconds);
LManagedHandle<RThread> mainThread;
TInt err = mainThread->Create(_L("Main Thread"), StartMainThread, KDefaultStackSize, KMinHeapSize, heapsize, NULL);
if (KErrNone != err)
__LOGSTR_TOFILE("MainThread failed : %d", err);
return err;
TRequestStatus status;
exitReason = mainThread->ExitReason();
TExitCategoryName category(mainThread->ExitCategory());
case EExitKill:
__LOGSTR_TOFILE("ExitKill : (%S) : %d", &category, exitReason);
case EExitTerminate:
__LOGSTR_TOFILE("ExitTerminate : (%S) : %d", &category, exitReason);
case EExitPanic:
__LOGSTR_TOFILE("ExitPanic : (%S) : %d", &category, exitReason);
__LOGSTR_TOFILE("ExitUnknown : (%S) : %d", &category, exitReason);
#ifndef NO_TRACING
#endif // NO_TRACING
if( KErrNone != status.Int() )
TTime now;
if (timeToLive > now)
TTimeIntervalMicroSeconds diff = timeToLive.MicroSecondsFrom(now);
__LOGSTR_TOFILE("Exiting due to TTL : (%Lu)", diff.Int64());
RProcess current;
RProcess restart;
err = restart.Create(current.FileName(), _L(""));
if( KErrNone == err )
return KErrNone;
__LOGSTR_TOFILE("Failed to start app: %d", err);
__LOGSTR_TOFILE("Application End");
return exitReason;
