Similar to this question, I'm trying to implement the HelloWorld example from this video by Wesley Shillingford, except this time with OpenCL 2.1. I can get it to run if I use the default context, but not if I create my own (as the video does).
When I use my own context, it produces a cl::Error (-34 = CL_INVALID_CONTEXT):
From here:
CL_INVALID_CONTEXT = the given context is invalid OpenCL context, or the context associated with certain parameters are not the same.
I'm not sure how I could tell that the context is invalid. I've tried comparing the defaultContext to myContext, and they match of everything except CL_CONTEXT_REFERENCE_COUNT. Which doesn't seem to matter (but maybe it does).
I could be mixing contexts. However, I assign the context I want to use to chosenContext and use that everywhere I need a context.
It seems that something is somehow using the default context instead of my supplied context, but I haven't been able to spot where. Any insights would be appreciated.
The code:
#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 200
#include <CL/cl2.hpp>
#include <fstream>
#include <iostream>
int main()
{
// Get Platform and Device
std::vector<cl::Platform> platforms;
cl::Platform::get(&platforms);
auto platform = platforms.front();
std::vector<cl::Device> devices;
platform.getDevices(CL_DEVICE_TYPE_GPU, &devices);
auto device = devices.front();
//This context doesn't work. Causes CL_INVALID_CONTEXT (-34)
cl_context_properties properties[] = {CL_CONTEXT_PLATFORM, (cl_context_properties)platform(), 0};
cl::Context myContext(device, properties);
//If I stick with the default context, things work.
cl::Context defaultContext = cl::Context::getDefault();
//The choice of context here determines whether it works or not:
// myContext -> Fails with CL_INVALID_CONTEXT (-34)
// defaultContext -> works
auto chosenContext = myContext;
std::ifstream helloWorldFile("hello_world.cl");
std::string src(std::istreambuf_iterator<char>(helloWorldFile), (std::istreambuf_iterator<char>()));
cl::Program program(chosenContext, src);
program.build("-cl-std=CL2.1");
//Debugging code: Check to make sure that the contexts are similar
auto myContextDevices = myContext.getInfo<CL_CONTEXT_DEVICES>();
auto defaultContextDevices = defaultContext.getInfo<CL_CONTEXT_DEVICES>();
auto devicesMatch = myContextDevices == defaultContextDevices; //true
auto myContextProperties = myContext.getInfo<CL_CONTEXT_PROPERTIES>();
auto defaultContextProperties = defaultContext.getInfo<CL_CONTEXT_PROPERTIES>();
auto propertiesMatch = myContextProperties == defaultContextProperties; //true
auto myContextNumDevices = myContext.getInfo<CL_CONTEXT_NUM_DEVICES>();
auto defaultContextNumDevices = defaultContext.getInfo<CL_CONTEXT_NUM_DEVICES>();
auto numDevicesMatch = myContextNumDevices == defaultContextNumDevices; //true
auto myContextRefCount = myContext.getInfo<CL_CONTEXT_REFERENCE_COUNT>(); // 1 if defaultContext, 3 if myContext
auto defaultContextRefCount = defaultContext.getInfo<CL_CONTEXT_REFERENCE_COUNT>(); // 4 if defaultContext, 2 if myContext
auto refCountsMatch = myContextRefCount == defaultContextRefCount; // false
auto contextsMatch = myContext == defaultContext; //false
//End of debugging code
//Continuing with computation
char buf[16];
cl::Buffer outputBuffer = cl::Buffer(CL_MEM_WRITE_ONLY | CL_MEM_HOST_READ_ONLY, sizeof(buf));
cl::Kernel kernel(program, "HelloWorld");
kernel.setArg(0, outputBuffer);
cl::CommandQueue commandQueue(chosenContext, device);
auto result = commandQueue.enqueueNDRangeKernel(kernel, 0, 1, 1); //CL_SUCCESS
commandQueue.enqueueReadBuffer(outputBuffer, CL_TRUE, 0, sizeof(buf), buf); // Execution fails here, raises cl::Error (-34)
std::cout << buf;
return EXIT_SUCCESS;
}
Build Command:
g++ -g hello_world_21.cpp -IOpenCL-Headers/opencl21 -std=c++11 -lOpenCL
hello_world.cl:
__kernel void HelloWorld(__global char* output) {
output[0] = 'H';
output[1] = 'e';
output[2] = 'l';
output[3] = 'l';
output[4] = 'o';
output[5] = ' ';
output[6] = 'W';
output[7] = 'o';
output[8] = 'r';
output[9] = 'l';
output[10] = 'd';
output[11] = '!';
output[12] = '\n';
}
You are still using the default context for the global memory buffer instead of using your own context:
cl::Buffer outputBuffer = cl::Buffer(CL_MEM_WRITE_ONLY | CL_MEM_HOST_READ_ONLY, sizeof(buf));
Just change this line to the following and it should work:
cl::Buffer outputBuffer = cl::Buffer(myContext, CL_MEM_WRITE_ONLY | CL_MEM_HOST_READ_ONLY, sizeof(buf));
Related
I am trying to calculate Mfcc feature in C++. And I found Aubio (https://github.com/aubio/aubio) but I cannot produce same result as Librosa of Python (this is important).
Librosa code:
X, sample_rate = sf.read(file_name, dtype='float32')
mfccs = librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=40)
Aubio code:
#include "utils.h"
#include "parse_args.h"
#include <stdlib.h>
aubio_pvoc_t *pv; // a phase vocoder
cvec_t *fftgrain; // outputs a spectrum
aubio_mfcc_t * mfcc; // which the mfcc will process
fvec_t * mfcc_out; // to get the output coefficients
uint_t n_filters = 128;
uint_t n_coefs = 40;
void process_block (fvec_t *ibuf, fvec_t *obuf)
{
fvec_zeros(obuf);
//compute mag spectrum
aubio_pvoc_do (pv, ibuf, fftgrain);
//compute mfccs
aubio_mfcc_do(mfcc, fftgrain, mfcc_out);
}
void process_print (void)
{
/* output times in selected format */
print_time (blocks * hop_size);
outmsg ("\t");
/* output extracted mfcc */
fvec_print (mfcc_out);
}
int main(int argc, char **argv) {
int ret = 0;
// change some default params
buffer_size = 2048;
hop_size = 512;
examples_common_init(argc,argv);
verbmsg ("using source: %s at %dHz\n", source_uri, samplerate);
verbmsg ("buffer_size: %d, ", buffer_size);
verbmsg ("hop_size: %d\n", hop_size);
pv = new_aubio_pvoc (buffer_size, hop_size);
fftgrain = new_cvec (buffer_size);
mfcc = new_aubio_mfcc(buffer_size, n_filters, n_coefs, samplerate);
mfcc_out = new_fvec(n_coefs);
if (pv == NULL || fftgrain == NULL || mfcc == NULL || mfcc_out == NULL) {
ret = 1;
goto beach;
}
examples_common_process(process_block, process_print);
printf("\nlen=%u\n", mfcc_out->length);
del_aubio_pvoc (pv);
del_cvec (fftgrain);
del_aubio_mfcc(mfcc);
del_fvec(mfcc_out);
beach:
examples_common_del();
return ret;
}
Please help to obtain same result of Librosa or suggest any C++ library do this well.
Thanks
This might be what you are looking for: C Speech Features
The library is a complete port of python_speech_features to C and according to the documentation, you should be able to use it in a C++ projects. The results will not be the same of Librosa, here is why but you should be able to work with them.
I'm trying to load an obj file full of vertices and render it as a point cloud.
when I try to run my code it crashes and gives me the following error:
Unhandled exception at 0x66463E50 (nvwgf2um.dll) in Tutorial06.exe: 0xC0000005: Access violation reading location 0x00E9D000.
I followed the tutorial code Microsoft provide with DirectX and changed it to suite my layout and everything but I must have done something wrong and I'm not sure what it is.
This is how I try to initialize my buffer:
CloudLoader::getInstance().loadCloudData("cloud.obj");
std::vector<CloudVertex>* data = CloudLoader::getInstance().getCloudData();
D3D11_BUFFER_DESC bd;
ZeroMemory( &bd, sizeof(bd) );
bd.Usage = D3D11_USAGE_DEFAULT;
bd.ByteWidth = sizeof( CloudVertex ) * data->size();
bd.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = 0;
D3D11_SUBRESOURCE_DATA InitData;
ZeroMemory( &InitData, sizeof(InitData) );
InitData.pSysMem = data;
hr = g_pd3dDevice->CreateBuffer( &bd, &InitData, &g_pVertexBuffer );
And this my cloud loading code:
void CloudLoader::loadCloudData(std::string fileName)
{
if (m_loaded == false){
m_loadedData = new std::vector<CloudVertex>();
std::wifstream fileIn(fileName.c_str()); //Open file
wchar_t checkChar;
if (fileIn){
while (fileIn){
checkChar = fileIn.get(); //Get next char
switch (checkChar){
case 'v':
checkChar = fileIn.get();
if (checkChar == ' ') //v - vert position
{
float vz, vy, vx;
fileIn >> vx >> vy >> vz; //Store the next three types
m_loadedData->push_back(CloudVertex(vx, vy, vz));
}
break;
}
}
}
}
m_loaded = true;
}
I guess its a C++ thing and not directX and its probably really simple but I've been stuck on this for a while now, I would really appreciate the help.
pSysMem cannot point at "data", as this is a pointer to the std::vector class and not actually the data contained within the vector.
Try:
InitData.pSysMem = data->data();
I'm trying to use Color Converter DMO (http://msdn.microsoft.com/en-us/library/windows/desktop/ff819079(v=vs.85).aspx) to convert RBG24 to YV12/NV12 via Media Foundation. I've created an instance of Color Converter DSP via CLSID_CColorConvertDMO and then tried to set the needed input/output types, but the calls always return E_INVALIDARG even when using media types that are returned by GetOutputAvailableType and GetInputAvailableType. If I set the media type to NULL then i get the error that the media type is invalid, that makes sense. I've seen examples from MSDN, where people do the same - enumerate available types and then set them as input types - and they claim it works, but i'm kinda stuck on the E_INVALIDARG. I understand that this is hard to answer without a code example, if no one has had similar experience, I'll try to post a snipplet, but maybe someone has experienced the same issue?
This DMO/DSP is dual interfaced and is both a DMO with IMediaObject and an MFT with IMFTransform. The two interfaces share a lot common, and here is a code snippet to test initialization of RGB24 into YV12 conversion:
#include "stdafx.h"
#include <dshow.h>
#include <dmo.h>
#include <wmcodecdsp.h>
#pragma comment(lib, "strmiids.lib")
#pragma comment(lib, "wmcodecdspuuid.lib")
int _tmain(int argc, _TCHAR* argv[])
{
ATLVERIFY(SUCCEEDED(CoInitialize(NULL)));
CComPtr<IMediaObject> pMediaObject;
ATLVERIFY(SUCCEEDED(pMediaObject.CoCreateInstance(CLSID_CColorConvertDMO)));
VIDEOINFOHEADER InputVideoInfoHeader;
ZeroMemory(&InputVideoInfoHeader, sizeof InputVideoInfoHeader);
InputVideoInfoHeader.bmiHeader.biSize = sizeof InputVideoInfoHeader.bmiHeader;
InputVideoInfoHeader.bmiHeader.biWidth = 1920;
InputVideoInfoHeader.bmiHeader.biHeight = 1080;
InputVideoInfoHeader.bmiHeader.biPlanes = 1;
InputVideoInfoHeader.bmiHeader.biBitCount = 24;
InputVideoInfoHeader.bmiHeader.biCompression = BI_RGB;
InputVideoInfoHeader.bmiHeader.biSizeImage = 1080 * (1920 * 3);
DMO_MEDIA_TYPE InputMediaType;
ZeroMemory(&InputMediaType, sizeof InputMediaType);
InputMediaType.majortype = MEDIATYPE_Video;
InputMediaType.subtype = MEDIASUBTYPE_RGB24;
InputMediaType.bFixedSizeSamples = TRUE;
InputMediaType.bTemporalCompression = FALSE;
InputMediaType.lSampleSize = InputVideoInfoHeader.bmiHeader.biSizeImage;
InputMediaType.formattype = FORMAT_VideoInfo;
InputMediaType.cbFormat = sizeof InputVideoInfoHeader;
InputMediaType.pbFormat = (BYTE*) &InputVideoInfoHeader;
const HRESULT nSetInputTypeResult = pMediaObject->SetInputType(0, &InputMediaType, 0);
_tprintf(_T("nSetInputTypeResult 0x%08x\n"), nSetInputTypeResult);
VIDEOINFOHEADER OutputVideoInfoHeader = InputVideoInfoHeader;
OutputVideoInfoHeader.bmiHeader.biBitCount = 12;
OutputVideoInfoHeader.bmiHeader.biCompression = MAKEFOURCC('Y', 'V', '1', '2');
OutputVideoInfoHeader.bmiHeader.biSizeImage = 1080 * 1920 * 12 / 8;
DMO_MEDIA_TYPE OutputMediaType = InputMediaType;
OutputMediaType.subtype = MEDIASUBTYPE_YV12;
OutputMediaType.lSampleSize = OutputVideoInfoHeader.bmiHeader.biSizeImage;
OutputMediaType.cbFormat = sizeof OutputVideoInfoHeader;
OutputMediaType.pbFormat = (BYTE*) &OutputVideoInfoHeader;
const HRESULT nSetOutputTypeResult = pMediaObject->SetOutputType(0, &OutputMediaType, 0);
_tprintf(_T("nSetOutputTypeResult 0x%08x\n"), nSetOutputTypeResult);
// TODO: ProcessInput, ProcessOutput
pMediaObject.Release();
CoUninitialize();
return 0;
}
This should work fine and print two S_OKs out...
Here's the code using GetOpenFileNameW:
import core.sys.windows.windows;
import std.stdio, std.string, std.utf;
pragma(lib, "comdlg32");
// Fill in some missing holes in core.sys.windows.windows.
extern (Windows) DWORD CommDlgExtendedError();
enum OFN_FILEMUSTEXIST = 0x001000;
void main()
{
auto buf = new wchar[1024];
OPENFILENAMEW ofn;
ofn.lStructSize = ofn.sizeof;
ofn.lpstrFile = buf.ptr;
ofn.nMaxFile = buf.length;
ofn.lpstrInitialDir = null;
ofn.Flags = OFN_FILEMUSTEXIST;
BOOL retval = GetOpenFileNameW(&ofn);
if (retval == 0) {
// Get 0x3002 for W and 0x0002 for A. ( http://msdn.microsoft.com/en-us/library/windows/desktop/ms646916(v=vs.85).aspx )
throw new Exception(format("GetOpenFileName failure: 0x%04X.", CommDlgExtendedError()));
}
writeln(buf);
}
This results in FNERR_INVALIDFILENAME, but I don't see any non-optional strings that I haven't filled in. And here's the code (only differences shown) for GetOpenFileNameA:
auto buf = new char[1024];
OPENFILENAMEA ofn;
// ...
BOOL retval = GetOpenFileNameA(&ofn);
This results in CDERR_INITIALIZATION, and the only elaboration MSDN gives me is
The common dialog box function failed during initialization.
This error often occurs when sufficient memory is not available.
This is on Windows 7 64 bit, DMD v2.059.
buf has to be zeroed completely. The problem here is that wchar.init == wchar.max (for error detection reasons), so your array is essentially 1024 instances of wchar.max. A simple buf[] = 0; should fix that.
So I am running some simple Hello World OpenCL code in Xcode 4.1 on Lion and it continually breaks at clEnqueueTask. The same thing happens when I run the source from the MacResearch.org OpenCL tutorials, which breaks at clEnqueueNDRangeKernel. lldb gives code 1, address 0x30.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <OpenCL/opencl.h>
#define MEM_SIZE (128)
#define MAX_SOURCE_SIZE (0x100000)
int main ()
{
char *program_source = "\n"\
"__kernel void hello(__global char* string) \n"\
"{ \n"\
" string[0] = 'H'; \n"\
" string[1] = 'e'; \n"\
" string[2] = 'l'; \n"\
" string[3] = 'l'; \n"\
" string[4] = 'o'; \n"\
" string[5] = ','; \n"\
" string[6] = ' '; \n"\
" string[7] = 'w'; \n"\
" string[8] = 'o'; \n"\
" string[9] = 'r'; \n"\
" string[10] = 'l'; \n"\
" string[11] = 'd'; \n"\
" string[12] = '!'; \n"\
" string[13] = '\0'; \n"\
"} \n"\
"\n";
size_t source_size = sizeof(program_source);
cl_device_id device_id = NULL;
cl_context context = NULL;
cl_command_queue command_queue = NULL;
cl_mem memobj = NULL;
cl_program program = NULL;
cl_kernel kernel = NULL;
cl_platform_id platform_id = NULL;
cl_uint ret_num_devices;
cl_uint ret_num_platforms;
cl_int ret;
char string[MEM_SIZE];
// get platform and device information
ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, &device_id, &ret_num_devices);
cl_int err = 0;
size_t returned_size = 0;
size_t buffer_size;
// Get some information about the returned device
cl_char vendor_name[1024] = {0};
cl_char device_name[1024] = {0};
err = clGetDeviceInfo(device_id, CL_DEVICE_VENDOR, sizeof(vendor_name), vendor_name, &returned_size);
err |= clGetDeviceInfo(device_id, CL_DEVICE_NAME, sizeof(device_name),device_name, &returned_size);
// assert(err == CL_SUCCESS);
printf("Connecting to %s %s...\n", vendor_name, device_name);
// create OpenCL context
context = clCreateContext(NULL, 1, &device_id, NULL, NULL, &ret);
// create command queue
command_queue = clCreateCommandQueue(context, device_id, 0, &ret);
// create memory buffer
memobj = clCreateBuffer(context,CL_MEM_READ_WRITE, MEM_SIZE*sizeof(char), NULL, &ret);
// create kernel program from source code
program = clCreateProgramWithSource(context, 1, (const char **)&program_source, (const size_t*)&source_size, &ret);
// build kernel program
ret = clBuildProgram(program, 1, &device_id, NULL, NULL, NULL);
// create OpenCL Kernel
kernel = clCreateKernel(program, "hello", &ret);
// set OpenCL kernel parameters
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobj);
// Execute OpenCL kernel
ret = clEnqueueTask(command_queue, kernel, 0, NULL, NULL);
// copy results from the memory buffer
ret = clEnqueueReadBuffer(command_queue, memobj, CL_TRUE, 0, MEM_SIZE*sizeof(char), string, 0, NULL, NULL);
// display results
puts(string);
// finish up
ret = clFlush(command_queue);
ret = clFinish(command_queue);
ret = clReleaseKernel(kernel);
ret = clReleaseProgram(program);
ret = clReleaseMemObject(memobj);
ret = clReleaseCommandQueue(command_queue);
ret = clReleaseContext(context);
return 0;
}
Tried using Guard Malloc, got:
GuardMalloc[OCL_HW-1453]: recording malloc stacks to disk using standard recorder
GuardMalloc[OCL_HW-1453]: Allocations will be placed on 16 byte boundaries.
GuardMalloc[OCL_HW-1453]: - Some buffer overruns may not be noticed.
GuardMalloc[OCL_HW-1453]: - Applications using vector instructions (e.g., SSE) should work.
GuardMalloc[OCL_HW-1453]: version 24.1
OCL_HW(1453) malloc: process 1423 no longer exists, stack logs deleted from /tmp/stack-logs.1423.OCL_HW.yL5f5u.index
OCL_HW(1453) malloc: stack logs being written into /tmp/stack-logs.1453.OCL_HW.pCjTNR.index
Connecting to NVIDIA GeForce GT 330M...
I had no problems with these codes under Snow Leopard and Xcode 3. I made sure not to compile any .cl files by removing them from the target, and 'OpenCl.framework' is linked and everything.
I actually even wiped my computer and clean installed lion and xcode and still it's a problem. I'm pretty sure at this point it's something stupid.
-Thanks a bunch
You're right -- it's something silly. You are passing an incorrect value to the fourth parameter of clCreateProgramWithSource. You should be passing the length of your source string, but you are passing the size of the pointer. You can fix it like this:
size_t source_size = strlen(program_source);
Note that I found this by checking the return value from clBuildProgram. It was -11, CL_BUILD_PROGRAM_FAILURE, which means your kernel compilation failed. Since your kernel looked fine, I did this on the command line:
CL_LOG_ERRORS=stdout ./test
Which caused the Apple OpenCL implementation to dump the compiler build log to standard output. I saw this:
[CL_BUILD_ERROR] : OpenCL Build Error : Compiler build log:
<program source>:2:1: error: unknown type name '__kerne'
__kerne
<program source>:2:8: error: expected identifier or '('
__kerne
Which made me immediately think something was up with your source code length parameter.
Also note that you need to change this in your kernel:
string[13] = '\0';
to
string[13] = 0;
After making these changes, I see this on my Macbook Pro:
Connecting to AMD ATI Radeon HD 6490M...
Hello, world!