I'm using NVRTC to compile a kernel. The relevant API call is:
nvrtcResult nvrtcCreateProgram (
nvrtcProgram* prog,
const char* src,
const char* name,
int numHeaders,
const char** headers,
const char** includeNames )
As you can see, the source is a raw string, and not associated with a file. That means that when you --generate-line-info, you get line numbers, but no related filename. And that means that if you then use, say, NSight Compute - you won't be able to see your kernel source code.
Obviously, neither NSight Compute itself, nor NVRTC itself, can figure out that the raw source is mirrored in some file. But there has to be some way to get around this:
Perhaps I'm missing something in the NVRTC API which can make the source <-> file association?
Perhaps we can manipulate the resulting compiled program (reasonably, not manually, or write-my-own-new-API) to make the association?
Perhaps we can shove the source code into the compiled program somehow?
Here's my initial workaround:
Place your source in a file, say my_kernel.cuh.
Create the string:
#include "my_kernel.cuh"
Compile just this string using NVRTC
Now, NVRTC is able to associate included files' sources with the files, so it's only a stub that will be missing in terms of source<->file association.
Caveat: You will need to be careful about paths - NVRTC's include paths, the working directory from which you invoke your program vs the directory of the source file etc.
It seems NVRTC does provides a default filename, such that if you place your source in the file with that name - NSight Compute may be able to find it.
The name is the one you passed to nvrtcCreateProgram() as the name argument.
So, if your kernel function (i.e. your __global__ function) is in my_kernel.cuh, and you place this file in the working directory of the profile program (which you tell NSight Compute about), or in one of the include directories you built your program with, you'll be able to read your source. If the original file's own directory is also one of the include directories, then you're in luck and you don't even have to make a copy.
Related
Got a new issue I've not come across before that's appeared when using the Espressif ESP32 ESP-IDF standard setup under VSCode. It uses the GNU compiler.
I'm getting "multiple definition of" errors on variables that share the same name, but which should be local.
So I use a .c and .h pair of files approach.
In my .c files I do this at the top
#define IO_EXPANDER_C //<<<This is a unique define for this file pair
#include "io-pca9539.h"
In my .h files I do this:
#ifdef IO_EXPANDER_C
//----- INTERNAL ONLY MEMORY DEFINITIONS -----
uint8_t *NextReadDataPointer;
//----- INTERNAL & EXTERNAL MEMORY DEFINITIONS -----
//(Also defined below as extern)
int SomeVariableIWantAvailableGlobally;
#else
//----- EXTERNAL MEMORY DEFINITIONS -----
extern int SomeVariableIWantAvailableGlobally;
#endif
It's a great simple system, any other .c file that includes the .h file (without the #define above its include statemnt), gets all of its extern variables, none of its local variables.
But, compiling in VSCode with my ESP-IDF based project, I'm getting "multiple definition of" errors relating to "NextReadDataPointer"
I use the same variable name NextReadDataPointer in another file pair in just the same way, but it's never declared anywhere as extern and each file pair uses a separate #define (IO_EXPANDER_C and LED_C). I do this all the time normally and I can't see any obvious mistakes.
I've never seen a C compiler do this before, it's as if it's mixing up the local definitions somehow. A #define should only have scope in the file it is declared in and in any includes within that file.
Even odder, the error is not generated if the project is built but a function is called from just one of the file pairs that share the same local variable name. It's only generated when functions are called from both file pairs from my main application.
Can anyone shed light on whether the GNU C compiler does something funky for a standard ESP-IDF project as it's got me baffled?
uint8_t *NextReadDataPointer; creates a variable which is visible across all translation units, i.e. it's the opposite of "private". If you include this header in multiple c files and the linker tries to link those together; it'll see a conflict. The keyword you're looking for is static, for example static uint8_t *NextReadDataPointer; creates a variable that is not visible across translation units. The reason you don't see the problem if calling a function from only one of those two files is because in this case the linker doesn't bother looking into the other one.
Personally I'd avoid such clever preprocessor hacks because it's quite difficult to see how files include one another and debug the resulting problems. I'd suggest sticking to the standard way of declaring shared things in header files and keeping the private stuff inside the c file (prepended by static).
I have a rather weird question but I don't really know how to put it or where to start looking from.
My question is not about "embedding" a text file (we already have at compile time) - that is too obvious.
My question is if (and how) I could let's say "package" an existing (created by C) binary with a text file and generate a new... working binary with access to that file.
I'm a Mac user. I know that could work with an .app package and all that. But that's still not what I want. I want to be able to "tweak" an existing binary, add some (accessible - how?) additional text data to it, and the binary remaining absolutely functional.
Is that even possible?
P.S. The only serious tool I've looked into is bsdiff and bspatch but I'm not really sure it's what I'm looking for.
You can definitely do this, but the exact procedure is going to be different for every platform, with a few commonalities. Your tool of choice here is probably going to be llvm_objcopy.
At a high level, you will create a special segment or section in the binary (or both as in the case of MachO) containing the data you want, and then you'll have to parse your own executable to retrieve it. Since you said you're on a Mac, we can start there as an example.
Create the dumbest possible test program as a starting point:
test.c
#include <stdio.h>
int main(int argc, char **argv)
{
printf("I'm a binary!\n");
return 0;
}
Compile and run it:
prompt$ clang -o test test.c
prompt$ ./test
I'm a binary
Now create a text file hello.txt and put some text in it:
Hello world!
You can embed this into the MachO file with llvm-objcopy
llvm-objcopy --add-section __MAGIC,__magic_section=hello.txt test test
Then check that it still runs:
prompt$ ./test
I'm a binary!
You can verify that the section has been added using otool -l, you can also run strings on the binary, and you should see your text:
prompt$ strings ./test
I'm a binary!
Hello world!
Now you have to be able to retrieve this data. If you were compiling everything in a priori, it would be easy since the linker would make symbols for you marking the start and end of the __magic_section section that you added.
Since you specifically said this has to be an a posteriori step, you're going to have to actually self-parse the MachO file to look up the __magic_section section in the __MAGIC segment that you added. You can find a few different references for parsing MachO files, but you probably also want to make use of built in dyld functionality. See Apple's Reference on dyld utility calls that can for example give you the Mach header for the running process. Linux has similar functionality by way of dl_iterate_phdr.
Once you know where the section is in your original binary, you can retrieve the text.
To repeat all of this on Linux, you will do pretty much the same thing, but you'll be working with the ELF file format instead of MachO. The same principles would apply though.
As a side note: this is exactly how code signing works on MacOS. A signature is generated and placed into a dedicated "signature" section in the binary to be read by the system on launch.
I am building a build system and have .c file which contains version string. const char version[] = "V01.00.00";
I need to get that line by regex into make variable to use for further binary making.
Can you suggest the best way to do that? Would appreciate native make solution.
It's a bad idea to try to parse .c files by yourself to extract information from them. Only the C compiler, for example gcc should be used to parse .c files. You should not be doing it, leave it to the compiler.
The reason is, that if someone changes the .c file you extracted information from, your extraction code is likely to break. Basically, your extraction code would have to be compatible with any .c syntax, which means, you would end up to be a "compiler".
Otherwise, you would have to publish some documentation for other programmers, to tell them to keep the .c code the way you want it, for your code not to break. So you would end up restricting the C language for them.
If you want to get such information:
define a function in the .c file to return that information
compile that file to a library
make an executable that links with the library and calls that function
call that executable from the Makefile
I am working on a project using the cmake build system. By default CMake has a nice framework for generating a single executable from a set of C/C++ code. The cmake function is called create_test_sourcelist. What it does is generate a C/C++ dispatcher with a single main entry point which will call other C/C++ code.
Therefore I have a bunch of C/C++ files with a function signature such as: int TestFunctionality1(int argc, char *argv[]), which I'd like to keep as-is, unless of course it means even more work.
How can I keep this system in place and start using BOOST_CHECK ? I could not figure out how to specify the actual main entry point is not called int main(int argc, char *argv[]).
My goal is have a framework for integration with Jenkins, since the project already uses Boost, I believe this should be doable without re-writing the existing CMake test suite and changing all tests into independent main function.
Unfortunately, it seems there is no straightforward and clean way to do that.
From one side the only useful function of create_test_sourcelist is to generate a test driver: a (stupid pretty simple, naive and with lack of abilities to hack/extend) C/C++ translation unit based on ${cmake-prefix}/share/cmake/Templates/TestDriver.cxx.in (and there is no way to select some other template).
From other side, Boost UTF offers its own test runner (which is equal to test driver in CMake's terminology), but in any variant (static, dynamic, single-header) it contains definition for main() some way or another (even in case of external test runner).
…so you end up with two main() functions and no ability to choose a single one.
Digging a little into sources of create_test_sourcelist I've really wonder why are they implemented it as a command and not as ordinal (extern) cmake module — it doesn't do any special (which can't be implemented using CMake language). This command is really stupid — it doesn't check that required functions are really exists (you'll get compile errors if smth wrong). There are no ways for flexible customization of output source file at all. All that is does is stripping paths and extensions from a list of source files and substitute it to mentioned template using ordinal configure_file()…
So, personally I see no reason to use it at all. It is why I've done the same (but better ;) job in the module mentioned in the comment above.
If you are still want to use that command, generated test driver completely useless if you want to use Boost UTF. You need to provide your own initialization function anyway (and it is not main()), where you can manually register your test cases into a master test suite (or organize your tests into some more complex tree). In that case there is absolutely no reason to use create_test_sourcelist! All what you can get from it is a list of sources that needs to be given to add_executable()… but it is much more easy to do w/ set()… This command even can't help you w/ list of test functions (list of filenames w/o extension actually) to call (it used internally and not exported). Do you still want to use that command??
Following the post ,
if I have header file,which has some functions implementations in it and should be included in several kernels(I mean these function are auxilary in all kernels and I don`t want to duplicate the code)
How I make this inclusion - can I keep the functions in header?Will the kernels and the header functions be compiled?
Can you specify (maybe by example) how I use the "-I" option in these case?
I am using VS2010(if its matter at all)
Note:Each kernel runs in different program
Yes, you can use headers in OpenCL for exactly what you are suggesting. Each kernel file will include the header and compile it.
The "-I" option is only used to specify the path for includes. If your includes are in your working directory it is not necessary. Here is an example:
/////////////////////////////////////////////////////////////////
// Load CL file, build CL program object, create CL kernel object
/////////////////////////////////////////////////////////////////
std::string sourceStr = FileToString(params.kernelFile);
cl::Program::Sources sources(1, std::make_pair(sourceStr.c_str(), sourceStr.length()));
cl::Program program = cl::Program(oclHandles.context, sources);
program.build(oclHandles.devices,"-I c:/Includes/");