OpenCL, include files - compilation

Following the post ,
if I have header file,which has some functions implementations in it and should be included in several kernels(I mean these function are auxilary in all kernels and I don`t want to duplicate the code)
How I make this inclusion - can I keep the functions in header?Will the kernels and the header functions be compiled?
Can you specify (maybe by example) how I use the "-I" option in these case?
I am using VS2010(if its matter at all)
Note:Each kernel runs in different program

Yes, you can use headers in OpenCL for exactly what you are suggesting. Each kernel file will include the header and compile it.
The "-I" option is only used to specify the path for includes. If your includes are in your working directory it is not necessary. Here is an example:
/////////////////////////////////////////////////////////////////
// Load CL file, build CL program object, create CL kernel object
/////////////////////////////////////////////////////////////////
std::string sourceStr = FileToString(params.kernelFile);
cl::Program::Sources sources(1, std::make_pair(sourceStr.c_str(), sourceStr.length()));
cl::Program program = cl::Program(oclHandles.context, sources);
program.build(oclHandles.devices,"-I c:/Includes/");

Related

How can I associate my NVRTC program source with a file?

I'm using NVRTC to compile a kernel. The relevant API call is:
nvrtcResult nvrtcCreateProgram (
nvrtcProgram* prog,
const char* src,
const char* name,
int numHeaders,
const char** headers,
const char** includeNames )
As you can see, the source is a raw string, and not associated with a file. That means that when you --generate-line-info, you get line numbers, but no related filename. And that means that if you then use, say, NSight Compute - you won't be able to see your kernel source code.
Obviously, neither NSight Compute itself, nor NVRTC itself, can figure out that the raw source is mirrored in some file. But there has to be some way to get around this:
Perhaps I'm missing something in the NVRTC API which can make the source <-> file association?
Perhaps we can manipulate the resulting compiled program (reasonably, not manually, or write-my-own-new-API) to make the association?
Perhaps we can shove the source code into the compiled program somehow?
Here's my initial workaround:
Place your source in a file, say my_kernel.cuh.
Create the string:
#include "my_kernel.cuh"
Compile just this string using NVRTC
Now, NVRTC is able to associate included files' sources with the files, so it's only a stub that will be missing in terms of source<->file association.
Caveat: You will need to be careful about paths - NVRTC's include paths, the working directory from which you invoke your program vs the directory of the source file etc.
It seems NVRTC does provides a default filename, such that if you place your source in the file with that name - NSight Compute may be able to find it.
The name is the one you passed to nvrtcCreateProgram() as the name argument.
So, if your kernel function (i.e. your __global__ function) is in my_kernel.cuh, and you place this file in the working directory of the profile program (which you tell NSight Compute about), or in one of the include directories you built your program with, you'll be able to read your source. If the original file's own directory is also one of the include directories, then you're in luck and you don't even have to make a copy.

Reordering functions in gcc assembly

I am writing a program which encrypt/decrypts itself in memory and then writes the .text memory region to a copy of the executable so I can change the encryption key each time.
This is mainly for a challenge as I am not great with C, and I'm incorporating parts in assembly as well.
My system is x86_64 Linux but I'm compiling with -m32
I am also using -nostartfiles (with gcc) so that I can write my own _start function. This function is written in assembly and this decrypts/encrypts the rest of the .text section. My problem is that the external functions are being compiled in the wrong order, such that when I try to dump the memory after it has been encrypted it calls an encrypted function which therefore doesn't work.
This is the current order of the functions:
some from -static
my functions which are in the correct order (assembly functions and then the ones from the main C file)
some more from -static
This doesn't work becuase the assembly encrypts from the main C file 'downwards', also encrypting some -static functions which are needed from the assembly functions.
This is the order I would like the functions to be in:
all -static functions & anything from an #include <>
functions from the .S assembly file (the whole .S in order)
functions from the .c main file (the whole .c in order)
any non-standard includes for the .c main file (ie not stdio.h etc, things from #include "")
Is there any way, short of manually mangling the ELF file, for me to reorder these functions so that the functions I need are not encrypted while the ones I want encrypted can be easily?
edit upon compiling with the musl (alternative libc) I can get all of my functions at the start, and the rest of the static functions following. However, This is the wrong way around still.
The "wrong" order of functions inside the binary comes from optimization efforts of the compiler. Functions that are used often (or often together) are near each other, so that no pagefault is generated by calling them.
You can turn off part of these optimizations with the flag -fno-toplevel-reorder. You can also use the attribute section to order only a subset of functions together (eg to encrypt them) or you can write your own linker scripts.
See also this question.

Is there a way to strip all functions from an object file that I am not using?

I am trying to save space in my executable and I noticed that several functions are being added into my object files, even though I never call them (the code is from a library).
Is there a way to tell gcc to remove these functions automatically or do I need to remove them manually?
If you are compiling into object files (not executables), then a compiler will never remove any non-static functions, since it's always possible you will link the object file against another object file that will call that function. So your first step should be declaring as many functions as possible static.
Secondly, the only way for a compiler to remove any unused functions would be to statically link your executable. In that case, there is at least the possibility that a program might come along and figure out what functions are used and which ones are not used.
The catch is, I don't believe that gcc actually does this type of cross-module optimization. Your best bet is the -Os flag to optimize for code size, but even then, if you have an object file abc.o which has some unused non-static functions and you link statically against some executable def.exe, I don't believe that gcc will go and strip out the code for the unused functions.
If you truly desperately need this to be done, I think you might have to actually #include the files together so that after the preprocessor pass, it results in a single .c file being compiled. With gcc compiling a single monstrous jumbo source file, you stand the best chance of unused functions being eliminated.
Have you looked into calling gcc with -Os (optimize for size.) I'm not sure if it strips unreached code, but it would be simple enough to test. You could also, after getting your executable back, 'strip' it. I'm sure there's a gcc command-line arg to do the same thing - is it --dead_strip?
In addition to -Os to optimize for size, this link may be of help.
Since I asked this question, GCC 4.5 was released which includes an option to combine all files so it looks like it is just 1 gigantic source file. Using that option, it is possible to easily strip out the unused functions.
More details here
IIRC the linker by default does what you want ins some specific cases. The short of it is that library files contain a bunch of object files and only referenced files are linked in. If you can figure out how to get GCC to emit each function into it's own object file and then build this into a library you should get what you are looking.
I only know of one compiler that can actually do this: here (look at the -lib flag)

Does exist any utility to know the size of a compiled function in an executable?

I want a report showing me the size of diferent symbols(compiled) in the executable. Something like .map files in Delphi, but generic if possible. nm from binutils, shows start address(?), maybe could i use that information?
(I'm using object pascal + freepascal compiler)
FPC/LD can generate mapfiles too
various ways to analyze .o files. (nm, objdump and parse the address increments between sections)
maybe the information is stored in the .ppu, have a look in the ppu unit (compiler dir) which contains .ppu loaders

Size of a library and the executable

I have a static library *.lib created using MSVC on windows. The size of library is say 70KB. Then I have an application which links this library. But now the size of the final executable (*.exe) is 29KB, less than the library. What i want to know is :
Since the library is statically linked, I was thinking it should add directly to the executable size and the final exe size should be more than that? Does windows exe format also do some compression of the binary data?
How is it for linux systems, that is how do sizes of library on linux (*.a/*.la file) relate with size of linux executable (*.out) ?
-AD
A static library on both Windows and Unix is a collection of .obj/.o files. The linker looks at each of these object files and determines if it is needed for the program to link. If it isn't needed, then the object file won't get included in the final executable. This can lead to executables that are smaller then the library.
EDIT: As MSalters points out, on Windows the VC++ compiler now supports generating object files that enable function-level linking, e.g., see here. In fact, edit-and-continue requires this, since the edit-and-continue needs to be able to replace the smallest possible part of the executable.
There is additional bookkeeping information in the .lib file that is not needed for the final executable. This information helps the linker find the code to actually link. Also, debug information may be stored in the .lib file but not in the .exe file (I don't recall where debug info is stored for objs in a lib file, it might be somewhere else).
The static library probably contains several functions which are never used. When the linker links the library with the main executable, it sees that certain functions are never used (and that their addresses are never taken and stored in function pointers), it just throws away the code. It can also do this recursively: if function A() is never called, and A() calls B(), but B() is never otherwise called, it can remove the code for both A() and B(). On Linux, the same thing happens.
A static library has to contain every symbol defined in its source code, because it might get linked into an executable which needs just that specific symbol. But once it is linked into an executable, we know exactly which symbols end up being used, and which ones don't. So the linker can trivially remove unused code, trimming the file size by a lot. Similarly, any duplicate symbols (anything that's defined in both the static library and the executable it's linked into gets merged into a single instance.
Disclaimer: It's been a long time since I dealt with static linking, so take my answer with a grain of salt.
You wrote: I was thinking it should add directly to the executable size and final exe size should be more than that?
Naive linkers work exactly this way - back when I was doing hobby development for CP/M systems (a LONG time ago), this was a real problem.
Modern linkers are smarter, however - they only link in the functions referenced by the original code, or as required.
Additionally to the current answers, the linker is allowed to remove function definitions if they have identical object code - this is intended to help reduce the bloating effects of templated code.
#All: Thanks for the pointers.
#Greg Hewgill - Your answer was a good pointer. Thanks.
The answer i found out was as follows:
1.)During Library building what happens is if the option "Keep Program debug databse" in MSVC (or something alike ) is ON, then library will have this debug info bloating its size.
but when i statically include that library and create a executable, the linker strips all that debug info from the library before geenrating the exe and hence the exe size is less than that of the library.
2.) When i disabled the option "Keep Program debug databse", i got an library whose size was smaller than the final executable, which was what i thought is nromal in most situations.
-AD

Resources