GNU C++ import name mangling [duplicate] - gcc

I came across an interesting error when I was trying to link to an MSVC-compiled library using MinGW while working in Qt Creator. The linker complained of a missing symbol that went like _imp_FunctionName. When I realized That it was due to a missing extern "C", and fixed it, I also ran the MSVC compiler with /FAcs to see what the symbols are. Turns out, it was __imp_FunctionName (which is also the way I've read on MSDN and quite a few guru bloggers' sites).
I'm thoroughly confused about how the MinGW linker complains about a symbol beginning with _imp, but is able to find it nicely although it begins with __imp. Can a deep compiler magician shed some light on this? I used Visual Studio 2010.

This is fairly straight-forward identifier decoration at work. The imp_ prefix is auto-generated by the compiler, it exports a function pointer that allows optimizing binding to DLL exports. By language rules, the imp_ is prefixed by a leading underscore, required since it lives in the global namespace and is generated by the implementation and doesn't otherwise appear in the source code. So you get _imp_.
Next thing that happens is that the compiler decorates identifiers to allow the linker to catch declaration mis-matches. Pretty important because the compiler cannot diagnose declaration mismatches across modules and diagnosing them yourself at runtime is very painful.
First there's C++ decoration, a very involved scheme that supports function overloads. It generates pretty bizarre looking names, usually including lots of ? and # characters with extra characters for the argument and return types so that overloads are unambiguous. Then there's decoration for C identifiers, they are based on the calling convention. A cdecl function has a single leading underscore, an stdcall function has a leading underscore and a trailing #n that permits diagnosing argument declaration mismatches before they imbalance the stack. The C decoration is absent in 64-bit code, there is (blessfully) only one calling convention.
So you got the linker error because you forgot to specify C linkage, the linker was asked to match the heavily decorated C++ name with the mildly decorated C name. You then fixed it with extern "C", now you got the single added underscore for cdecl, turning _imp_ into __imp_.

Related

MinGW's gcc reports error where Cygwin's accepts

MingGW's gcc (4.8.1) reports the following error (and more to come) when I try to compile Expstack.c:
parser.h:168:20: error: field '__p__environ' declared as a function
struct term *environ;
where 'environ' is declared inside 'struct term{ ... }'. In unistd.h you find
char **environ
but nowhere a '__p__environ'.
Some other fields are declared likewise, but are accepted. Subsequent errors related to environ are reported as follows
Expstack.c:1170:38: error: expected identifier before '(' token
case Term_src: return e->item.src->environ;
^
Cygwin's gcc (4.8.3) accepts these constructs and has done so over various versions since
2006 at least, and gcc with various versions of Linux before that.
The source text uses CRLF despite my attempts to convert to DOS, and this is my only guess for an explanation.
I'll appreciate clues or ideas to fix the problem, but renaming the field is not especially attractive and ought to be totally irrelevant.
This is very unlikely to have to do with CR/LF.
The name ought to be irrelevant but it isn't: this one relates to the Windows integration that MinGW does and Cygwin does not, as alluded to in http://sourceforge.net/p/mingw/mailman/message/14901207/ (that person is trying to use an extern environ that it expects to be defined by the system. Clearly, the fashion in which MinGW developers have made this variable available breaks the use of the name as a struct member).
You should report this as a MinGW bug. Unpleasant as it may be, in the interim, renaming the member is the simplest workaround. It is possible that a well-placed #undef environ might help, but no guarantees.

why does gcc(default version on openSUSE 11.3) give an error on the statement int *p=malloc(sizeof(int));?

malloc returns a void pointer.so why is it not working for me without typecasting the return value?
The error pretty clear said that gcc is not allowing conversion from void* to int*.
In C, you don't have to cast. In fact it's a bad idea to cast there since it can cause certain subtle errors.
However, casting is required in C++ so that would be my first guess, that you're somehow invoking the C++ compiler. Perhaps your source files are *.cpp or *.C both of which may be auto-magigically treated as C++ rather than C.
See here for more detail:
C++ source files conventionally use one of the suffixes ‘.C’, ‘.cc’, ‘.cpp’, ‘.CPP’, ‘.c++’, ‘.cp’, or ‘.cxx’; C++ header files often use ‘.hh’, ‘.hpp’, ‘.H’, or (for shared template code) ‘.tcc’; and preprocessed C++ files use the suffix ‘.ii’. GCC recognizes files with these names and compiles them as C++ programs even if you call the compiler the same way as for compiling C programs (usually with the name gcc).
The fact that it knows you're trying to convert void* to int* means that you have a valid malloc prototype in place so I can't see it being anything other than the imposition of C++ rules.
Without code I can't help you properly, but you can try this:
p = (int*)malloc(sizeof(int));
Give more info about what you want to do and what you are allocating.

Using DLLs with NASM

I have been doing some x86 programming in Windows with NASM and I have run into some confusion. I am confused as to why I must do this:
extern _ExitProcess#4
Specifically I am confused about the '_' and the '#4'. I know that the '#4' is the size of the stack but why is it needed? When I looked in the kernel32.dll with a hex editor I only saw 'ExitProcess' not '_ExitProcess#4'.
I am also confused as to why C Functions do not need the underscore and the stack size such as this:
extern printf
Why don't C Functions need decorations?
My third question is "Is this the way I should be using these functions?" Right now I am linking with the actual dll files themselves.
I know that the '#4' is the size of the stack but why is it needed?
To enable the linker to report a fatal error if your compiler assumed the wrong calling convention for the function (this can happen if you forget to include header files in C and ignore all the compiler warnings or if a declaration doesn't exactly match the function in the shared library).
Why don't C Functions need decorations?
Functions that use the cdecl calling convention are decorated with a single leading (so it would actually be _printf).
The reason why no parameter size is encoded into the decorated name is that the caller is responsible for both setting up and tearing down the stack, so an argument count mismatch will not be fatal for the stack setup (though the calling function might still crash if it isn't given the right arguments, of course). It might even be possible that the argument count is variable, like in the case of printf.
When I looked in the kernel32.dll with a hex editor I only saw ExitProcess not _ExitProcess#4.
The mangled names are usually mapped to the actual exported names of the DLL using definition files (*.def), which then get compiled to *.lib import library files that can be used in your linker invocation. An example of such a definition file for kernel32.dll is this one. The following line defines the mapping for ExitProcess:
_ExitProcess#4 = ExitProcess
Is this the way I should be using these functions?
I don't know NASM very well, but the code I've seen so far usually specifies the decorated name, like in your example.
You can find more information on this excellent page about Win32 calling conventions.

Debugging vtable Linker Errors in GCC

Now and then when using GCC I get cryptic errors like this:
undefined reference to 'vtable for classname'
When it's not caused by a missing library, this not-very-descriptive error message always causes me to dig through code files line by line to find the missing implementation for a virtual function. Is there a way to make the linker tell me which virtual function it is missing, perhaps a flag or something? Or is it maybe telling me but I don't understand what it's saying?
From gcc faq:
When building C++, the linker says my
constructors, destructors or virtual
tables are undefined, but I defined
them
The ISO C++ Standard specifies that
all virtual methods of a class that
are not pure-virtual must be defined,
but does not require any diagnostic
for violations of this rule
[class.virtual]/8. Based on this
assumption, GCC will only emit the
implicitly defined constructors, the
assignment operator, the destructor
and the virtual table of a class in
the translation unit that defines its
first such non-inline method.
Therefore, if you fail to define this
particular method, the linker may
complain about the lack of definitions
for apparently unrelated symbols.
Unfortunately, in order to improve
this error message, it might be
necessary to change the linker, and
this can't always be done.
The solution is to ensure that all
virtual methods that are not pure are
defined. Note that a destructor must
be defined even if it is declared
pure-virtual [class.dtor]/7.
The solution that I adopt is search the classname and seek virtual methods declaration and check if there is any definition. I didn't found any other solution for this.

GCC 4.7 fails to inline with message "function body not available"

I am trying to compile some legacy code with more modern toolchains. I have tracked down one of my issues to the switch from gcc 4.6 to gcc 4.7:
Some of the functions are annotated with the inline keyword. Gcc fails on these with the error message:
error: inlining failed in call to always_inline 'get_value_global': function body not available
What is the correct way of dealing with this issue? How could the function body not be available? Should the compiler not make sure that it is available in all situations that require it?
Edit
As requested (in a deleted comment), an example of a signature of a function resulting in the error:
inline struct a_value_fmt const *find_a_value_format(struct base_fmt *base)
{
/* the code */
}
That error is typical to inline functions declared in source files, rather than in header files, in which case the compiler is not able to inline them (as the code of the function to be inlined must be visible to the compiler in the same source file being compiled). So, first thing I would check is that all functions declared inline are indeed defined in header files.
It may be that a change in GCC diagnostics in 4.7 caused the error to surface, and that it went silent in GCC 4.6 (but that's just a speculation).
The quoted error indicates that the function is declared with __attribute__((always_inline)). Note that GCC may fail to inline and report a different (and quite obscure) error if function is declared always_inline, but not with the inline keyword - so make sure that any function declared as always_inline is also declared as inline.
Few more tips:
General advice, which may not be applicable: since this is a legacy codebase, you may want to re-evaluate which functions should be inlined, being on the critical path, and which aren't, based on updated profiling results. Sometimes, inline is used generously, even when it is not required, or redundant. In some cases, the fix may be to remove the inline keyword in places where it is not needed.
When functions are declared in header files, the compiler considers them for inlining automatically (given they are small enough, and the compiler thinks that inlining them will improve performance, based on its heuristics) - even when the inline keyword is not used. inline is sort of a "recommendation" to the compiler, and it doesn't have to obey (unless it is given along with the always_inline attribute).
Modern compilers make relatively smart inlining decisions, so it's usually best to let the compiler do it's thing, and declare functions as inline (and moving their implementations to header files) in the appliation hot spots, based on profiling results.

Resources