Strip specific symbols from DLL - visual-studio-2010

I have created a Win32-DLL using MSVC 2010 that contains unwanted exported C++ symbols. I'm using a module definition file (.def) and __stdcall convention for specific functions that I want to export. However, due to the fact that I am also using Boost Serialization there is a ton of exported C++ symbols from Boost. These symbols are exported by Boost due to this fact (found here):
I am using boost::serialization from 1.44.0. One thing that I noticed
is that linking statically to the serialization libs will add several
hundred exports in the final exe file that I get. Using dumpbin
/exports my_program.exe
These functions are not explicity called from the library. But they
ARE called as part of the serialization process. Its just that MSVC
doesn't see them. So when you compile for release, The MSVC Linker
strips them out and the program won't work anymore. In order to work
around this, these functions are explicitly exported. This prevents
MSVC from stripping them out. For more information see
force_include.hpp
Exported symbols (excerpt):
class boost::archive::detail::extra_detail::map<class boost::archive::binary_oarchive> & >boost::serialization::singleton<class boost::archive::detail::extra_detail::map<class boost::archive::binary_oarchive> >::get_instance(void)'::`2'::`local static guard'{2}'
You can recreate the situation by creating a DLL project and include Boost (link against libboost_serialization-vc100-mt-gd-1_55.lib):
#include <boost/archive/binary_oarchive.hpp>
#include <fstream>
extern "C" int __stdcall test();
int __stdcall test() {
std::fstream stream;
boost::archive::binary_oarchive o(stream, boost::archive::no_header);
return 1;
}
I tested the GNU utility strip from binutils. However, it seems it always removes all symbols. E.g. using this command
strip --strip-symbol=test DllBoostTest.dll -o test.dll
This simple test does not work. It should remove only the test symbol. Unfortunately, it also removes all symbols. Also using wildcards and -N does not work as it removes all exports, too.
So is there a way to remove all unwanted boost C++ symbols? Say, remove all symbols with "boost" text in it?
If you need more information, I'm happy to provide it.
Note: This is not about debugging or PDB files!

This is quite hard to fix in a clean way. A true fix would be to take the sting out the boost hack to force these symbols to be included. Which would require removing the __declspec(dllexport) attribute and either use the /OPT:NOREF linker option to suppress the optimization or use /INCLUDE (or #pragma comment) to ensure that the symbols are included. This however requires rebuilding the boost library and is yucky maintenance due to the unpredictability of the mangled names. So you probably don't like that option, the Boost team clearly didn't.
I don't think trying to hack strip is going to get you anywhere, it is important that the linker still sees the symbols so it doesn't optimize them away. You could only do this after building the DLL, that requires rewriting the export tables in the file. This is technically possible but not easy to do.
One possibility is to prevent these names from being visible. The DEF file gives you that option, you can use the NONAME attribute to prevent the name from being visible and the PRIVATE attribute to prevent the name from being visible in the import library. Make it look like this:
EXPORTS
??_B?1??get_instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##CAAAV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#3#XZ#51 #1 NONAME PRIVATE
??_B?1??get_instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##CAAAV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##XZ#51 #2 NONAME PRIVATE
?get_const_instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##SAABV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#3#XZ #3 NONAME PRIVATE
?get_const_instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##SAABV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##XZ #4 NONAME PRIVATE
?get_instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##CAAAV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#3#XZ #5 NONAME PRIVATE
?get_instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##CAAAV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##XZ #6 NONAME PRIVATE
?get_mutable_instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##SAAAV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#3#XZ #7 NONAME PRIVATE
?get_mutable_instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##SAAAV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##XZ #8 NONAME PRIVATE
?instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##0AAV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#3#A #9 NONAME PRIVATE
?instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##0AAV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##A #10 NONAME PRIVATE
?is_destroyed#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##SA_NXZ #11 NONAME PRIVATE
?is_destroyed#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##SA_NXZ #12 NONAME PRIVATE
?t#?1??get_instance#?$singleton#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###serialization#boost##CAAAV?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#4#XZ#4V?$singleton_wrapper#V?$map#Vbinary_oarchive#archive#boost###extra_detail#detail#archive#boost###734#A #13 NONAME PRIVATE
?t#?1??get_instance#?$singleton#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###serialization#boost##CAAAV?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std##XZ#4V?$singleton_wrapper#V?$multiset#PBVextended_type_info#serialization#boost##Ukey_compare#detail#23#V?$allocator#PBVextended_type_info#serialization#boost###std###std###detail#34#A #14 NONAME PRIVATE
_test#0 = _test#0
You will get LNK4197 warnings because the linker sees two export requests, one from the __declspec(dllexport) and another from the DEF file. These warnings are benign and you can ignore them. Note that you may have to tweak these names, I tested this with VS2012 and Boost version 1.53
After deleting the PDB file (don't forget), the exports look like this:
ordinal hint RVA name
15 0 0001582F _test#0
1 0004DE94 [NONAME]
2 0004DEB8 [NONAME]
3 0001524E [NONAME]
4 000153A2 [NONAME]
5 00015AA5 [NONAME]
6 00015460 [NONAME]
7 000154E7 [NONAME]
8 00016199 [NONAME]
9 0004DE7C [NONAME]
10 0004DEA0 [NONAME]
11 00015AFF [NONAME]
12 00015B9F [NONAME]
13 0004DE84 [NONAME]
14 0004DEA8 [NONAME]

Using MSVC 2010 you may not have any options available that will work for you, however in VS2012/2013 you have an option 'pdbcopy.exe'.
From the help, you may find what you need:
PDBCopy v11.00.50307
usage: PDBCopy <source_pdb> <destination_pdb> [-p] [-s] [-f] [-F] [-a] [-A] [-?]
[-p] remove private debug information
[-s] create new signature
[-f:{#file|symbol}] filter specific public symbols out of stripped pdb
[-F:{#file|symbol}] leave only specific public symbols in stripped pdb
[-a] leave all annotation symbols in stripped pdb
[-a:{#file|symbol}] filter specific annotation symbols out of stripped pdb
[-A:{#file|symbol}] leave only specific annotation symbols in stripped pdb
[-?] display this message

Related

Does "-Wl,-soname" work on MinGW or is there an equivalent?

I'm experimenting a bit with building DLLs on windows using MINGW.
A very good summary (in my opinion) can be found at:
https://www.transmissionzero.co.uk/computing/building-dlls-with-mingw/
There is even a basic project which can be used for the purpose of this discussion:
https://github.com/TransmissionZero/MinGW-DLL-Example/releases/tag/rel%2Fv1.1
Note there is a cosmetic mistake in this project which will make it fail out of the box: the Makefile does not create an "obj" directory - Either adjust the Makefile or create it manually.
So here is the real question.
How to change the Windows DLL name so it differs from the actual DLL file name ??
Essentially I'm trying to achieve on Windows, the effect which is very well described here on Linux:
https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
Initially I tried changing "InternalName" and ""OriginalFilename" in the resource file used to create the DLL but that does not work.
In a second step, I tried adding "-Wl,-soname,SoName.dll" on the command that performs the final link, to change the Windows DLL name.
However, that does not seem to have the expected effect (I'm using MingW 7.3.0, x86_64-posix-seh-rev0).
Two things makes me say that:
1/ The test executable still works (I would expect it to fail, because it tries to locate SoName.dll but can't find it).
2/ "pexports.exe AddLib.dll" produces the output below, where the library name hasn't changed:
LIBRARY "AddLib.dll"
EXPORTS
Add
bar DATA
foo DATA
Am I doing anything wrong ? Are my expectations wrong perhaps ?
Thanks for your help !
David
First of all, I would like to say it's important to use either a .def file for specifying the exported symbols or use __declspec(dllexport) / __declspec(dllimport), but never mix these two methods. There is also another method using the -Wl,--export-all-symbols linker flag, but I think that's ugly and should only be used when quick and dirty is what you want.
It is possible to tell MinGW to use a DLL filename that does not match the library name. In the link step use -o to specify the DLL and use -Wl,--out-implib, to specify the library file.
Let me illustrate by showing how to build chebyshev as a both static and shared library. Its sources consist of only only 2 files: chebyshev.h and chebyshev.c.
Compile
gcc -c -o chebyshev.o chebyshev.c -I. -O3
Create static library
ar cr libchebyshev.a chebyshev.o
Create a .def file (as it wasn't supplied and __declspec(dllexport) / __declspec(dllimport) wasn't used either). Note that this file doesn't contain a line with LIBRARY allowing the linker to specify the DLL filename later.
There are several ways to do this if the .def file wasn't supplied by the project:
3.1. Get the symbols from the .h file(s). This may be hard as sometimes you need to distinguish for example between type definitions (like typedef, enum, struct) and actual functions and variables that need to be exported;
echo "EXPORTS" > chebyshev.def
sed -n -e "s/^.* \**\(chebyshev_.*\) *(.*$/\1/p" chebyshev.h >> chebyshev.def
3.2. Use nm to list symbols in the library file and filter out the type of symbols you need.
echo "EXPORTS" > chebyshev.def
nm -f posix --defined-only -p libchebyshev.a | sed -n -e "s/^_*\([^ ]*\) T .*$/\1/p" >> chebyshev.def
Link the static library into the shared library.
gcc -shared -s -mwindows -def chebyshev.def -o chebyshev-0.dll -Wl,--out-implib,libchebyshev.dll.a libchebyshev.a
If you have a project that uses __declspec(dllexport) / __declspec(dllimport) things are a lot easier. And you can even have the link step generate a .def file using the -Wl,--output-def, linker flag like this:
gcc -shared -s -mwindows -o myproject.dll -Wl,--out-implib,myproject.dll.a -Wl,--output-def,myproject.def myproject.o
This answer is based on my experiences with C. For C++ you really should use __declspec(dllexport) / __declspec(dllimport).
I believe I have found one mechanism to achieve on Windows, the effect described for Linux in https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
This involves dll_tool
In the example Makefile there was originally this line:
gcc -o AddLib.dll obj/add.o obj/resource.o -shared -s -Wl,--subsystem,windows,--out-implib,libaddlib.a
I simply replaced it with the 2 lines below instead:
dlltool -e obj/exports.o --dllname soname.dll -l libAddLib.a obj/resource.o obj/add.o
gcc -o AddLib.dll obj/resource.o obj/add.o obj/exports.o -shared -s -Wl,--subsystem,windows
Really, the key seems to be the creation with dlltool of an exports file in conjunction with dllname. This exports file is linked with the object files that make up the body of the DLL and it handles the interface between the DLL and the outside world. Note that dlltool also creates the "import library" at the same time
Now I get the expected effect, and I can see that the "Internal DLL name" (not sure what the correct terminology is) has changed:
First evidence:
>> dlltool.exe -I libAddLib.a
soname.dll
Second evidence:
>> pexports.exe AddLib.dll
LIBRARY "soname.dll"
EXPORTS
Add
bar DATA
foo DATA
Third evidence:
>> AddTest.exe
Error: the code execution cannot proceed because soname.dll was not found.
Although the desired effect is achieved, this still seems to be some sort of workaround. My understanding (but I could well be wrong) is that the gcc option "-Wl,-soname" should achieve exactly the same thing. At least it does on Linux, but is this broken on Windows perhaps ??

Getting "cannot find symbol .... while executing load ..." error when trying to run Hello World as a C extension (dll) example

I have used the C code from the following verbatim: https://wiki.tcl-lang.org/page/Hello+World+as+a+C+extension
/*
* hello.c -- A minimal Tcl C extension.
*/
#include <tcl.h>
static int
Hello_Cmd(ClientData cdata, Tcl_Interp *interp, int objc, Tcl_Obj *const objv[])
{
Tcl_SetObjResult(interp, Tcl_NewStringObj("Hello, World!", -1));
return TCL_OK;
}
/*
* Hello_Init -- Called when Tcl loads your extension.
*/
int DLLEXPORT
Hello_Init(Tcl_Interp *interp)
{
if (Tcl_InitStubs(interp, TCL_VERSION, 0) == NULL) {
return TCL_ERROR;
}
/* changed this to check for an error - GPS */
if (Tcl_PkgProvide(interp, "Hello", "1.0") == TCL_ERROR) {
return TCL_ERROR;
}
Tcl_CreateObjCommand(interp, "hello", Hello_Cmd, NULL, NULL);
return TCL_OK;
}
My command for compiling is nearly verbatim except for the last character, indicating Tcl version 8.6 rather than 8.4, and it compiles without error:
gcc -shared -o hello.dll -DUSE_TCL_STUBS -I$TCLINC -L$TCLLIB -ltclstub86
Then I created the following Tcl program:
load hello.dll Hello
puts "got here"
But when running it with tclsh get the following error:
cannot find symbol "Hello_Init"
while executing
"load ./hello.dll Hello"
(file "hello.tcl" line 1)
So I am essentially following a couple of suggestions from Donal Fellows answer here: cannot find symbol "Embeddedrcall_Init" The OP there however commented that, like me, the suggestion(s) hadn't resolved their issue. One thing that I didn't try from that answer was "You should have an exported (extern "C") function symbol in your library" -- could that be the difference maker? Shouldn't it have been in the example all along then?
At the suggestion of somebody on comp.lang.tcl I found "DLL Export Viewer" but when I run it against the DLL it reports 0 functions found :( What am I doing wrong?
Could it be an issue with MinGW/gcc on Windows, and I need to bite the bullet and do this with Visual Studio? That's overkill I'd like to avoid if possible.
The core of the problem is that your function Hello_Init is not ending up in the global symbol table exported by the resulting DLL. (Some linkers would put such things in as _Hello_Init instead of Hello_Init; Tcl adapts to them transparently.) The symbol must be there for Tcl's load command to work: without it, there's simply no consistent way to tell your extension code what the Tcl_Interp context handle is (which allows it to make commands, variables, etc.)
(If you'd been working with C++, one of the possible problem is a missing extern "C" whose actual meaning is to turn off name mangling. That's probably not the problem here.)
Since you are on Windows — going by the symbols in your DLL, such as EnterCriticalSection and GetLastError — the problem is probably linked to exactly how you are linking. I'm guessing that Tcl is defining your function to have __declspec(dllexport) (assuming you've not defined STATIC_BUILD, which absolutely should not be used when building a DLL) and yet that's not getting respected. Assuming you're using a modern-enough version of GCC… which you probably are.
I'm also going through the process of how to build tcl extensions in C and had exactly the same problem when working though this same example using tcl 8.6.
i.e. I was compiling using MinGW GCC (64-bit), and used the following:
gcc -shared -o hello.dll -DUSE_TCL_STUBS "-IC:\\ActiveTcl\\include" "-LC:\\ActiveTcl\\lib" -ltclstub86
And like the OP I got no compile error, but when loading the dll at a tclsh prompt tcl complained :
'cannot find symbol "Hello_Init"'
I can't say that I understand, but I was able to find a solution that works thanks to some trial and error, and some information on the tcl wiki here
https://wiki.tcl-lang.org/page/Building+Tcl+DLL%27s+for+Windows
In my case I had to adjust the compiler statement to the following
gcc -shared -o hello.dll hello.c "-IC:\\ActiveTcl\\include" "-LC:\\ActiveTcl\\bin" -ltcl86t
Obviously those file paths are specific to my system, but basically
I had to add an explicit reference to the .c file
I had to include the tcl86t dll library from the tcl bin directory
I had to remove the -DUSE_TCL_STUBS flag ( meaning that the references -LC:\\ActiveTcl\\lib and -ltclstub86 could also be removed)
(attempting to use the -DUSE_TCL_STUBS flag caused the compiler to complain with C:\ActiveTcl\lib/tclstub86.lib: error adding symbols: File format not recognized )
This successfully compiled a dll that I could load, and then call the hello function to print my 'Hello World' message.
Something else I stumbled over, and which wasn't immediately obvious:
reading https://www.tcl.tk/man/tcl8.6/TclCmd/load.htm, tcl epxects to find an 'init' function based on a certain naming convention.
if the C extension does not define a package name then the name of that init function will be derived from the dll filename.
This caused a few problems for me (when compiling via Eclipse IDE), as the dll name was being automatically determined from the eclipse projet name.
For example, if I recompile the same example, but call the .dll something else, eg.
gcc -shared -o helloWorldExenstion.dll hello.c "-IC:\\ActiveTcl\\include" "-LC:\\ActiveTcl\\bin" -ltcl86t
Then at tclsh prompt:
% load helloWorldExtension
cannot find symbol "Helloworldextension_Init"

How to force gcc to link like g++?

In this episode of "let's be stupid", we have the following problem: a C++ library has been wrapped with a layer of code that exports its functionality in a way that allows it to be called from C. This results in a separate library that must be linked (along with the original C++ library and some object files specific to the program) into a C program to produce the desired result.
The tricky part is that this is being done in the context of a rigid build system that was built in-house and consists of literally dozens of include makefiles. This system has a separate step for the linking of libraries and object files into the final executable but it insists on using gcc for this step instead of g++ because the program source files all have a .c extension, so the result is a profusion of undefined symbols. If the command line is manually pasted at a prompt and g++ is substituted for gcc, then everything works fine.
There is a well-known (to this build system) make variable that allows flags to be passed to the linking step, and it would be nice if there were some incantation that could be added to this variable that would force gcc to act like g++ (since both are just driver programs).
I have spent quality time with the gcc documentation searching for something that would do this but haven't found anything that looks right, does anybody have suggestions?
Considering such a terrible build system write a wrapper around gcc that exec's gcc or g++ dependent upon the arguments. Replace /usr/bin/gcc with this script, or modify your PATH to use this script in preference to the real binary.
#!/bin/sh
if [ "$1" == "wibble wobble" ]
then
exec /usr/bin/gcc-4.5 $*
else
exec /usr/bin/g++-4.5 $*
fi
The problem is that C linkage produces object files with C name mangling, and that C++ linkage produces object files with C++ name mangling.
Your best bet is to use
extern "C"
before declarations in your C++ builds, and no prefix on your C builds.
You can detect C++ using
#if __cplusplus
Many thanks to bmargulies for his comment on the original question. By comparing the output of running the link line with both gcc and g++ using the -v option and doing a bit of experimenting, I was able to determine that "-lstdc++" was the magic ingredient to add to my linking flags (in the appropriate order relative to other libraries) in order to avoid the problem of undefined symbols.
For those of you who wish to play "let's be stupid" at home, I should note that I have avoided any use of static initialization in the C++ code (as is generally wise), so I wasn't forced to compile the translation unit containing the main() function with g++ as indicated in item 32.1 of FAQ-Lite (http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html).

Undefining linker symbols in gcc

We have a programm that runs on an embedded oOS. We normally embed a version string in the output binary that can identify all the versions contained when generating the binary. Usually the compilers we use can make sure that the version string is in the binary by creating an "undefined" symbol, which is then resolved by our version string.
However, we have now moved to a Linux based system and gcc.
gcc is removing the version string from the final exe. The final exe is created through linking in a bunch of libraries. Each library has a version string embedded.
gcc is removing the version string because nothing is referencing the string and we have turned on -Os optimisations.
Is there a way of making sure that gcc does not strip a collection of strings (there are about 5-10 version strings we need to embed)?
Thanks.
Try working with --retain-symbols-file (option to the linker)
From the ld mangpage:
--retain-symbols-file filename
Retain only the symbols listed in the file filename, discarding all others. filename is simply a flat file, with one symbol name per line. This option is especially useful in environments (such as VxWorks) where a large global symbol table is accumulated gradually, to conserve run-time memory.
--retain-symbols-file does not discard undefined symbols, or symbols needed for relocations.
You may only specify --retain-symbols-file once in the command line. It overrides -s and -S.
EDIT I just noticed the last line of the docs quoted above. It will override the 'strip all' option, so I'm not sure this will help you...
Ok, to solve this we did this in a c file:
const char _string_[] = "some string";
Then include the object file in the final link:
gcc <snip> -Wl,--start-group string.o <snip> -Wl,--end-group -Wl,--strip-all -o final.exe

How do I strip local symbols from linux kernel module without breaking it?

If I do --strip-debug or --strip-unneeded, I have the .ko that lists all function names with nm, if I do just strip foo.ko I have a kernel module that refuses to load.
Does anyone know a quick shortcut how to remove all symbols that are not needed for module loading so that people cannot reverse engineer the API:s as easily?
PS: For all you open source bigots missionaries; this is something that general public will never be using in any case so no need to turn the question into a GPL flame war.
With no answer to my previous questions, here are some guesses that could also be some clues, and a step to an answer:
From what I recall, a .ko is nothing but an .o file resulting from the merge of all the .o files generated by your source module, and the addition of a .modinfo section.
At the end of any .ko building Makefile, there is an LD call: from what I recall, ld is called with the -r option, and this is what create that .o file that the Makefile calls a .ko. This resulting file is not to be confused with an archive or object library (.a file), that is just a format archiving / packaging multiple .o files as one: A merged object is the result of a link that produces yet another .o module: But in the resulting module, all sections that could be merged have been, and all public / external pairs that could be resolved have been inside those sections.
So I assume that you end up with your .ko file containing all your "local" extern definitions:
Those that are extern because they
are used to call across the .o
modules in your .ko (but are not
needed anymore since they are not
supposed to be called from outside
the .ko), and
those that the .ko module DO need to
properly communicate with the loader
and kernel.
The former have most likely already been resolved by ld during the merge, but ld has no way to know whether you intend to have them also callable from outside the .ko.
So the extraneous symbols you see are those that are extern for each of your .o files, but are not needed as extern for the resulting .ko.
And what you are looking for is a way to strip only those.
Does this last paragraph properly describe the symbols you want to get rid of?
I think this is exactly what we are
talking about here.
OK, then it looks like one solution is to "manually" remove the extraneous symbols. The "strip" utility seems to allow individually stripping (or keeping) of symbols, so you would have to use one --strip-all and a small bunch of --keep-symbol= . Note that --wildcard might help a bit, too. You can do the opposite, of course, keep all and individually strip, depending on what's the most convenient.
A good start could be to remove all the symbols that you explicitly defined in your module for cross-module linking and don't want to appear - just leaving the obvious useful ones, things like init and exit. And to not touch those that have been generated by / belong to the kernel dev software infrastructure. Then trial and error until you find the right recipe... In fact, I would think that about all your own symbols might be removable, apart from those you explicitly defined yourself as EXPORT_SYMBOL (and init / exit, of course).
Good luck! :)
PS:
In fact, it seems that the required source information exists in all .ko projects to perform the required stripping automatically: Unless I'm missing something, it seems that anything that's not EXPORT_SYMBOL or explicitly inserted by the build software could theoretically be stripped by default at the end of "ld -r" time that ends a .ko build. It's just that I don't think the toolchain (compiler / linker) have provision / directives / options to individually designate "strip or keep" syms for the relocatable link / merge. Otherwise, some modifications in the EXPORT_SYMBOL macro and in a few other places could probably achieve the result you're after, and shave some bytes from most .ko files in any Linux system.
I just built a kernel without realizing the kernel config had debug symbols enabled, so the size of the resulting modules were quite large. This worked for me:
# du -sh /lib/modules/3.1.0/
1.9G /lib/modules/3.1.0/
# find /lib/modules/3.1.0/ -iname "*.ko" -exec strip --strip-debug {} \;
# du -sh /lib/modules/3.1.0/
134M /lib/modules/3.1.0/
Find all files in /lib/modules/3.1.0 named *.ko and execute strip --strip-debug on each of them.
I'm not sure I understand what the problem really is:
When developing a .ko, if I don't explicitly add something like
ccflags-y += -ggdb -O0 -Wall
into my Makefile, I don't get any symbol but for those that I publish or external ref myself. I'm sure I don't get any other symbols for several good reasons:
the resulting .ko file is considerably smaller,
dumping the file and analyzing the ELF shows the tables are not there,
I can't see nor access the symbols in kgdb.
So I'm a little puzzled at your question, actually?... What are those symbols you do see in your .ko (and don't want to)?
How are they declared in your source file?
In which ELF sections do they end up?
And (sorry, dumb question ahead): Did you define static all things that didn't need to be seen outside of their own module?
In addition to filofel's post:
The reason stripping userspace shared libraries keeps them functioning is because their exported symbols are in the .dynsym section which is never stripped. .ko files however do not use dynsym.
people have reported success with
strip --strip-unneeded
strip -g XXX.
My Previous problem like what you happened is sloved by this command in embedded device with Linux Kernel 3.0.8.

Resources