Avoid creating debug info in LKM with kbuild - gcc

I'm building Linux kernel module (LKM) from a big C files (>50 000 LOC). It's some generated RAID calculation code. When I try to build it from kbuild gcc eats all of the memory and crashes, while invoking gcc manually works fine.
After inspecting object files from manual gcc and kbuild I've found that kbuild object files is 20-30 times larger than manual gcc objects (900k vs 30M). And the reason is that kbuild object files contains giant section debug_info with tons of data.
Here is the fragment from objdump -x:
RELOCATION RECORDS FOR [.debug_info]:
OFFSET TYPE VALUE
0000000000000006 R_X86_64_32 .debug_abbrev
000000000000000c R_X86_64_32 .debug_str+0x0000000000000c41
0000000000000011 R_X86_64_32 .debug_str+0x0000000000000e26
0000000000000015 R_X86_64_32 .debug_str+0x0000000000000544
0000000000000019 R_X86_64_64 .text
0000000000000021 R_X86_64_64 .text+0x0000000000060957
0000000000000029 R_X86_64_32 .debug_line
0000000000000030 R_X86_64_32 .debug_str+0x0000000000000b78
0000000000000037 R_X86_64_32 .debug_str+0x000000000000011e
0000000000000040 R_X86_64_32 .debug_str+0x000000000000066b
0000000000000047 R_X86_64_32 .debug_str+0x0000000000000d38
000000000000004e R_X86_64_32 .debug_str+0x0000000000000bef
... another 60000 records ...
00000000000a0c8d R_X86_64_32 .debug_str+0x0000000000000add
00000000000a0ca0 R_X86_64_32 .debug_str+0x0000000000000526
00000000000a0cae R_X86_64_64 Calculation_1s_Func_Buf
I've already tried EXTRA_CFLAG += -S with no luck.
So is there any way to avoid creating debug info in my object files while building with kbuild?

Have you tried turning off CONFIG_DEBUG_INFO?
Look for it in 'Kernel Hacking' -> 'Compile-time checks and compiler options' -> 'Compile the kernel with debug info' in menuconfig.

Related

Undefined symbol error: the target library has been set in rpath but it still cannot be found

calving#norfolk:~/sandbox/stage/third_party/houdini16.5/lib$ ldd libgusd.so | grep boost
libboost_python.so.1.55.0 => /home/calveng/sandbox/stage/third_party/houdini16.5/lib/./../../../lib/libboost_python.so.1.55.0 (0x00007f735cb9c000)
libboost_regex.so.1.55.0 => /home/calveng/sandbox/stage/third_party/houdini16.5/lib/./../../../lib/./libboost_regex.so.1.55.0 (0x00007f735af6a000)
calving#norfolk:~/sandbox/stage/third_party/houdini16.5/lib$ readelf -s --wide libgusd.so | grep _ZN5boost6system16generic_categoryEv
1064: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZN5boost6system16generic_categoryEv
6632: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZN5boost6system16generic_categoryEv
Once I try to link this "libgusd.so" library, it will throw an error: "undefined symbol: _ZN5boost6system16generic_categoryEv"
But the point is the path which including boost has already been set by an embedded rpath, and some other boost file can be linked properly.
Does anybody ran into similar condition before? Any prompt would be really helpful.
TIPS: There isn't a LD_LIBRARY_PATH in my environment, so no overwritten.
Well, it solved by:
patchelf --add-needed libboost_system.so.1.55.0 libgusd.so
I guess I still a little confusing about the difference between linked-libraries and their paths... I thought all of the dynamic libraries will be linked automatically once the program needs them.

Is there a way to get debugging symbols for CGO code linked into Go?

I have some Cgo code that I'm linking into my Go binary. I've got Cgo running and building my code and wrapper. After some recent changes, I started getting a double-free in my C++ that I'm linking in. I've tried running my binary under lldb and it does trap the malloc panic, but the symbols are not especially useful.
In vanilla C or C++ I've used -g3 to get rich debugging symbols that includes variable names and source. This makes using lldb much more productive. However, I'm having some issues getting these symbols to show up in my go binary. I've noticed that in the backtrace my function appears as main'foo, where foo is the name of my function. There is no other debug info present though, all I get is a trace of assembly and memory pointers/registers.
I've tried invoking go build with CGO_CFLAGS="-g3" CGO_CXXFLAGS="-g3" but the binary still doesn't have the symbols. I've also tried adding -g3 to the CFLAGS/CXXFLAGS in my .go file where I set other flags (before import "C") but this doesn't seem to work either. I can't think of any other way to get this debugging info added into my binary - is there some Go-specific flag or build sequence that enables this?
I don't know how the process of go & C++ linking works. But a brief description of how debug information is handled on OS X might help you figure out where the debug information is being lost.
On OS X as on most systems, the debug information for an individual source file compile goes into the .o file made from it. You can verify that your .o file got debug information in a variety of ways, here's one:
lldb -o "image dump sections" Target.o --batch | grep DWARF
0x00000300 container [0x0000000000061538-0x0000000000782c77) rwx 0x00061cb8 0x0072173f 0x00000000 Target.o.__DWARF
0x00000009 dwarf-str [0x0000000000061538-0x00000000004eeabc) rwx 0x00061cb8 0x0048d584 0x02000000 Target.o.__DWARF.__debug_str
0x0000000a dwarf-loc [0x00000000004eeabc-0x00000000004ef493) rwx 0x004ef23c 0x000009d7 0x02000000 Target.o.__DWARF.__debug_loc
0x0000000b dwarf-abbrev [0x00000000004ef493-0x00000000004f09a7) rwx 0x004efc13 0x00001514 0x02000000 Target.o.__DWARF.__debug_abbrev
0x0000000c dwarf-info [0x00000000004f09a7-0x00000000006ea7f1) rwx 0x004f1127 0x001f9e4a 0x02000000 Target.o.__DWARF.__debug_info
0x0000000d dwarf-ranges [0x00000000006ea7f1-0x00000000006ec481) rwx 0x006eaf71 0x00001c90 0x02000000 Target.o.__DWARF.__debug_ranges
0x0000000e dwarf-macinfo [0x00000000006ec481-0x00000000006ec482) rwx 0x006ecc01 0x00000001 0x02000000 Target.o.__DWARF.__debug_macinfo
0x0000000f apple-names [0x00000000006ec482-0x000000000071134e) rwx 0x006ecc02 0x00024ecc 0x02000000 Target.o.__DWARF.__apple_names
0x00000010 apple-objc [0x000000000071134e-0x0000000000711372) rwx 0x00711ace 0x00000024 0x02000000 Target.o.__DWARF.__apple_objc
0x00000011 apple-namespaces [0x0000000000711372-0x00000000007116d6) rwx 0x00711af2 0x00000364 0x02000000 Target.o.__DWARF.__apple_namespac
0x00000012 apple-types [0x00000000007116d6-0x0000000000748797) rwx 0x00711e56 0x000370c1 0x02000000 Target.o.__DWARF.__apple_types
0x00000015 dwarf-line [0x000000000075d348-0x0000000000782c77) rwx 0x0075dac8 0x0002592f 0x02000000 Target.o.__DWARF.__debug_line
If you don't see anything here, then the compiler isn't emitting debug information...
The next step is specific to Darwin, instead of putting the debug information into the output of the link stage, the debug info is left in the .o files, and a "debug map" is inserted into the output image. That's how the debugger finds its way back to the .o files. You can see that by doing:
$ nm -ap <YourBinary> | grep OSO
you should see a list of all your .o files here. If you don't then at some point in the build process your binary is getting stripped (using at least strip -S) You have to find out when that is happening and not do that. Also check that the .o files are still where the entries you see from the command above say they are. It may be some part of the build process is moving them around, and the debugger can't find them anymore.

Understanding why gcc/ld are linking a symbol from a particular library using readelf

I'm including several libraries with gcc (-llapacke -llapack -lcblas) and I'm receiving "undefined reference" errors unless I explicitly link to the static version of one them (lapacke). I'm trying to understand why by searching the various variants of the offending library with nm and readelf. Let's take the "undefined" function zsysv_rook_:
% readelf -Wa /usr/lib/liblapacke.so | grep zsysv_rook_
00000000003c3978 000008d600000007 R_X86_64_JUMP_SLOT 000000000016e340 LAPACKE_zsysv_rook_work + 0
00000000003c5f20 000003c300000007 R_X86_64_JUMP_SLOT 0000000000000000 zsysv_rook_ + 0
963: 0000000000000000 0 FUNC GLOBAL DEFAULT UND zsysv_rook_
2262: 000000000016e340 884 FUNC GLOBAL DEFAULT 11 LAPACKE_zsysv_rook_work
That's the dynamic variant. This is the static variant:
% readelf -Wa /usr/lib/liblapacke.a | grep zsysv_rook_
00000000000000b2 0000000d00000004 R_X86_64_PLT32 0000000000000000 LAPACKE_zsysv_rook_work - 4
0000000000000146 0000000d00000004 R_X86_64_PLT32 0000000000000000 LAPACKE_zsysv_rook_work - 4
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND LAPACKE_zsysv_rook_work
File: /usr/lib/liblapacke.a(lapacke_zsysv_rook_work.o)
0000000000000186 0000000f00000004 R_X86_64_PLT32 0000000000000000 zsysv_rook_ - 4
0000000000000264 0000000f00000004 R_X86_64_PLT32 0000000000000000 zsysv_rook_ - 4
9: 0000000000000000 884 FUNC GLOBAL DEFAULT 1 LAPACKE_zsysv_rook_work
15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND zsysv_rook_
Why does it compile only with the static version?
The symbol zsysv_rook_ is not defined by either library, so something in the library references it.
A static library is basically an archive of object files and the linker looks into the archive and links to each object that resolves an undefined reference in your program. If there are objects that define symbols that your program doesn't need, those objects will not be linked to. I assume what is happening is that the object that references zsysv_rook_ doesn't define any symbols you need, so that object isn't linked to, and your program doesn't need to resolve the zsysv_rook_ symbol.
When you link to a dynamic library (by default) you need to resolve all the undefined references needed by anything in the library, so because some part of the library refers to zsysv_rook_ you need to link to whatever provides it.
So if you want to use the dynamic library you need to figure out which library defines zsysv_rook_ and link to it. It's probably one of the other LAPACK libs, maybe one you're already linking to but you might be putting the -l option in the wrong place in the link command - the library that provides zsysv_rook_ needs to come after -llapacke in order to resolve the reference to it.

cmake tries to link resource file

How can I prevent cmake from linking MACOSX bundle resource files ?
When I add a resource file to a MACOSX application bundle, it is taken by the linker due to its .obj postfix. The linker tries to link it without success. He writes a warning:
Linking CXX executable mwe.app/Contents/MacOS/mwe
ld: warning: ignoring file ../star.obj, file was built for unsupported file format
( 0x23 0x20 0x42 0x6C 0x65 0x6E 0x64 0x65 0x72 0x20 0x76 0x32 0x2E 0x37 0x30 0x20 )
which is not the architecture being linked (x86_64): ../star.obj
Of course he cannot link it because star.obj is a 3D model text file in the obj format:
# Blender v2.70 (sub 0) OBJ File: 'star.blend'
o Plane
v 0.510396 -0.000389 0.998397
v -0.926169 -0.000017 -0.001603
v 0.510396 0.000355 -1.001603
[..and many more vertices]
The resulting problem is; the file is not put into the MACOSX bundle folder because the linker ignored it.
VisorZ#Mac ~/MWE> ll build/mwe.app/Contents/Resources/
total 8
-rw-r--r-- 1 Stephan staff 62B 6 Aug 22:36 star.off
(See, star.obj is missing here) ------------------------------------^
I would like to exclude that obj file from the link file lists via cmake.
But neither the source file properties
MACOSX_PACKAGE_LOCATION,
EXTERNAL_OBJECT FALSE,
LANGUAGE "myMODEL"
nor the target property
RESOURCE
can do that. Here is a minimum working example CMakeLists.txt:
# Minimum Working Example to create mwe.app on MACOSX with linker
# trying to process *.obj files which are marked as resource files
CMAKE_MINIMUM_REQUIRED(VERSION 2.8)
PROJECT(mwe)
SET_SOURCE_FILES_PROPERTIES(
star.obj # 3D model as OBJ txt file
star.off # 3D model as OFF txt file
PROPERTIES
MACOSX_PACKAGE_LOCATION Resources
)
ADD_EXECUTABLE(
${PROJECT_NAME}
MACOSX_BUNDLE # needs to be second argument, enables bundling
helloworld.cpp # will be compiled
star.obj # will not be bundled because it will be taken by linker
star.off # will be bundled
)
Use HEADER_FILE_ONLY and try not to think too much about the name of it.
set_source_files_properties(
star.obj # 3D model as OBJ txt file
star.off # 3D model as OFF txt file
PROPERTIES
HEADER_FILE_ONLY ON
)
If you want to stop the linker from trying to link the files you can add them to a custom target as such:
ADD_CUSTOM_TARGET (testTarget SOURCES myFile.obj myFile.off)
This is especially handy for having files show up in your project tree but not having them compiled/linked.

WinDbg not showing useful information

First let me say I am a total WinDbg noob, so this might be an easy question...
I have an application ("MyApp" - name changed to protect the innocent!) that I am trying to debug because it is throwing an exception. This only happens on user machines - I have not been able to reproduce it on my development machine. So I set up DebugDiag on the users machine and captured a Full Dump. Then I loaded the dump in WinDbg and did an analyze -v and a kp to try to figure out what was going on... but neither of these seem to give me the information that I'm looking for - the function (and hopefully the line number) of the line that is causing the problem... I think I have the symbol file loaded by specifying the path to 'MyApp.pdb' in the Symbol File Path:
srv*c:\symcache*http://msdl.microsoft.com/download/symbols;srv*c:\symcache*C:\dev\Customer\MyAppSln\MyApp\Debug
First, here's the output from kp:
0:004> kp
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
0502f474 7c347966 MyApp!DllMain+0x3e8a6
0502f4bc 7c3a2448 msvcr71!_nh_malloc(unsigned int size = <Memory access error>, int nhFlag = <Memory access error>)+0x24 [f:\vs70builds\3052\vc\crtbld\crt\src\malloc.c # 117]
0502f57c 7c3416b3 msvcp71!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::_Tidy(bool _Built = <Memory access error>, unsigned int _Newsize = <Memory access error>)+0x45 [f:\vs70builds\3077\vc\crtbld\crt\src\xstring # 1520]
0502f610 7c3a32de msvcr71!_heap_alloc(unsigned int size = <Memory access error>)+0xe0 [f:\vs70builds\3052\vc\crtbld\crt\src\malloc.c # 212]
0502f620 7c3b3f63 msvcp71!wmemcpy(wchar_t * _S1 = 0x04e463b9 "าธ???", wchar_t * _S2 = 0xffffffff "--- memory read error at address 0xffffffff ---", unsigned int _N = 0x4e25212)+0x14 [f:\vs70builds\3077\vc\crtbld\crt\src\wchar.h # 843]
0502f640 04e463b9 msvcp71!std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::assign(class std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > * _Right = 0xffffffff, unsigned int _Roff = 0x4e25212, unsigned int _Count = 2)+0x7c [f:\vs70builds\3077\vc\crtbld\crt\src\xstring # 601]
0502f770 04df1077 MyApp!DllMain+0x65329
0502f824 04e01b35 MyApp!DllMain+0xffe7
0502ff08 04dfe034 MyApp!DllMain+0x20aa5
0502ff48 04dfde4f MyApp!DllMain+0x1cfa4
0502ff88 7648d0e9 MyApp!DllMain+0x1cdbf
0502ffc4 773499f9 kernel32!BaseThreadInitThunk+0xe
0502ffd4 7738198e ntdll!RtlQueryInformationAcl+0x8b
0502ffec 00000000 ntdll!_RtlUserThreadStart+0x1b
the line I'm specifically trying to decode is the 'MyApp!DllMain+0x65329' as this is the last line that seems to be executing, and the error is occurring within the malloc call, which is apparently where the exception is being thrown from. What am I doing wrong that makes it only display the module and offset instead of source file and line number?
I'm also not sure why the line above the malloc call is back in MyApp again - maybe someone can explain that too.
Just in case, here's the output from 'analyze -v':
0:004> !analyze -v
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************
*** WARNING: Unable to verify checksum for MyApp.exe
*** ERROR: Module load completed but symbols could not be loaded for MyApp.exe
*** WARNING: Unable to verify checksum for ThirdPartyDll.dll
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ThirdPartyDll.dll -
*** WARNING: Unable to verify checksum for mdnsNSP.dll
*** ERROR: Symbol file could not be found. Defaulted to export symbols for mdnsNSP.dll -
*** ERROR: Symbol file could not be found. Defaulted to export symbols for SLC.dll -
FAULTING_IP:
MyApp!DllMain+3e8a6
04e1f936 8b16 mov edx,dword ptr [esi]
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 04e1f936 (MyApp!DllMain+0x0003e8a6)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 00000000
Attempt to read from address 00000000
PROCESS_NAME: MyApp.exe
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
EXCEPTION_PARAMETER1: 00000000
EXCEPTION_PARAMETER2: 00000000
READ_ADDRESS: 00000000
FOLLOWUP_IP:
msvcr71!_heap_alloc+e0 [f:\vs70builds\3052\vc\crtbld\crt\src\malloc.c # 212]
7c3416b3 e88e0c0000 call msvcr71!__SEH_epilog (7c342346)
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
LAST_CONTROL_TRANSFER: from 00000000 to 773bbb33
FAULTING_THREAD: ffffffff
BUGCHECK_STR: APPLICATION_FAULT_ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption_NULL_POINTER_READ_SHUTDOWN
PRIMARY_PROBLEM_CLASS: ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption_SHUTDOWN
DEFAULT_BUCKET_ID: ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption_SHUTDOWN
STACK_TEXT:
773bbb33 ntdll!RtlpAllocateHeap+0x7ad
773a6e0c ntdll!RtlAllocateHeap+0x1e3
7c3416b3 msvcr71!_heap_alloc+0xe0
FAULTING_SOURCE_CODE:
No source found for 'f:\vs70builds\3052\vc\crtbld\crt\src\malloc.c'
SYMBOL_STACK_INDEX: 2
SYMBOL_NAME: msvcr71!_heap_alloc+e0
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: msvcr71
IMAGE_NAME: msvcr71.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 3e561eac
STACK_COMMAND: dds 7740c078 ; kb
FAILURE_BUCKET_ID: ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption_SHUTDOWN_c0000005_msvcr71.dll!_heap_alloc
BUCKET_ID: APPLICATION_FAULT_ACTIONABLE_HEAP_CORRUPTION_heap_failure_freelists_corruption_NULL_POINTER_READ_SHUTDOWN_msvcr71!_heap_alloc+e0
If you believe the PDB should be in your symbol path, you should run something like this:
!sym noisy
.reload MyApp.dll
kp
!sym noisy causes the debugger to give out more detailed information on why it couldn't load symbols - no MyApp.pdb found, found but does not match, etc. This will help you find out why it is not loading symbols. !sym noisy again turns off the verbose symbol output.
When you set the path for symbols, did you reload them?
.reload
I'm not sure your adding
srv*c:\symcache*C:\dev\Customer\MyAppSln\MyApp\Debug
to the symbol path has the desired effect.
I usually list all local paths in the .sympath first, and as the last step, I do .symfix+ to configure the public symbols using the microsoft symbol server:
.sympath C:\dev\Customer\MyAppSln\MyApp\Debug
.symfix+ c:\symcache
the rationale behind listing local paths first being that the debugger would not have to check the remote server for pdbs (that are not there anyways) as opposed to simply retrieving them locally.
Anyways, your problem is that the symbols for MyApp are not loaded therefore stack walking does not quite work.
Debugger walks the stack backwards, starting from the top, that's why you're seeing MyApp - this is where the access violation occurred.
Now, since debugger does not have the symbols at this point, it can only guess what invocation chain has led to the function on top.
And it guesses wrong by following a misleading path.

Resources