gcc address sanitizer core dump on error - gcc

I'm trying to do some debugging on a server on an issue that I suspect is related to a buffer overflow, so I tried to compile my code with -fsanitize=address to enable address sanitizing.
It compiled, and the resulting software runs. However, I'm trying to get a core dump when the address sanitizer detects an error since that is pretty much the only way I can get information out of the system due to the setup.
I am calling the software with ASAN_OPTIONS=abort_on_error=1 prepended on the command line (using a shell script to do that), and have checked that ulimit -c gives unlimited as result, but it just won't produce a core dump.
What am I missing?
This is on an ubuntu 14.04 server with gcc version 4.8.4
EDIT: sysctl kernel.core_pattern gives back kernel.core_pattern = |/usr/share/apport/apport %p %s %c %P. This probably means that apport is enabled (at least in some form). However, I have been able to get proper core files on this system from asserts and SIGFPEs in the software (that is where the suspicion of array overruns comes from).

Let me guess, is this x64 target? Coredumps are disabled there to avoid dumping 16 TB shadow memory (see docs for disable_coredump here for details).
Newer versions of GCC/Clang remove shadow from core by default so that one could do something like
export ASAN_OPTIONS=abort_on_error=1:disable_coredump=0
but I'm afraid 4.8 is too old for this.
As an alternative suggestion, why backtraces are not enough for you? You could use log_path or log_to_syslog to preserve them if you do not have access to programs stderr.
NB: I posted suggestion to enable coredumps on all platforms.

Related

Qemu-Arm Is stuck with black screen - running vanilla kernel

I have tried to run qemu-arm with a complied linux kernel (Version 4.9)
and with an initfs that i have created with a sample program.
This is was based on an excellent post from here.
This is the command that i have executed:
qemu-system-arm -M vexpress-a9 -kernel linux-4.9/arch/arm/boot/zImage -initrd initramfs -append "console=tty1"
then, qemu shows me these errors and its graphical window is getting stuck:
pulseaudio: set_sink_input_volume() failed
pulseaudio: Reason: Invalid argument
pulseaudio: set_sink_input_mute() failed
pulseaudio: Reason: Invalid argument
Even when I run it without the -initrd parameter, for just loading the kernel - nothing happens.
When I tried run it with a vmlinuz-3.2.0-4-vexpress image in this example, it worked for me.
Does someone have clue what may be the problem? Something with the fact that it is a zImage? Is there a way to debug it?
Thanks!
"QEMU sits there and prints nothing" is quite a common symptom, and it almost always means "the guest kernel crashed before being able to print anything, because it wasn't configured correctly". This is pretty much the same effect you get if you try to boot a wrongly configured kernel on real hardware, and the process for debugging it is about the same:
check the obvious kernel config options are set correctly: in particular, that you have built it to support the ARM board and CPU that you're trying to run it on, and that you've enabled support for whatever devices you're trying to use for console output
give yourself the maximum chance of being able to see something, by configuring QEMU to output serial port information, and configuring the guest to send its console output to serial, and enabling any earlycon/earlyprintk options you can (serial output happens much earlier than graphics output, and the Linux kernel earlycon/earlyprintk options mean the kernel will start printing output earlier than it defaults to)
if you have a kernel that works, and one that doesn't, look at the differences between the kernel configs to see if one is missing something
if all else fails, you have to break out the debugger to find out what's going on
Nothing about this is particularly QEMU specific -- it's the same sort of pain you have to go through if you're trying to do kernel bringup on hardware.
PS: my first guess is that the kernel is crashing because it doesn't have enough memory -- you haven't passed QEMU a '-m' option, so it is defaulting to 128MB; the vexpress-a9 board can handle up to 1GB. earlycon would probably be sufficient debug output to identify this issue. You also aren't passing a device tree blob via -dtb, which may be an issue for newer kernels (older kernels would happily boot without one).

fail to attach eBPF blob

I've just compiled BPF examples from kernel tools/testing/selftests/bpf and tried to load as explained in http://cilium.readthedocs.io/en/v0.10/bpf/:
% tc filter add dev enp0s1 ingress bpf \
object-file ./net-next.git/tools/testing/selftests/bpf/sockmap_parse_prog.o \
section sk_skb1 verbose
Program section 'sk_skb1' not found in ELF file!
Error fetching program/map!
This happens on Ubuntu 16.04.3 LTS with kernel 4.4.0-98, llvm and clang of version 3.8 installed from packages, iproute2 is the latest from github.
I suspect I'm running into some toolchain/kernel version/features mismatch.
What am I doing wrong?
I do not know why tc complains. On my setup, with a similar command, the program loads. Still, here are some hints:
I think the problem might come, as you suggest, from some incompatibility between kernel headers version and iproute2, and that some relocation fails to occur, although on a quick investigation I did not find exactly why it refuses to load the section. On my side I'm using clang-3.8, latest iproute2, but also the latest kernel (some commit close to 4.14).
If you manage to load the section somehow, I believe you would still encounter problems when trying to attach the program in the kernel. The feature called “direct packet access” is only present on kernels 4.7 and higher. This is what makes you able to use skb->data and skb->data_end in your programs.
Then as a side note, this program sockmap_parse_prog.c is not meant to be used with tc. It is supposed to be attached directly to a socket (search for SOCKMAP_PARSE_PROG in file test_maps.c in the same directory to see how it is loaded there). Technically this does not prevent one to attach the program as a tc filter, but it will probably not work as expected. In particular, the value returned from the program will probably not have a meaning that tc classifier hook will understand.
So I would advise to try with a recent kernel, and to see if you have more success. Alternatively, try compiling and running the examples that you can find in your own kernel sources. Good luck!

Where is the kprintf (kernel printf) log on Sierra?

There are lots of pages that explain it but I can't find it. Many of the articles I find only work on El Capitan and older systems.
I cannot use the fwkpfv right now as I don't have the right dongles. My client is getting me a used MacBook that will support firewire.
My kernel extension panics my box. Quite oddly if my coworker builds my extension, it works just fine. I remain flummoxed.
You can get "live" local kernel logs using the command
log stream --process 0
For looking at past logs, use log show instead, e.g.:
log show --predicate 'processID == 0' --last 1h | less
None of that will help you much with kernel panics, however, as the logging happens asynchronously in user space, so you won't get the very last messages before the panic.
A few more options for debugging KPs without firewire, which you're probably already aware of but I'll mention them for completeness' sake:
Ethernet-based kernel debugging (as opposed to firewire). Only the test device needs wired/thunderbolt ethernet, the Mac running the debugger can be on wifi.
You can often extract quite a lot of info from the panic log itself: in addition to symbolicating the stack (use keepsyms=1 boot-arg so you don't have to do it retroactively), looking at the register contents and disassembly can often tell you the values of variables.
If you're missing parts of Apple's code the stack trace, run a debug or development kernel instead of the release one. Those are built with fewer optimisations enabled, so functions are less likely to be inlined, etc.
There are a bunch of memory debugging and other diagnostic options you can turn on in the kernel, e.g. -zp, -zc and so on.
If you can repro the crash in a VM (VMWare Fusion, Parallels, VirtualBox, KVM/Qemu, whatever), you can use the VM's simulated serial port to log kprintf output. The virtual ethernet ports also tend to support kernel debugging if you set them up right.

cc1plus: error: include: Value too large for defined data type when compiling with g++

I am making a project that should compile on Windows and Linux. I have made the project in Visual Studio and then made a makefile for linux. I created all the files in Windows with VS.
It compiles and runs perfectly in VS but when I run the makefile and it runs g++ I get
$ g++ -c -I include -o obj/Linux_x86/Server.obj src/Server.cpp
cc1plus: error: include: Value too large for defined data type
cc1plus: error: src/Server.cpp: Value too large for defined data type
The code is nothing more than a Hello World atm. I just wanted to make sure that everything was working before I started development. I have tried searching but to no avail.
Any help would be appreciated.
I have found a solution on Ubuntu at least. I, like you have noticed that the error only occurs on mounted samba shares - it seems to come from g++ 'stat'ing the file, the inode returns a very large value.
When mounting the share add ,nounix,noserverino to the options, ie:
mount -t cifs -o user=me,pass=secret,nounix,noserverino //server/share /mount
I found the info at http://bbs.archlinux.org/viewtopic.php?id=85999
I had similar problem. I compiled a project in a CIFS mounted samba share. With one Linux kernel the compilation was done, but using an other Linux kernel (2.6.32.5), I got similar error message: "Value too large for defined data type". When I used the proposed "nounix,noserverino" CIFS mounting option, the problem was fixed. So in that case there is a problem with CIFS mounting, so the error message is misleading, as there are no big files.
GNU Core Utils:
27 Value too large for defined data type
It means that your version of the utilities were not compiled with
large file support enabled. The GNU utilities do support large files
if they are compiled to do so. You may want to compile them again and
make sure that large file support is enabled. This support is
automatically configured by autoconf on most systems. But it is
possible that on your particular system it could not determine how to
do that and therefore autoconf concluded that your system did not
support large files.
The message "Value too large for defined data type" is a system error
message reported when an operation on a large file is attempted using
a non-large file data type. Large files are defined as anything larger
than a signed 32-bit integer, or stated differently, larger than 2GB.
Many system calls that deal with files return values in a "long int"
data type. On 32-bit hardware a long int is 32-bits and therefore this
imposes a 2GB limit on the size of files. When this was invented that
was HUGE and it was hard to conceive of needing anything that large.
Time has passed and files can be much larger today. On native 64-bit
systems the file size limit is usually 2GB * 2GB. Which we will again
think is huge.
On a 32-bit system with a 32-bit "long int" you find that you can't
make it any bigger and also maintain compatibility with previous
programs. Changing that would break many things! But many systems make
it possible to switch into a new program mode which rewrites all of
the file operations into a 64-bit program model. Instead of "long"
they use a new data type called "off_t" which is constructed to be
64-bits in size. Program source code must be written to use the off_t
data type instead of the long data type. This is typically done by
defining -D_FILE_OFFSET_BITS=64 or some such. It is system dependent.
Once done and once switched into this new mode most programs will
support large files just fine.
See the next question if you have inadvertently created a large file
and now need some way to deal with it.
12 years after this question was posted, I get the same error, when building my c++ project in docker on ubuntu 20.04 in Windows 11 WSL2 using an old 32-bit compiler (gcc-linaro-arm-linux-gnueabi-2012.01-20120125_linux/bin/arm-linux-gnueabi-g++).
The issue is related to mounting of the windows filesystem into WSL, where we get 64 bit inodes, not supported by the old 32 bit toolchain, see also The 64 bit inode problem
As a quick workaround you can build if you move your project into the home folder of the ubuntu.
An alternative solution is to replace the stat() with LD_PRELOAD as described in Build on Windows 10 with WSL, section "Fix stat in 32-bit binaries". This shall be done in the docker image. To compile inode64.c I had to additionally add 32 bit header files with
sudo apt-get install gcc-multilib
If you are on mergerfs filesystem, removing use_ino option will solve the issue: https://github.com/trapexit/mergerfs/issues/485
I think your g++ parameters are a bit off, or conflicting.
-c compile only
-I directory of your includes (just plain include might be ambiguous. Try full path)
-o outfile (but -c says compile only)

Locate bad memory access on Solaris

On Linux, FreeBSD and other systems I have valgrind for checking for memory errors like invalid reads and similar. I really love valgrind. Now I have to test code on Solaris/OpenSolaris and can't find a way to get information on invalid reads/writes in an as nice way (or better ;-)) as valgrind there.
When searching for this on the net I find references to libumem, but I get only reports about memory leaks there, not invalid access. What am I missing?
The dbx included with the Sun Studio compilers includes memory access checking support in its "Run Time Checking" feature (the check subcommand). See:
Solaris Studio 12.4 dbx manual: Chapter 9: Using Runtime Checking
Debugging Applications with Sun Studio dbx, dbxtool, and the Thread Analyzer
Leonard Li's Weblog: Runtime Memory Checking
The related "Sun Memory Error Discovery Tool" is also available from
http://cooltools.sunsource.net/discover/
Since version 3.11.0, Valgrind does run on Solaris.
See Release Notes and Supported Platforms.
More precisely, x86/Solaris and amd64/Solaris is now supported.
Support for sparc/Solaris is still in works.
watchmalloc is a quite useful library that can be dynamically loaded for your program (usually no need for recompiling) and then sets watchpoints at all the usually problematic memory locations, like freed areas or after an allocated memory block.
If your program accesses one of these invalid areas it gets a signal and you can inspect it in the debugger.
Depending on the configuration problematic areas can be watched for writes only, or also for reads.

Resources