Capture vDSO in strace - linux-kernel

I was wondering if there is a way to capture (in other words observe) vDSO calls like gettimeofday in strace.
Also, is there a way to execute a binary without loading linux-vdso.so.1 (a flag or env variable)?
And lastly, what if I write a program that delete the linux-vdso.so.1 address from the auxiliary vector and then execve my program? Has anyone ever tried that?

You can capture calls to system calls which have been implemented via the vDSO by using ltrace instead of strace. This is because calls to system calls implemented via the vDSO work differently than "normal" system calls and the method strace uses to trace system calls does not work with vDSO-implemented system calls. To learn more about how strace works, check out this blog post I wrote about strace. And, to learn more about how ltrace works, check out this other blog post I wrote about ltrace.
No, it is not possible to execute a binary without loading linux-vdso.so.1. At least, not on my version of libc on Ubuntu precise. It is certainly possible that newer versions of libc/eglibc/etc have added this as a feature but it seems very unlikely. See the next answer for why.
If you delete the address from the auxillary vector, your binary will probably crash. libc has a piece of code which will first attempt to walk the vDSO ELF object, and if this fails, will fall back to a hardcoded vsyscall address. The only way it will avoid this is if you've compiled glibc with the vDSO disabled.
There is another workaround, though, if you really, really don't want to use the vDSO. You can try using glibc's syscall function and pass in the syscall number for gettimeofday. This will force glibc to call gettimeofday via the kernel instead of the vDSO.
I've included a sample program below illustrating this. You can read more about how system calls work by reading my syscall blog post.
#include <sys/time.h>
#include <stdio.h>
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
int
main(int argc, char *argv[]) {
struct timeval tv;
syscall(SYS_gettimeofday, &tv);
return 0;
}
Compile with gcc -o test test.c and strace with strace -ttTf ./test 2>&1 | grep gettimeofday:
09:57:32.651876 gettimeofday({1467305852, 651888}, {420, 140735905220705}) = 0 <0.000006>

Related

How to find hanging LLVM optimization pass?

I've written an LLVM pass that replaces a few store instructions with calls to a function that perform some book-keeping, and then performs the store in a special way. It works fine when I compile with -O0, but I can only guarantee the functionality of my pass when using -O3. When I compile with -O3(or -O1/-O2), it completes my pass successfully, and then hangs in some later optimization stage. Is there a way to discover which optimization pass is hanging / why?
Just so I don't have to provide it later, here is my code and my compile line.
clang++-5.0 -std=c++11 -Xclang -load -Xclang ../../plugin/build/mylib.so single_param.cc -c -I ../../libs/ -S -emit-llvm -O3
The problem is not in code generation because I'm only generating bitcode. I noticed that stores in -O3 (without my pass) include alias information, and I thought that since I'm deleting these instructions, some later optimization using this alias information might encounter some trouble, so I turned off most of the alias analysis using -fno-strict-aliasing
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
void __attribute__((noinline)) f(int *n){
*n = *n + 1;
}
int main(){
int a = 4;
f(&a);
return a;
}
The way I was able to find the pass that was stalling was by turning remarks on with
-Rpass=.* -Rpass-missed=.* -Rpass-analysis=.*
I found that the only optimization pass giving remarks was tail call optimization, so I turned it off. I later found the problem with my code, but this is how I found the problem I was causing.

How to get backtrace without core file?

In case core-dump is not generated(due to any possible reason). And I want to know back trace (execution sequence of instructions). How Can I do that?
As /proc/pid/maps stores the memory map of a process.
Is there any file in linux which stores userspace or kernel space of a process?(May be I'm using wrong words to express)
I mean to say all address by address sequence of execution of instructions.
To see what the kernel stack for a process looks like at the present time:
sudo cat /proc/PID/stack
If you want to see the user stack for a process, and can get to it while it is still running (even if it is stuck waiting for a system call to return), run gdb and attach to it using its PID. Then use the backtrace command. This will be much more informative if the program was compiled with debug symbols.
If you want backtrace to be printed in Linux kernel use dump_stack()
If you want backtrace to be printed in user-level C code, implement something like this
#include <stdlib.h>
#include <stdio.h>
#include <execinfo.h>
#define BACKTRACE_SIZ 64
void show_backtrace (void)
{
void *array[BACKTRACE_SIZ];
size_t size, i;
char **strings;
size = backtrace(array, BACKTRACE_SIZ);
strings = backtrace_symbols(array, size);
for (i = 0; i < size; i++) {
printf("%p : %s\n", array[i], strings[i]);
}
free(strings); // malloced by backtrace_symbols
}
And then compile the code with -funwind-tables flag and link with -rdynamic
As told in http://www.stlinux.com/devel/debug/backtrace

Implementing a syscall on real-time Debian Wheezy

For educational purposes, I want to implement a system call in Debian Wheezy. I wish to implement it on the kernel that comes in the linux-image-3.2.0--rt-amd64 package. Here is an overview of what I have tried:
To get the kernel source:
apt-get source linux-image-3.2.0-4-rt-amd64
From that, I get the following files/directories the directory I executed in:
linux_3.2.41.orig.tar.xz
linux_3.2.41-2+deb7u2.dsc
linux_3.2.41-2+deb7u2.debian.tar.xz
as well as:
linux_3.2.41
which contains the source code for the kernel.
Then, to make the necessary changes in order to add the system call, I basically followed this page:
How to write system calls on debian/ubuntu
The following is a condensed version of the instructions given there modified to reflect the changes I made.
+File 1: linux-x.x.x/vpart_syscalls/vpart_syscalls.c
#include <linux/linkage.h>
#include <linux/kernel.h>
asmlinkage long insert_partition(char*dest, const char* src)
{
printk("<--- the syscall has been called!");
return 0;
}
File 2: linux-x.x.x/vpart_syscalls/Makefile. Create a Makefile within the same test directory you created above and put this line in it:
obj-y := vpart_syscalls.o
File 3: linux-x.x.x/arch/x86/kernel/syscall_table_32.S. Now, you have to add your system call to the system call table. Append to the file the following line:
.long insert_partition
File 4: linux-x.x.x/arch/x86/include/asm/unistd_32.h
In this file, the names of all the system calls will be associated with a unique number. After the last system call-number pair, add a line
#define __NR_insert_partition 349
Then replace NR_syscalls value, stating total number of system calls with (the existing number incremented by 1) i.e. in this case the NR_syscalls should've been 338 and the new value is 339.
#define NR_syscalls 350
File 5: linux-x.x.x/include/linux/syscalls.h
Append to the file the prototype of our function.
asmlinkage long insert_partition(int lenTicks, int vpid);
just before the #endif line in the file.
File 6: Makefile at the root of source directory.
Open Makefile and find the line where core-y is defined and add the directory test to the end of that line.
core-y += kernel/ mm/ fs/ test/ vpart_syscalls/
I then proceeded to build the kernel in a different fashion than is described there:
make localmodconfig
make menuconfig (making no changes)
make-kpkg clean
fakeroot make-kpkg --initrd --append-to-version=+tm kernel_image kernel_headers
cd ..
dpkg -i linux-image-3.8.*
dpkg -i linux-headers-3.8.*
The kernel that is installed boots fine. I made the following c program to test the syscall:
#include <stdio.h>
#include <linux/unistd.h>
#include <sys/syscall.h>
int main(){
printk("Calling the new syscall!\n");
int ret = 100;
ret = syscall(349, 1, 2);
printf("call return value: %i\n", ret);
return 0;
}
When I compile and run this program, I get a return value of -1. I check the messages using dmesg and there is no evidence of my printk being called..
If anyone knows where my problem is I would be really really happy! I should say I am not too experienced at changing and building the kernel, but I have learned a lot about it. I read Robert Loves book - linux kernel development and several guides on the webs.
I think, the steps 3 and 4 may be incorrect for 64-bit kernels:
File 3: linux-x.x.x/arch/x86/kernel/syscall_table_32.S.
File 4: linux-x.x.x/arch/x86/include/asm/unistd_32.h
There are two files here: http://lxr.linux.no/linux+v3.2.41/arch/x86/kernel/
syscall_64.c 668 2008-12-24 14:26:58 -0800
syscall_table_32.S 8659 2012-01-04 14:55:50 -0800
First one defines syscall table contents for 64-bit mode using C file and macro-cheating with unistd_64.h
#define __SYSCALL(nr, sym) [nr] = sym,
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
....
#include <asm/unistd_64.h>
};
Where asm/unistd_64.h is
#define __NR_read 0
__SYSCALL(__NR_read, sys_read)
and so on.
And second one, which you changed - is for 32-bit mode and written using asm file and labels (.long sys_call_name).
So, you defined syscall for 32-bit mode and you are using linux-image-3.2.0-4-rt-amd64 which is basically for " 64-bit PCs".
I think you compiled your test program as gcc test.c, which defaults to 64-bit mode. You can try -m32 option of gcc: gcc -m32 test.c to get 32-bit application (this will only work if you have correct cross environment for 32-bit builds) or compile this test on some 32-bit linux.
Or the other choice is to make step "4a": edit arch/x86/include/asm/unistd_64.h to add two lines:
#define __NR_insert_partition YOUR_NUMBER
__SYSCALL(__NR_insert_partition, insert_partition)
I'm not sure where and how NR_syscalls for 64bit is defined. It may be generated during build.

sys_call_table in linux kernel 2.6.18

I am trying to set the sys exit call to a variable by
extern void *sys_call_table[];
real_sys_exit = sys_call_table[__NR_exit]
however, when I try to make, the console gives me the error
error: ‘__NR_exit’ undeclared (first use in this function)
Any tips would be appreciated :) Thank you
Since you are in kernel 2.6.x , sys_call_table isnt exported any more.
If you want to avoid the compilation error try this include
#include<linux/unistd.h>
however, It will not work. So the work around to "play" with the sys_call_table is to find the address of sys_call_table in SystemXXXX.map (located at /boot) with this command:
grep sys_call System.map-2.6.X -i
this will give the addres, then this code should allow you to modify the table:
unsigned long *sys_call_table;
sys_call_table = (unsigned long *) simple_strtoul("0xc0318500",NULL,16);
original_mkdir = sys_call_table[__NR_mkdir];
sys_call_table[__NR_mkdir] = mkdir_modificado;
Hope it works for you, I have just tested it under kernel 2.6.24, so should work for 2.6.18
also check here, Its a very good
http://commons.oreilly.com/wiki/index.php/Network_Security_Tools/Modifying_and_Hacking_Security_Tools/Fun_with_Linux_Kernel_Modules
If you haven't included the file syscall.h, you should do that ahead of the reference to __NR_exit. For example,
#include <syscall.h>
#include <stdio.h>
int main()
{
printf("%d\n", __NR_exit);
return 0;
}
which returns:
$ cc t.c
$ ./a.out
60
Some other observations:
If you've already included the file, the usual reasons __NR_exit wouldn't be defined are that the definition was being ignored due to conditional compilation (#ifdef or #ifndef at work somewhere) or because it's being removed elsewhere through a #undef.
If you're writing the code for kernel space, you have a completely different set of headers to use. LXR (http://lxr.linux.no/linux) searchable, browsable archive of the kernel source is a helpful resource.

Any way to disable `tempnam' is dangerous, better use `mkstemp' gcc warning?

I'm using tempnam() only to get the directory name, so this security warning does not apply to my case. How can I disable it? I couldn't find any switches to do it.
If you really only want the directory name, use the string constant macro P_tmpdir, defined in <stdio.h>.
"The tempnam() function returns a pointer to a string that is a valid filename, and such that a file with this name did not exist when tempnam() checked."
The warning arises because of the race condition between checking and a later creating of the file.
You want to only get the directory name? What should that be good for?
Like stranger already said, you may disable this (and similar warnings) using -Wno-deprecated-declarations.
The answer is no, because - on many systems - the GNU C library (glibc) which implements this function is compiled so as to trigger a linker warnings when it is used.
See:
GCC bug page regarding this issue - I filed this a short while ago.
GNU ld bug page regarding this issue - filed in 2010!
GNU ld bug page suggesting an approach for resolving the issue - I filed this a short while ago.
Note that the problem is not specific to GCC - any linker is supposed to emit this warning, its trigger is "hard-coded" in the compiled library.
If you want to create a temporary directory that's unique for the process, you can use mkdtemp.
This can, e.g., be useful to create FIFOs in there, or when a program needs to create lots of temporary files or trees of directories and files: Then they can be put into that directory.
As linker warning it may be obfuscated by using this answer's ASM workaround/hack:
https://stackoverflow.com/a/29205123/2550395
Something like this (quick and dirty):
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
char my_file[20];
#define __hide_section_warning(section_string) \
__asm__ (".section " section_string "\n.string \"\rquidquid agis prudenter agas et respice finem \"\n\t.previous");
/* If you want to hide the linker's output */
#define hide_warning(symbol) \
__hide_section_warning (".gnu.warning." #symbol)
hide_warning(tmpnam)
tmpnam( my_file );
lock_fd = open( my_file, O_CREAT | O_WRONLY, (S_IRUSR | S_IWUSR | S_IRGRP) );
However, it still will leave a trace in the Make.p file and therefore isn't perfectly clean, besides already being a hack.
PS: It works on my machine ¯\_(ツ)_/¯
You can use GCC's -Wno-deprecated-declarations option to disable all warnings like this. I suggest you handle the warning properly, though, and take the suggestion of the compiler.

Resources