Base address of thread stack in Linux - linux-kernel

I have a question. In linux C programming, if we create a thread using pthread, and we can get its tid by pthread_self().
Is there a way to obtain the base address of this thread simply(Using some API from pthread)?
Thank you!

The correct way to do this is to use pthread_getaddr_np (not necessarily portable, seems to work with glibc and musl) and pthread_attr_getstack
#define _GNU_SOURCE //for pthread_getattr_np
#include <pthread.h>
void* thread_main(void* args) {
//your code here
}
int main() {
pthread_t thread;
pthread_create(&thread, NULL, &thread_main, NULL);
pthread_attr_t attrs;
pthread_getattr_np(thread, &attrs);
void* stack_ptr;
size_t stack_size;
pthread_attr_getstack(&attrs, &stack_ptr, &stack_size);
//`stack_ptr` now points to the base address of the spawned thread's stack
...
}
If you try to avoid the call to pthread_getaddr_np, you may get unexpected results. If you use the dafault value of stack address, pthread_create() ignores it and picks a suitable stack address (on linux, gnu libc and musl libc).
#define _GNU_SOURCE //for pthread_getattr_np
#include <pthread.h>
#include <stdio.h>
void* thread_main(void* args) {
//your code here
return NULL;
}
int main() {
pthread_t thread;
pthread_attr_t attrs;
pthread_attr_init(&attrs);
void* stack_ptr;
size_t stack_size;
pthread_attr_getstack(&attrs, &stack_ptr, &stack_size);
printf("default stack address:\t%x\n", stack_ptr);
pthread_create(&thread, &attrs, &thread_main, NULL);
pthread_getattr_np(thread, &attrs);
pthread_attr_getstack(&attrs, &stack_ptr, &stack_size);
//`stack_ptr` now points to the base address of the spawned thread's stack
printf("real stack address:\t%x\n", stack_ptr);
return 0;
}
output:
default stack address: 0
real stack address: d385a000

It is 0x00000000. Surprisingly..

Related

linux hw_breakpoint does not work while accessing memory from userspace

I am debugging a ARMv7 board and I want to know whether a kernel symbol is accessed. So I have to use hw_breakpoint in kernel.
For simplicity, I use kernel sample code:data_breakpoint to test, which locates in samples/hw_breakpoint/data_breakpoint.c.
Then I did the following operation:
insmod data_breakpoint.ko ksym=max
cat /proc/kallsyms | grep max
./read_kmem c06fa128
But this did not trigger the callback function.
If I print the value in that address in any kernel module, callback function will be triggered.
I read the cpu manual and it says that the breakpoint register in my cpu support virtual address matching. But I don't know why it doesn't work while accessing memory from userspace. I think that program does read the right value of kernel symbol.
read_kmem.c
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/mman.h>
#define DEVKMEM "/dev/kmem"
#define PAGE_SIZE 0x1000
#define PAGE_MASK (~(PAGE_SIZE-1))
int main(int argc, char* argv[])
{
int fd;
char *mbase;
char read_buf[10];
unsigned int varAddr;
varAddr = strtoul(argv[1], 0, 16);
unsigned int ptr = varAddr & ~(PAGE_MASK);
fd = open(DEVKMEM, O_RDONLY);
if (fd == -1) {
perror("open");
exit(-1);
}
mbase = mmap(0,PAGE_SIZE,PROT_READ,MAP_SHARED,fd, (varAddr & PAGE_MASK));
if (mbase == MAP_FAILED) {
printf("map failed %s\n",strerror(errno));
}
printf("varAddr = 0x%X \n", varAddr);
printf("mapbase = 0x%X \n", (unsigned int)mbase);
printf("value = 0x%X \n",*(unsigned int*)(mbase+ptr));
close(fd);
munmap(mbase,PAGE_SIZE);
return 0;
}
Your userspace does not access address c06fa128, it accesses a different address - one that that mmap() returned (plus offset). Thus no breakpoint hit.
The fact that virtual address being accessed resolves to same physical address as some other virtual address that has a breapoint, does not matter. CPU executing your userspace code has no idea that different mapping exists.

Catch system calls on Mac OS X

I'm trying to catch all systems-calls called by a given PID with a self-made program (I cant use any of strace, dtruss, gdb...). So i used the function
kern_return_t task_set_emulation(task_t target_port, vm_address_t routine_entry_pt, int routine_number) declared in /usr/include/mach/task.h .
I've written a little program to catch the syscall write :
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <mach/mach.h>
#include <mach/mach_vm.h>
void do_exit(char *msg)
{
printf("Error::%s\n", msg);
exit(42);
}
int main(void)
{
mach_port_t the_task;
mach_vm_address_t address;
mach_vm_size_t size;
mach_port_t the_thread;
kern_return_t kerr;
//Initialisation
address = 0;
size = 1ul * 1024;
the_task = mach_task_self(); //Get the current program task
kerr = mach_vm_allocate(the_task, &address, size, VM_MEMORY_MALLOC); //Allocate a new address for the test
if (kerr != KERN_SUCCESS)
{ do_exit("vm_allocate"); }
printf("address::%llx, size::%llu\n", address, size); //debug
//Process
kerr = task_set_emulation(the_task, address, SYS_write); //About to catch write syscalls
the_thread = mach_thread_self(); //Verify if a thread is opened (even if it's obvious)
printf("kerr::%d, thread::%d\n", kerr, the_thread); //debug
if (kerr != KERN_SUCCESS)
{ do_exit("set_emulation"); }
//Use some writes for the example
write(1, "Bonjour\n", 8);
write(1, "Bonjour\n", 8);
}
The Output is :
address::0x106abe000, size::1024
kerr::46, thread::1295
Error::set_emulation
The kernel error 46 corresponds to the macro KERN_NOT_SUPPORTED described as an "Empty thread activation (No thread linked to it)" in /usr/include/mach/kern_return.h, and happend even before i'm calling write.
My question is: What did I do wrong in this process? Kern_not_supported does mean that it's not implemented yet, instead of a meaningless thread problem?
The source code in XNU for the task_set_emulation is:
kern_return_t
task_set_emulation(
__unused task_t task,
__unused vm_offset_t routine_entry_pt,
__unused int routine_number)
{
return KERN_NOT_SUPPORTED;
}
Which means task_set_emulation is not supported.

How to make lldb ignore EXC_BAD_ACCESS exception?

I am writing a program on Mac OSX depending on the sigaction/sa_handler mechanism. Run a code snippet from user and get ready to catch signals/exceptions at any time. The program works fine, but the problem is I can't debug it with lldb. lldb seems not being able to ignore any exceptions even I set
proc hand -p true -s false SIGSEGV
proc hand -p true -s false SIGBUS
The control flow stops at the instruction that triggers the exception and does not jump to the sa_handler I installed earlier even I tried command c. The output was:
Process 764 stopped
* thread #2: tid = 0xf140, 0x00000001000b8000, stop reason = EXC_BAD_ACCESS (code=2, address=0x1000b8000)
How do I make lldb ignore the exception/signal and let the sa_handler of the program do its work?
EDIT: sample code
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <pthread.h>
#include <unistd.h>
static void handler(int signo, siginfo_t *sigaction, void *context)
{
printf("in handler.\n");
signal(signo, SIG_DFL);
}
static void gen_exception()
{
printf("gen_exception in.\n");
*(int *)0 = 0;
printf("gen_exception out.\n");
}
void *gen_exception_thread(void *parg)
{
gen_exception();
return 0;
}
int main()
{
struct sigaction sa;
sa.sa_sigaction = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
if(sigaction(/*SIGBUS*/SIGSEGV, &sa, NULL) == -1) {
printf("sigaction fails.\n");
return 0;
}
pthread_t id;
pthread_create(&id, NULL, gen_exception_thread, NULL);
pthread_join(id, NULL);
return 0;
}
I needed this in a recent project, so I just built my own LLDB. I patched a line in tools/debugserver/source/MacOSX/MachTask.mm from
err = ::task_set_exception_ports (task, m_exc_port_info.mask, m_exception_port, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
to
err = ::task_set_exception_ports (task, m_exc_port_info.mask & ~EXC_MASK_BAD_ACCESS, m_exception_port, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
which causes the debugserver to be unable to catch EXC_BAD_ACCESS exceptions. Now, my custom LLDB works just fine: it still catches SIGSEGV and SIGBUS but no longer enters a silly infinite loop when faced with EXC_BAD_ACCESS. Setting process handle options on the previously-fatal signals works fine too, and I can now debug SEGV handlers with impunity.
Apple really ought to make this an option in LLDB...seems like a really easy fix for them.
This is a long-standing bug in the debugger interface in Mac OS X (gdb had the same problem...) If you have a developer account, please file a bug with http://bugreport.apple.com. So few people actually use SIGSEGV handlers that the problem never gets any attention from the kernel folks, so more bugs is good...
We can do it easily. Just add this code.
#include <mach/task.h>
#include <mach/mach_init.h>
#include <mach/mach_port.h>
int ret = task_set_exception_ports(
mach_task_self(),
EXC_MASK_BAD_ACCESS,
MACH_PORT_NULL,//m_exception_port,
EXCEPTION_DEFAULT,
0);
Don't forget to do this
proc hand -p true -s false SIGSEGV
proc hand -p true -s false SIGBUS
Full code:
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <pthread.h>
#include <unistd.h>
#include <mach/task.h>
#include <mach/mach_init.h>
#include <mach/mach_port.h>
static void handler(int signo, siginfo_t *sigaction, void *context)
{
printf("in handler.\n");
signal(signo, SIG_DFL);
}
static void gen_exception()
{
printf("gen_exception in.\n");
*(int *)0 = 0;
printf("gen_exception out.\n");
}
void *gen_exception_thread(void *parg)
{
gen_exception();
return 0;
}
int main()
{
task_set_exception_ports(
mach_task_self(),
EXC_MASK_BAD_ACCESS,
MACH_PORT_NULL,//m_exception_port,
EXCEPTION_DEFAULT,
0);
struct sigaction sa;
sa.sa_sigaction = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
if(sigaction(/*SIGBUS*/SIGSEGV, &sa, NULL) == -1) {
printf("sigaction fails.\n");
return 0;
}
pthread_t id;
pthread_create(&id, NULL, gen_exception_thread, NULL);
pthread_join(id, NULL);
return 0;
}
Refer to (Chinese article): https://zhuanlan.zhihu.com/p/33542591
A little bit of example code can make a question like this a lot easier to answer ... I've never used the sigaction API before but I threw this together -
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
void segv_handler (int in)
{
puts ("in segv_handler()");
}
void sigbus_handler (int in)
{
puts ("in sigbus_handler()");
}
int main ()
{
struct sigaction action;
action.sa_mask = 0;
action.sa_flags = 0;
action.sa_handler = segv_handler;
sigaction (SIGSEGV, &action, NULL);
action.sa_handler = sigbus_handler;
sigaction (SIGBUS, &action, NULL);
puts ("about to send SIGSEGV signal from main()");
kill (getpid(), SIGSEGV);
puts ("about to send SIGBUS signal from main()");
kill (getpid(), SIGBUS);
puts ("exiting main()");
}
% lldb a.out
(lldb) br s -n main
(lldb) r
(lldb) pr h -p true -s false SIGSEGV SIGBUS
(lldb) c
Process 54743 resuming
about to send SIGSEGV signal from main()
Process 54743 stopped and restarted: thread 1 received signal: SIGSEGV
in segv_handler()
about to send SIGBUS signal from main()
Process 54743 stopped and restarted: thread 1 received signal: SIGBUS
in sigbus_handler()
exiting main()
Process 54743 exited with status = 0 (0x00000000)
(lldb)
Everything looks like it's working correctly here. If I'd added -n false to the process handle arguments, lldb wouldn't have printed the lines about Process .. stopped and restarted.
Note that these signal settings do not persist across process executions. So if you're starting your debug session over (r once you've already started the process once), you'll need to re-set these. You may want to create a command alias shortcut and put it in your ~/.lldbinit file so you can set the process handling the way you prefer with a short cmd.

How to get thread stack information on Windows?

I enumerate all threads in a process through the CreateToolhelp32Snapshot function. I would like to get some basic stack information for each thread. More specifically I would like to get stack bottom address and if possible I would like to get current stack top address. Basically this is the information displayed with the ~*k command in WinDbg. So how can I obtain the stack information from the thread's ID or HANDLE?
(Definitions can be found here.)
To get stack boundaries:
THREAD_BASIC_INFORMATION basicInfo;
NT_TIB tib;
// Get TEB address
NtQueryInformationThread(YOUR_THREAD_HANDLE, ThreadBasicInformation, &basicInfo, sizeof(THREAD_BASIC_INFORMATION), NULL);
// Read TIB
NtReadVirtualMemory(YOUR_PROCESS_HANDLE, basicInfo.TebBaseAddress, &tib, sizeof(NT_TIB), NULL);
// Check tib.StackBase and tib.StackLimit
To get the value of esp, simply use GetThreadContext.
An easier way without having to involve the Windows Driver Kit is as so:
NT_TIB* tib = (NT_TIB*)__readfsdword(0x18);
size_t* stackBottom = (size_t*)tib->StackLimit;
size_t* stackTop = (size_t*)tib->StackBase;
__readfsdword() works only for the current thread. So, the variant with NtQueryInformationThread() is more flexible.
Added some declarations which are missed in ntdll.h:
typedef enum _THREADINFOCLASS {
ThreadBasicInformation = 0,
} THREADINFOCLASS;
typedef LONG KPRIORITY;
typedef struct _CLIENT_ID {
HANDLE UniqueProcess;
HANDLE UniqueThread;
} CLIENT_ID;
typedef CLIENT_ID *PCLIENT_ID;
typedef struct _THREAD_BASIC_INFORMATION
{
NTSTATUS ExitStatus;
PVOID TebBaseAddress;
CLIENT_ID ClientId;
KAFFINITY AffinityMask;
KPRIORITY Priority;
KPRIORITY BasePriority;
} THREAD_BASIC_INFORMATION, *PTHREAD_BASIC_INFORMATION;
As fas as I know, Toolhelp works by making a copy of basic information on heaps, modules, processes and threads. This does not include the TEB block that contains the stack bottom address. I think you need to use another API, the debugger engine API, which offers functions to examine the stacks
Here's an easy way for the current thread (portable Win32 x86/x64 version):
#include <intrin.h>
NT_TIB* getTIB() {
#ifdef _M_IX86
return (NT_TIB*)__readfsdword(0x18);
#elif _M_AMD64
return (NT_TIB*)__readgsqword(0x30);
#else
#error unsupported architecture
#endif
}
NT_TIB* tib = getTIB();
void* stackBase = tib->StackBase;
void* stackLimit = tib->StackLimit;
Note: stackLimit < stackBase (as stack grows downwards).
For more details refer to Win32 TIB.

OpenSSL and multi-threads

I've been reading about the requirement that if OpenSSL is used in a multi-threaded application, you have to register a thread identification function (and also a mutex creation function) with OpenSSL.
On Linux, according to the example provided by OpenSSL, a thread is normally identified by registering a function like this:
static unsigned long id_function(void){
return (unsigned long)pthread_self();
}
pthread_self() returns a pthread_t, and this works on Linux since pthread_t is just a typedef of unsigned long.
On Windows pthreads, FreeBSD, and other operating systems, pthread_t is a struct, with the following structure:
struct {
void * p; /* Pointer to actual object */
unsigned int x; /* Extra information - reuse count etc */
}
This can't be simply cast to an unsigned long, and when I try to do so, it throws a compile error. I tried taking the void *p and casting that to an unsigned long, on the theory that the memory pointer should be consistent and unique across threads, but this just causes my program to crash a lot.
What can I register with OpenSSL as the thread identification function when using Windows pthreads or FreeBSD or any of the other operating systems like this?
Also, as an additional question:
Does anyone know if this also needs to be done if OpenSSL is compiled into and used with QT, and if so how to register QThreads with OpenSSL? Surprisingly, I can't seem to find the answer in QT's documentation.
I will just put this code here. It is not panacea, as it doesn't deal with FreeBSD, but it is helpful in most cases when all you need is to support Windows and and say Debian. Of course, the clean solution assumes usage of CRYPTO_THREADID_* family introduced recently. (to give an idea, it has a CRYPTO_THREADID_cmp callback, which can be mapped to pthread_equal)
#include <pthread.h>
#include <openssl/err.h>
#if defined(WIN32)
#define MUTEX_TYPE HANDLE
#define MUTEX_SETUP(x) (x) = CreateMutex(NULL, FALSE, NULL)
#define MUTEX_CLEANUP(x) CloseHandle(x)
#define MUTEX_LOCK(x) WaitForSingleObject((x), INFINITE)
#define MUTEX_UNLOCK(x) ReleaseMutex(x)
#define THREAD_ID GetCurrentThreadId()
#else
#define MUTEX_TYPE pthread_mutex_t
#define MUTEX_SETUP(x) pthread_mutex_init(&(x), NULL)
#define MUTEX_CLEANUP(x) pthread_mutex_destroy(&(x))
#define MUTEX_LOCK(x) pthread_mutex_lock(&(x))
#define MUTEX_UNLOCK(x) pthread_mutex_unlock(&(x))
#define THREAD_ID pthread_self()
#endif
/* This array will store all of the mutexes available to OpenSSL. */
static MUTEX_TYPE *mutex_buf=NULL;
static void locking_function(int mode, int n, const char * file, int line)
{
if (mode & CRYPTO_LOCK)
MUTEX_LOCK(mutex_buf[n]);
else
MUTEX_UNLOCK(mutex_buf[n]);
}
static unsigned long id_function(void)
{
return ((unsigned long)THREAD_ID);
}
int thread_setup(void)
{
int i;
mutex_buf = malloc(CRYPTO_num_locks() * sizeof(MUTEX_TYPE));
if (!mutex_buf)
return 0;
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_SETUP(mutex_buf[i]);
CRYPTO_set_id_callback(id_function);
CRYPTO_set_locking_callback(locking_function);
return 1;
}
int thread_cleanup(void)
{
int i;
if (!mutex_buf)
return 0;
CRYPTO_set_id_callback(NULL);
CRYPTO_set_locking_callback(NULL);
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_CLEANUP(mutex_buf[i]);
free(mutex_buf);
mutex_buf = NULL;
return 1;
}
I only can answer the Qt part. Use QThread::currentThreadId(), or even QThread::currentThread() as the pointer value should be unique.
From the OpenSSL doc you linked:
threadid_func(CRYPTO_THREADID *id) is needed to record the currently-executing thread's identifier into id. The implementation of this callback should not fill in id directly, but should use CRYPTO_THREADID_set_numeric() if thread IDs are numeric, or CRYPTO_THREADID_set_pointer() if they are pointer-based. If the application does not register such a callback using CRYPTO_THREADID_set_callback(), then a default implementation is used - on Windows and BeOS this uses the system's default thread identifying APIs, and on all other platforms it uses the address of errno. The latter is satisfactory for thread-safety if and only if the platform has a thread-local error number facility.
As shown providing your own ID is really only useful if you can provide a better ID than OpenSSL's default implementation.
The only fail-safe way to provide IDs, when you don't know whether pthread_t is a pointer or an integer, is to maintain your own per-thread IDs stored as a thread-local value.

Resources