LLDB and the memory addresses of library functions - macos

I have a "Hello World" program to which I've attached lldb. I'm trying to answer a few questions for myself about the results I get when I try to get the address of library functions:
(lldb) image lookup -n printf
1 match found in /usr/lib/system/libsystem_c.dylib:
Address: libsystem_c.dylib[0x000000000003f550] (libsystem_c.dylib.__TEXT.__text + 253892)
Summary: libsystem_c.dylib`printf
(lldb) image lookup -n scanf
1 match found in /usr/lib/system/libsystem_c.dylib:
Address: libsystem_c.dylib[0x000000000003fc69] (libsystem_c.dylib.__TEXT.__text + 255709)
Summary: libsystem_c.dylib`scanf
(lldb) expr &printf
(int (*)(const char *__restrict, ...)) $2 = 0x00007fff6f8c5550 (libsystem_c.dylib`printf)
(lldb) expr &scanf
error: unsupported expression with unknown type
I have three questions here:
What kind of address is 0x00007fff6f8c5550? I assume it is the function pointer to printf. Is this a virtual address that exists only in the mapped space of the current process? If yes, why does another program return the same address for printf?
Assuming it's some global shared address that is the same for every process, would modifying the contents of the data at this address (which I haven't been able to do yet) create a copy of the modified memory page and will the address change? (i'm on Mac OS and I assume one process cannot change shared memory for another process)
Why does expr &scanf not work, but expr &printf does?

What kind of address is 0x00007fff6f8c5550? I assume it is the function pointer to printf.
Yes, that's correct.
Is this a virtual address that exists only
in the mapped space of the current process?
Well, yes and no. It is a virtual address specific to your process and you should not assume it's valid in another process. But:
If yes, why does another
program return the same address for printf?
As an optimization, macOS uses a shared mapping for a lot of the system libraries. They are loaded once at boot and used by all processes. For a given boot, the address is constant across all such processes. However, the address is randomized each boot for security.
Assuming it's some global shared address that is the same for every process, would modifying the contents of the data at this address
(which I haven't been able to do yet) create a copy of the modified
memory page and will the address change?
Well, it is mapped copy-on-write. So, modifying it would create a copy. However, that wouldn't change its address. The OS would simply modify the mapping so that the memory around that address is private to your process.
(i'm on Mac OS and I assume
one process cannot change shared memory for another process)
Well, processes can cooperate to have writable shared memory. But, in general, you're correct that security precautions prevent unwanted modifications to a process's memory.
Why does expr &scanf not work, but expr &printf does?
Your program (presumably) doesn't use scanf, so there's no debugging information regarding it. The main thing lldb is missing is the type of scanf. If you use a cast expression, it can work:
(lldb) p scanf
error: 'scanf' has unknown type; cast it to its declared type to use it
(lldb) p &scanf
error: unsupported expression with unknown type
(lldb) p (int(*)(const char * __restrict, ...))scanf
(int (*)(const char *__restrict, ...)) $3 = 0x00007fffd7e958d4 (libsystem_c.dylib`scanf)
Conversely, it works for printf because your program does use it.

Related

How to get physical address of kernel symbol?

I'm trying to get the physical address of a kernel symbol (because I need it to be read by a system that only accepts phys addr). So I'm using things like __pa(vaddr) and virt_to_phys(vaddr).
Problem is that the conversion is not correct everytime. Sometimes it is, sometimes it's not. How do I know that? Because I'm reading the value on the kernel module as well and comparing to value read using fmem and using my other system.
Ex:
my kernel symbol's address is 0xffffffffb3400000, so when I do
printk(KERN_INFO "Addr: 0x%lx , Value: 0x%lx\n", _stext, *_stext) I get => Addr: 0xffffffffb3400000 , Value: 0x4801e03f51258d48.
When it reads right I use fmem and the other system to read the physical address given by __pa and the value matches 0x4801e03f51258d48 in both cases. But sometimes it doesn't work, so the value is different. And how do I know the value itself hasn't changed? Because dereferencing the virtual address still gives me the result 0x4801e03f51258d48.
I know kernel code and data changes on every boot (including kernel symbols addresses), that's not a problem since I'm getting the kernel symbol's addr everytime. But I expected the virtual to phys mapping to be the same every time, shouldn't it be? I mean, what can I do to get the physical address of my kernel symbol correctly?
So right now I have
u64 _stext_pa = __pa(_stext_va)
printk(KERN_INFO "Value of _stext (va: 0x%lx | pa: 0x%x) = 0x%lx\n", _stext_va, _stext_pa, *_stext_va)
Addresses (virtual and physical) change every boot but the value is always the same. That's what I need.
I need a physical address paddr so that I can perform int value = *paddr; on my other system and get the same result I'm getting with the virtual address
Edit: this "other system" I'm refering to is an SMM driver, but that's not relevant, just in case you might be wondering "what is 'other system'"

Get memory address ranges for Windows program

I'm trying to read the memory of a Windows program based on a pointer I find by using ModuleInfo to get the address starting point and size of the module. But that pointer points to memory outside that modules address space, is there a way to find out the program uses that section of memory without having to find a pointer to it first?
See if the program in question has an interface ( https://en.m.wikipedia.org/wiki/Interface_(computing) ) that can be used to interface with said program. If there is no documented interface, attempting to tamper with that programs memory is a bad idea; and will most likely result in undefined behaviour. If this does not answer your question I suggest you edit it to specify exactly which program this is about.

Working of mmap()

I am trying to get an idea on how does memory mapping take place using the system call mmap.
So far I know mmap takes arguments from the user and returns a logical address of where the file is stored. When the user tries to access it takes this address to the map table converts it to a a physical address and carries the operation as requested.
However I found articles as code example and Theoretical explanation
What it mentions is the memory mapping is carried out as:
A. Using system call mmap ()
B. file operations using (struct file *filp, struct vm_area_struct *vma)
What I am trying to figure out is:
How the arguments passed in the mmap system call are used in the struct vm_area_struct *vma) More generally how are these 2 related.
for instance: the struct vm_area_struct has arguments such as starting address, ending address permissions,etc. How are the values sent by the user used to fill values of these variables.
I am trying to write a driver so, Does the kernal fill the values for variables in the structure for us and I simply use it to call and pass values to remap_pfn_range
And a more fundamental question, why is a different file systems operation needed. The fact that mmap returns the virtual address means that it has already achieved a mapping doesnt it ?
Finally I am not that clear about how the entire process would work in user as well as kernal space. Any documentation explaining the process in details would be helpful.

Getting address of symbol from kernel's symbol table

arif#khost:~/src/linux$ global -x ip_rcv_finish
ip_rcv_finish 319 net/ipv4/ip_input.c static int ip_rcv_finish(struct sk_buff *skb)
Now if i want to use this function i need to initialize a pointer to this function.
To be able to do that i need the address of the function.
I've seen that from user space i can read /proc/kallsyms to get an address of a symbol. Is their any similar mechanism exist where i can read the symbol table to extract a symbol's address from kernel space?
Depending on your kernel version, you can use kallsyms_lookup_name and/or kallsyms_on_each_symbol to obtain the addresses of the symbols from code running in the kernel space.
This only works if CONFIG_KALLSYMS is set in the kernel configuration.
Note that I would not recommend looking up the addresses of the functions to be called though unless there is no better way (kernel API) to do what you would like to. Still, if nothing else helps, kallsyms_*() API may be the way to go.

process descriptor pointer doesn't match current macro in Linux Kernel

I am using the esp value of kernel stack to calculate the process descriptor pointer value.
According to ULK book, I just need to mask 13 least significant bits of esp to obtain the base address of the thread_info structure.
My test is:
write a kernel module because I need to get value of kernel stack
In the kernel init function, get the value of kernel stack
use following formula to get the process descriptor pointer of the process running on the CPU: *((unsigned int*) esp & 0xffffe000)
use the current macro, print out its value.
I think the value of step3 should be same as the value of step 4.
But my experiment results shows: sometimes they are same, and sometimes they are different. Could any explain why? Or am I missing anything?
This is because at the base of the kernel stack you will find a struct thread_info instance (platform dependent) and not a struct task_struct. The current() macro provides a pointer to the current task_struct.
Try the following:
struct thread_info *info = (struct thread_info*)(esp & 0xfffe000);
struct task_struct *my_current = info->task;
Now you can compare my_current with current().
Finally, I solved this problem. Everything is correct expect for the size of kernel stack. My kernel use 4KB stack instead of 8KB stack. So I just need to mask low 12 bits of the ESP.
Thanks for all the suggestions and answer!

Resources