Relation Between Program/Instruction Pointer(RIP) and Base/Frame Pointer(RBP) on 64 bit Linux - ptrace

i need some help on retrieving Instruction pointers(RIP) of a call stack on Linux 64 bit machine. i can traverse the Stack using ptrace and retrieve all Frame/Base pointer(RBP) values. but as i want IP values, what is the arithmetic and conceptual relationship between RIP and RBP. i assume that RIP value is stored at (RBP + 8) location and a can read it using ptrace PEEKDATA. is my assumption correct?

Any return address pushed on the stack will only get you the %rip starting after the currently running function returns, not the %rip of the currently executing function. You should be able to get your hands on the current %rip the same way GDB does:
Ideally, your platform supports the PTRACE_GETREGS or PTRACE_GETREGSET argument. Your manpage and the header file should get you the rest of the way from here.
Failing that, you should be able to use the PTRACE_PEEKUSER argument with the appropriate offset to grab the register from the user area.
You can look at the gorey details in gdb/amd64-linux-nat.c in the GDB source tree.

Related

Get memory address ranges for Windows program

I'm trying to read the memory of a Windows program based on a pointer I find by using ModuleInfo to get the address starting point and size of the module. But that pointer points to memory outside that modules address space, is there a way to find out the program uses that section of memory without having to find a pointer to it first?
See if the program in question has an interface ( https://en.m.wikipedia.org/wiki/Interface_(computing) ) that can be used to interface with said program. If there is no documented interface, attempting to tamper with that programs memory is a bad idea; and will most likely result in undefined behaviour. If this does not answer your question I suggest you edit it to specify exactly which program this is about.

How did debuggers for 16-bit real mode programs produce stack traces?

I'm messing around with running old DOS programs in an emulator, and I've gotten to the point where I'd like to trace the program's stack. However, I'm running into a problem, specifically how to detect near calls and far calls. Some pretext:
A near call pushes only the IP onto the stack, and is expected to be paired with a ret which pops only the IP to return to.
A far call pushes both the CS and IP onto the stack, and is expected to be paired with a retf which pops both the CS and IP to return to.
There is no way to know whether a call is a near call or a far call, except by knowing which kind of instruction called it, or which return it uses.
Luckily, for the period this program was developed in, BP-based stack frames were very common, so walking the stack doesn't seem to be a problem: I just follow the BP-chain. Unfortunately, getting the CS and/or IP is difficult, because there doesn't seem to be any way for me to determine whether a call is a near call or a far call by looking at the stack alone.
I have metadata about functions available, so I can tell whether a function is a near or far call if I already know the actual CS and IP, but I can't figure out the IP and CS unless I already know if it's a far call or near call.
I'm having a little success by just guessing and seeing if my guess results in a valid function lookup, but I think this method will produce a lot of false positives.
So my question is this: How did debuggers of the DOS era deal with this problem and produce stack traces? Is there some algorithm for this I'm missing, or did they just encode debug information in the stack? (If this is the case, then I'll have to come up with something else.)
Just a guess, I've never actually used 16-bit x86 development tools (modern or back in the day):
You know the CS:IP value of the current function (or one that triggered a fault or whatever from an exception frame).
You might have metadata that tells you whether this is a "far" function that's called with a far call or not. Or you could attempt decoding until you get to a retn or retf, and use that to decide whether the return address is a near IP or a far CS:IP.
(Assuming this is a normal function that returns with some kind of ret. Or if it ends with a jmp tailcall to another function, then the return address probably matches that, but that's another level of assumptions. And figuring out that a near jmp is the end of a function instead of just a jump within a large function is am ambiguous problem without any symbol metadata.)
But anyway, apply the same thing to the parent function: after one level of successful backtracing, you now have the CS:IP of the instruction after the call in your parent function, and the SS:BP value of the BP linked list.
And BTW, yes there's a very good reason for legacy BP stack frames being widely used: [SP] isn't a valid 16-bit addressing mode, and only [BP] as a base implies SS as a segment, so yes, using BP for access to the stack was the only good option for random access (not just push/pop for temporaries). No reason not to save/restore it first (before any other registers or reserving stack space) to make a conventional stack-frame.

How do symbols solve walking the stack with FPO in x86 debugging?

In this answer: https://stackoverflow.com/a/8646611/192359 , it is explained that when debugging x86 code, symbols allow the debugger to display the callstack even when FPO (Frame Pointer Omission) is used.
The given explanation is:
On the x86 PDBs contain FPO information, which allows the debugger to reliably unwind a call stack.
My question is what's this information? As far as I understand, just knowing whether a function has FPO or not does not help you finding the original value of the stack pointer, since that depends on runtime information.
What am I missing here?
Fundamentally, it is always possible to walk the stack with enough information1, except in cases where the stack or execution context has been irrecoverably corrupted.
For example, even if rbp isn't used as the frame pointer, the return address is still on the stack somewhere, and you just need to know where. For a function that doesn't modify rsp (indirectly or directly) in the body of the function it would be at a simple fixed offset from rsp. For functions that modify rsp in the body of the function (i.e., that have a variable stack size), the offset from rsp might depend on the exact location in the function.
The PDB file simply contains this "side band" information which allows someone to determine the return address for any instruction in the function. Hans linked a relevant in-memory structure above - you can see that since it knows the size of the local variables and so on it can calculate the offset between rsp and the base of the frame, and hence get at the return address. It also knows how many instruction bytes are part of the "prolog" which is important because if the IP is still in that region, different rules apply (i.e., the stack hasn't been adjusted to reflect the locals in this function yet).
In 64-bit Windows, the exact function call ABI has been made a bit more concrete, and all functions generally have to provide unwind information: not in a .pdb but directly in a section included in the binary. So even without .pdb files you should be able to unwind a properly structured 64-bit Windows program. It allows any register to be used as the frame pointer, and still allows frame-pointer omission (with some restrictions). For details, start here.
1 If this weren't true, ask yourself how the currently running function could ever return? Now, technically you could design a program which clobbers or forgets the stack in a way that it cannot return, and either never exits or uses a method like exit() or abort() to terminate. This is highly unusual and not possibly outside of assembly.

How to get address of system call entry point?

When a system call is invoked from 64-bit userspace to 64-bit kernel, syscall table is accessed from arch/x86/kernel/entry_64.S, from the system_call assembly entry point. How can I get the virtual/physical address of this "system_call()" routine?
In other words, I want to know the address of entry point used by all system calls. I tried looking at kallsyms file but couldn't find it there. Perhaps, it has another name in kallsyms?
Reference: https://lwn.net/Articles/604287/
What do you need this for? Are you sure you were inspecting kallsyms of the same kernel which was used in the article?
Figuring out what the func got renamed to is left as an exercise for the reader.

Buffer overflow to jump to a part of code

My teacher gave me some code and I have to run it and make it jump to the admin section using a buffer overflow. I cannot modify the source code. Could someone explain how I could jump to the admin method using a buffer overflow? I'm running it on ubuntu 8.10 and it was compiled with an older version of gcc so the overflow will work.
Without being able to see the code, on a general level you need to design inputs to the function that will overwrite the return address (or another address to which control will be transferred by the function) on the stack.
At a guess, the code has a fixed length character buffer and copies values from a function parameter into that buffer without validating that the length does not exceed the length of the buffer.
You need to make a note of what the stack layout looks like for your application (running it under a debugger may well be the quickest way to do this) to work out where the address you need to override is, then put together a string to overwrite this with the address of the admin function you need to call.
You can always get the asm output of it (I forgot how right now... brainfart) and see where the buffer you want to overflow is being used/read and check it's length. Next you want to calculate how far you need to overflow it so that you either replace the next instruction with a JMP (address of admin code) or change a JMP address to that of the admin section. 0xE8 is the jump opcode for x86 if you need it since you want to overwrite the binary data of the instruction with your own.

Resources