How to change a call with Reverse engineering

How to change a call with Reverse engineering - windows

I have an example program test1.exe that uses an example library test2.dll.
test.dll contains the functions A() and B() of the same type.
test1.exe calls A and then exits.
Here I've found the call to A():
(http://i.stack.imgur.com/5W9Jd.jpg)
Now, if i'm not mistaken, I need to replace 88FDFFFF with the correct offset of B(), but how can I calculate it so that B() will be invoked instead of A()?

If this in an x86 call-relative instruction, the offset value is computed by subtracting the address of the instruction following the call (= call instruction location + 5 bytes), from the address of the target. So, you need to patch the offset to be address(B)-address(callinstruction+5).

if b is imported in test1.exe it is easy otherwise you have to use LoadLibrary and GetProcAddress.
press ctrl+N to see if b is imported or not.

I would recommend to learn asm basics first and play with HIEW hexeditor/disassembler to change simple codes.

Related

__verify_pcpu_ptr function in Linux Kernel - What does it do?

#define __verify_pcpu_ptr(ptr)
do {
const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;
(void)__vpp_verify;
} while (0)
#define VERIFY_PERCPU_PTR(__p)
({
__verify_pcpu_ptr(__p);
(typeof(*(__p)) __kernel __force *)(__p);
})
What do these two functions do? What are they used for? How do they work?
Thanks.

This is part of the scheme used by per_cpu_ptr to support a pointer that gets a different value for each CPU. There are two motives here:
Ensure that accesses to the per-cpu data structure are only made via the per_cpu_ptr macro.
Ensure that the argument given to the macro is of the correct type.
Restating, this ensures that (a) you don't accidentally access a per-cpu pointer without the macro (which would only reference the first of N members), and (b) that you don't inadvertently use the macro to cast a pointer that is not of the correct declared type to one that is.
By using these macros, you get the support of the compiler in type-checking without any runtime overhead. The compiler is smart enough to eventually recognize that all of these complex machinations result in no observable state change, yet the type-checking will have been performed. So you get the benefit of the type-checking, but no actual executable code will have been emitted by the compiler.

lldb: how to call a function from a specific library/framework

Problem: In project we have localization functions which are specific to a framework/dynamic library. That is they have identical name but fetch resources from different bundles/folders
I'd want to call a function from a specific library, something similar to:
lldb> p my_audio_engine.framework::GetL10nString( stringId );
lldb> expr --shlib my_audio_engine.framework -- GetL10nString();
lldb> p my_audio_engine`L10N_Utils::GetString(40000)
but all these variants don't work.
Adding gdb in tags hoping the same semantic if exists will work on lldb as well.

lldb's expression parser doesn't currently have the equivalent of gdb's foo.c::function meta-symbol to encode a function from a specific source file.
Please feel free to file a bug requesting this at bugreporter.apple.com. It will get duped to the one I filed a while ago, but dups are votes for features, and we haven't gotten around to this one yet 'cause nobody but me asked for it...
For the nonce, you will have to do this by hand. Here's a silly example for calling printf, which I happen to know is in libsystem_c.dylib on OS X. First, I find the address in the shared library I am interested in:
(lldb) image lookup -vn printf libsystem_c.dylib
1 match found in /usr/lib/system/libsystem_c.dylib:
Address: libsystem_c.dylib[0x0000000000042948] (libsystem_c.dylib.__TEXT.__text + 266856)
Summary: libsystem_c.dylib`printf
Module: file = "/usr/lib/system/libsystem_c.dylib", arch = "x86_64"
Symbol: id = {0x00000653}, range = [0x00007fff91307948-0x00007fff91307a2c), name="printf"
The first address (the one under Address) is the address of the function in the dylib, not where it got loaded in the running program. That's not immediately useful. I could calculate the library's load offset if I wanted to and apply it to the file address, but fortunately the first address in the Symbol's address range is the address in the running program so I don't have to. 0x00007fff91307948 is the address I want.
Now I want to call that address. I do this in two steps because it makes the casting easier, like:
(lldb) expr typedef int (*$printf_type)(const char *, ...)
(lldb) expr $printf_type $printf_function = ($printf_type) 0x00007fff91307948
Now I have a function I can call over and over:
(lldb) expr $printf_function("Hello world %d times.\n", 400)
Hello world 400 times.
(int) $2 = 23
If you are going to do this over and over, you can write a Python function that finds the symbol out of the library of interest, and constructs the expression that calls the right function. The Python API's include calls to get symbols from a particular module (lldb-speak for loadable binary images), get their addresses, evaluate expressions, etc.

Match the left side variable of an assignment to the return value of the right side function call

For the following statement inside function func(), I'm trying to figure out the variable name (which is 'dictionary' in the example) that points to the malloc'ed memory region.
Void func() {
uint64_t * dictionary = (uint64_t *) malloc ( sizeof(uint64_t) * 128 );
}
The instrumented malloc() can record the start address and size of the allocation. However, no knowledge of variable 'dictionary' that will be assigned to, any features from the compilers side can help to solve this problem, without modifying the compiler to instrument such assignment statements?
One way I've been thinking is to use the feature that variable 'dictionary' and function 'malloc' is on one source code line or next to each other, the dwarf provides line information.

One thing you can do with Clang and LLVM is emit the code with debug information and then look for malloc calls. These will be assigned to LLVM values, which can be traced (when not compiled with optimizations, that is) to the original C/C++ source code via the debug information metadata.

Windows function names followed by # number symbol?

I'm programming for Windows in assembly in NASM, and i found this in the code:
extern _ExitProcess#4
;Rest of code...
; ...
call _ExitProcess#4
What does the #4 mean in the declaration and call of a winapi library function?

The winapi uses the __stdcall calling convention. The caller pushes all the arguments on the stack from right to left, the callee pops them again to cleanup the stack, typically with a RET n instruction.
It is the antipode of the __cdecl calling convention, the common default in C and C++ code where the caller cleans up the stack, typically with an ADD ESP,n instruction after the CALL. The advantage of __stdcall is that it is generates more compact code, just one cleanup instruction in the called function instead of many for each call to the function. But one big disadvantage: it is dangerous.
The danger lurks in the code that calls the function having been compiled with an out-dated declaration of the function. Typical when the function was changed by adding an argument for example. This ends very poorly, beyond the function trying to use an argument that is not available, the new function pops too many arguments off the stack. This imbalances the stack, causing not just the callee to fail but the caller as well. Extremely hard to diagnose.
So they did something about that, they decorated the name of the function. First with a leading _underscore, as is done for __cdecl functions. And appended #n, the value of n is the operand of the RET instruction at the end of the function. Or in other words, the number of bytes taken by the arguments on the stack.
This provides a linker diagnostic when there's a mismatch, a change in a foo(int) function to foo(int, int) for example generates the name _foo#8. The calling code not yet recompiled will look for a _foo#4 function. The linker fails, it cannot find that symbol. Disaster avoided.

The name decoration scheme for C is documented at Format of a C Decorated Name. A decorated name containing a # character is used for the __stdcall calling convention:
__stdcall: Leading underscore (_) and a trailing at sign (#) followed by a number representing the number of bytes in the parameter list
Tools like Dependency Walker are capable of displaying both decorated and undecorated names.
Unofficial documentation can be found here: Name Decoration

It's a name decoration specifying the total size of the function's arguments:
The name is followed by the at sign (#) followed by the number of bytes (in decimal) in the argument list.
(source)

What does Arduino's "F()" actually do?

I have asked a similar question before, but I realize that I can't make heads or tails of the macrology and templateness. I'm a C (rather than C++) programmer.
What does F() actually do? When does it stuff characters into pgmem (flash)? When does it pull characters out of pgmem? Does it cache them? How does it handle low-memory situations?

There are no templates involved, only function overloading. The F() macro does two things:
uses PSTR to ensure that the literal string is stored in flash memory (the code space rather than the data space). However, PSTR("some string") cannot be printed because it would receive a simple char * which represents a base address of the string stored in flash. Dereferencing that pointer would access some random characters from the same address in data. Which is why F() also...
casts the result of PSTR() to __FlashStringHelper*. Functions such as print and println are overloaded so that, on receiving a __FlashStringHelper* argument, they correctly dereference the characters in the flash memory.

BTW. For the ESP32 library, both of these functions are defined in the following files:
# PSTR : ../Arduino/hardware/espressif/esp32/cores/esp32/pgmspace.h
# F : ../Arduino/hardware/espressif/esp32/cores/esp32/WString.h
And the F(x):
// An abstract class used as a means to provide a unique pointer type
// but really has no body
class __FlashStringHelper;
#define F(string_literal) (reinterpret_cast<const __FlashStringHelper *>(PSTR(string_literal)))
...
Also for ESP32, PSTR(x) is not needed and is just x: #define PSTR(s) (s).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to change a call with Reverse engineering - windows

If this in an x86 call-relative instruction, the offset value is computed by subtracting the address of the instruction following the call (= call instruction location + 5 bytes), from the address of the target. So, you need to patch the offset to be address(B)-address(callinstruction+5).

if b is imported in test1.exe it is easy otherwise you have to use LoadLibrary and GetProcAddress. press ctrl+N to see if b is imported or not.

I would recommend to learn asm basics first and play with HIEW hexeditor/disassembler to change simple codes.

Related

__verify_pcpu_ptr function in Linux Kernel - What does it do?

lldb: how to call a function from a specific library/framework

Match the left side variable of an assignment to the return value of the right side function call

Windows function names followed by # number symbol?

What does Arduino's "F()" actually do?

Categories

Resources