I am trying to get some info from windows via the System plugin and the netapi32 library.
I try to call NetWkstaGetInfo() after allocating a struct suitable as a WKSTA_INFO_100.
The NetWkstaGetInfo() prototype states in MSDN:
NET_API_STATUS NetWkstaGetInfo(
_In_ LPWSTR servername,
_In_ DWORD level,
_Out_ LPBYTE *bufptr
);
While the WKSTA_INFO_100 is
typedef struct _WKSTA_INFO_100 {
DWORD wki100_platform_id;
LMSTR wki100_computername;
LMSTR wki100_langroup;
DWORD wki100_ver_major;
DWORD wki100_ver_minor;
} WKSTA_INFO_100, *PWKSTA_INFO_100, *LPWKSTA_INFO_100;
For a preliminary test, I try to display the struct members in a messagebox. I first initialize the struct with dummy info to check if the api call replaces the content of my allocated block.
But until now I got barely nothing after the first struct member, I suppose that the struct is not correctly defined, or that I have a struct alignment issue. Unfortunately the weird documentation of the System plugin is driving me nuts does not help me much.
Here is my test script:
outfile "hello.exe"
section
System::Call /NOUNLOAD "*(*i11,t 'some',t 'thing',i22,i44)i .r0"
Dumpstate::debug
System::Call /NOUNLOAD "netapi32::NetWkstaGetInfo(i0, i100, i r0) i.r6"
Dumpstate::debug
System::Call /NOUNLOAD "*$0(*i.r1, t.r2, t.r3, i.r4, i.r5)"
Dumpstate::debug
messagebox MB_OK "Hello, to $2 $3 domain (win $1 - $4.$5) !"
System::Free $0
sectionEnd
The first retrieved value (500) is correct. But the other members keep their initial value. What I am missing?
(Edit) Corollary questions:
it seems that following the documentation and MSDN, the first member of the struct should be i and not *i but I did not managed to get a correct returned value without the * (the Dumpstate plugin tends to show it is returned as an address)
is the /NOUNLOAD parameter for the plugin mandatory? I have found several examples with it but did not find a precise reason for it. I feared that the allocated struct could have been freed prematurely without the parameter. Could you confirm / infirm?
bufptr is out only so passing r0 as input to NetWkstaGetInfo is pointless, the function does not require input data.
You should not use *i with NetApiBufferFree, i alone is enough ($0 already has the address, you don't want the system plugin to play with the pointer, just pass it straight to the API)
!include LogicLib.nsh
System::Call "netapi32::NetWkstaGetInfo(i0, i100, *i 0 r0) i.r1"
${If} 0 = $1
System::Call "*$0(i.r1, w.r2, w.r3, i.r4, i.r5)"
DetailPrint "Hello, to $2 $3 domain (win $1 - $4.$5) !"
${EndIf}
System::Call "netapi32::NetApiBufferFree(ir0)"
In the preceding example I used *i 0 r0 for the bufptr parameter so that $0 is NULL before the function starts (If you don't want to use this trick you can just do StrCpy $0 0 before the system call). If you don't do this then it is unclear what happens if the function fails. The documentation does not specify what happens to bufptr when the function fails, hopefully it is set to NULL but you cannot know for sure. If the function fails we end up passing NULL to NetApiBufferFree and that is usually a safe thing to pass to a free function but the documentation does not call this out as OK. To be on the super safe side you should only free a non-NULL pointer:
System::Call "netapi32::NetWkstaGetInfo(i0, i100, *i 0 r0) i.r1"
${If} 0 = $1
System::Call "*$0(i.r1, w.r2, w.r3, i.r4, i.r5)"
DetailPrint "Hello, to $2 $3 domain (win $1 - $4.$5) !"
${EndIf}
${IfThen} $0 <> 0 ${|} System::Call "netapi32::NetApiBufferFree(ir0)" ${|}
SectionEnd
/NOUNLOAD is no longer required when using the system plugin (Since v2.42). /NOUNLOAD prevents NSIS from unloading the plugin. That is important if the plugin has internal state but in your case the only state is a block of memory allocated by Windows.
it seems that following the documentation and MSDN, the first member of the struct should be i and not *i but I did not managed to get a correct returned value without the * (the Dumpstate plugin tends to show it is returned as an address)
That was the clue to the solution : I misreaded the MSDN page and did not noticed at first that it is not a struct that is passed to NetWkstaGetInfo but the address of a NetWkstaGetInfo* that is modified by the api call (and must be freed after by NetApiBufferFree. Thus the correct script is :
outfile "hello.exe"
section
System::Call "netapi32::NetWkstaGetInfo(i0, i100, *i r0 r0) i.r6"
System::Call "*$0(i.r1, w.r2, w.r3, i.r4, i.r5)"
messagebox MB_OK "Hello, to $2 $3 domain (win $1 - $4.$5) !"
System::Call "netapi32::NetApiBufferFree(*i r0) i.r6"
sectionEnd
Related
This question already has answers here:
What is the meaning and usage of __stdcall?
(9 answers)
How does this asm for a stdcall function clean args from the stack?
(1 answer)
Closed 1 year ago.
Having a simple MessageBox program like that:
NULL EQU 0 ; Constants
MB_DEFBUTTON1 EQU 0
MB_DEFBUTTON2 EQU 100h
IDNO EQU 7
MB_YESNO EQU 4
extern _MessageBoxA#16 ; Import external symbols
extern _ExitProcess#4 ; Windows API functions, decorated
global Start ; Export symbols. The entry point
section .data ; Initialized data segment
MessageBoxText db "Do you want to exit?", 0
MessageBoxCaption db "MessageBox 32", 0
section .text ; Code segment
Start:
push MB_YESNO | MB_DEFBUTTON2 ; 4th parameter. 2 constants ORed together
push MessageBoxCaption ; 3rd parameter
push MessageBoxText ; 2nd parameter
push NULL ; 1st parameter
call _MessageBoxA#16
cmp EAX, IDNO ; Check the return value for "No"
je Start
push NULL
call _ExitProcess#4
My question is:
Shouldn't we add appropriate value to the esp reg after calling the MessageBoxA to restore the stack to it's previous state? If so when calling push MessageBoxCaption how much have to be added to the esp register (4?).
I'm trying to output the same string twice in extended inline ASM in GCC, on 64-bit Linux.
int main()
{
const char* test = "test\n";
asm(
"movq %[test], %%rdi\n" // Debugger shows rdi = *address of string*
"movq $0, %%rax\n"
"push %%rbp\n"
"push %%rbx\n"
"call printf\n"
"pop %%rbx\n"
"pop %%rbp\n"
"movq %[test], %%rdi\n" // Debugger shows rdi = 0
"movq $0, %%rax\n"
"push %%rbp\n"
"push %%rbx\n"
"call printf\n"
"pop %%rbx\n"
"pop %%rbp\n"
:
: [test] "g" (test)
: "rax", "rbx","rcx", "rdx", "rdi", "rsi", "rsp"
);
return 0;
}
Now, the string is outputted only once. I have tried many things, but I guess I am missing some caveats about the calling convention. I'm not even sure if the clobber list is correct or if I need to save and restore RBP and RBX at all.
Why is the string not outputted twice?
Looking with a debugger shows me that somehow when the string is loaded into rdi for the second time it has the value 0 instead of the actual address of the string.
I cannot explain why, it seems like after the first call the stack is corrupted? Do I have to restore it in some way?
Specific problem to your code: RDI is not maintained across a function call (see below). It is correct before the first call to printf but is clobbered by printf. You'll need to temporarily store it elsewhere first. A register that isn't clobbered will be convenient. You can then save a copy before printf, and copy it back to RDI after.
I do not recommend doing what you are suggesting (making function calls in inline assembler). It will be very difficult for the compiler to optimize things. It is very easy to get things wrong. David Wohlferd wrote a very good article on reasons not to use inline assembly unless absolutely necessary.
Among other things the 64-bit System V ABI mandates a 128-byte red zone. That means you can't push anything onto the stack without potential corruption. Remember: doing a CALL pushes a return address on the stack. Quick and dirty way to resolve this problem is to subtract 128 from RSP when your inline assembler starts and then add 128 back when finished.
The 128-byte area beyond the location pointed to by %rsp is considered to
be reserved and shall not be modified by signal or interrupt handlers.8 Therefore,
functions may use this area for temporary data that is not needed across function
calls. In particular, leaf functions may use this area for their entire stack frame,
rather than adjusting the stack pointer in the prologue and epilogue. This area is
known as the red zone.
Another issue to be concerned about is the requirement for the stack to be 16-byte aligned (or possibly 32-byte aligned depending on the parameters) prior to any function call. This is required by the 64-bit ABI as well:
The end of the input argument area shall be aligned on a 16 (32, if __m256 is
passed on stack) byte boundary. In other words, the value (%rsp + 8) is always
a multiple of 16 (32) when control is transferred to the function entry point.
Note: This requirement for 16-byte alignment upon a CALL to a function is also required on 32-bit Linux for GCC >= 4.5:
In context of the C programming language, function arguments are pushed on the stack in the reverse order. In Linux, GCC sets the de facto standard for calling conventions. Since GCC version 4.5, the stack must be aligned to a 16-byte boundary when calling a function (previous versions only required a 4-byte alignment.)
Since we call printf in inline assembler we should ensure that we align the stack to a 16-byte boundary before making the call.
You also have to be aware that when calling a function some registers are preserved across a function call and some are not. Specifically those that may be clobbered by a function call are listed in Figure 3.4 of the 64-bit ABI (see previous link). Those registers are RAX, RCX, RDX, RD8-RD11, XMM0-XMM15, MMX0-MMX7, ST0-ST7 . These are all potentially destroyed so should be put in the clobber list if they don't appear in the input and output constraints.
The following code should satisfy most of the conditions to ensure that inline assembler that calls another function will not inadvertently clobber registers, preserves the redzone, and maintains 16-byte alignment before a call:
int main()
{
const char* test = "test\n";
long dummyreg; /* dummyreg used to allow GCC to pick available register */
__asm__ __volatile__ (
"add $-128, %%rsp\n\t" /* Skip the current redzone */
"mov %%rsp, %[temp]\n\t" /* Copy RSP to available register */
"and $-16, %%rsp\n\t" /* Align stack to 16-byte boundary */
"mov %[test], %%rdi\n\t" /* RDI is address of string */
"xor %%eax, %%eax\n\t" /* Variadic function set AL. This case 0 */
"call printf\n\t"
"mov %[test], %%rdi\n\t" /* RDI is address of string again */
"xor %%eax, %%eax\n\t" /* Variadic function set AL. This case 0 */
"call printf\n\t"
"mov %[temp], %%rsp\n\t" /* Restore RSP */
"sub $-128, %%rsp\n\t" /* Add 128 to RSP to restore to orig */
: [temp]"=&r"(dummyreg) /* Allow GCC to pick available output register. Modified
before all inputs consumed so use & for early clobber*/
: [test]"r"(test), /* Choose available register as input operand */
"m"(test) /* Dummy constraint to make sure test array
is fully realized in memory before inline
assembly is executed */
: "rax", "rcx", "rdx", "rsi", "rdi", "r8", "r9", "r10", "r11",
"xmm0","xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7",
"xmm8","xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15",
"mm0","mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm6",
"st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)"
);
return 0;
}
I used an input constraint to allow the template to choose an available register to be used to pass the str address through. This ensures that we have a register to store the str address between the calls to printf. I also get the assembler template to choose an available location for storing RSP temporarily by using a dummy register. The registers chosen will not include any one already chosen/listed as an input/output/clobber operand.
This looks very messy, but failure to do it correctly could lead to problems later as you program becomes more complex. This is why calling functions that conform to the System V 64-bit ABI within inline assembler is generally not the best way to do things.
#include <stdio.h>
int main(void){
int sum = 0;
sum += 0xabcd;
printf(“%x”, sum);
return 0;
}
This is my code and when I use gdb I can find different address when break main / break *main.
When I just type disassemble main it shows like this:
Dump of assembler code for function main:
0x080483c4 <+0>: push %ebp
0x080483c5 <+1>: mov %esp,%ebp
0x080483c7 <+3>: and $0xfffffff0,%esp
0x080483ca <+6>: sub $0x20,%esp
0x080483cd <+9>: movl $0x0,0x1c(%esp)
0x080483d5 <+17>:addl $0xabcd,0x1c(%esp)
0x080483dd <+25>:mov $0x80484c0,%eax
0x080483e2 <+30>:mov 0x1c(%esp),%edx
0x080483e6 <+34>:mov %edx,0x4(%esp)
0x080483ea <+38>:mov %eax,(%esp)
0x080483ed <+41>:call 0x80482f4 <printf#plt>
0x080483f2 <+46>:mov $0x0,%eax
0x080483f7 <+51>:leave
0x080483f8 <+52>:ret
End of assembler dump.
So when I type [break *main] it starts 0x080483c4 but type [break main] it start 0x080483cd
Why is start address is different?
Why is the address different.
Because break function and break *address are not the same thing(*address specifies the address of the function's first instruction, before the stack frame and arguments have been set up).
In the first case, GDB skips function prolog (setting up the current frame).
Total guess - and prepared to be totally wrong.
*main if address of the function
Breaking inside main is the first available address to stop inside the function when it is being executed.
Note that 0x080483cd is the first place a debugger can stop as it is modifying a variable (ie assigning zero to sum)
When you are breaking at 0x080483c4 this is before the setup assembler that C knows nothing about
Sorry if this question is really simple, but I tried all that I know and coudn't figure it out.
I'm trying to make a simple procedure which takes a string and a Count from the console and print the string number of times specified by the Count.
Everything is fine, but when I mov the Count to eax for a loop, the value get's messed up and I end up with an infinite loop of print.
I tried to change the Count to DWORD with atodw, but didn't work.
here's the code :
PrintString PROTO :DWORD, :DWORD
.data
String db 100 DUP(0)
Count db 10 DUP(0)
.code
start:
;1- get user input
invoke StdIn, addr String, 99
invoke StdIn, addr Count, 10
;2- Remove the CRLF from count
invoke StripLF, addr Count
;3- Convert the count to DWORD
invoke atodw, addr InputCount
mov Counter, eax
;4- Call the Printer function
invoke Printer, addr String, addr Count
Printer PROC StringToPrint:DWORD, count:DWORD
mov eax,count ;;;;;; This is the problem I think
Looppp:
push eax
invoke StdOut, StringToPrint
pop eax
dec eax
jnz Looppp
ret
Printer endp
You’re passing addr Count – the address of the string – as the second argument to Printer. But it expects an integer, so you want to pass Counter instead.
Since you’re using a language without type checking, adopting a naming convention such as Hungarian notation for your identifiers could help you see and avoid this kind of problem. With the variables here named strCount and dwCount, for example, it would be more obvious that you were using the wrong one.
As an aside, eax must eventually reach zero so your printing loop won’t be infinite – just rather longer than you intended…
I'm trying to put the equivalent of asm{int 3} (or similar) into my iPhone program. My goal is to have Xcode stop exactly on the offending line, without having to fiddle with the call stack (so _Debugger doesn't sound like it would do, not that I could find which framework it's in anyway...), and leave me able to resume execution (which is why I'm not happy with assert).
(I'm used to both these behaviours on other systems, and I'd like to reproduce them on iOS.)
My best attempt so far has been this:
asm volatile("bkpt 1");
This stops Xcode on the line in question, but when I try to continue with Cmd+Alt+P, Xcode appears to run the BKPT again. And if I use Shift+Cmd+O, I just get this:
Watchdog has expired. Remote device was disconnected? Debugging session terminated.
(Needless to say, the remote device IS still connected.)
I don't have a huge amount of experience with iOS, Mac, ARM, gdb, or gcc's asm stuff. So I'm stumped already. Is there some way of getting iOS and Xcode to do what I want?
(I don't know if it makes a difference but judging by the instruction size my program is ARM code.)
Try:
__builtin_trap();
works on Mac as well as iOS, and you can drag the little green cursor to the next line to continue running.
raise(SIGTRAP) is a relatively portable way to have an "in code" breakpoint.
I've tried all of these solutions and although #RichardGroves answer preserved the stack, the best solution is to:
create your own assert method, such as Debug::assert(...)
set a breakpoint within XCode on that implementation
use the Step Out command to get back to the caller
This is because it's the only reliable way to both
view the stack trace
step / continue
int resume = false;
for (int i = 0; i < 20 && !resume; ++i)
sleep(1);
Above is a poor man's trap in that you have to manually attach to the program in question. Increase the delay as appropriate. Put the code where you want to break, and insert a breakpoint on the sleep statement, build and run your program, and attach to it from Xcode. Once Xcode breaks, you can right-click on the resume variable and edit it to 1, to resume execution.
I tried to find implementation that behaves the same as __debugbreak() that comes with Microsoft compiler and breaks inside my code and not somewhere inside system libraries and allows me to continue execution. This implementation of __debugbreak() works exactly as I wanted:
#if defined(__APPLE__) && defined(__aarch64__)
#define __debugbreak() __asm__ __volatile__( \
" mov x0, %x0; \n" /* pid */ \
" mov x1, #0x11; \n" /* SIGSTOP */ \
" mov x16, #0x25; \n" /* syscall 37 = kill */ \
" svc #0x80 \n" /* software interrupt */ \
" mov x0, x0 \n" /* nop */ \
:: "r"(getpid()) \
: "x0", "x1", "x16", "memory")
#elif defined(__APPLE__) && defined(__arm__)
#define __debugbreak() __asm__ __volatile__( \
" mov r0, %0; \n" /* pid */ \
" mov r1, #0x11; \n" /* SIGSTOP */ \
" mov r12, #0x25; \n" /* syscall 37 = kill */ \
" svc #0x80 \n" /* software interrupt */ \
" mov r0, r0 \n" /* nop */ \
:: "r"(getpid()) \
: "r0", "r1", "r12", "memory")
#elif defined(__APPLE__) && defined(__i386__)
#define __debugbreak() __asm__ __volatile__("int $3; mov %eax, %eax")
#endif
#define ASSERT(expr) do { if (!(expr)){ __debugbreak(); } } while(0)
int pthread_kill(pthread_t thread, int sig); allows for continuation, and pauses on the current thread, via pthread_self().
Similar to other signal functions (e.g., kill(), raise(), etc.), however,pthread_kill() is used to request that a signal be delivered to a particular thread.
Pthread_kill Manual
std::runtime_error::runtime_error("breakpoint")
together with an XCode exception breakpoint of type
Exception:C++ "named:std::runtime"
worked for me (using XCode 8.0).
It yields the same result as if I had set a breakpoint manually at the line where the
std::runtime_error::runtime_error
function is called, i.e. correct thread, correct call stack, and the possibility to resume.
To force xcode to break, use
kill(getpid(), SIGSTOP)
You can then step out/up and use lldb per usual. When you're done, you can hit continue and it works just like a breakpoint set from the Xcode GUI.
Tested with Swift 5 and Xcode 11.3
Direct equivalent of x86 int3 / int 3 in arm / arm64 is
#if TARGET_CPU_ARM | TARGET_CPU_ARM64 | TARGET_CPU_ARM64E
asm volatile("trap");
#endif