GCC Jump Table initialization code generating movsxd and add?

GCC Jump Table initialization code generating movsxd and add? - gcc

When I compile a switch statement with optimization in GCC, it sets up a jump table like this,
(fcn) sym.foo 148
sym.foo (unsigned int arg1);
; arg unsigned int arg1 # rdi
0x000006e0 83ff06 cmp edi, 6 ; arg1
0x000006e3 0f87a7000000 ja case.default.0x790
0x000006e9 488d156c0100. lea rdx, [0x0000085c]
0x000006f0 89ff mov edi, edi
0x000006f2 4883ec08 sub rsp, 8
0x000006f6 486304ba movsxd rax, dword [rdx + rdi*4]
0x000006fa 4801d0 add rax, rdx ; '('
;-- switch.0x000006fd:
0x000006fd ffe0 jmp rax ; switch table (7 cases) at 0x85c
Is the MOVSXD and ADD the best way to do that,
movsxd rax, dword [rdx + rdi*4]
add rax, rdx
Isn't that the same as using LEA with displacement
lea rax, [rdx + rdi*4 + rdx]
It occurs to me that I probably don't understand what's going on here. RDX seems to be the start off the start of the jump table. RDI is the incoming argument to the switch statement. Why are we adding RDX twice though?
This is the switch statement I was compiling with -O3,
int foo (int x) {
switch(x) {
//case 0: puts("\nzero"); break;
case 1: puts("\none"); break;
case 2: puts("\ntwo"); break;
case 3: puts("\nthree"); break;
case 4: puts("\nfour"); break;
case 5: puts("\nfive"); break;
case 6: puts("\nsix"); break;
}
return 0;
}

GCC is using relative displacements in its jump table (relative to the base of the table), instead of absolute addresses. So the jump table itself is position-independent, and doesn't need fixups when it's relocated, e.g. as part of loading a PIE executable or a PIC shared library.
If you compile with -fno-pie -no-pie, gcc might choose to use a table of jump targets with jmp [table + rdi*8]
Targets like x86-64 Linux do support runtime data fixups, so a simple jump table would be possible. But some targets don't support fixups at all, which is why gcc -fPIC / -fpie avoids it entirely. This potential optimization is gcc bug 84011. See discussion there for more.
It's unfortunate gcc is using a jump table instead of realizing that the only difference between each case is the data, not code. So really it just needs a table lookup of string pointers. (Which could be done with relative displacements if it wanted to.)
That's a separate missed optimization, which I reported as bug 85585. (That reminds me, I have a followup to that half-written which I should finish and post.)

Is the MOVSXD and ADD the best way to do that,
It could be done with just an add with a qword memory operand. Of course the downside is that it makes the table twice as big.
Isn't that the same as using LEA with displacement
No, lea does not access memory.
Why are we adding RDX twice though?
The first time it is used as the base of the table to index into it. The table holds addresses relative to itself, so adding RDX to the value from the table creates an absolute address.
By the way this could easily be improved:
mov edi, edi ; truncate rdi to 32bit
A self-mov cannot be mov-eliminated on current architectures, so it would be better to mov to some other register.

Related

Cannot modify data segment register. When tried General Protection Error is thrown

I have been trying to create an ISR handler following this
tutorial by James Molloy but I got stuck. Whenever I throw a software interrupt, general purpose registers and the data segment register is pushed onto the stack with the variables automatically pushed by the CPU. Then the data segment is changed to the value of 0x10 (Kernel Data Segment Descriptor) so the privilege levels are changed. Then after the handler returns those values are poped. But whenever the value in ds is changed a GPE is thrown with the error code 0x2544 and after a few seconds the VM restarts. (linker and compiler i386-elf-gcc , assembler nasm)
I tried placing hlt instructions in between instructions to locate which instruction was throwing the GPE. After that I was able to find out that the the `mov ds,ax' instruction. I tried various things like removing the stack which was initialized by the bootstrap code to deleting the privilege changing parts of the code. The only way I can return from the common stub is to remove the parts of my code which change the privilege levels but as I want to move towards user mode I still want them to stay.
Here is my common stub:
isr_common_stub:
pusha ; Pushes edi,esi,ebp,esp,ebx,edx,ecx,eax
xor eax,eax
mov ax, ds ; Lower 16-bits of eax = ds.
push eax ; save the data segment descriptor
mov ax, 0x10 ; load the kernel data segment descriptor
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
call isr_handler
xor eax,eax
pop eax
mov ds, ax ; This is the instruction everything fails;
mov es, ax
mov fs, ax
mov gs, ax
popa
iret
My ISR handler macros:
extern isr_handler
%macro ISR_NOERRCODE 1
global isr%1 ; %1 accesses the first parameter.
isr%1:
cli
push byte 0
push %1
jmp isr_common_stub
%endmacro
%macro ISR_ERRCODE 1
global isr%1
isr%1:
cli
push byte %1
jmp isr_common_stub
%endmacro
ISR_NOERRCODE 0
ISR_NOERRCODE 1
ISR_NOERRCODE 2
ISR_NOERRCODE 3
...
My C handler which results in "Received interrupt: 0xD err. code 0x2544"
#include <stdio.h>
#include <isr.h>
#include <tty.h>
void isr_handler(registers_t regs) {
printf("ds: %x \n" ,regs.ds);
printf("Received interrupt: %x with err. code: %x \n", regs.int_no, regs.err_code);
}
And my main function:
void kmain(struct multiboot *mboot_ptr) {
descinit(); // Sets up IDT and GDT
ttyinit(TTY0); // Sets up the VGA Framebuffer
asm volatile ("int $0x1"); // Triggers a software interrupt
printf("Wow"); // After that its supposed to print this
}
As you can see the code was supposed to output,
ds: 0x10
Received interrupt: 0x1 with err. code: 0
but results in,
...
ds: 0x10
Received interrupt: 0xD with err. code: 0x2544
ds: 0x10
Received interrupt: 0xD with err. code: 0x2544
...
Which goes on until the VM restarts itself.
What am I doing wrong?

The code isn't complete but I'm going to guess what you are seeing is a result of a well known bug in James Molloy's OSDev tutorial. The OSDev community has compiled a list of known bugs in an errata list. I recommend reviewing and fixing all the bugs mentioned there. Specifically in this case I believe the bug that is causing problems is this one:
Problem: Interrupt handlers corrupt interrupted state
This article previously told you to know the ABI. If you do you will
see a huge problem in the interrupt.s suggested by the tutorial: It
breaks the ABI for structure passing! It creates an instance of the
struct registers on the stack and then passes it by value to the
isr_handler function and then assumes the structure is intact
afterwards. However, the function parameters on the stack belongs to
the function and it is allowed to trash these values as it sees fit
(if you need to know whether the compiler actually does this, you are
thinking the wrong way, but it actually does). There are two ways
around this. The most practical method is to pass the structure as a
pointer instead, which allows you to explicitly edit the register
state when needed - very useful for system calls, without having the
compiler randomly doing it for you. The compiler can still edit the
pointer on the stack when it's not specifically needed. The second
option is to make another copy the structure and pass that
The problem is that the 32-bit System V ABI doesn't guarantee that data passed by value will be unmodified on the stack! The compiler is free to reuse that memory for whatever purposes it chooses. The compiler probably generated code that trashed the area on the stack where DS is stored. When DS was set with the bogus value it crashed. What you should be doing is passing by reference rather than value. I'd recommend these code changes in the assembly code:
irq_common_stub:
pusha
mov ax, ds
push eax
mov ax, 0x10 ;0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
push esp ; At this point ESP is a pointer to where GS (and the rest
; of the interrupt handler state resides)
; Push ESP as 1st parameter as it's a
; pointer to a registers_t
call irq_handler
pop ebx ; Remove the saved ESP on the stack. Efficient to just pop it
; into any register. You could have done: add esp, 4 as well
pop ebx
mov ds, bx
mov es, bx
mov fs, bx
mov gs, bx
popa
add esp, 8
sti
iret
And then modify irq_handler to use registers_t *regs instead of registers_t regs :
void irq_handler(registers_t *regs) {
if (regs->int_no >= 40) port_byte_out(0xA0, 0x20);
port_byte_out(0x20, 0x20);
if (interrupt_handlers[regs->int_no] != 0) {
interrupt_handlers[regs->int_no](*regs);
}
else
{
klog("ISR: Unhandled IRQ%u!\n", regs->int_no);
}
}
I'd actually recommend each interrupt handler take a pointer to registers_t to avoid unnecessary copying. If your interrupt handlers and the interrupt_handlers array used function that took registers_t * as the parameter (instead of registers_t) then you'd modify the code:
interrupt_handlers[r->int_no](*regs);
to be:
interrupt_handlers[r->int_no](regs);
Important: You have to make these same type of changes for your ISR handlers as well. Both the IRQ and ISR handlers and associated code have this same problem.

In Clang/LLVM x86-64 inline assembly, how do I say I clobbered the x87/media state?

I'm writing some x86-64 inline assembly that might affect the floating point and media (SSE, MMX, etc.) state, but I don't feel like saving and restoring the state myself. Does Clang/LLVM have a clobber constraint for that?
(I'm not too familiar with the x86-64 architecture or inline assembly, so it was hard to know what to search for. More details in case this is an XY problem: I'm working on a simple coroutine library in Rust. When we switch tasks, we need to store the old CPU state and load the new state, and I'd like to write as little assembly as possible. My guess is that letting the compiler take care of saving and restoring state is the simplest way to do that.)

If your coroutine looks like an opaque (non-inline) function call, the compiler will already assume the FP state is clobbered (except for control regs like MXCSR and the x87 control word (rounding mode)), because all the FP regs are call-clobbered in the normal function calling convention.
Except for Windows, where xmm6..15 are call-preserved.
Also beware that if you're putting a call inside inline asm, there's no way to tell the compiler that your asm clobbers the red zone (128 bytes below RSP in the x86-64 System V ABI). You could compile that file with -mno-redzone or use add rsp, -128 before call to skip over the red-zone that belongs to the compiler-generated code.
To declare clobbers on the FP state, you have to name all the registers separately.
"xmm0", "xmm1", ..., "xmm15" (clobbering xmm0 counts as clobbering ymm0/zmm0).
For good measure you should also name "mm0", ..., "mm7" as well (MMX), in case your code inlines into some legacy code using MMX intrinsics.
To clobber the x87 stack as well, "st" is how you refer to st(0) in the clobber list. The rest of the registers have their normal names for GAS syntax, "st(1)", ..., "st(7)".
https://stackoverflow.com/questions/39728398/how-to-specify-clobbered-bottom-of-the-x87-fpu-stack-with-extended-gcc-assembly
You never know, it is possible to compile withclang -mfpmath=387, or to use 387 vialong double`.
(Hopefully no code uses -mfpmath=387 in 64-bit mode and MMX intrinsics at the same time; the following test-case looks slightly broken with gcc in that case.)
#include <immintrin.h>
float gvar;
int testclobber(float f, char *p)
{
int arg1 = 1, arg2 = 2;
f += gvar; // with -mno-sse, this will be in an x87 register
__m64 mmx_var = *(const __m64*)p; // MMX
mmx_var = _mm_unpacklo_pi8(mmx_var, mmx_var);
// x86-64 System V calling convention
unsigned long long retval;
asm volatile ("add $-128, %%rsp \n\t" // skip red zone. -128 fits in an imm8
"call whatever \n\t"
"sub $-128, %%rsp \n\t"
// FIXME should probably align the stack in here somewhere
: "=a"(retval) // returns in RAX
: "D" (arg1), "S" (arg2) // input args in registers
: "rcx", "rdx", "r8", "r9", "r10", "r11" // call-clobbered integer regs
// call clobbered FP regs, *NOT* including MXCSR
, "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7" // MMX
, "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)" // x87
// SSE/AVX: clobbering any results in a redundant vzeroupper with gcc?
, "xmm0", "xmm1", "xmm2", "xmm3", "xmm4", "xmm5", "xmm6", "xmm7"
, "xmm8", "xmm9", "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15"
#ifdef __AVX512F__
, "zmm16", "zmm17", "zmm18", "zmm19", "zmm20", "zmm21", "zmm22", "zmm23"
, "zmm24", "zmm25", "zmm26", "zmm27", "zmm28", "zmm29", "zmm30", "zmm31"
, "k0", "k1", "k2", "k3", "k4", "k5", "k6", "k7"
#endif
#ifdef __MPX__
, "bnd0", "bnd1", "bnd2", "bnd3"
#endif
, "memory" // reads/writes of globals and pointed-to data can't reorder across the asm (at compile time; runtime StoreLoad reordering is still a thing)
);
// Use the MMX var after the asm: compiler has to spill/reload the reg it was in
*(__m64*)p = mmx_var;
_mm_empty(); // emms
gvar = f; // memory clobber prevents hoisting this ahead of the asm.
return retval;
}
source + asm on the Godbolt compiler explorer
By commenting one of the lines of clobbers, we can see that the spill-reload go away in the asm. e.g. commenting the x87 st .. st(7) clobbers makes code that leaves f + gvar in st0, for just a fst dword [gvar] after the call.
Similarly, commenting the mm0 line lets gcc and clang keep mmx_var in mm0 across the call. The ABI requires that the FPU is in x87 mode, not MMX, on call / ret, this isn't really sufficient. The compiler will spill/reload around the asm, but it won't insert an emms for us. But by the same token, it would be an error for a function using MMX to call your co-routine without doing _mm_empty() first, so maybe this isn't a real problem.
I haven't experimented with __m256 variables to see if it inserts a vzeroupper before the asm, to avoid possible SSE/AVX slowdowns.
If we comment the xmm8..15 line, we see the version that isn't using x87 for float keeps it in xmm8, because now it thinks it has some non-clobbered xmm regs. If we comment both sets of lines, it assumes xmm0 lives across the asm, so this works as a test of the clobbers.
asm output with all clobbers in place
It saves/restores RBX (to hold the pointer arg across the asm statement), which happens to re-align the stack by 16. That's another problem with using call from inline asm: I don't think alignment of RSP is guaranteed.
# from clang7.0 -march=skylake-avx512 -mmpx
testclobber: # #testclobber
push rbx
vaddss xmm0, xmm0, dword ptr [rip + gvar]
vmovss dword ptr [rsp - 12], xmm0 # 4-byte Spill (because of xmm0..15 clobber)
mov rbx, rdi # save pointer for after asm
movq mm0, qword ptr [rdi]
punpcklbw mm0, mm0 # mm0 = mm0[0,0,1,1,2,2,3,3]
movq qword ptr [rsp - 8], mm0 # 8-byte Spill (because of mm0..7 clobber)
mov edi, 1
mov esi, 2
add rsp, -128
call whatever
sub rsp, -128
movq mm0, qword ptr [rsp - 8] # 8-byte Reload
movq qword ptr [rbx], mm0
emms # note this didn't happen before call
vmovss xmm0, dword ptr [rsp - 12] # 4-byte Reload
vmovss dword ptr [rip + gvar], xmm0
pop rbx
ret
Notice that because of the "memory" clobber in the asm statement, *p and gvar are read before the asm, but written after. Without that, the optimizer could sink the load or hoist the store so no local variable was live across the asm statement. But now the optimizer needs to assume that the asm statement itself might read the old value of gvar and/or modify it. (And assume that p points to memory that's also globally accessible somehow, because we didn't use __restrict.)

Segmentation Fault 11 linking os x 32-bit assembler

UPDATE: Sure enough, it was a bug in the latest version of nasm. I "downgraded" and after fixing my code as shown in the answer I accepted, everything is working properly. Thanks, everyone!
I'm having problems with what should be a very simple program in 32-bit assembler on OS X.
First, the code:
section .data
hello db "Hello, world", 0x0a, 0x00
section .text
default rel
global _main
extern _printf, _exit
_main:
sub esp, 12 ; 16-byte align stack
push hello
call _printf
push 0
call _exit
It assembles and links, but when I run the executable it crashes with a segmentation fault: 11.
The command lines to assemble and link are:
nasm -f macho32 hello32x.asm -o hello32x.o
I know the -o there is not 100 percent necessary
Linking:
ld -lc -arch i386 hello32x.o -o hello32x
When I run it into lldb to debug it, everything is fine until it enters into the call to _printf, where it crashes as shown below:
(lldb) s
Process 1029 stopped
* thread #1: tid = 0x97a4, 0x00001fac hello32x`main + 8, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x00001fac hello32x`main + 8
hello32x`main:
-> 0x1fac <+8>: calll 0xffffffff991e381e
0x1fb1 <+13>: pushl $0x0
0x1fb3 <+15>: calll 0xffffffff991fec84
0x1fb8: addl %eax, (%eax)
(lldb) s
Process 1029 stopped
* thread #1: tid = 0x97a4, 0x991e381e libsystem_c.dylib`vfprintf + 49, queue = 'com.apple.main-thread', stop reason = instruction step into
frame #0: 0x991e381e libsystem_c.dylib`vfprintf + 49
libsystem_c.dylib`vfprintf:
-> 0x991e381e <+49>: xchgb %ah, -0x76f58008
0x991e3824 <+55>: popl %esp
0x991e3825 <+56>: andb $0x14, %al
0x991e3827 <+58>: movl 0xc(%ebp), %ecx
(lldb) s
Process 1029 stopped
* thread #1: tid = 0x97a4, 0x991e381e libsystem_c.dylib`vfprintf + 49, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x890a7ff8)
frame #0: 0x991e381e libsystem_c.dylib`vfprintf + 49
libsystem_c.dylib`vfprintf:
-> 0x991e381e <+49>: xchgb %ah, -0x76f58008
0x991e3824 <+55>: popl %esp
0x991e3825 <+56>: andb $0x14, %al
0x991e3827 <+58>: movl 0xc(%ebp), %ecx
As you can see toward the bottom, it stops due to a bad access error.

16-byte Stack Alignment
One serious issue with your code is stack alignment. 32-bit OS/X code requires 16-byte stack alignment at the point you make a CALL. The Apple IA-32 Calling Convention says this:
The function calling conventions used in the IA-32 environment are the same as those used in the System V IA-32 ABI, with the following exceptions:
Different rules for returning structures
The stack is 16-byte aligned at the point of function calls
Large data types (larger than 4 bytes) are kept at their natural alignment
Most floating-point operations are carried out using the SSE unit instead of the x87 FPU, except when operating on long double values. (The IA-32 environment defaults to 64-bit internal precision for the x87 FPU.)
You subtract 12 from ESP to align the stack to a 16 byte boundary (4 bytes for return address + 12 = 16). The problem is that when you make a CALL to a function the stack MUST be 16 bytes aligned just prior to the CALL itself. Unfortunately you push 4 bytes before the call to printf and exit. This misaligns the stack by 4, when it should be aligned to 16 bytes. You'll have to rework the code with proper alignment. As well you must clean up the stack after you make a call. If you use PUSH to put parameters on the stack you need to adjust ESP after your CALL to restore the stack to its previous state.
One naive way (not my recommendation) to fix the code would be to do this:
section .data
hello db "Hello, world", 0x0a, 0x00
section .text
default rel
global _main
extern _printf, _exit
_main:
sub esp, 8
push hello ; 4(return address)+ 8 + 4 = 16 bytes stack aligned
call _printf
add esp, 4 ; Remove arguments
push 0 ; 4 + 8 + 4 = 16 byte alignment again
call _exit ; This will not return so no need to remove parameters after
The code above works because we can take advantage of the fact that both functions (exit and printf) require exactly one DWORD being placed on the stack for parameters. 4 bytes for main's return address, 8 for the stack adjustment we made, 4 for the DWORD parameter = 16 byte alignment.
A better way to do this is to compute the amount of stack space you will need for all your stack based local variables (in this case 0) in your main function, plus the maximum number of bytes you will need for any parameters to function calls made by main and then make sure you pad enough bytes to make the value evenly divisible by 12. In our case the maximum number of bytes needed to be pushed for any one given function call is 4 bytes. We then add 8 to 4 (8+4=12) to become evenly divisible by 12. We then subtract 12 from ESP at the start of our function.
Instead of using PUSH to put parameters on the stack you can now move the parameters directly onto the stack into the space we have reserved. Because we don't PUSH the stack doesn't get misaligned. Since we didn't use PUSH we don't need to fix ESP after our function calls. The code could then look something like:
section .data
hello db "Hello, world", 0x0a, 0x00
section .text
default rel
global _main
extern _printf, _exit
_main:
sub esp, 12 ; 16-byte align stack + room for parameters passed
; to functions we call
mov [esp],dword hello ; First parameter at esp+0
call _printf
mov [esp], dword 0 ; First parameter at esp+0
call _exit
If you wanted to pass multiple parameters you place them manually on the stack as we did with a single parameter. If we wanted to print an integer 42 as part of our call to printf we could do it this way:
section .data
hello db "Hello, world %d", 0x0a, 0x00
section .text
default rel
global _main
extern _printf, _exit
_main:
sub esp, 12 ; 16-byte align stack + room for parameters passed
; to functions we call
mov [esp+4], dword 42 ; Second parameter at esp+4
mov [esp],dword hello ; First parameter at esp+0
call _printf
mov [esp], dword 0 ; First parameter at esp+0
call _exit
When run we should get:
Hello, world 42
16-byte Stack Alignment and a Stack Frame
If you are looking to create a function with a typical stack frame then the code in the previous section has to be adjusted. Upon entry to a function in a 32-bit application the stack is misaligned by 4 bytes because the return address was placed on the stack. A typical stack frame prologue looks like:
push ebp
mov ebp, esp
Pushing EBP into the stack after entry to your function still results in a misaligned stack, but it is misaligned now by 8 bytes (4 + 4).
Because of that the code must subtract 8 from ESP rather than 12. As well when determining the space needed to hold parameters, local stack variables, and pad bytes for alignment the stack allocation size will have to be evenly divisible by 8, not by 12. Code with a stack frame could look like:
section .data
hello db "Hello, world %d", 0x0a, 0x00
section .text
default rel
global _main
extern _printf, _exit
_main:
push ebp
mov ebp, esp ; Set up stack frame
sub esp, 8 ; 16-byte align stack + room for parameters passed
; to functions we call
mov [esp+4], dword 42 ; Second parameter at esp+4
mov [esp],dword hello ; First parameter at esp+0
call _printf
xor eax, eax ; Return value = 0
mov esp, ebp
pop ebp ; Remove stack frame
ret ; We linked with C library that calls _main
; after initialization. We can do a RET to
; return back to the C runtime code that will
; exit the program and return the value in EAX
; We can do this instead of calling _exit
Because you link with the C library on OS/X it will provide an entry point and do initialization before calling _main. You can call _exit but you can also do a RET instruction with the program's return value in EAX.
Yet Another Potential NASM Bug?
I discovered that NASM v2.12 installed via MacPorts on El Capitan seems to generate incorrect relocation entries for _printf and _exit, and when linked to a final executable the code doesn't work as expected. I observed almost the identical errors you did with your original code.
The first part of my answer still applies about stack alignment, however it appears you will need to work around the NASM issue as well. One way to do this install the NASM that comes with the latest XCode command line tools. This version is much older and only supports Macho-32, and doesn't support the default directive. Using my previous stack aligned code this should work:
section .data
hello db "Hello, world %d", 0x0a, 0x00
section .text
;default rel ; This directive isn't supported in older versions of NASM
global _main
extern _printf, _exit
_main:
sub esp, 12 ; 16-byte align stack
mov [esp+4], dword 42 ; Second parameter at esp+4
mov [esp],dword hello ; First parameter at esp+0
call _printf
mov [esp], dword 0 ; First parameter at esp+0
call _exit
To assemble with NASM and link with LD you could use:
/usr/bin/nasm -f macho hello32x.asm -o hello32x.o
ld -macosx_version_min 10.8 -no_pie -arch i386 -o hello32x hello32x.o -lc
Alternatively you could link with GCC:
/usr/bin/nasm -f macho hello32x.asm -o hello32x.o
gcc -m32 -Wl,-no_pie -o hello32x hello32x.o
/usr/bin/nasm is the location of the XCode command line tools version of NASM that Apple distributes. The version I have on El Capitan with latest XCode command line tools is:
NASM version 0.98.40 (Apple Computer, Inc. build 11) compiled on Jan 14 2016
I don't recommend NASM version 2.11.08 because it has a serious bug related to macho64 format. I recommend 2.11.09rc2. I have tested that version here and it does seem to work properly with the code above.

OSX gettimeofday syscall on x86_64 seems to not work

I'm making a call to gettimeofday via the syscall instruction using 64bit code.
I can't get any results back and am getting told via Dtrace that the call worked with no errors, but the registers I get back from the call are garbage.
I do the following:
lea rdi, [rel timeval] ;buffer for 16bytes
mov rsi, 0 ;no need of timezone
mov rax, 0x2000074 ;gettimeofday
syscall
On return rax is neither 0 nor -1, and the buffer never gets any data.
Please can somebody check this and see if they can get a working call. I've no idea what is going on.
Best Regards
Chris
ps this is the example code I just tried, it dosn't return anything but 0.
SECTION .text
global _main
_main:
lea rdi, [rel buffer]
mov rsi, 0
mov rax, 0x2000074
syscall
mov rdi, [rel buffer]
mov rax, 0x2000001
syscall
SECTION .data
buffer:
times 16 db 0

According to the APIs, you need to point:
1) RDI to the timeval structure:
_STRUCT_TIMEVAL
{
__darwin_time_t tv_sec; /* seconds */
__darwin_suseconds_t tv_usec; /* and microseconds */
};
DarwinTime is a QWORD and DarwinSuSeconds is a DWORD.
2) RSI to the timezone structure.
Just point that to a 2xQWORD scratch buffer if you don't like its output.
3) RDX to the mach_absolute_time structure
This is a 1xQWORD buffer
Setting either of the above RSI, RDX, RDI to 0x0 rather than pointers might be getting rejected as it can trigger an internal exception. (Check if the return value is -EFAULT.)
Also note that POSIX deprecates gettimeofday in favour of clock_gettime (which has a seconds / nanosecond time struct), so you might want to use that instead.

Assembly - Moving data from Register to Memory in MASM

I am trying to move stuff from a register to a variable in .CODE, but trying to do so makes my program start over in an infinite loop (no crash and no error message, but obviously broken). I don't understand what I'm doing wrong. Here is the beginning of my code where I am trying to move data; the program never even gets past this part when I include it:
.CODE
screenX DWORD 0
screenY DWORD 0
...
ProcName PROC
mov ebx, edx ;; Copy srcBitmap into ebx
mov eax, edi ;; Take given y-location (edi)
mov edx, (EECS205BITMAP PTR [ebx]).dwHeight
shr edx, 1 ;; Subtract dwHeight/2 to center
sub eax, edx
mov screenY, eax ;; Program jumps back to beginning with no error message
Seems like I'm missing something obvious, anyone have a clue?

Your application's code segment (which is actually it's .text section under Windows) isn't writable. If you want to modify these variables you need to put them in the data segment.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

GCC Jump Table initialization code generating movsxd and add? - gcc

Related

Cannot modify data segment register. When tried General Protection Error is thrown

In Clang/LLVM x86-64 inline assembly, how do I say I clobbered the x87/media state?

Segmentation Fault 11 linking os x 32-bit assembler

OSX gettimeofday syscall on x86_64 seems to not work

Assembly - Moving data from Register to Memory in MASM

Categories

Resources