Why does Clang generate ud2 opcode on OSX? - macos

This is possibly similar to a question here: What's the purpose of the UD2 opcode in the Linux kernel?, however, I'm getting this on OSX not on linux, and wouldn't know where to look to see if it is the same as the BUG() macro mentioned there.
I've been getting a number of release build only crashes on my OSX build which are to do with the ud2 opcode and was wondering what would cause clang to generate them. Here is an example:
COMMON_UI::BackProject3DPosition(UTILITYLIB::TVECTOR<float, 3u> const&, UTILITYLIB::TVECTOR<float, 3u> const&) const:
0x1e0705c: pushl %ebp
0x1e0705d: movl %esp, %ebp
0x1e0705f: ud2
0x1e07061: nop
This only happens at -O2, and not -O1, so it looks like the optimisations are going slighty awry.
Any help would be greatly appreciated.

I'm not 100% sure about clang, but gcc sometimes inserts ud2 to mark code areas which exhibit undefined behavior and thus are not supposed to be executed. It does give a warning in such cases, however.
So I suspect there are some warning from the compiler which you are ignoring or suppressing. Try adding -Wall -Werror to the command line.

Related

How can I debug executables for Windows using GDB in WSL?

I'm frankly not even sure if this is a thing GDB can do, but no amount of searching I've done so far has given me a 'yes' or 'no'.
When I attempt to debug an application using a GDB installation built for Linux and opened in WSL, it is unable to insert a breakpoint anywhere in the program, claiming it can not access the memory at that address. If I do this from Windows with a GDB built for Windows, this error does not happen (and before you ask why I don't just use the Windows build, it's because I'm having other miscellaneous issues with that one. I may open a question for that as well)
I've got an internal error from GDB as well, but unfortunately, I can't seem to recreate it right now.
I've tried rebuilding GDB, as well as switching to another version of GDB (the same as my Windows build)
I'm using a WSL installation of Ubuntu 20.04 and GDB 10.2, configured as follows:
(gdb) show configuration
This GDB was configured as follows:
configure --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-auto-load-dir=$debugdir:$datadir/auto-load
--with-auto-load-safe-path=$debugdir:$datadir/auto-load
--without-expat
--with-gdb-datadir=/usr/local/share/gdb (relocatable)
--with-jit-reader-dir=/usr/local/lib/gdb (relocatable)
--without-libunwind-ia64
--without-lzma
--without-babeltrace
--without-intel-pt
--without-mpfr
--without-xxhash
--without-python
--without-python-libdir
--without-debuginfod
--without-guile
--disable-source-highlight
--with-separate-debug-dir=/usr/local/lib/debug (relocatable)
To see if this was an issue with the particular program I was debugging, I made a very minimal program in NASM (my original project was also in NASM) and compiled it as follows:
nasm -f win32 -gcv8 Test.asm
gcc -m32 -g Test.obj -o Test.exe
The source assembly is very simple. It just calls printf with a string and integer.
; Test.asm
global _main
extern _printf
section .data
fmt: db "%s, %d", 0x0
string: db "Testing...", 0x0
section .bss
num: resd 1
section .text
_main:
mov dword [num], 28
push dword [num]
push string
push fmt
call _printf
add esp, 12
ret
When attempting to debug this with GDB in WSL, this is the output I get:
(gdb) file Test.exe
Reading symbols from Test.exe...
(gdb) set architecture i386:x86-64
The target architecture is set to "i386:x86-64".
(gdb) start
Temporary breakpoint 1 at 0x401520
Starting program: /mnt/c/NASM/Test.exe
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x401520
EDIT: After poking at it some more, I discovered something that seems important. GDB is only unable to access the memory and place breakpoints when the program is running. Before I've started the program, I can place breakpoints and disassemble freely.
(gdb) disas main
Dump of assembler code for function main:
0x00401520 <+0>: mov DWORD PTR ds:0x405028,0x1c
0x0040152a <+10>: push DWORD PTR ds:0x405028
0x00401530 <+16>: push 0x40300b
0x00401535 <+21>: push 0x403004
0x0040153a <+26>: call 0x40249c <printf>
0x0040153f <+31>: add esp,0xc
0x00401542 <+34>: ret
0x00401543 <+35>: xchg ax,ax
0x00401545 <+37>: xchg ax,ax
0x00401547 <+39>: xchg ax,ax
0x00401549 <+41>: xchg ax,ax
0x0040154b <+43>: xchg ax,ax
0x0040154d <+45>: xchg ax,ax
0x0040154f <+47>: nop
End of assembler dump.
(gdb) b *main+26
Breakpoint 1 at 0x40153a
(gdb) run
Starting program: /mnt/c/NASM/Test.exe
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x40153a
(gdb) disas main
Dump of assembler code for function main:
0x00401520 <+0>: Cannot access memory at address 0
EDIT 2:
I don't know how useful this information might be, but I did find a method that consistently causes an internal error for GDB. Starting execution of the program, then setting the architecture to auto causes an internal error every time I've tried it.
(gdb) file Test.exe
Reading symbols from Test.exe...
(gdb) start
Temporary breakpoint 1 at 0x401520
Starting program: /mnt/c/NASM/Test.exe
warning: Selected architecture i386 is not compatible with reported target architecture i386:x86-64
warning: Architecture rejected target-supplied description
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x401520
(gdb) set architecture auto
warning: Selected architecture i386 is not compatible with reported target architecture i386:x86-64
/mnt/c/Users/Joshua/gdb-10.2/gdb/arch-utils.c:503: internal-error: could not select an architecture automatically
A problem internal to GDB has been detected,
further debugging may prove unreliable.
If the answer to this really is as simple as "GDB built for Linux can't debug applications built for Windows"... I'll be very sad, and also quite annoyed that I was unable to find that info anywhere.

GNU Assembler in Windows Subsystem for Linux fail

I would like to compile "Hello World" in Windows Subsystem for Linux (WLS) with Debian.
.text
.global _start
_start:
movl $len,%edx
movl $msg,%ecx
movl $1,%ebx
movl $4,%eax
int $0x80
movl $0,%ebx
movl $1,%eax
int $0x80
.data
msg:
.ascii "Hello, world!\n"
len = . - msg
If i compile in a Debian server with
gcc -nostdlib -o hello hello.s
It work, but in WLS return error
/usr/bin/ld: /tmp/cciVVddg.o: relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status
I also tried
gcc -fPIC -nostdlib -o hello hello.s
There are two problems with your code:
your code is intended to be 32 bit code but gcc tries to assemble it as 64 bit code. You can fix this by passing -m32 in all stages of assembly and linkage. Please keep in mind that WSL does not actually support 32 bit code so you won't be able to run your program even if you manage to assemble it.
gcc tries to generate a position-indepentent executable. To make your code work in such an executable, you need to write position indepentent code. To do so, you need to avoid any absolute references to the addresses of variables. In 32 bit code, this is a bit tricky and I'm not going to explain this further as 32 bit code won't run on WSL anyway. The compiler advises you to compile with -fpic because that causes the compiler to generate position independent code from C files, but for assembly files it's ineffective. You can fix this issue by linking with -no-pie, causing the linker to generate a normal position-dependent binary. Note that this still doesn't mean that a 32 bit binary is going to run in WSL.

Wierd GCC behaviour with ARM assembler. ANDSEQ instruction

If I try to assemble this program:
.text
main:
andseq r1,r3,r2,lsl #13
With the command gcc -c test.s, I get the following error:
Error: bad instruction `andseq r1,r3,r2,lsl#13'
After some tries I replaced andseq with andeqs, and now it compiles fine.
But if I dump the resulting obj file with objdump -d test.o I get this:
Disassembly of section .text:
00000000 <main>:
0: 00131682 andseq r1, r3, r2, lsl #13
Note how the instruction is decoded as andseq ....
Am I missing something? Is this a bug?
My system is Raspbian GNU/Linux 8, and my gcc is: gcc (Raspbian 4.9.2-10) 4.9.2. I have also tested with gcc-8.1.0 (edit, not really see edit), same results.
EDIT:
In fact, it seems Im using the same binutils with gcc8, so I really only tested this GNU assembler (GNU Binutils for Raspbian) 2.25. I'll try a more recent assembler.
For compatibility with old assembly files, GNU as defaults to divided syntax for ARM assembly. In divided syntax, andeqs is the correct mnemonic for the instruction you desire. You can issue a .syntax unified directive to select unified syntax, in which andseq is the correct mnemonic.
GNU objdump on the other hand only knows unified syntax, which explains the apparent inconsistency.
For new developments, I advise you to consistently use unified syntax if possible.
There is a good UAL vs pre-UAL mnemonic table on ARMv8 Appendix K6 "Legacy Instruction Syntax for AArch32 Instruction Sets"
One of the entries of that table is:
Pre-UAL syntax UAL equivalent
AND<c>S ANDS<c>
where eq is one of the possible condition codes <c>.

Assembling with GCC causes weird relocation error with regards to .data

This is an issue that didn't used to ever occur. I'm pretty convinced it's probably an issue with my package repos (I recently reinstalled my Arch system and this has only just started happening).
I wrote a small hello world in x86_64:
.data
str: .asciz "Test"
.text
.globl main
main:
sub $8, %rsp
mov $str, %rdi
call puts
add $8, %rsp
ret
and then I attempt to assembly and link using GCC - like I have done many times in the past - with, simply:
gcc test.s -o test
and then this error is outputted:
/usr/bin/ld: /tmp/ccAKVV4D.o: relocation R_X86_64_32S against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
This error has never occured for me ever. I've tried to fix the issue by googling the same error message but it comes up with things that are so specific whereas I'd consider this a general issue. I've tried reinstalling base-devel and the entire GCC toolchain. I dunno what else I can do (please don't suggest using nasm, that's heresy).
I'd like to think I'm missing something obvious but I've used GCC for my assembly needs for a long time.
The way to get around this error is to generate a no-pie (Non Position Independent executable) executable :
gcc -no-pie test.s -o test
The reason for this behaviour is as explained by #Ped7g :
Debian switched to PIC/PIE binaries in 64-bits mode & GCC in your case is trying to link your object as PIC, but it will encounter absolute address in mov $str, %rdi.

Problem building ECOS for "Linux Synthetic" target

I'm trying to building Synthetic Linux target with ECOS. My software environment:
Ubuntu 11.4
GCC 4.5.2
ECOS 3.0
In the Config Tool I have set up "Linux Sythetic" target with "all" packages. Pressing F7 (build) the compilation starts, but later it says:
/opt/ecos/ecos-3.0/packages/hal/synth/i386linux/v3_0/src/syscall-i386-linux-1.0.S:
Assembler messages: make: Leaving
directory `/opt/ecos/linux_build'
/opt/ecos/ecos-3.0/packages/hal/synth/i386linux/v3_0/src/syscall-i386-linux-1.0.S:457:
Error: .size expression for
__restore_rt does not evaluate to a constant
/opt/ecos/ecos-3.0/packages/hal/synth/i386linux/v3_0/src/syscall-i386-linux-1.0.S:457:
Error: .size expression for __restore
does not evaluate to a constant
make:
[src/syscall-i386-linux-1.0.o.d] Error 1 make: [build] Error 2
The content of the file /opt/ecos/ecos-3.0/packages/hal/synth/i386linux/v3_0/src/syscall-i386-linux-1.0.S from the line 434 is:
// ----------------------------------------------------------------------------
// Special support for returning from a signal handler. In theory no special
// action is needed, but with some versions of the kernel on some
// architectures that is not good enough. Instead returning has to happen
// via another system call.
.align 16
.global cyg_hal_sys_restore_rt
cyg_hal_sys_restore_rt:
movl $SYS_rt_sigreturn, %eax
int $0x80
1:
.type __restore_rt,#function
.size __restore_rt,1b - __restore_rt
.align 8
.global cyg_hal_sys_restore
cyg_hal_sys_restore:
popl %eax
movl $SYS_sigreturn, %eax
int $0x80
1:
.type __restore,#function
.size __restore,1b - __restore
So the __restore and __restore_rt is undefinied.
I've tried to comment out this part and remove signal-related packages (it says, that it is a signal handler stuff), but it looks to be the base part of the ECOS kernel; the build seems succeed when parts are outcommented, but when I compile example apps, there are linker error because of the missing symbols (cyg_hal_sys_restore).
Silly idea, but I've tried to replace "__restore" with "cyg_hal_sys_restore"
and "...rt" same way, just to eliminate undefs (not really hoping that the wrong code causes no error), and the result is: the build is ok (as there're no undefs), example compiling is ok (as no missing symbols), but example a.out throws segfault just at the holy moment I start it.
Halp, pls., I'm not familiar with inline asm nor ECOS.
The problem seems to be related to binutils. On Debian, a downgrade to 2.20.1-16 worked for me.
http://ecos.sourceware.org/ml/ecos-discuss/2011-06/msg00010.html
EDIT: Follow link, there's a proper fix too.

Resources