Junks ax after expression - gcc

This code works as a IN func for i/o ports in inline c:
int func(short port)
{
short data;
__asm__(
".intel_syntax \n\t"
"in byte %0, %1\n\t"
".att_syntax\n\t"
:"=a" (data)
:"dN" (port)
:
);
return data;
}
but on compiling the c code the assembler outputs:
file.c:5: Error: junk `ax' after expression
Here is the compiling command
gcc -ffreestanding -m32 -c file.c -o file.o
I tried viewing the code in assemly using -S flag but it seems to be fine

Related

Why does macOS kill static executables created by clang?

I have a minimal c program for the m1 arm cpu that returns 42:
void _start() {
asm("mov x0, #42;");
asm("mov x16, #1;");
asm("svc 0x80;");
}
This code compiles after telling clang to use the _start symbol and returns the correct value.
clang -Wl,-e, -Wl,__start test.c -o dyn.out
./dyn.out ; echo $?
42
However this binary still has dynamic links according to otool:
otool -L ./dyn.out
./dyn.out:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)
After telling clang to produce a unsigned static binary however macOS then immediately kills the binary when trying to run.
clang -Wl,-e, -Wl,__start -static -nostdlib test.c -o static_no_sign.out
zsh: killed ./static_no_sign.out
Signing the binary before running also produces the same problem.
clang -Wl,-e, -Wl,__start -static -nostdlib test.c -o static_sign.out
codesign -s - static_sign.out
./static_sign.out
zsh: killed ./static_sign.out
The following messages are produced in Console:
taskgated: UNIX error exception: 3
taskgated: no signature for pid=93166 (cannot make code: UNIX[No such process])
But codesign can verify the signature
codesign -v -v static_sign.out
static_sign.out: valid on disk
static_sign.out: satisfies its Designated Requirement
Can anyone clarify why macOS is deciding to kill the clang produced binaries?
Because static binaries are explicitly disallowed on any architecture other than x86_64.
XNU contains this code piece in the Mach-O loader:
case MH_EXECUTE:
if (depth != 1 && depth != 3) {
return LOAD_FAILURE;
}
if (header->flags & MH_DYLDLINK) {
/* Check properties of dynamic executables */
if (!(header->flags & MH_PIE) && pie_required(header->cputype, header->cpusubtype & ~CPU_SUBTYPE_MASK)) {
return LOAD_FAILURE;
}
result->needs_dynlinker = TRUE;
} else if (header->cputype == CPU_TYPE_X86_64) {
/* x86_64 static binaries allowed */
} else {
/* Check properties of static executables (disallowed except for development) */
#if !(DEVELOPMENT || DEBUG)
return LOAD_FAILURE;
#endif
}
break;
If you do the exact same thing on x86_64, it works:
void _start()
{
__asm__ volatile
(
".intel_syntax noprefix\n"
"mov eax, 0x2000001\n"
"mov edi, 42\n"
"syscall"
);
}
% clang -Wl,-e,__start -static -nostdlib t.c -o t -arch x86_64
% ./t
% echo $?
42

When compiled inb and outb in inline assembly produce "Error: operand type mismatch"

I'm trying to code the simplest of the kernels for 64 bits arch and I'm having trouble with keyboard input.
I'm currently implementing this two functions to manage I/O
unsigned char inportb (unsigned short _port)
{
unsigned char rv;
__asm__ __volatile__ ("inb %1, %0" : "=a" (rv) : "dN" (_port));
return rv;
}
void outportb (unsigned short _port, unsigned char _data)
{
__asm__ __volatile__ ("outb %1, %0" : : "dN" (_port), "a" (_data));
}
But I'm getting this assembler error:
main.c: Mensajes del ensamblador:
main.c:51: Error: no coincide el tipo de operando para «in»
main.c:61: Error: no coincide el tipo de operando para «out»
Or in English:
main.c: Assembler messages:
main.c:51: Error: operand type mismatch for `in'
main.c:61: Error: operand type mismatch for `out'
My guess is that this code (that I got from http://www.osdever.net/bkerndev/Docs/creatingmain.htm) is designed for 32 bits assembly.
Any help on how to solve my problem would be greatly appreciated.
I build and run everything with this script
#!/bin/bash
nasm -f bin boot.asm -o boot.bin
nasm -f elf64 loader.asm -o loader.o
#cc -m64 -ffreestanding -fno-builtin -nostdlib -c main.c
cc -m64 -masm=intel -c main.c
ld -Ttext 0x100000 -o kernel.elf loader.o main.o
objcopy -R .note -R .comment -S -O binary kernel.elf kernel.bin
dd if=/dev/zero of=image.bin bs=512 count=2880
dd if=boot.bin of=image.bin conv=notrunc
dd if=kernel.bin of=image.bin conv=notrunc bs=512 seek=1
rm ./boot.bin ./kernel.bin ./main.o ./loader.o ./kernel.elf
qemu-system-x86_64 image.bin
By default GCC uses AT&T assembly syntax when generating assembly code from C code. This can be overridden by using the -masm=intel GCC compile option. In the update to your question you have -masm=intel in your GCC command line:
cc -m64 -masm=intel -c main.c
The code you found was designed for AT&T syntax where the source operand of an instruction is first and the destination is second. -masm=intel option has reversed that behavior. You have two choices. Reverse the operands in the inline assembly so they are destination, source (intel syntax) like this:
unsigned char inportb (unsigned short _port)
{
unsigned char rv;
__asm__ __volatile__ ("inb %0, %1" : "=a" (rv) : "dN" (_port));
return rv;
}
void outportb (unsigned short _port, unsigned char _data)
{
__asm__ __volatile__ ("outb %0, %1" : : "dN" (_port), "a" (_data));
}
The other option is to remove -masm=intel option from your GCC command line and keep the code as is. This might be preferable as a significant amount of OS Development code uses AT&T syntax for inline assembly.
Note: You might want to consider using gcc instead of just cc

Have GAS generate instruction from inline assembly?

I'm trying to assemble a file that uses ARM's CRC instruction. The assembler is producing an error Error: selected processor does not support 'crc32b w1,w0,w0'.
There are runtime checks in place, so we are safe with the instruction. The technique works fine on i686 and x86_64. For example, I can assemble a file that uses Intel CRC intrinsics or SHA Intrinsics without -mcrc or -msha (and on a machine without the features).
Here is the test case:
$ cat test.cxx
#include <arm_neon.h>
#define GCC_INLINE_ATTRIB __attribute__((__gnu_inline__, __always_inline__, __artificial__))
#if defined(__GNUC__) && !defined(__ARM_FEATURE_CRC32)
__inline unsigned int GCC_INLINE_ATTRIB
CRC32B(unsigned int crc, unsigned char v)
{
unsigned int r;
asm ("crc32b %w2, %w1, %w0" : "=r"(r) : "r"(crc), "r"((unsigned int)v));
return r;
}
#else
// Use the intrinsic
# define CRC32B(a,b) __crc32b(a,b)
#endif
int main(int argc, char* argv[])
{
return CRC32B(argc, argc);
}
And here is the result:
$ g++ test.cxx -c
/tmp/ccqHBPUf.s: Assembler messages:
/tmp/ccqHBPUf.s:23: Error: selected processor does not support `crc32b w1,w0,w0'
Placing the ASM code in a source file and compiling with different options is not feasible because CRC32B will be used in C++ header files, too.
How do I get GAS to assemble the instruction?
GCC's configuration and options are the reason we are trying to do things this way. User's don't read manuals, so they won't add -march=armv8-a+crc+crypto -mtune=cortex-a53 to CFLAGS and CXXFLAGS.
In addition, distros compile to a "least capable" machine, so we want the hardware acceleration routines available. When the library is provided by a distro like Linaro, both code paths (software CRC and hardware accelerated CRC) will be available.
The machine is a LeMaker HiKey, which is ARMv8/Aarch64. It has an A53 processor with CRC and Crypto (CRC and Crypto is optional under the architecture):
$ cat /proc/cpuinfo
Processor : AArch64 Processor rev 3 (aarch64)
processor : 0
...
processor : 7
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: AArch64
GCC lacks most of the usual defines one expects to be present by default:
$ g++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
#define __aarch64__ 1
#define __AARCH64_CMODEL_SMALL__ 1
#define __AARCH64EL__ 1
Using GCC's -march=native does not work on ARM:
$ g++ -march=native -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
cc1: error: unknown value ‘native’ for -march
And Clang:
$ clang++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
#define __AARCH64EL__ 1
#define __ARM_64BIT_STATE 1
#define __ARM_ACLE 200
#define __ARM_ALIGN_MAX_STACK_PWR 4
#define __ARM_ARCH 8
#define __ARM_ARCH_ISA_A64 1
#define __ARM_ARCH_PROFILE 'A'
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FP 0xe
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_FP_FENV_ROUNDING 1
#define __ARM_NEON 1
#define __ARM_NEON_FP 0xe
#define __ARM_PCS_AAPCS64 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_SIZEOF_WCHAR_T 4
#define __aarch64__ 1
GCC version:
$ gcc -v
...
gcc version 4.9.2 (Debian/Linaro 4.9.2-10)
GAS version:
$ as -v
GNU assembler version 2.24 (aarch64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.24
This answer came from Jiong Wang on the Binutils mailing list. It bypasses GAS's architectural requirements and plays well with GCC:
__inline unsigned int GCC_INLINE_ATTRIB
CRC32W(unsigned int crc, unsigned int val)
{
#if 1
volatile unsigned int res;
asm ("\n"
"\t" ".set reg_x0, 0\n"
"\t" ".set reg_x1, 1\n"
"\t" ".set reg_x2, 2\n"
"\t" ".set reg_x3, 3\n"
"\t" ".set reg_x4, 4\n"
"\t" ".set reg_x5, 5\n"
"\t" ".set reg_x6, 6\n"
"\t" ".set reg_x7, 7\n"
"\t" "#crc32w %w0, %w1, %w2\n"
"\t" ".inst 0x1ac04800 | (reg_%2 << 16) | (reg_%1 << 5) | (reg_%0)\n"
: "=r"(res) : "r"(crc), "r"(val)
);
return res;
#else
volatile unsigned int res;
asm (".cpu generic+fp+simd+crc+crypto \n"
"crc32w %w0, %w1, %w2 \n"
: "=r"(res) : "r"(crc), "r"(val));
return res;
#endif
}
The second one commented out by the preprocessor block was suggested by Nick Clifton on the Binutils mailing list. The idea is GCC generates code using the ISA based on -march=XXX, so it does not matter if we increase capabilities to get past the assembler. We decided to go with Wang's answer because we did not want potential side effects from modifying the .cpu.
And the verification with GCC 4.8 and Binutils 2.24:
$ g++ -O1 test.cxx -c
$ objdump --disassemble test.o
test.o: file format elf64-littleaarch64
Disassembly of section .text:
0000000000000000 <main>:
0: 12001c01 and w1, w0, #0xff
4: 1ac14800 crc32w w0, w0, w1
8: d65f03c0 ret

gcc warning different when using --preprocessed

When compiling ltrace with icecc we run into a compilation problem. This is the minimal example:
main.c
#include <assert.h>
int main(int argc, char **argv) {
assert(argc != argc);
return 0;
}
test.sh:
#!/bin/bash
set -x
# one step compilation (no warning)
gcc -Wall main.c
# splitted compilation (warning)
gcc -Wall -E main.c -o main.i
gcc -Wall --preprocessed main.i
output:
++ gcc -Wall main.c
++ gcc -Wall -E main.c -o main.i
++ gcc -Wall --preprocessed main.i
main.c: In function ‘main’:
main.c:4:10: warning: self-comparison always evaluates to false [-Wtautological-compare]
assert(argc != argc);
^~
As you can see the result is different when compiling in one step and when preprocessing and compiling in two steps. Is this intended behavior?
I use gcc 6.3, the issue also appears in gcc 6.2 for ARM. I also cannot ignore this, as the full example uses -Werror.

Why doesn't 'ld' warn when linking '-mthumb-only' object files without '-mthumb'?

In my setup using gcc-arm-none-eabi 4.8 and binutils 2.26 I get pretty undefined behavior when compiling the object files separately with -mthumb but leaving that flag out in the final linking step using ld without getting any warning from the linker. Why is that the case?
The undefined behavior is probably (according to the very helpful FOSS developers I have asked first) due to the default multilib architecture chosen by the linker due to the lack of the flag. However, why doesn't the linker warn about this issue? Can't it easily detect the ISA of the linked functions from the object and library files to determine that something fishy is going on?
Ideally yes it should know from the objects being linked. Disassemble and examine the interaction between functions of different objects.
I was recently trying to demonstrate how good the gnu tools/linker was at adding trampolines. When it consistently failed miserably. It would put the trampoline in for you for one direction but not the other (thumb to/from arm). I think this was an assembly to/from C issue and I didnt use a plethera of directives like the compiler uses, so that was probably it.
Fairly simple to setup some test cases, make an object out of functions built for arm, make another out of functions built for thumb, have them call each other. Put enough structure around it to get the tools to link without error and then disassemble and examine.
armstart.s
.global _start
_start:
bl notmain
b hang
hang: b .
notmain.c
extern unsigned int one ( unsigned int );
int notmain ( void )
{
one(1);
return(0);
}
one.c
extern unsigned int two ( unsigned int x );
unsigned int one ( unsigned int x )
{
return(two(x+5));
}
two.c
extern unsigned int one ( unsigned int x );
unsigned int two ( unsigned int x )
{
return(one(x+7));
}
Makefile
ARMGNU = arm-none-eabi
#ARMGNU = arm-linux-gnueabi
AOPS = --warn --fatal-warnings
COPS = -Wall -O2 -nostdlib -nostartfiles -ffreestanding
all : notmain.bin
clean:
rm -f *.bin
rm -f *.o
rm -f *.elf
rm -f *.list
rm -f *.bc
rm -f *.opt.s
rm -f *.norm.s
rm -f *.hex
armstart.o : armstart.s
$(ARMGNU)-as $(AOPS) armstart.s -o armstart.o
notmain.o : notmain.c
$(ARMGNU)-gcc $(COPS) -c notmain.c -o notmain.o
two.o : two.c
$(ARMGNU)-gcc $(COPS) -mthumb -c two.c -o two.o
one.o : one.c
$(ARMGNU)-gcc $(COPS) -c one.c -o one.o
notmain.bin : memmap armstart.o notmain.o one.o two.o
$(ARMGNU)-ld -o notmain.elf -T memmap armstart.o notmain.o one.o two.o
$(ARMGNU)-objdump -D notmain.elf > notmain.list
$(ARMGNU)-objcopy notmain.elf notmain.hex -O ihex
$(ARMGNU)-objcopy notmain.elf notmain.bin -O binary
some of the disassembly:
20000024 <one>:
20000024: e92d4010 push {r4, lr}
20000028: e2800005 add r0, r0, #5
2000002c: eb000007 bl 20000050 <__two_from_arm>
20000030: e8bd4010 pop {r4, lr}
20000034: e12fff1e bx lr
20000038 <two>:
20000038: b510 push {r4, lr}
2000003a: 3007 adds r0, #7
2000003c: f000 f804 bl 20000048 <__one_from_thumb>
20000040: bc10 pop {r4}
20000042: bc02 pop {r1}
20000044: 4708 bx r1
20000046: 46c0 nop ; (mov r8, r8)
20000048 <__one_from_thumb>:
20000048: 4778 bx pc
2000004a: 46c0 nop ; (mov r8, r8)
2000004c: eafffff4 b 20000024 <one>
20000050 <__two_from_arm>:
20000050: e59fc000 ldr ip, [pc] ; 20000058 <__two_from_arm+0x8>
20000054: e12fff1c bx ip
20000058: 20000039 andcs r0, r0, r9, lsr r0
2000005c: 00000000 andeq r0, r0, r0
And the toolchain in this case
arm-none-eabi-gcc (GCC) 6.1.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
(binutils 2.26.20160125)
Worked quite nicely.
Likewise with gcc 5.4.0 and binutils 2.26.1.
Looking at the differences between readelf for the two objects we see for example:
< 00000004 00000a1d R_ARM_JUMP24 00000000 two
---
> 00000004 00000a0a R_ARM_THM_CALL 00000000 one
compiling the object files separately with -mthumb but leaving that flag out in the final linking step using ld without getting any warning from the linker. Why is that the case?
The resulting executable file should work fine on most ARM platforms except for the microcontroller profiles (Cortex-M). There is even a good reason to mix thumb code with ARM libraries: Code size.

Resources