On Matt Godbolt's Compiler Explorer website, you can compile code using various pre-installed compilers. When using PowerPC gcc 4.8 the registers cannot be distinguished from immediates (for example addi 11,31,16).
However, when the -mregnames option is used, all registers are marked with %r followed by the register index. How do I omit just the % sign to get r1 instead of %r1?
For example, void nop () {} with gcc4.8 PowerPC -O0 -mregnames:
nop():
stwu %r1,-16(%r1)
stw %r31,12(%r1)
mr %r31,%r1
addi %r11,%r31,16
lwz %r31,-4(%r11)
mr %r1,%r11
blr
When targeting PowerPC, you basically have two options for the syntax of assembly listings:
You can either use the IBM syntax (common on IBM assemblers), where the registers do not use any type of special prefix: they are just referred to with numbers. Yes, this makes it difficult to distinguish them from immediates.
Or, you can use Gnu/AT&T syntax, which always prefixes registers with % symbols (and an r, in this case). This not only makes it easier to distinguish between registers and immediates, but it also makes it possible to distinguish between integer registers (%r?) and floating-point registers (%f?).
There is no intermediate option, where you get the r (or f) prefix, but no leading %. If you need this, you can do like Jester suggested and post-process the output, using the regular expression %r[0-9]+ for matching.
An update:
powerpc-linux-gnu-gcc version 5.4.0 (the default package with Ubuntu 16.04)
When using -mregnames, you can use "%r0" or "r0" or "0" format for a register name in assembly source code files.
For disassembling, powerpc-linux-gnu-objdump defaults to the "r0" format (which I agree is easier to read).
In the example from that webpage, it looks like it is showing the listing output from the compiler, instead of using objdump. I do not know of a way to control the listing output format.
Related
I would like to create a build of my embedded C code which specifically checks that floating point operations aren't introduced into it by accident. I've tried adding +nofp to my [cortex-m3] processor architecture but GCC for ARM doesn't like that (probably because the cortex-m3 doesn't have a floating point unit). I've tried specifying -mfpu=none but that isn't a permitted option. I've tried leaving -lm off the linker command-line but the linker seems too clever to be fooled by that and is compiling code with double in it and resolving pow() anyway.
This post: https://gcc.gnu.org/legacy-ml/gcc-help/2011-07/msg00093.html from 2011 hints that GCC has no such option, since no-one is interested in it, which surprises me as it seems like a common thing to want, at least from an embedded standpoint, to avoid accidental C-library bloat.
Does anyone know of a way to do this with GCC/newlib without me having to go through and manually hack stuff out of the C library file it chooses?
It is not just a library issue. Your target will use soft-fp, and the compiler will supply floating point code to implement arithmetic operators regardless of the library.
The solution I generally apply is to scan the map file for instances of the compiler supplied floating-point routines. If your code is "fp clean" there will be no such references. The math library and any other code that perform floating-point arithmetic operations will use these operator implementations, so you only need look for these operator calls and can ignore the Newlib math library functions.
The internal soft-fp routines are listed at https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html. It is probably feasible to manually check the mapfile for fp symbols but you might write yourself a script or tool to scan the map file for these names to check your. The cross-reference section of the map file will list all modules these symbols are used in so you can use that to identify where the floating point code is used.
The Newlib stdio functions support floating-point by default. If your formatted I/O is limited to printf() you can use iprintf() instead or you can rebuild Newlib with FLOATING_POINT undefined to remove floating point support from all but scanf() (no idea why). You can then use the map file technique again to find "banned" formatted I/O functions (although these are likely to also use the floating point operator functions in any case, so you will already have spotted them indirectly).
An alternative is to use an alternative stdio library to override the Newlib versions. There are any number of "tiny printf" implementations available you could use. If you link such a library as object code or list its library ahead of Newlib in the link command, it will override the Newlib versions.
Is it possible to write inline assembly (Intel syntax) with GCC or Clang, without needing to understand the clobber list "stuff"?
I'm going to guess "no" because the clobber list "stuff" ensures you don't over-write the register the compiler wrote to (immediately before your inline assembly begins)?
GNU C Basic inline asm statements (no operand/clobber lists) are not recommended for basically anything except maybe the body of an __attribute__((naked)) function. Why can't local variable be used in GNU C basic inline asm statements? (globals can't safely be used either.)
https://gcc.gnu.org/wiki/DontUseInlineAsm says to see ConvertBasicAsmToExtended for reasons not to use Basic asm statements. You can't really do anything safely in Basic asm; even asm("cli"); can get reordered with any memory accesses that aren't volatile.
If you're going to use inline asm at all (instead of writing a stand-alone function in asm, or C with intrinsics), you need to describe your string of asm instruction in exact detail to the compiler, in terms of a black box with input and/or output operands, and/or clobbers. See https://stackoverflow.com/tags/inline-assembly/info for links to guides, including some SO answers about using input / output constraints.
Think hard before deciding it's really worth using GNU C inline asm for anything. If you can get the compiler to emit the same instructions another way, that's almost always better. Intrinsics or pure C allow constant-propagation optimization; inline asm doesn't (unless you do stuff like if(_builtin_constant_p(x)) { pure C version } else { inline asm version }).
Intel syntax: in GCC, compile with -masm=intel so your asm template will be part of an Intel-syntax .s, and the compiler will substitute in operands in Intel syntax. (Like dword ptr [rsp] instead of (%rsp) for "m"(my_int)).
In clang I'm not sure there's any convenient way to use Intel-syntax in normal asm statements.
There is one other option though, if you don't care about efficient code (but then why are you using asm?): clang supports -fasm-blocks to allow syntax like MSVC's inefficient style of inline asm. And yes, this uses Intel syntax.
Is there any way to complie a microsoft style inline-assembly code on a linux platform? shows how inefficient the resulting code is: full of compiler-generated instructions to store input variables to memory for the asm{} block to read them. Because MSVC-style asm blocks can't do inputs or outputs in registers. (Clang doesn't support the leave-a-value-in-EAX method for getting a single value out so the output has to be stored/reloaded as well.)
You don't get to specify clobbers for this, so I assume an asm block implies a "memory" clobber, along with clobbers on all registers you write. (Or maybe even just mention.)
I would not recommend this; it's basically not possible to wrap a single instruction or handful of instructions efficiently this way. Only if you're writing a whole loop can you amortize the overhead of getting inputs into an asm{} block.
is there a simple way to strip compiler information from PE file?
Use the program "strip" which comes with fpc (in fpc/bin).
The lazarus one needs to be in the units though (lclbase?), maybe the FPC one too (compiler/version.pas would be my guess). But potentially grepping is difficult because the strings might be made with {$i %%} include meta data constructs.
To work around this, and at least get the unit, one could also try to compile everything to assembler (-a -s), and then grep the generated assembler. The assembler will contain the final form
Strings can also get added by the linker, on Windows, FPC typically uses its internal (high speed) linkers.You can try to use the external (GNU LD) linker (-Xe) to see if that behaves differently.
I'm using mingw in Windows to compile code in C and assembly, several functions in which have the fastcall calling convention (as Microsoft defines it). If I use __fastcall in the declaration, mingw does what Windows does and name decorates:
An at sign (#) is prefixed to names; an at sign followed by the number of bytes (in decimal) in the parameter list is suffixed to names
This works fine. I have labels in assembly in the form:
.global #myfunction#4
#myfunction#4:
....code....
But this proves a big problem when I port to Linux (x86, 32 bit). Gcc suddenly does not like __fastcall (or __cdecl for that matter) and does not like # in labels at all. I'm not sure how I can unify the two issues - either get gcc in Linux to like # or get mingw in Windows to not add the #.
Also: I can use __attribute__(__cdecl__) in place of __cdecl but I'm puzzled as to where it goes. I assumed before the function name itself but I see people putting it after the declaration and before the semicolon. Can I do either?
Related answer: Adding leading underscores to assembly symbols with GCC on Win32?
Name decoration appears to be a common theme when porting between operating systems, platforms and even processors on the same platform (IA32 to IA64 for example loses the underscore).
The way I solved this was to remove the # decoration from all the function that used it as I didn't need to export them other than for testing. The other functions were redefined from function to _function using macros (that's what macro assemblers are for after all).
In this case I renamed the assembly code from .s to .sx (Windows platform) and uses the gcc preprocessor to check for _WIN32 and thus redefine export global symbols to have leading underscores. The same for calls to _calloc and _free.
Is there a site listing the various platforms and their support for GCC's atomic built-ins, for the various GCC versions?
EDIT:
To be more clear:
GCC adds _sync... as intrinsics on platforms it contains support for. On all other platforms it keeps those as normal functions declarations but does not supply an implementation. This must be done by some framework.
So the question is: For which platforms does GCC supply which intrinsics without need to add a function implementation?
I'm not aware if there's such a list, however http://gcc.gnu.org/projects/cxx0x.html says atomics are supported since GCC 4.4.
GCC libstdc++ implements <atomic> on top of the builtin functions `__sync_fetch_and_add' and friends ( http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Atomic-Builtins.html ).
These functions are expanded either using machine specific expanders in the machine description of the target (usually in a file named `sync.md') or, lacking such expanders, using a CAS loop. If the presense of `sync.md' file is any indication for a proper atomics support, then you can count in MIPS, i386, ARM, BlackFin, Alpha, PowerPC, IA64 and Sparc.
[Though this is an old question, I thought I should update and complete the answer]
I am not aware of a per-architecture-version and per-gcc-version table, describing supported built-ins.
The __sync built-in functions of gcc exist since version 4.1 (see, e.g., gcc 4.1.2 manual. As stated there:
Not all operations are supported by all target processors. If a particular operation cannot be implemented on the target processor, a warning will be generated and a call an external function will be generated. The external function will carry the same name as the builtin, with an additional suffix `_n' where n is the size of the data type.
So, when there is not an implementation for a specific architecture, a compilation warning will appear and, I guess, a link-time error, unless you provide the required function with the appropriate name.
After gcc 4.7 there are also __atomic built-ins and __sync built-ins are deprecated.
For example, see how Fedora uses gcc __sync and __atomic here