Inline assembly __sync_fetch_and_add and __sync_add_and_fetch - gcc

The GCC builtin __sync_fetch_and_add is an implementation of the x86 inline assembly:
asm("lock; xaddl %%eax, %2;"
:"=a" (val)
: "a" (val), "m" (*ptr) : )
How can I implement this inline assembly using the addl instruction instead of xaddl?
And another question that I have is how would be the x86 inline assembly of the builtin __sync_add_and_fetch ?
Thanks.

Builtins do not necessarily correspond with a single well defined chunk of assembly code. In particular both __sync_add_and_fetch and __sync_fetch_and_add will generate lock addl instead of lock xaddl if the result is not live out of the builtin, and they may generate lock incl if the result is not live out and the second argument is known to have the value 1.
It is not clear what you mean by "how can I implement this inline assembly". Assembly is something that you write or generate, not something that you implement (unless you are writing an assembler).

Related

Write inline assembly without clobber lists?

Is it possible to write inline assembly (Intel syntax) with GCC or Clang, without needing to understand the clobber list "stuff"?
I'm going to guess "no" because the clobber list "stuff" ensures you don't over-write the register the compiler wrote to (immediately before your inline assembly begins)?
GNU C Basic inline asm statements (no operand/clobber lists) are not recommended for basically anything except maybe the body of an __attribute__((naked)) function. Why can't local variable be used in GNU C basic inline asm statements? (globals can't safely be used either.)
https://gcc.gnu.org/wiki/DontUseInlineAsm says to see ConvertBasicAsmToExtended for reasons not to use Basic asm statements. You can't really do anything safely in Basic asm; even asm("cli"); can get reordered with any memory accesses that aren't volatile.
If you're going to use inline asm at all (instead of writing a stand-alone function in asm, or C with intrinsics), you need to describe your string of asm instruction in exact detail to the compiler, in terms of a black box with input and/or output operands, and/or clobbers. See https://stackoverflow.com/tags/inline-assembly/info for links to guides, including some SO answers about using input / output constraints.
Think hard before deciding it's really worth using GNU C inline asm for anything. If you can get the compiler to emit the same instructions another way, that's almost always better. Intrinsics or pure C allow constant-propagation optimization; inline asm doesn't (unless you do stuff like if(_builtin_constant_p(x)) { pure C version } else { inline asm version }).
Intel syntax: in GCC, compile with -masm=intel so your asm template will be part of an Intel-syntax .s, and the compiler will substitute in operands in Intel syntax. (Like dword ptr [rsp] instead of (%rsp) for "m"(my_int)).
In clang I'm not sure there's any convenient way to use Intel-syntax in normal asm statements.
There is one other option though, if you don't care about efficient code (but then why are you using asm?): clang supports -fasm-blocks to allow syntax like MSVC's inefficient style of inline asm. And yes, this uses Intel syntax.
Is there any way to complie a microsoft style inline-assembly code on a linux platform? shows how inefficient the resulting code is: full of compiler-generated instructions to store input variables to memory for the asm{} block to read them. Because MSVC-style asm blocks can't do inputs or outputs in registers. (Clang doesn't support the leave-a-value-in-EAX method for getting a single value out so the output has to be stored/reloaded as well.)
You don't get to specify clobbers for this, so I assume an asm block implies a "memory" clobber, along with clobbers on all registers you write. (Or maybe even just mention.)
I would not recommend this; it's basically not possible to wrap a single instruction or handful of instructions efficiently this way. Only if you're writing a whole loop can you amortize the overhead of getting inputs into an asm{} block.

Using a user defined entry point in assembly x86-64 nasm when compiling with gcc

I recently started learning assembly and was wondering if it is possible for us to have our own defined entry point for an assembly code when compiling with gcc?
For example the standard code that compiles with gcc is
global main
section .data
section .bss
section .text
main:
I would like to change the entry point to a more defined name such as "addition", something like this below.
global addition
section .data
section .bss
section .text
addition:
A reason for why im using gcc to compile in the first place as well is that im using c libraries in my assembly code for "printf" and "scanf", and everytime I tried to change the entry point, I would get an undefined reference to main error.
If you are writing in assembly and not using the C runtime library, then you can call your entry point whatever you want. You tell the linker what the name of the entry point is, using either the gcc command line option -Wl,--entry=<symbol> or the ENTRY directive in the linker script. The linker writes the address of this entry point in the executable file.
If you are using the C runtime library, then the entry point in the executable file needs to be the entry point of the C runtime library, so that it can perform initialization. This entry point is typically called crt0. When crt0 finishes initializing, it calls main, so in this case, you cannot change the name.
You can put multiple labels on the same address. So you can stick the main label at whatever place you want the CRT startup code to call.
global main
main:
addition:
lea eax, [rdi+rdi] ; return argc*2
ret
I checked, and GDB chooses to show main in the disassembly for the block of code following the label, regardless of which one you declare first. (`global addition doesn't help either.)
Of if you want to be able to change one line at the top of your file to select which function is the main entry point, you could maybe do
%define addition main
I'm not sure if NASM lets you create an alias or weak-alias for a symbol, like with GAS
.weakref main, addition. (Call a function in another object file without using PLT within a shared library?)

Assembly ".set" directive emitting symbol

In some kernel-mode assembly source I have a line that looks like this:
; excerpt #1
.set __framesize, ROUND_TO_STACK(localvarsize)
(localvarsize is a parameter to a C-preprocessor macro, if you’re wondering.) I assume that __framesize is a compile-time variable that is usable in .if statements, and is then discarded. However, I find references to a symbol named __framesize in the symbol table and disassembly of my kernel. The symbol is defined (as output by nm -m) as such:
; excerpt #2
0000000000000000 (absolute) non-external __framesize
The usage of __framesize in compiler-generated assembly is as such:
; excerpt #3
movq %gs:__framesize, %rax
movq 0x140(%rax), %r15
Given what I understand of my compiler and my kernel, excerpt #3 should be emitted as movq %gs:0x140, %r15, and that code should work. (The code that is actually being emitted from the C as excerpt #3 is causing a triple fault on the second line.)
I have two questions:
Should this __framesize symbol be emitted into my binary by the assembler when used in this fashion? If possible, how can I suppress it?
Would this usage of __framesize cause a problem like what is discussed above?
I am using GAS assembler syntax and the Xcode 7.1.1 assembler, and a Mach-O output format, if it is useful.
The GNU as manual says that .set modifies the value(i.e. address) and/or type of an existing symbol. It's synonymous with .equ, so it can be used to set/modify assembler macro variable, or to mess around with symbols which are also labels.
If __framesize is showing up in the object file, then it's probably declared somewhere else.
Try looking at the disassembly output, to see what really happened.

Why I cannot compile the assembly codes for x64 platform with VC2010?

I am now practicing assembly codes mixed with c++ codes, and I can compile the mixed codes for win32 platform without any problem as the following codes illustrate:
int main()
{
char alphabet = 'X';
printf ("Type letter = ");
__asm
{
mov ah, 02
mov dl, [alphabet]
int 21h
}
printf ("\n");
return (0);
}
However, when I try to compile the above codes for x64 platform, it fails. The error message I have got is as follows:
error C4235: nonstandard extension used : '__asm' keyword not supported on this architecture
I use VC2010 for compiling, and I was wondering why VC2010 does not support assembly language compiling and what I should do in this situation. Thanks!
The compiler simply does not support inline assembly in 64-bit code.
Your options:
write assembly code in separate .asm files and assemble and link them together with the rest of the project
include in your program pre-compiled assembly code as data in some array and execute it (you'll need to make sure the assembly code is relocatable, that is, it can be executed when placed at an arbitrary location, and you'll need to change the memory protection for the pages underneath the array to executable)
use intrinsic functions if they are sufficient
don't use assembly at all
And as it's been mentioned, chances of int 21h function 2 working in a Windows program are exactly zero. That API is only available to DOS programs.

Equivalent for GCC's naked attribute

I've got an application written in pure C, mixed with some functions that contain pure ASM. Naked attribute isn't available for x86 (why? why?!) and my asm functions don't like when prologue and epilogue is messing with the stack. Is it somehow possible to create a pure assembler function that can be referenced from C code parts? I simply need the address of such ASM function.
Just use asm() outside a function block. The argument of asm() is simply ignored by the compiler and passed directly on to the assembler. For complex functions a separate assembly source file is the better option to avoid the awkward syntax.
Example:
#include <stdio.h>
asm("_one: \n\
movl $1,%eax \n\
ret \n\
");
int one();
int main() {
printf("result: %d\n", one());
return 0;
}
PS: Make sure you understand the calling conventions of your platform. Many times you can not just copy/past assembly code.
PPS: If you care about performance, use extended asm instead. Extended asm essentially inlines the assembly code into your C/C++ code and is much faster, especially for short assembly functions. For larger assembly functions a seperate assembly source file is preferable, so this answer is really a hack for the rare case that you need a function pointer to a small assembly function.
Good news everyone. GCC developers finally implemented attribute((naked)) for x86. The feature will be available in GCC 8.
Certainly, just create a .s file (assembly source), which is run through gas (the assembler) to create a normal object file.

Resources