"Inconsistent Operand Constraints" with Inline ASM using GCC - gcc

Getting this error when trying to compile this source file using GCC:
https://github.com/wolf9466/cpuminer-multi/blob/master/cryptonight_aesni.c
"cryptonight_aesni.c:162:4: error: inconsistent operand constraints"
Specifically:
uint64_t hi, lo;
// hi,lo = 64bit x 64bit multiply of c[0] and b[0]
__asm__("mulq %3\n\t"
: "=d" (hi),
"=a" (lo)
: "%a" (c[0]),
"rm" (b[0])
: "cc" );
Very difficult to find out what this error even means, yet alone how to fix it

The instruction mulq in this code is an x86 64-bit instruction. All the parameters are 64-bit values and can't fit in 32-bit registers (when compiling for a 32-bit x86 platform)
– Michael Petch

Related

NVCC Compiler Crashes after hours of compilation

For some reason, NVCC is crashing when trying to compile a GPU program with very long double-precision arithmetic expressions, of the form
// given double precision arrays A[ ], F[ ],
__global__ myKernel(double *A, double *F, long n){
//.......thread ids
A[t+1] = A[t]*F[t+1] + ...... (order of million of terms).... + A[t-1]*F[t]
}
The same code does get compiled successfully with GCC (around 30 minutes compiling) and even executes correctly.
The crash occurs after +20hrs of compilation time, with the following error:
nvcc error : 'cicc' died due to signal 11 (Invalid memory reference)
nvcc error : 'cicc' core dumped
make: *** [Makefile:28: obj/EquationAlfa.cu.o] Error 139
As a side note, if we change the GPU program to 32-bit float, then it does compile correctly, although still taking hours to compile.
The compilation line is this:
nvcc -std=c++14 -c -o obj/EquationAlfa.cu.o src/EquationAlfa.cu
NVCC version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
During compilation, the RAM usage does not exceed 8~10GB, and the computer has 128GB of RAM.
Any help or directions would be greatly appreciated.

ARM GCC inline assembly error on %w operand

I have an ARMv8 inline assembly segment:
/* get leading 0 of cache ways */
__asm__ __volatile__
(
"CLZ %w[shift], %w[maxWay] \n"
: [shift] "=r" (uiShift)
: [maxWay] "r" (uiMaxWay)
);
When compile by ARM GCC compiler:
Interestingly, if I compile with Linaro compiler, then there is no problem.
Is there a problem in ARM GCC compiler, or in my code?
Unlike x86 where the same compiler can produce x86-32 or x86-64 code with -m32 and -m64, you need a separate build of gcc for ARM vs. AArch64.
ARM gcc accepts -march=armv8-a, but it's still compiling in 32-bit ARM mode, not AArch64.
I can reproduce your problem on the Godbolt compiler explorer with AArch64 gcc and ARM gcc. (And I included an example that uses __builtin_clz(uiShift) instead of inline asm, so it compiles to a clz instruction on either architecture.)
BTW, you could have left out the w size override on both operands, and simply use unsigned int for the input and output. Then the same inline asm would work with both ARM and AArch64. (But __builtin_clz is still better, because the compiler understands what it does. e.g. it knows the result is in the range 0..31, which may enable some optimizations.)

Pascal, Ordinal error

I need to run a loop for 10 billion times and failing to run it, please help me get this done. I am getting ordinal error.
program kittu;
var i:qword;
j:qword;
k:qword;
begin
i:= 10000000000;
k:= 0;
for j:=1 to i do
begin
k:=k+1;
end;
writeln(k);
readln();
end.
From the FreePascal docs for this error message.
Error: Ordinal expression expected The expression must be of ordinal
type, i.e., maximum a Longint. This happens, for instance, when you
specify a second argument to Inc or Dec that doesn’t evaluate to an
ordinal value.
Your variable K is defined as qword, which is a 64-bit length. LongInt is 32 bit.
The for statement is platform dependent.
Observation: qword is not supported to be used as a counter variable on 32-bit platform.
But seems no documentary support to tell which set of datatypes are supported to be used as counter variables.
Tried in both 32-bit and 64-bit platforms:
32-bit:
declaration of variable j could be changed to datatype dword to get it successfully compiled.
It is also required to compile with release mode to prevent getting an error due to overflow.
Compiler: Free Pascal IDE for Win32 for i386
Target CPU: i386
Version 1.0.12 2017/02/13
Compiler Version: 3.0.2
Environment: Win10
edit:
Successfully compiled with i386 free pascal with x86_64 cross compiler
on 64-bit Win10 (edit2: in the left hand side's command line)
[Image]
Guess: the counter in for statement might be optimized with using registers. Under i386 configuration, qword is too large for a 32-bit register.
64-bit:
[Image]
But it seems to work fine in 64-bit platform.
Compiler: Free Pascal Compiler version 3.0.2 [2017/03/18] for x86_64
Environment: Mac OSX 10.11.6

How to address errors from gcc cross compiler for ARM7 target [duplicate]

So I have been doing an assembly tutorial, and I got stuck in the very beginning.
Project name: asmtut.s
The Code:
.text
.global _start
start:
MOV R0, #65
MOV R7, #1
SWI 0
Right off the beginning I'm welcomed by 3 error messages after I try this line:as -o asmtut.o asmtut.s
asmtut.s:6: Error: expecting operand after ','; got nothing
asmtut.s:7: Error: expecting operand after ','; got nothing
asmtut.s:9: Error: no such instruction: 'swi 0'
I'm confused, because this is the exact code in the tutorial, and there it works completely fine.
Can anyone help me what could cause this?
You're trying to use an x86 assembler to assemble ARM code. They use different instruction sets and syntax.
The native gcc and as tools on your x86 Linux system will choke, just like if you tried to compile C++ with a Java compiler or vice versa. For example, # is the comment character in GAS x86 syntax, so mov r0, is a syntax error before it even gets to the point of noticing that r0 isn't a valid x86 register name.
You're following a tutorial for Assembly on Raspberry Pi (an ARM architecture) on a x86-based PC. Either run as on the Raspberry Pi, or install a cross-compile toolchain for Rasperry Pi/ARM.
Some Linux distros have packages that provide arm-linux-gnueabi-as and ...-gcc. For example, https://www.acmesystems.it/arm9_toolchain has details for Ubuntu.
To actually run the resulting binaries, you'd either run them on your ARM system, or you'd need an ARM emulator like qemu-arm. How to single step ARM assembly in GDB on QEMU? and How to run a single line of assembly, then see [R1] and condition flags have walkthroughs of doing that.

i386 movsxw instruction in x86_64

I am trying to compile Apple's Libm (version 2026, tarball here). The only file that is failing to compile properly is Source/Intel/frexp.s because
/<path>/Libm-2026/Source/Intel/frexp.s:239:5:
error: invalid instruction mnemonic 'movsxw'
movsxw 8+(8 + (32 - 8))(%rsp), %eax
^~~~~~
/<path>/Libm-2026/Source/Intel/frexp.s:291:5:
error: invalid instruction mnemonic 'movsxw'
movsxw 8(%rsp), %eax
^~~~~~
Looking around on the Internet I can only find very scanty details of the movsxw instruction but it does appear to exist for i386 architectures. I am running OS X 10.9.3 with a Core i5 processor. The macro __x86_64__ is predefined, however it seems the __i386__ macro is NOT *.
I was under the impression that the x86_64 instruction set was fully compatible with the i386 set. Is this incorrect? I can only assume that the movsxw instruction does not exist in the x86_64 instruction set, thus my question is: what does it do, and what can I replace it with from the x86_64 instruction set?
*Checked with: clang -dM -E -x c /dev/null
The canonical at&t syntax for movsxw is movswl although at least some assembler versions seem to accept the former too.
movsxb : Sign-extend a byte into the second operand
movsxw : Sign-extend a word (16 bits) into the second operand
movsxl : Sign-extend a long (32 bits) into the second operand
movsxw assembles just fine for me in 64-bit mode using gcc/as (4.8.1/2.24). I don't have clang for x86 installed on this machine, but you could try specifying the size of the second operand (i.e. change movsxw to movsxwl, which would be "sign-extend word into long").

Resources