Translate VS inline assembler to GCC inline assembler

Translate VS inline assembler to GCC inline assembler - gcc

I find this C code with inline assembler code:
ReadFromCMOS (unsigned char array [])
{
unsigned char tvalue, index;
for(index = 0; index < 128; index++)
{
_asm
{
cli /* Disable interrupts*/
mov al, index /* Move index address*/
/* since the 0x80 bit of al is not set, NMI is active */
out 0x70,al /* Copy address to CMOS register*/
/* some kind of real delay here is probably best */
in al,0x71 /* Fetch 1 byte to al*/
sti /* Enable interrupts*/
mov tvalue,al
}
array[index] = tvalue;
}
}
WriteTOCMOS(unsigned char array[])
{
unsigned char index;
for(index = 0; index < 128; index++)
{
unsigned char tvalue = array[index];
_asm
{
cli /* Clear interrupts*/
mov al,index /* move index address*/
out 0x70,al /* copy address to CMOS register*/
/* some kind of real delay here is probably best */
mov al,tvalue /* move value to al*/
out 0x71,al /* write 1 byte to CMOS*/
sti /* Enable interrupts*/
}
}
}
I tried to translate to GNU inline assembler, but I failed, mostly because GNU inline asm is messy, uses the archaic AT&T syntax and is difficult to use.
Code that gives me error:
void read_cmos(unsigned char array[])
{
unsigned char tvalue, index;
for (index = 0; index < 128; ++index)
{
/* read from CMOS */
asm ("cli; outb %1, $0x70; inb $0x71, %0; sti" : "=a"(tvalue) : "a"(index));
}
array[index] = tvalue;
}

Try something like this:
/* read from CMOS */
asm ("cli; outb %1, $0x70; inb $0x71, %0; sti" : "=a"(tvalue) : "a"(index));
/* write to CMOS */
unsigned char i = index;
asm volatile ("cli; outb %0, $0x70; movb %1, %%al; outb %%al, $0x71; sti" : "+a"(i) : "rm"(tvalue));
Note that using an extra variable for tvalue is optional. You could also specify
"+a"(array[index])
or
"a"(array[index])
directly. What matters is that the expression you pass has a byte-sized type so gcc picks al instead of eax.
Assigning index to i is needed to allow al to be clobbered without changing the value of index. This code should just work. Alternatively, the second set of instructions can also be split up into two:
asm volatile ("cli; outb %0, $0x70" :: "a"(index));
asm volatile ("outb %0, %0x71" :: "a"(tvalue));
This avoids the need for an extra variable and gives greater flexibility to the compiler when chosing registers.

Take a look at the (ancient) GCC-inline-assembly HOWTO (geared towards i686 Linux, so probably right on for your use), check the argument passing/constraints carefully (they allow GCC to arrange calling code right, by e.g. placing the inputs/outputs in the registers used). The GCC documentation on inline assembly is also relevant, but somewhat opaque in my memory, much more detailed, covering many more architectures in detail (but presumably more up to date).
(Sorry, can't place links on my phone. A quick search should give them as first hits.)

Related

Confusion about different clobber description for arm inline assembly

I'm learning ARM inline assembly, and is confused about a very simple function: assign the value of x to y (both are int type), on arm32 and arm64 why different clobber description required?
Here is the code:
#include <arm_neon.h>
#include <stdio.h>
void asm_test()
{
int x = 10;
int y = 0;
#ifdef __aarch64__
asm volatile(
"mov %w[in], %w[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 not working, but r1 or x1 works
);
#else
asm volattile(
"mov %[in], %[out]"
: [out] "=r"(y)
: [in] "r"(x)
: "r0" // r0 works, but r1 not working
);
#endif
printf("y is %d\n", y);
}
int main() {
arm_test();
return 0;
}
Tested on my rooted android phone, for arm32, r0 generates correct result but r1 won't. For arm64, r1 or x1 generate correct result, and r0 won't. Why on arm32 and arm64 they are different? What is the concrete rule for this and where can I find it?

ARM / AArch64 syntax is mov dst, src
Your asm statement only works if the compiler happens to pick the same register for both "=r" output and "r" input (or something like that, given extra copies of x floating around).
Different clobbers simply perturb the compiler's register-allocation choices. Look at the generated asm (gcc -S or on https://godbolt.org/, especially with -fverbose-asm.)
Undefined Behaviour from getting the constraints mismatched with the instructions in the template string can still happen to work; never assume that an asm statement is correct just because it works with one set of compiler options and surrounding code.
BTW, x86 AT&T syntax does use mov src, dst, and many GNU C inline-asm examples / tutorials are written for that. Assembly language is specific to the ISA and the toolchain, but a lot of architectures have an instruction called mov. Seeing a mov does not mean this is an ARM example.
Also, you don't actually need a mov instruction to use inline asm to copy a valid. Just tell the compiler you want the input to be in the same register it picks for the output, whatever that happens to be:
// not volatile: has no side effects and produces the same output if the input is the same; i.e. the output is a pure function of the input.
asm (""
: "=r"(output) // pick any register
: "0"(input) // pick the same register as operand 0
: // no clobbers
);

How do you explain gcc's inline assembly constraints for the IN, OUT instructions of i386?

As far as I can tell, the constraints used in gcc inline assembly tell gcc where input and output variables must go (or must be) in order to generate valid assembly. As the Fine Manual says, "constraints on the placement of the operand".
Here's a specific, working example from a tutorial.
static inline uint8_t inb(uint16_t port)
{
uint8_t ret;
asm volatile ( "inb %1, %0"
: "=a"(ret)
: "Nd"(port) );
return ret;
}
inb is AT&T syntax-speak for the i386 IN instruction that receives one byte from an I/O port.
Here are the specs for this instruction, taken from the i386 manual. Note that port numbers go from 0x0000 to 0xFFFF.
IN AL,imm8 // Input byte from immediate port into AL
IN AX,imm8 // Input word from immediate port into AX
IN EAX,imm8 // Input dword from immediate port into EAX
IN AL,DX // Input byte from port DX into AL
IN AX,DX // Input word from port DX into AX
IN EAX,DX // Input dword from port DX into EAX
Given a statement like uint8_t x = inb(0x80); the assembly output is, correctly, inb $0x80,%al. It used the IN AL,imm8 form of the instruction.
Now, let's say I just care about the IN AL,imm8 form, receiving a uint8_t from a port between 0x00 and 0xFF inclusive. The only difference between this and the working example is that port is now a uint8_t template parameter (to make it effectively a constant) and the constraint is now "N".
template<uint8_t port>
static inline uint8_t inb()
{
uint8_t ret;
asm volatile ( "inb %1, %0"
: "=a"(ret)
: "N"(port) );
return ret;
}
Fail!
I thought that the "N" constraint would mean, "you must have a constant unsigned 8-bit integer for this instruction", but clearly it does not because it is an "impossible constraint". Isn't the uint8_t template param a constant unsigned 8-bit integer?
If I replace "N" with "Nd", I get a different error:
./test.h: Assembler messages:
./test.h:23: Error: operand type mismatch for `in'
In this case, the assembler output is inb %dl, %al which obviously is not valid.
Why would this only work with "Nd" and uint16_t and not "N" and uint8_t?
EDIT:
Here's a stripped-down version I tried on godbolt.org:
#include <cstdint>
template<uint8_t N>
class Port {
public:
uint8_t in() const {
uint8_t data;
asm volatile("inb %[port], %%al"
:
: [port] "N" (N)
: // clobbers
);
return data;
}
};
void func() {
Port<0x7F>().in();
}
Interestingly, this works fine, except if you change N to anything between 0x80 and 0xFF. On clang this generates a "128 is out of range for constraint N" error. This generates a more generic error in gcc.

Based on how constraints are documented your code should work as expected.
This appears to still be a bug more than a year later. It appears the compilers are converting N from an unsigned value to a signed value and attempting to pass that into an inline assembly constraint. That of course fails when the value being passed into the constraint can't be represented as an 8-bit signed value. The input constraint "N" is suppose to allow an unsigned 8-bit value and any value between 0 and 255 (0xff) should be accepted:
N
Unsigned 8-bit integer constant (for in and out instructions).
There is a similar bug report to GCC's bugzilla titled "Constant constraint check sign extends unsigned constant input operands".
In one of the related threads it was suggested you can fix this issue by ANDing (&) 0xff to the constant (ie: N & 0xff). I have also found that static casting N to an unsigned type wider than uint8_t also works:
#include <cstdint>
template<uint8_t N>
class Port {
public:
uint8_t in() const {
uint8_t data;
asm volatile("inb %[port], %0"
: "=a"(data)
: [port] "N" (static_cast<uint16_t>(N))
: // clobbers
);
return data;
}
};
void func() {
Port<0x7f>().in();
Port<0x80>().in();
// Port<0x100>().in(); // Fails as expected since it doesn't fit in a uint8_t
}
To test this you can play with it on godbolt.

text mode cursor doesn't appear in qemu vga emulator

I have a problem with the function that updates cursor position in text mode
the function definition and declaration are
#include <sys/io.h>
signed int VGAx = 0,VGAy=0;
void setcursor()
{
uint16_t position = VGAx+VGAy*COLS;
outb(0x0f, 0x03d4);
outb((position<<8)>>8,0x03d5);
outb(0x0e,0x03d4);
outb(position>>8,0x03d5);
}
and the file sys/io.h
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm ("inb %0, %%al":"=rm"(value):"a"(port));
return value;
}
static inline void outb(unsigned char value, unsigned short int port)
{
asm volatile ("outb %%al, $0"::"rm"(value), "a"(port));
}
before using the function the cursor sometimes was blinking underscore and sometimes didn't appear while after using the function no cursor appeared
here is the main function that runs
#include <vga/vga.h>
int kmain(){
setcursor()
setbgcolor(BLACK);
clc();
setforecolor(BLUE);
terminal_write('h');
setcursor();
return 0;
}
I tried using this function
void enable_cursor() {
outb(0x3D4, 0x0A);
char curstart = inb(0x3D5) & 0x1F; // get cursor scanline start
outb(0x3D4, 0x0A);
outb(0x3D5, curstart | 0x20); // set enable bit
}
which is provided here but I got this error
inline asm: operand type mismatch for 'in'
any help is appreciated
EDIT
I tried to fix the wrong inb and outb:
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm volatile("inb %1, %0" : "=a"(value) : "Nd"(port));
return value;
}
static inline void outb(unsigned char value, unsigned short int port)
{
asm volatile ("outb %%al, $0"::"Nd"(value), "a"(port));
}
I guess this is the right definition but still no cursor appeard
EDIT 2
I followed the given answer and defined the io.h file as the following
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm volatile ("inb %1, %0" : "=a"(value) : "Nd"(port));
return value;
}
static inline void outb(unsigned char value, unsigned short int port)
{
asm volatile ("outb %0, %1"::"a"(value), "Nd"(port));
}
I would like to mention that I also addedenable_cursor(); to the beginning of kmain now the compile time error is fixed but no cursor appeared (which is the main problem)
EDIT 3
I would like to point out that a version of the whole code is availabe on gihub if any one want access to pieces of code that are no available in the question

inb and outb Function Bugs
This code for inb is incorrect:
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm ("inb %0, %%al":"=rm"(value):"a"(port));
return value;
}
A few problems with it:
It seems you have the parameters to inb reversed. See the instruction set reference for inb. Remember that in AT&T syntax (that you are using in your GNU Assembler code) the operands are reversed. The instruction set reference shows them in Intel format.
The port number is either specified as an immediate 8 bit value or passed in the DX register. The proper constraint for specifying the DX register or an immediate 8 bit value for inb/outb is Nd. See my Stackoverflow answer here for an explanation of the constraint Nd.
The destination that the value read is returned in is either AL/AX/EAX so a constraint =rm on the output that says an available register or memory address is incorrect. It should be =a in your case.
Your code should be something like:
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm volatile ("inb %1, %0" : "=a"(value) : "Nd"(port));
return value;
}
Your assembler template for outb is incorrect:
static inline void outb(unsigned char value, unsigned short int port)
{
asm volatile ("outb %%al, $0"::"rm"(value), "a"(port));
}
A couple problems with it:
The port number is either specified as an immediate 8 bit value or passed in the DX register. The proper constraint for specifying the DX register or an immediate 8 bit value for inb/outb is Nd. See my Stackoverflow answer here for an explanation of the constraint Nd.
The value to output on the port has to be specified in AL/AX/EAX so a constraint rm on the value that says an available register or memory address is incorrect. It should be a in your case. See the instruction set reference for outb
The code should probably look something like:
static inline void outb(unsigned char value, unsigned short int port)
{
asm volatile ("outb %0, %1"::"a"(value), "Nd"(port));
}
Enabling and Disabling the Cursor
I had to look up the VGA registers about the cursor and found this document on the cursor start register which says:
Cursor Start Register (Index 0Ah)
-------------------------------------------------
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
-------------------------------------------------
| | | CD | Cursor Scan Line Start |
-------------------------------------------------
CD -- Cursor Disable
This field controls whether or not the text-mode cursor is displayed. Values are:
0 -- Cursor Enabled
1 -- Cursor Disabled
Cursor Scan Line Start
An important thing is that the cursor is disabled when the bit 5 is set. In your github setcursor function you do this:
outb(curstart | 0x20, 0x3D5);
curstart | 0x20 sets bit 5 (0x20 = 0b00100000). If you want to clear bit 5 and enable the cursor, then you bitwise NEGATE(~) the bitmask and bitwise AND (&) that with curstart. It should look like this:
outb(curstart & ~0x20, 0x3D5);
VGA Function Bugs
Once you have the cursor properly enabled it will render the cursor in the foreground color (attribute) for the particular video location it is currently over. One thing I noticed is that your clc routine does this:
vga_deref_80x24(VGAx,VGAy) = \
vga_encode_80x24(' ',BgColor,BgColor);
The thing to observe is that you set the attribute for the foreground and background colors to BgColor . If you set the bgcolor to black before calling clc it will flash a black underline cursor on a black background rendering it invisible on any screen location. For the cursor to be visible it must be on a screen location where the foreground and background are different colors. One way to see if this works is to change the code to:
vga_deref_80x24(VGAx,VGAy) = \
vga_encode_80x24(' ',BgColor,ForeColor);
I think it is a bug that you are clearing it with encoding vga_encode_80x24(' ',BgColor,BgColor); I think you mean to use vga_encode_80x24(' ',BgColor,ForeColor);
Now in your kmain function you need to set a ForeColor and BgColor before calling clc and they both must be different color to make the cursor visible. You have this code:
setbgcolor(BLACK);
clc();
setforecolor(BLUE);
It should now be:
setbgcolor(BLACK);
setforecolor(BLUE);
clc();
Now if the cursor is rendered anywhere on an unwritten location on the screen it will flash BLUE underline on BLACK background.
This should solve your cursor problem. However, I noticed that you also use encode vga_encode_80x24(' ',BgColor,BgColor); in your VGA scrolldown and terminal_control functions. I think this is a bug as well, and I think you should use encode vga_encode_80x24(' ',BgColor,ForeColor); instead. You do seem to set it properly in terminal_write.
If you want to change the color of the cursor at any location you could write a function that changes the foreground attribute under the cursor location without changing the background color. Make sure the two attributes (Foreground and background color) are different for the cursor to be visible. If you wish to hide the cursor you can set foreground and background color the same color for the screen location the cursor is currently at.

The problem is in your outb code. Also be aware of order port and value parameters.
Following works for me:
static inline unsigned char inb (unsigned short int port)
{
unsigned char value;
asm volatile ("inb %1, %0" : "=a"(value) : "Nd"(port));
return value;
}
static inline void outb (unsigned short int port, unsigned char value)
{
asm volatile ("outb %b0,%w1": :"a" (value), "Nd" (port));
}
void update_cursor(int x, int y)
{
uint16_t pos = y * 80 + x;
outb(0x3D4, 0x0F);
outb(0x3D5, (uint8_t) (pos & 0xFF));
outb(0x3D4, 0x0E);
outb(0x3D5, (uint8_t) ((pos >> 8) & 0xFF));
}

How do I ask the assembler to "give me a full size register"?

I'm trying to allow the assembler to give me a register it chooses, and then use that register with inline assembly. I'm working with the program below, and its seg faulting. The program was compiled with g++ -O1 -g2 -m64 wipe.cpp -o wipe.exe.
When I look at the crash under lldb, I believe I'm getting a 32-bit register rather than a 64-bit register. I'm trying to compute an address (base + offset) using lea, and store the result in a register the assembler chooses:
"lea (%0, %1), %2\n"
Above, I'm trying to say "use a register, and I'll refer to it as %2".
When I perform a disassembly, I see:
0x100000b29: leal (%rbx,%rsi), %edi
-> 0x100000b2c: movb $0x0, (%edi)
So it appears the code being generated calculates and address using 64-bit values (rbx and rsi), but saves it to a 32-bit register (edi) (that the assembler chose).
Here are the values at the time of the crash:
(lldb) type format add --format hex register
(lldb) p $edi
(unsigned int) $3 = 1063330
(lldb) p $rbx
(unsigned long) $4 = 4296030616
(lldb) p $rsi
(unsigned long) $5 = 10
A quick note on the Input Operands below. If I drop the "r" (2), then I get a compiler error when I refer to %2 in the call to lea: invalid operand number in inline asm string.
How do I tell the assembler to "give me a full size register" and then refer to it in my program?
int main(int argc, char* argv[])
{
string s("Hello world");
cout << s << endl;
char* ptr = &s[0];
size_t size = s.length();
if(ptr && size)
{
__asm__ __volatile__
(
"%=:\n" /* generate a unique label for TOP */
"subq $1, %1\n" /* 0-based index */
"lea (%0, %1), %2\n" /* calcualte ptr[idx] */
"movb $0, (%2)\n" /* 0 -> ptr[size - 1] .. ptr[0] */
"jnz %=b\n" /* Back to TOP if non-zero */
: /* no output */
: "r" (ptr), "r" (size), "r" (2)
: "0", "1", "2", "cc"
);
}
return 0;
}
Sorry about these inline assembly questions. I hope this is the last one. I'm not really thrilled with using inline assembly in GCC because of pain points like this (and my fading memory). But its the only legal way I know to do what I want to do given GCC's interpretation of the qualifier volatile in C.
If interested, GCC interprets C's volatile qualifier as hardware backed memory, and anything else is an abuse and it results in an illegal program. So the following is not legal for GCC:
volatile void* g_tame_the_optimizer = NULL;
...
unsigned char* ptr = ...
size_t size = ...;
for(size_t i = 0; i < size; i++)
ptr[i] = 0x00;
g_tame_the_optimizer = ptr;
Interestingly, Microsoft uses a more customary interpretation of volatile (what most programmers expect - namely, anything can change the memory, and not just memory mapped hardware), and the code above is acceptable.

gcc inline asm is a complicated beast. "r" (2) means allocate an int sized register and load it with the value 2. If you just need an arbitrary scratch register you can declare a 64 bit early-clobber dummy output, such as "=&r" (dummy) in the output section, with void *dummy declared earlier. You can consult the gcc manual for more details.
As to the final code snippet looks like you want a memory barrier, just as the linked email says. See the manual for example.

atomic_inc and atomic_xchg in gcc assembly

I have written the following user-level code snippet to test two sub functions, atomic inc and xchg (refer to Linux code).
What I need is just try to perform operations on 32-bit integer, and that's why I explicitly use int32_t.
I assume global_counter will be raced by different threads, while tmp_counter is fine.
#include <stdio.h>
#include <stdint.h>
int32_t global_counter = 10;
/* Increment the value pointed by ptr */
void atomic_inc(int32_t *ptr)
{
__asm__("incl %0;\n"
: "+m"(*ptr));
}
/*
* Atomically exchange the val with *ptr.
* Return the value previously stored in *ptr before the exchange
*/
int32_t atomic_xchg(uint32_t *ptr, uint32_t val)
{
uint32_t tmp = val;
__asm__(
"xchgl %0, %1;\n"
: "=r"(tmp), "+m"(*ptr)
: "0"(tmp)
:"memory");
return tmp;
}
int main()
{
int32_t tmp_counter = 0;
printf("Init global=%d, tmp=%d\n", global_counter, tmp_counter);
atomic_inc(&tmp_counter);
atomic_inc(&global_counter);
printf("After inc, global=%d, tmp=%d\n", global_counter, tmp_counter);
tmp_counter = atomic_xchg(&global_counter, tmp_counter);
printf("After xchg, global=%d, tmp=%d\n", global_counter, tmp_counter);
return 0;
}
My 2 questions are:
Are these two subfunctions written properly?
Will this behave the same when I compile this on 32-bit or
64-bit platform? For example, could the pointer address have a different
length. or could incl and xchgl will conflict with the operand?

My understanding of this question is below, please correct me if I'm wrong.
All the read-modify-write instructions (ex: incl, add, xchg) need a lock prefix. The lock instruction is to lock the memory accessed by other CPUs by asserting LOCK# signal on the memory bus.
The __xchg function in Linux kernel implies no "lock" prefix because xchg always implies lock anyway. http://lxr.linux.no/linux+v2.6.38/arch/x86/include/asm/cmpxchg_64.h#L15
However, the incl used in atomic_inc does not have this assumption so a lock_prefix is needed.
http://lxr.linux.no/linux+v2.6.38/arch/x86/include/asm/atomic.h#L105
btw, I think you need to copy the *ptr to a volatile variable to avoid gcc optimization.
William

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Translate VS inline assembler to GCC inline assembler - gcc

Related

Confusion about different clobber description for arm inline assembly

How do you explain gcc's inline assembly constraints for the IN, OUT instructions of i386?

text mode cursor doesn't appear in qemu vga emulator

How do I ask the assembler to "give me a full size register"?

atomic_inc and atomic_xchg in gcc assembly

Categories

Resources