Does QEMU emulate ARM coprocessor - linux-kernel

I need to implement a kernel module that involves reading the ARM Cortex-A9 coprocessor's register:
register int reg asm ("r6");
reg = -2;
volatile printk(KERN_INFO "reg: %d\n", reg);
volatile asm("MRC p15, 0,r6, c1, c0, 2;"); //Read Coprocessor Access Control Register
volatile printk(KERN_INFO "reg: %d\n", reg);
However, when i run this on QEMU, it always print out:
reg: -2
reg: -2
Is this because of my code or is it because of QEMU?
Thanks in advance.

Your code should work fine (though you need to remove volatile from printk lines, and ASM command should be asm volatile, not the other way around). Try to check next things:
QEMU version. I'm using 2.12 and your code works. So if you're using older version, try 2.12 too.
Emulated machine and cpu. Not sure if it affects CP registers, but I'm using "virt" machine with no CPU specified, you can try this configuration too.
If this doesn't help, check more details about my configuration below.
My configuration
I'm using next command to run QEMU:
$ qemu-system-arm -kernel $zimage -initrd $rootfs \
-machine virt -nographic -m 512 \
--append "root=/dev/ram0 rw console=ttyAMA0,115200 mem=512M"
where:
$zimage is path to zImage file (my kernel is linux-mainline on tag v4.18, built with multi_v7_defconfig configuration)
$rootfs is path to CPIO archive with minimal BusyBox rootfs
My kernel module code is next:
#include <linux/module.h>
static int __init mrc_init(void)
{
u32 acr;
/*
* Read Coprocessor Access Control Register.
* See Cortex-A9 TRM for details.
*/
asm volatile ("mrc p15, 0, %0, c1, c0, 2\n" : "=r" (acr));
pr_info("ACR = 0x%x\n", acr);
return 0;
}
static void __exit mrc_exit(void)
{
}
module_init(mrc_init);
module_exit(mrc_exit);
MODULE_AUTHOR("Sam Protsenko");
MODULE_DESCRIPTION("Test MRC on QEMU");
MODULE_LICENSE("GPL");
After loading this module I can see next output in dmesg:
ACR = 0xf00000

Related

mmap() RWX page on MacOS (ARM64 architecture)?

I've been trying to map a page that both writable AND executable.
mov x0, 0 // start address
mov x1, 4096 // length
mov x2, 7 // rwx
mov x3, 0x1001 // flags
mov x4, -1 // file descriptor
mov x5, 0 // offset
movl x16, 0x200005c // mmap
svc 0
This gives me a 0xD error code (EACCESS, which the documentation unhelpfully blames on an invalid file descriptor, although same documentation says to use '-1'). I think the code is correct, it returns a valid mmap if I just pass 'r--' for permissions.
I know the same code works in Catalina and x64 architecture. I tested the same error happens when SIP mode is disabled.
For more context, I'm trying to port a FORTH implementation to MacOs/ARM64, and this FORTH, like many others, heavily uses self modifying code/assembling code at runtime. And the code that is doing the assembling/compiling resides in the middle of the newly created code (in fact part the compiler will be generated in machine language as part of running FORTH), so it's very hard/infeasible to separate the FORTH JIT compiler (if you call it that) from the generated code.
Now, I'd really don't want to end up with the answer: "Apple thinks they know better than you, no FORTH for you!", but that is what it looks like so far. Thanks for any help!
You need to toggle the thread between being writable or executable, it can not be both at the same time. I think it is actually possible to do both with the same memory using 2 different threads but I haven't tried.
Before you write to the memory you mmap, call this:
pthread_jit_write_protect_np(0);
sys_icache_invalidate(addr, size);
Then when you are done writing to it you can switch back again like this:
pthread_jit_write_protect_np(1);
sys_icache_invalidate(addr, size);
This is the full code I am using right now
#include <stdio.h>
#include <sys/mman.h>
#include <pthread.h>
#include <libkern/OSCacheControl.h>
#include <stdlib.h>
#include <stdint.h>
uint32_t* c_get_memory(uint32_t size) {
int prot = PROT_READ | PROT_WRITE | PROT_EXEC;
int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_JIT;
int fd = -1;
int offset = 0;
uint32_t* addr = 0;
addr = (uint32_t*)mmap(0, size, prot, flags, fd, offset);
if (addr == MAP_FAILED){
printf("failure detected\n");
exit(-1);
}
pthread_jit_write_protect_np(0);
sys_icache_invalidate(addr, size);
return addr;
}
void c_jit(uint32_t* addr, uint32_t size) {
pthread_jit_write_protect_np(1);
sys_icache_invalidate(addr, size);
void (*foo)(void) = (void (*)())addr;
foo();
}

How do I ask the assembler to "give me a full size register"?

I'm trying to allow the assembler to give me a register it chooses, and then use that register with inline assembly. I'm working with the program below, and its seg faulting. The program was compiled with g++ -O1 -g2 -m64 wipe.cpp -o wipe.exe.
When I look at the crash under lldb, I believe I'm getting a 32-bit register rather than a 64-bit register. I'm trying to compute an address (base + offset) using lea, and store the result in a register the assembler chooses:
"lea (%0, %1), %2\n"
Above, I'm trying to say "use a register, and I'll refer to it as %2".
When I perform a disassembly, I see:
0x100000b29: leal (%rbx,%rsi), %edi
-> 0x100000b2c: movb $0x0, (%edi)
So it appears the code being generated calculates and address using 64-bit values (rbx and rsi), but saves it to a 32-bit register (edi) (that the assembler chose).
Here are the values at the time of the crash:
(lldb) type format add --format hex register
(lldb) p $edi
(unsigned int) $3 = 1063330
(lldb) p $rbx
(unsigned long) $4 = 4296030616
(lldb) p $rsi
(unsigned long) $5 = 10
A quick note on the Input Operands below. If I drop the "r" (2), then I get a compiler error when I refer to %2 in the call to lea: invalid operand number in inline asm string.
How do I tell the assembler to "give me a full size register" and then refer to it in my program?
int main(int argc, char* argv[])
{
string s("Hello world");
cout << s << endl;
char* ptr = &s[0];
size_t size = s.length();
if(ptr && size)
{
__asm__ __volatile__
(
"%=:\n" /* generate a unique label for TOP */
"subq $1, %1\n" /* 0-based index */
"lea (%0, %1), %2\n" /* calcualte ptr[idx] */
"movb $0, (%2)\n" /* 0 -> ptr[size - 1] .. ptr[0] */
"jnz %=b\n" /* Back to TOP if non-zero */
: /* no output */
: "r" (ptr), "r" (size), "r" (2)
: "0", "1", "2", "cc"
);
}
return 0;
}
Sorry about these inline assembly questions. I hope this is the last one. I'm not really thrilled with using inline assembly in GCC because of pain points like this (and my fading memory). But its the only legal way I know to do what I want to do given GCC's interpretation of the qualifier volatile in C.
If interested, GCC interprets C's volatile qualifier as hardware backed memory, and anything else is an abuse and it results in an illegal program. So the following is not legal for GCC:
volatile void* g_tame_the_optimizer = NULL;
...
unsigned char* ptr = ...
size_t size = ...;
for(size_t i = 0; i < size; i++)
ptr[i] = 0x00;
g_tame_the_optimizer = ptr;
Interestingly, Microsoft uses a more customary interpretation of volatile (what most programmers expect - namely, anything can change the memory, and not just memory mapped hardware), and the code above is acceptable.
gcc inline asm is a complicated beast. "r" (2) means allocate an int sized register and load it with the value 2. If you just need an arbitrary scratch register you can declare a 64 bit early-clobber dummy output, such as "=&r" (dummy) in the output section, with void *dummy declared earlier. You can consult the gcc manual for more details.
As to the final code snippet looks like you want a memory barrier, just as the linked email says. See the manual for example.

AVR inline assembly: registers to variables?

I'm currently trying to write some code that checks the value of SRAM at a certain address, and then executes some C code if it matches. This is running on an atmega32u4 AVR chip. Here is what I have so far:
volatile char a = 0;
void setup(){
}
void loop(){
asm(
"LDI r16,77\n" //load value 77 into r16
"STS 0x0160,r16\n" //copy r16 value into RAM location 0x0160
"LDS r17,0x0160\n" //copy value of RAM location 0x0160 into register r17
//some code to copy value r17 to char a?
);
if(a == 77){
//do something
}
}
I'm having trouble figuring out the part where I transition from assembly back to C. How do I get the value inside register r17 and put it into a variable in the C code?
I did find this code snippet, however I don't quite understand how that works, or if that is the best way to approach this.
__asm__ __volatile__ (
" ldi __tmp_reg__, 77" "\n\t"
" sts 0x0160, __tmp_reg__" "\n\t"
" lds %0, 0x0160" "\n\t"
: "=r" (a)
:
);
See here on how to inline assembly. Unless you have a very specific reason in mind, you should let the compiler take care of the variables for you. Even though you declared a in your code to be volatile, it could very well be bound to any of the 32 registers on the GP register file of the AVR core. This essentially means, the variable is never stored in RAM. If you really want to know what your compiler is doing, disassemble the final object file with avr-objdump -S and study it.

Loading SSE registers

I'm working on homework project for OS development class. One task is to save context of SSE registers upon interrupt. Now, saving and restoring context is easy (fxsave/fxsave). But I have problem with testing. I want to put same sample date into one of registers, but all I get is error interrupt 6. Here is code:
// load some SSE registers
struct Vec4 {
int x, y, z, w;
} vec = { 0, 1, 2, 3 };
asm volatile ( "movl %0, %%eax"
: /* no output */
: "r"( &vec )
:
);
asm volatile ( "movups (%eax), %xmm0" );
I searched on internet for solution. All I got is that it might something to do with effective address space. But I don't know what it is.
You need to use a memory operand as a constraint in the inline assembly. This is much better than generating the address by yourself (as you tried with the & operator) and loading in in a register, because the latter will not work if the address is rip relative or relocatable.
asm volatile ( "movups %0, %%xmm0"
: /* no output */
: "m"( vec )
:
);
And you need to use two "%%" before register names.
Read more about gcc's constraints here: http://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints . The title is somewhat misleading, as this concept is far from simple :-)
I found out what is problem. Execution of SSE instructions must be enabled by setting some flags in CR0 and CR4 registers. More info here: http://wiki.osdev.org/SSE
You're making this way harder than it needs to be - just use the intrinsics in the *mmintrin.h headers, e.g.
#include <emmintrin.h>
__m128i vec = _mm_set_epi32(3, 2, 1, 0);
If you need to put this in a specific XMM register then use the above example as a starting point, then generate asm, e.g. using gcc -S and use the generated asm as a template for your own code.

Rewrite Intel-style assembly code into GCC inline assembly

How to write this assembly code as inline assembly? Compiler: gcc(i586-elf-gcc). The GAS syntax confuses me. Please give tell me how to write this as inline assembly that works for gcc.
.set_video_mode:
mov ah,00h
mov al,13h
int 10h
.init_mouse:
mov ax,0
int 33h
Similar one I have in assembly. I wrote them separate as assembly routines to call them from my C program. I need to call these and some more interrupts from C itself.
Also I need to put some values in some registers depending on which interrupt routine I'm calling. Please tell me how to do it.
All that I want to do is call interrupt routines from C. It's OK for me even to do it using int86() but i don't have source code of that function.
I want int86() so that i can call interrupts from C.
I am developing my own tiny OS so i got no restrictions for calling interrupts or for any direct hardware access.
I've not tested this, but it should get you started:
void set_video_mode (int x, int y) {
register int ah asm ("ah") = x;
register int al asm ("al") = y;
asm volatile ("int $0x10"
: /* no outputs */
: /* no inputs */
: /* clobbers */ "ah", "al");
}
I've put in two 'clobbers' as an example, but you'll need to set the correct list of clobbers so that the compiler knows you've overwritten register values (maybe none).
First, keep in mind GCC doesn't support 16-bit code yet, so you'll end up compiling 32-bit code in 16-bit mode, which is very inefficient but doable (it is used, for example, by Linux and SeaBIOS). It can be done with the following at the begging of each file:
__asm__ (".code16gcc");
Newer GCC versions (since 4.9 IIRC) support the -m16 flag that does the same thing.
Also, there's no mouse driver available unless you load it previous to your kernel running init_mouse.
You seem to be using an API commonly available in several x86 DOS.
asm can take care of the register assignments, so the code can be reduced to:
void set_video_mode(int mode)
{
mode &= 255;
__asm__ __volatile__ (
"int $0x10"
: "+a" (mode) /* %eax = mode & 255 => %ah = 0, %al = mode */
);
}
void init_mouse(void)
{
/* XXX it is really important to check the IDT entry isn't 0 */
int tmp = 0;
__asm__ __volatile__ (
"int $0x33"
: "+a" (tmp) /* %eax = 0*/
:: "ebx" /* %ebx is also clobbered by DOS mouse drivers */
);
}
The asm statement is documented in the GCC manual, although perhaps not in enough depth and lacks x86 examples. The outputs (after first colon) have a distinctively obscure syntax, while the rest is far easier to understand (the second colon specifies the inputs and the third the clobbered registers, flags and/or memory).
The outputs must be prefixed with =, meaning you don't care the previous value it may have had, or +, meaning you want to use it as an input too. In this context we use that instead of an input because the value is modified by the interrupt and you're not allowed to specify input registers in the clobbered list (because the compiler is forbidden from using them).

Resources