With CONFIG_FSL_BOOKE (P1020 RDB) 2.6.31 I need to reserve 1MB of RAM at some fixed location (doesn't matter where) that is pristine, meaning it is not touched by U-Boot or the bootmem allocator, so that the RAM contents survive warm reboot. With the caveat that I cannot change U-boot to use CONFIG_PRAM/mem=.
Compiling relocatable kernel is not an option in arc/powerpc 2.6.31. memmap is not supported in arch/powerpc/kernel/setup_32.c.
Ideally this area should be reserved and not L1 d-cached, so that it can be used to store ramoops from interrupt context.
Is there any way to move _end out to 0x600000 before bootmem to create a hole that is not touched by anyone? That is, to trick the kernel into thinking that _end is farther out?
In vmlinux.lds.S I tried something like:
. = ALIGN(PAGE_SIZE);
_end = . ;
PROVIDE32 (end = .);
Changed to
. = ALIGN(PAGE_SIZE);
_start_unused_ram = .;
. = ALIGN(0x400000);
_end = . ;
PROVIDE32 (end = .);
However, the area between __bss_stop and 0x400000 was overwritten.
The best solution would be to add the area of memory as a reserved region in the device tree.
That way it will be reserved early during boot and should not be touched by the kernel.
Related
I'm working on a driver where ranges of device memory are mapped into the user space (via IOCTL) for the application to write to. It works:
vma->vm_flags |= VM_DONTCOPY;
vma->vm_flags |= VM_DONTEXPAND;
down_write(¤t->mm->mmap_sem);
ret = vm_iomap_memory(vma, from, sz_required);
up_write(¤t->mm->mmap_sem);
where from is a physical address obtained from pci_resource_start() with some offset added to it.
The application also needs to read from the device so I increase the size of the region mmapped by application by PAGE_SIZE, allocate a page with dma_alloc_coherent(), and try to insert it at the end of the vma but that returns EBUSY. What do I do wrong? I should be able to stitch together multiple physical ranges into a single vma, both real memory and device mapped, or is that not supported?
In the new code a page is allocated like that, dma_addr is passed to the device so it knows where to write to:
dma = dma_alloc_coherent(&device, PAGE_SIZE, &dma_addr, GFP_KERNEL);
memset(dma, 0xfe, PAGE_SIZE);
set_memory_wb((unsigned long)dma, 1);
And the mapping code is changed to:
vma->vm_flags |= VM_DONTCOPY;
vma->vm_flags |= VM_DONTEXPAND;
vma->vm_flags |= VM_MIXEDMAP;
down_write(¤t->mm->mmap_sem);
ret = vm_iomap_memory(vma, from, sz_required);
up_write(¤t->mm->mmap_sem);
down_write(¤t->mm->mmap_sem);
ret = vm_insert_page(vma, vma->vm_end - PAGE_SIZE, virt_to_page(dma));
up_write(¤t->mm->mmap_sem);
The kernel is 4.15 on x86_64
Got it working by following the "hack" in Map multiple kernel buffer into contiguous userspace buffer?
Before vm_iomap_memory() I decrement vma->vm_end by PAGE_SIZE and restore the old value afterwards. Also, I switched from dma_alloc_coherent() to alloc_page() following by dma_map_page()
Not the solution I'm satisfied with though. There has to be a better way, perhaps a fault handler in vm_ops? Although that seems counter-productive considering I know exactly what I will be mapping and where.
It appears to be working on x86_64 and aarch64
I'd like to create a program with a special section at the end of Virtual Memory. So I wanted to do a linker script something like this:
/* ... */
.section_x 0xffff0000 : {
_start_section_x = .;
. = . + 0xffff;
_end_section_x = .;
}
The problem is that gcc/ld/glibc seem to load the stack at this location by default for a 32 bit application, even if it overlaps a known section. The above code zero's out the stack causing an exception. Is there any way to tell the linker to use another VM memory location for the stack? (As well, I'd like to ensure the heap doesn't span this section of virtual memory...).
I hate answers that presume or even ask if the question is wrong but, if you need a 64k segment, why can't you just allocate one on startup?
Why could you possibly need a fixed address within your process address space? I've been doing a lot of different kinds of coding for almost 30 years, and I haven't seen the need for a fixed address since the advent of protected memory.
I'm currently learning about ARM64 architecture, and I have trouble understanding how Linux can boot. From the linker script, we can see that:
SECTIONS {
. = PAGE_OFFSET + TEXT_OFFSET;
.head.text : {
_text = .;
HEAD_TEXT
}
the initial _text symbol is very high, something like 0xfffffffc000080000. To my knowledge, this address is a physical address, but this seems impossible on a 8Gb RAM based board. How is this possible ? Am I missing something ?
EDIT: The Linux documentation states that:
- Caches, MMUs
The MMU must be off.
I am a newbie in linux kernel .Today i have a question about some linux kerenl 2.6.11 memory management code(plz check my code comments for my question) in do_anonymous_pages() and the code slice is following:
if (write_access) {
pte_unmap(page_table);
spin_unlock(&mm->page_table_lock);
page = alloc_page(GFP_HIGHUSER | _ _GFP_ZERO);
spin_lock(&mm->page_table_lock);
page_table = pte_offset_map(pmd, addr);
mm->rss++;
entry = maybe_mkwrite(pte_mkdirty(mk_pte(page,
vma->vm_page_prot)), vma);
lru_cache_add_active(page);
SetPageReferenced(page);
set_pte(page_table, entry); /* here just set new pte entry */
pte_unmap(page_table); /* why unmap just we set new maped PTE?? */
spin_unlock(&mm->page_table_lock);
return VM_FAULT_MINOR;
}
If you read how the page_table was populated in the first place you will see it was pte_offset_map-ed first. It should be no surprise that there is a matching pte_unmap.
The page_table thingy IS NOT the pte thingy which is set here.
Rather, on certain architectures the kernel has a very limited address space. For instance i386 is able to address 4GB of memory. This is typically split into 3GB for userspace and 1GB for kernel. But all kernel memory typically does not fit into 1GB. Thus the problem is combated by temporarily mapping and unmapping various pages where possible, as needed. page tables of userspace processes, as can be seen, are subject to this behaviour. These macros don't map/unmap anything on amd64, which has big enough address space for the kernel to permanently map all physical memory.
I have a code that fails calling ioremap() for 4M region. Trying to debug the reason, I've found out that if you call ioremap it will try to allocate continuous addresses with a very large alignment (depending on the size of the area you want to allocate). The code that computes this alignment is in __get_vm_area_node() function (mm/vmalloc.c) and it looks like this:
if (flags & VM_IOREMAP) {
int bit = fls(size);
if (bit > IOREMAP_MAX_ORDER)
bit = IOREMAP_MAX_ORDER;
else if (bit < PAGE_SHIFT)
bit = PAGE_SHIFT;
align = 1ul << bit;
}
On ARM, IOREMAP_MAX_ORDER is defined as 23. This means that in my case, ioremap needs not only 4M of continues addressing in vmalloc area but it also has to be aligned to 4M.
I wasn't able to find any information on why this alignment is needed. I even tried using git blame to see the commit that introduces this change but it seems the code is older than git history so I couldn't find anything.