Linux Kernel Driver - physical CPU memory not updated. DMA problem - linux-kernel

I'm using Orange Pi3 LTS with Allwinner H6 ARM CPU. I'm writing now UART driver with DMA for Rx and Tx. I allocated physical RAM memory using kmalloc() call and I got physical and logical address for my allocated memory. So, I know physical address in processor and corresponding logical address in Linux Kernel Driver space. I have a problem with updating physical memory after update logical. I mean, for example in my linux kernel driver I have callback init() when I'm attaching my driver to kernel and exit() when I'm disconnecting driver from kernel. In this call init() I'm allocating physical memory using kmalloc() call. In the same call I'm filling this memory with some data, but using logical address (because from kernel I can't access physical memory). In the same call (after fill memory) I'm triggering one of DMA channel to do job (I'm putting data to CPU registers). So, DMA should take descriptor (as pointer) from physical RAM memory and do some job for transmit data over UART. But it seems that physical memory is not updated in this "init()" call. Only logical RAM memory is updated, because in CPU registers I have wrong data. But when I put filling in RAM only descriptor data and for example in another kernel callback (exit) I'm triggering DMA then it is working -> in physical RAM memory is correct data and data is sending over UART as expected. I don't understand this situation. Why in single linux kernel driver callback (i.e. "init") physical memory is not updated, but it is updated only in logical memory space. Why linux kernel driver is not updating physical memory (over MMU) directly after write to logical memory, but after this call (after leave init() callbcak)?
As I wrote in problem description.

I studied documentation about DMA API Linux. Finally I found solution.
As was wrote in comment here was a problem with cache coherency.
Instead of use kmalloc() call to allocate RAM memory for DMA should be use dma_alloc_coherent() which returns pointer to logical address for kernel and also in argument it returns physical address without cache (non-cached).
Here is my example/test code which is working for me and now physical memory is updated immediately with logical inside kernel memory space. Allocation of 1024 bytes in RAM.
static struct device *dev;
static dma_addr_t physical_address;
static unsigned int *logical_address;
static void ptr_init(void)
{
unsigned long long dma_mask = DMA_BIT_MASK(32);
dev->dma_mask = &dma_mask;
if (dma_set_mask_and_coherent(dev, dma_mask) != 0)
printk("Mask not OK\n");
else
printk("Mask OK\n");
logical_address = (unsigned int *)dma_alloc_coherent(dev, 1024, &physical_address, GFP_KERNEL);
if (logical_address != NULL)
printk("allocation OK \n");
else
printk("allocation NOT OK\n");
printk("logical address: %x\n", logical_address);
printk("physical address: %x\n", physical_address);
}

Related

Limit Linux DMA allocation to specific range

I am working on a SOC which it's RAM starts at 0x200M.
For some reason I need to limit the DMA allocations up to 0x220M, so I called - dma_set_mask_and_coherent(dev, 0x21FFFFFFF).
I noticed that
dev->dma_mask was not set, while
dev->coherent_dma_mask was set.
Still, calling dma_alloc_coherent returns buffers above the requested limit.
How can I limit the buffer address?

Theoretical question: Fetching a 32-bit variable from 128 bit addressable memory

Suppose I have a processor that is running Linux kernel, and the processor is connected with a 128 bit bus to a memory. The memory is addressed as one address gives you the full 128 bits of data through the bus to processor side.
If I do a 32-bit assignment in the user space:
int a = *p
where &p is pointing to an address in the memory. That address will return full 128 bits of data through the bus to the processor.
Is the kernel able to translate this kind of access from user space? What would happen?

Issue Getting Physical Address in Kernel Module - Can't MMAP to that address

I am implementing DMA through memory mapped registers and need the physical address of the buffers used to pass on to the DMA device.
I allocate my memory using a kernel module, then pass the physical address on to the userspace application driving the DMA. I have tried with and without the Page Shift.
float* buffer = (float*) kcalloc(512, sizeof(float), GFP_DMA); //allocate physical memory for DMA
unsigned long physAddr = virt_to_phys(buffer) >> PAGE_SHIFT;
When I write these addrs to my DMA device no data moves - and when I attempt to mmap to these addrs from /dev/mem they result in invalid pointers (segfaults on read).
How should I be getting these addresses for the allocated memory?

Why memset function make the virtual memory so large

I have a process will do much lithography calculation, so I used mmap to alloc some memory for memory pool. When process need a large chunk of memory, I used mmap to alloc a chunk, after use it then put it in the memory pool, if the same chunk memory is needed again in the process, get it from the pool directly, not used memory map again.(not alloc all the need memory and put it in the pool at the beginning of the process). Between mmaps function, there are some memory malloc not used mmap, such as malloc() or new().
Now the question is:
If I used memset() to set all the chunk data to ZERO before putting it in the memory pool, the process will use too much virtual memory as following, format is "mmap(size)=virtual address":
mmap(4198400)=0x2aaab4007000
mmap(4198400)=0x2aaab940c000
mmap(8392704)=0x2aaabd80f000
mmap(8392704)=0x2aaad6883000
mmap(67112960)=0x2aaad7084000
mmap(8392704)=0x2aaadb085000
mmap(2101248)=0x2aaadb886000
mmap(8392704)=0x2aaadba89000
mmap(67112960)=0x2aaadc28a000
mmap(2101248)=0x2aaae028b000
mmap(2101248)=0x2aaae0c8d000
mmap(2101248)=0x2aaae0e8e000
mmap(8392704)=0x2aaae108f000
mmap(8392704)=0x2aaae1890000
mmap(4198400)=0x2aaae2091000
mmap(4198400)=0x2aaae6494000
mmap(8392704)=0x2aaaea897000
mmap(8392704)=0x2aaaeb098000
mmap(2101248)=0x2aaaeb899000
mmap(8392704)=0x2aaaeba9a000
mmap(2101248)=0x2aaaeca9c000
mmap(8392704)=0x2aaaec29b000
mmap(8392704)=0x2aaaecc9d000
mmap(2101248)=0x2aaaed49e000
mmap(8392704)=0x2aaafd6a7000
mmap(2101248)=0x2aacc5f8c000
The mmap last - first = 0x2aacc5f8c000 - 0x2aaab4007000 = 8.28G
But if I don't call memset before put in the memory pool:
mmap(4198400)=0x2aaab4007000
mmap(8392704)=0x2aaab940c000
mmap(8392704)=0x2aaad2480000
mmap(67112960)=0x2aaad2c81000
mmap(2101248)=0x2aaad6c82000
mmap(4198400)=0x2aaad6e83000
mmap(8392704)=0x2aaadb288000
mmap(8392704)=0x2aaadba89000
mmap(67112960)=0x2aaadc28a000
mmap(2101248)=0x2aaae0a8c000
mmap(2101248)=0x2aaae0c8d000
mmap(2101248)=0x2aaae0e8e000
mmap(8392704)=0x2aaae1890000
mmap(8392704)=0x2aaae108f000
mmap(4198400)=0x2aaae2091000
mmap(4198400)=0x2aaae6494000
mmap(8392704)=0x2aaaea897000
mmap(8392704)=0x2aaaeb098000
mmap(2101248)=0x2aaaeb899000
mmap(8392704)=0x2aaaeba9a000
mmap(2101248)=0x2aaaec29b000
mmap(8392704)=0x2aaaec49c000
mmap(8392704)=0x2aaaecc9d000
mmap(2101248)=0x2aaaed49e000
The mmap last - first = 0x2aaaed49e000 - 0x2aaab4007000= 916M
So the first process will "out of memory" and killed.
In the process, the mmap memory chunk will not be fully used or not even used although it is alloced, I mean, for example, before calibration, the process mmap 67112960(64M), it will not used(write or read data in this memory region) or just used the first 2M bytes, then put in the memory pool.
I know the mmap just return virtual address, the physical memory used delay alloc, it will be alloced when read or write on these address.
But what made me confused is that, why the virtual address increase so much? I used the centos 5.3, kernel version is 2.6.18, I tried this process both on libhoard and the GLIBC(ptmalloc), both with the same behavior.
Do anyone meet the same issue before, what is the possible root cause?
Thanks.
VMAs (virtual memory areas, AKA memory mappings) do not need to be contiguous. Your first example uses ~256 Mb, the second ~246 Mb.
Common malloc() implementations use mmap() automatically for large allocations (usually larger than 64Kb), freeing the corresponding chunks with munmap(). So you do not need to mmap() manually for large allocations, your malloc() library will take care of that.
When mmap()ing, the kernel returns a COW copy of a special zero page, so it doesn't allocate memory until it's written to. Your zeroing is causing memory to be really allocated, better just return it to the allocator, and request a new memory chunk when you need it.
Conclusion: don't write your own memory management unless the system one has proven inadecuate for your needs, and then use your own memory management only when you have proved it noticeably better for your needs with real life load.

What is the need for Logical Memory?

When I study OS,I find a concept Logical Memory.So Why there is a need for a Logical Memory?How does a CPU generate Logical Memory?The output of "&ptr" operator is Logical or physical Address?Is Logical Memory and Virtual Memory same?
If you're talking about C's and C++'s sizeof, it returns a size and never an address. And the CPU does not generate any memory.
On x86 CPUs there are several layers in address calculations and translations. x86 programs operate with logical addresses comprised of two items: a segment selector (this one isn't always specified explicitly in instructions and may come from the cs, ds, ss or es segment register) and an offset.
The segment selector is then translated into the segment base address (either directly (multiplied by 16 in the real address mode and in the virtual 8086 mode of the CPU) or by using a special segment descriptor table (global or local, GDT or LDT, in the protected mode of the CPU), the selector is used as an index into the descriptor table, from where the base address is pulled).
Then the sum segment base address + offset forms a linear address (AKA virtual address).
If the CPU is in the real address mode, that's the final, physical address.
If the CPU is in the protected mode (or virtual 8086), that linear/virtual address can be further translated into the physical address by means of page tables (if page translation is enabled, of course, otherwise, it's the final physical address as well).
Physical memory is your RAM or ROM (or flash). Virtual memory is physical memory extended by the space of disk storage (could be flash as well as we now have SSDs).
You really need to read up on this. You seem to have no idea.

Resources