Why doesn't free execute munmap?

Why doesn't free execute munmap? - linux-kernel

I have the following code:
unsigned char *p = (unsigned char *)valloc(page_size);
if (!p) {
ret = -1;
goto out;
}
printf("valloc: allocated %d bytes, virtual address: %p\n", page_size, p);
memset(p, 0xFF, page_size);
memcpy(p, s, sizeof(s));
trace_mem(p, sizeof(s));
printf("Memory: %p - press any key\n", p);
getchar();
if (ioctl(fd, MY_IOC_PATCH) == -1) {
fprintf(stderr, "ioctl %s error(%d): %s\n ", "MY_IOC_PATCH", errno, strerror(errno));
ret = -1;
goto out;
}
if (p) {
printf("free: freed %d bytes, virtual address: %p\n", page_size, p);
free(p);
}
.........................
Then I use strace to observe system calls: strace ./my_program I get the following:
fstat64(1, {st_mode=S_IFREG|0644, st_size=1533, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7730000
brk(0) = 0x9d81000
brk(0x9da4000) = 0x9da4000
fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb772f000
read(0, "\n", 1024) = 1
ioctl(3, RTC_IRQP_SET, 0x1000) = 0
read(0, "\n", 1024) = 1
ioctl(3, RTC_EPOCH_READ, 0x9d82000) = 0
read(0, "\n", 1024) = 1
close(3) = 0
valloc: allocated 4096 bytes, virtual address: 0x9d82000
After the first IOCTL I don't see munlock. I suppose that free must use munlock to unmap memory, but it doesn't cause. What is the reason for that?

I think that Paramagnetic Croissant's comment, above, qualifies as "the Answer" to this one. It is ordinary practice for malloc() implementations to ask the operating-system for more memory when they need it, but then to never give it back. For any operating-system.
You see, there's really no need to "give it back." Pestering the kernel, asking him to carve out more VM-space and to update the memory-management data structures, is a comparatively expensive operation. But, it doesn't really "cost" much to keep the storage around. (The cost of "releasing them" doesn't gain you anything, especially if you turn right around and have to ask for them again!) So, you just do it once.
If you stop using those pages, they'll eventually get swapped-out, and the physical resource (page frames) will automagically get used for other purposes. "No harm, no foul." But then, if you then suddenly start using that storage again, there's no reason to "pester the kernel" a second (or third) time. The pages just get swapped-in again, and off you go.

malloc/valloc(page size variant of malloc) actually gets the memory addresses from virtual address space. These addresses have mapping to physical address by way of page tables that are specific to a particular process.Thence in my opinion all kernel has to do in case of [vm]alloc is:
1) Attach an anonymous segment to the process.
2) Associate a bunch of virtual address (heap area) entries with physical pages, of course on first use.
In case of "free" it just needs to disassociate the virtual memory entries with the physical pages. Note that since these are anonymous pages it aint need to care where the "data" needs to go, while mmaping a file it may need to stage it back to the disk.
The physical pages are tracked and managed by the memory manager independently and is governed by cache principles (hot, cold color etc). Thus there is no question of free trying to give back memory to the kernel. Since all it got was a virtual address. It will give back the virtual address to the glibc library which should maintain virtual address chunks for use by the specific process.

Related

MMAP buffer kernel writes are not seen by user space

i have a kernel driver which shares a buffer with the user space layer.
Everything seemed to work fine in my VM prototype (Ubuntu, Kernel 5.4) but when i moved my code to the target (same kernel but this is an embedded distro) I can clearly see that Kernel writes to the buffer (using memcpy, or memset) are not reflected in the User space side of the buffer.
Note that, i use direct buffer accesses on both sides. There is no concurrency issue, as the Kernel writes to, then the user space reads from.
I ended up believing this is a cache issue ... as the same code works perfectly in my VM.
The buffer size is 4 * PAGE_SIZE.
It is allocated as follows:
int _size = (SFP_BUFFER_SIZE + (PAGE_SIZE-1)) & ~(PAGE_SIZE-1);
input_buffer = (char*) kzalloc (_size, GFP_KERNEL); // aligned on page boundary
if (!input_buffer) {
dev_dbg(&dev, "open/ENOMEM (input_buffer)\n");
status = -ENOMEM;
goto err_all
When mmap'ing, i used the following code pattern:
vma->vm_ops = &fpgadrv_vm_ops;
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
pfn = virt_to_phys((void*)(input_buffer)) >> PAGE_SHIFT;
if (remap_pfn_range (vma, vma->vm_start, pfn, size, vma->vm_page_prot))
{
printk(KERN_DEBUG "remap page range failed\n");
return -EAGAIN;
}
User space code, and kernel code user memcpy to update the buffer. Note also that I cannot use write/read entry points, as they are already used for very specific operations.
The user code is calling mmap as follows:
buf = mmap(NULL, BUF_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, device_fd, 0);
if (buf == MAP_FAILED)
{
perror("USERDRV:cannot mmap");
return -1; // for testing, ignore the return code and continue
}
and upon IOCTL call, the kernel would fill up the mmap buffer as follows:
case IOCTL_RESET:
printk(KERN_DEBUG "FPGADRV: IOCTL RESET");
// reset the buffer (zero + put back the signature)
memset(input_buffer, 0xA5, SFP_BUFFER_SIZE);
memcpy((void*)(input_buffer), (void*)signature, 10);
break;
Is there something more i should do to make sure the pages are not cached (assuming this is the cause of my pb) ?
Thanks,
Jacques

Simulating (lazy) NAND memory on Windows

I'm running a firmware simulation in a DLL which has simulated NAND (256MB or 1GB). I want to avoid allocating memory for this on the heap and instead allocate using virtual memory.
The memory initially needs to be cleared to 0xFF (like NAND is). However I don't want to pay for that initialization (nor commit un-accessed pages). So ideally it should only allocate upon access. And I do not need to retain the data following exit of the simulation.
Initial ideas are
VirtualAlloc. Not sure but thinking perhaps could use guard page and then trap the exception on first access. Not sure its ideal that a DLL handles such SEH exceptions? Or is there a better way?
Create a big file that's initialized to 0xFF. Then map view of file with copy-on-write.
Anyone know if it is possible to create a file with a callback for providing the initial data?
Think probably 1) the way to go but wondering if that's really the best option.
Edit:
3) I've come up with another method that can avoid exception handler and also avoids creating a huge file:
Create a file that is same size as dwAllocationGranularity (64KiB typically). Fill with 0xFF. Then create multiple copy-on-write views of that in contiguous memory using MapViewOfFileEx + FILE_MAP_COPY (after an initial VirtualAlloc/VirtualFree to get a suitable base address that we can hope to allocate juxtapositioned views). Need to test this a bit more fully - slight concern about potential thread races.. I'm ony actually using a single thread but the CRT does start a few too.
This means that any code that only reads the virtual NAND also does not result in all pages getting committed.

yes, basically 1 is best solution. only i be do next changes - use VEH instead SEH - SEH handler will be called only if you access memory inside it, when in case VEH - access can be ai any context and thread. and instead use guard page, i be initial only reserve region of memory without real allocation. so any access to memory region lead to exception, you handle it in VEH - commit memory and fill with 0xFF pattern. demo code
PVOID g_NandBegin;
SIZE_T g_NandSize = 0x1000000;
LONG NTAPI Vex(::PEXCEPTION_POINTERS ExceptionInfo)
{
::PEXCEPTION_RECORD ExceptionRecord = ExceptionInfo->ExceptionRecord;
if (ExceptionRecord->ExceptionCode == STATUS_ACCESS_VIOLATION &&
ExceptionRecord->NumberParameters > 1)
{
PVOID pv = (PVOID)ExceptionRecord->ExceptionInformation[1];
if ((ULONG_PTR)pv - (ULONG_PTR)g_NandBegin < g_NandSize)
{
SIZE_T RegionSize = 1;
if (0 <= NtAllocateVirtualMemory(NtCurrentProcess(), &pv, 0, &RegionSize, MEM_COMMIT, PAGE_READWRITE))
{
RtlFillMemoryUlong(pv, RegionSize, MAXULONG);
return EXCEPTION_CONTINUE_EXECUTION;
}
}
}
return EXCEPTION_CONTINUE_SEARCH;
}
void dc()
{
if (PVOID pv = AddVectoredExceptionHandler(TRUE, Vex))
{
if (g_NandBegin = VirtualAlloc(0, g_NandSize, MEM_RESERVE, PAGE_READWRITE))
{
ULONG seed = ~GetTickCount();
int n = 0x100;
do
{
if (*(UCHAR*)((PBYTE)g_NandBegin + (((ULONG64)RtlRandomEx(&seed) * g_NandSize) >> 32)) != 0xFF)
{
__debugbreak();
}
} while (--n);
VirtualFree(g_NandBegin, 0, MEM_RELEASE);
}
RemoveVectoredExceptionHandler(pv);
}
}

How to use mmap for devmemon MT7620n

I`m trying to access the GPIOs of a MT7620n via register settings. So far I can access them by using /sys/class/gpio/... but that is not fast enough for me.
In the Programming guide of the MT7620 page 84 you can see that the GPIO base address is at 0x10000600 and the single registers have an offset of 4 Bytes.
MT7620 Programming Guide
Something like:
devmem 0x10000600
from the shell works absolutely fine but I cannot access it from inside of a c Programm.
Here is my code:
#define GPIOCHIP_0_ADDDRESS 0x10000600 // base address
#define GPIO_BLOCK 4
volatile unsigned long *gpiochip_0_Address;
int gpioSetup()
{
int m_mfd;
if ((m_mfd = open("/dev/mem", O_RDWR)) < 0)
{
printf("ERROR open\n");
return -1;
}
gpiochip_0_Address = (unsigned long*)mmap(NULL, GPIO_BLOCK, PROT_READ|PROT_WRITE, MAP_SHARED, m_mfd, GPIOCHIP_0_ADDDRESS);
close(m_mfd);
if(gpiochip_0_Address == MAP_FAILED)
{
printf("mmap() failed at phsical address:%d %s\n", GPIOCHIP_0_ADDDRESS, strerror(errno));
return -2;
}
return 0;
}
The Output I get is:
mmap() failed at phsical address:268436992 Invalid argument
What do I have to take care of? Do I have to make the memory accessable before? I´m running as root.
Thanks
EDIT
Peter Cordes is right, thank you so much.
Here is my final solution, if somebody finds a bug, please tell me ;)
#define GPIOCHIP_0_ADDDRESS 0x10000600 // base address
volatile unsigned long *gpiochip_0_Address;
int gpioSetup()
{
const size_t pagesize = sysconf(_SC_PAGE_SIZE);
unsigned long gpiochip_pageAddress = GPIOCHIP_0_ADDDRESS & ~(pagesize-1); //get the closest page-sized-address
const unsigned long gpiochip_0_offset = GPIOCHIP_0_ADDDRESS - gpiochip_pageAddress; //calculate the offset between the physical address and the page-sized-address
int m_mfd;
if ((m_mfd = open("/dev/mem", O_RDWR)) < 0)
{
printf("ERROR open\n");
return -1;
}
page_virtual_start_Address = (unsigned long*)mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_SHARED, m_mfd, gpiochip_pageAddress);
close(m_mfd);
if(page_virtual_start_Address == MAP_FAILED)
{
printf("ERROR mmap\n");
printf("mmap() failed at phsical address:%d %d\n", GPIOCHIP_0_ADDDRESS, strerror(errno));
return -2;
}
gpiochip_0_Address = page_virtual_start_Address + (gpiochip_0_offset/sizeof(long));
return 0;
}

mmap's file offset argument has to be page-aligned, and that's one of the documented reasons for mmap to fail with EINVAL.
0x10000600 is not a multiple of 4k, or even 1k, so that's almost certainly your problem. I don't think any systems have pages as small as 512B.
mmap a whole page that includes the address you want, and access the MMIO registers at an offset within that page.
Either hard-code it, or maybe do something like GPIOCHIP_0_ADDDRESS & ~(page_size-1) to round down an address to a page-aligned boundary. You should be able to do something that gets the page size as a compile-time constant so it still compiles efficiently.

Error while reading other process memory

I'm using ReadProcessMemory to read a single byte out of a process i've created.
Since i'm attaching as a debugger, i'm reading addresses that are being executed now (or in the near past).
but i get a 299 error for ReadProcessMemory via GetLastError() on some addresses only (some works fine..)
On the cases i get an error, i call VirtualQueryEx, and the memInfo protect is 0x1, while the type & baseAddress are 0x0 (but the region size is some normal number), also VirtualQueryEx isn't failing..
if i call VirtualProtectEx for those cases, i get error 487 (Attempt to access invalid address).
i thought maybe the address i'm trying to read is paged out, thus all the errors, but it doesn't seem right since, as i've already mentioned, its an address that was executed recently.
ideas anyone?

You want to loop through all the memory, calling VirtualQueryEx() to make sure the memory is valid before calling ReadProcessMemory()
You need to make sure that MEMORY_BASIC_INFORMATION.State is MEM_COMMIT in most cases.
This whole operation can be easy to screw up, because you didn't supply any code I will provide a working solution that should work in 95% of situations. It's ok for bytesRead to be different than regionSize, you just need to handle that situation correctly. You don't need to take permissions in most cases using VirtualProtect because all valid memory should have read access.
int main()
{
DWORD procid = GetProcId("notepad.exe");
unsigned char* addr = 0;
HANDLE hProc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, procid);
MEMORY_BASIC_INFORMATION mbi;
while (VirtualQueryEx(hProc, addr, &mbi, sizeof(mbi)))
{
if (mbi.State == MEM_COMMIT && mbi.Protect != PAGE_NOACCESS)
{
char* buffer = new char[mbi.RegionSize]{ 0 };
SIZE_T bytesRead = 0;
if (ReadProcessMemory(hProc, addr, buffer, mbi.RegionSize, &bytesRead))
{
if (bytesRead)
{
//scan from buffer[0] to buffer[bytesRead]
}
else
{
//scan from buffer[0] to buffer[mbi.RegionSize]
}
}
delete[] buffer;
}
addr += mbi.RegionSize;
}
CloseHandle(hProc);
}
Reminder: This is just a PoC to teach you the concept.

mmap /dev/mem, read performance is very slow

I have written a test program which is like this:
fd = open("/dev/mem", O_RDWR);
src = mmap(0x0, 0x1000000, PROT_READ, MAP_SHARED, fd, 0x80000000);/* 0x80000000 is physical start address of DDR on my A8-cortex platform */
dst = malloc(0x1000000);
start_time = get_time()
memcpy(dst, src, 0x1000000);
end_time = get_time();
print_speed();
On my ARM A8-cortex based board, it gives me about 400MB/s. Then I changed the above test program, src buffer is also alllocated by malloc, test again, now it gives me about 1400MB/s, about 3~4X faster.
I try to figure out the reason. First, I suspect that the src memory is uncached through mmap, so I check out the code in driver/cha/mem.c in kernel. In mmap_mem function, I use printk to print the page attribute of maped address, vma->vm_pgoff shows 0x10f, so it is not uncached.
Further more, I change the code and set it uncached type through vma->vm_pgoff = pgprot_nocached(vma->vm_pgoff) and test again, the result is about 30MB/s. So, we can defenitly confirm that /dev/mem maped memory is surely cached, but its read performance is very slow as compared to malloced memory.
Then how to explain this test result?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio