Memory limit for mmap - linux-kernel

I am trying to mmap a char device. It works for 65536 bytes. But I get the following error if I try for more memory.
mmap: Resource temporarily unavailable
I want to mmap 1MB memory for a device. I use alloc_chrdev_region, cdev_init, cdev_add for the char device. How can I mmap memory larger than 65K? Should I use block device?

Using the MAP_LOCKED flag in the mmap call can cause this error. The used mlock can return EAGAIN if the amount of memory can not be locked.
From man mmap:
MAP_LOCKED (since Linux 2.5.37) Lock the pages of the mapped region
into memory in the manner of mlock(2). This flag is ignored in older
kernels.
From man mlock:
EAGAIN:
Some or all of the specified address range could not be
locked.

Did you implement *somedevice_mmap()* file operation?
static int somedev_mmap(struct file *filp, struct vm_area_struct *vma)
{
/* Do something. You probably need to use ioremap(). */
return 0;
}
static const struct file_operations somedev_fops = {
.owner = THIS_MODULE,
/* Initialize other file operations. */
.mmap = somedev_mmap,
};

Related

What is the purpose of the function "blk_rq_map_user" in the NVME disk driver?

I am trying to understand the nvme linux drivers. I am now tackling the function nvme_user_submit_cmd, which I report partially here:
static int nvme_submit_user_cmd(struct request_queue *q,
struct nvme_command *cmd, void __user *ubuffer,
unsigned bufflen, void __user *meta_buffer, unsigned meta_len,
u32 meta_seed, u32 *result, unsigned timeout)
{
bool write = nvme_is_write(cmd);
struct nvme_ns *ns = q->queuedata;
struct gendisk *disk = ns ? ns->disk : NULL;
struct request *req;
struct bio *bio = NULL;
void *meta = NULL;
int ret;
req = nvme_alloc_request(q, cmd, 0, NVME_QID_ANY);
[...]
if (ubuffer && bufflen) {
ret = blk_rq_map_user(q, req, NULL, ubuffer, bufflen,
GFP_KERNEL);
[...]
The ubufferis a pointer to some data in the virtual address space (since this comes from an ioctl command from a user application).
Following blk_rq_map_user I was expecting some sort of mmap mechanism to translate the userspace address into a physical address, but I can't wrap my head around what the function is doing. For reference here's the call chain:
blk_rq_map_user -> import_single_range -> blk_rq_map_user_iov
Following those function just created some more confusion for me and I'd like some help.
The reason I think that this function is doing a sort of mmap is (apart from the name) that this address will be part of the struct request in the struct request queue, which will eventually be processed by the NVME disk driver (https://lwn.net/Articles/738449/) and my guess is that the disk wants the physical address when fulfilling the requests.
However I don't understand how is this mapping done.
ubuffer is a user virtual address, which means it can only be used in the context of a user process, which it is when submit is called. To use that buffer after this call ends, it has to be mapped to one or more physical addresses for the bios/bvecs. The unmap call frees the mapping after the I/O completes. If the device can't directly address the user buffer due to hardware constraints then a bounce buffer will be mapped and a copy of the data will be made.
Edit: note that unless a copy is needed, there is no kernel virtual address mapped to the buffer because the kernel never needs to touch the data.

how to read a register in device driver?

in a linux device driver, in the init function for the device, I tried reading an address (which is SMMUv3 device for arm64) like below.
uint8_t *addr1;
addr1 = ioremap(0x09050000, 0x20000);
printk("SMMU_AIDR : 0x%X\n", *(addr1 + 0x1c));
but I get Internal error: synchronous external abort: 96000010 [#1] SMP error.
Is it not permitted to map an address to virtual address using ioremap and just reading that address?
I gave a fixed value 0x78789a9a to SMMU IDR[2] register. (at offset 0x8, 32 bit register. This is possible because it's qemu.)
SMMU starts at 0x09050000 and it has address space 0x20000.
__iomem uint32_t *addr1 = NULL;
static int __init my_driver_init(void)
{
...
addr1 = ioremap(0x09050000, 0x20000); // smmuv3
printk("SMMU_IDR[2] : 0x%X\n", readl(addr1 +0x08/4));
..}
This is the output when the driver is initialized.(The value is read ok)
[ 453.207261] SMMU_IDR[2] : 0x78789A9A
The first problem was that the access width was wrong for that address. Before, it was defined as uint8_t *addr1; and I used printk("SMMU_AIDR : 0x%X\n", *(addr1 + 0x1c)) so it was reading byte when it was not allowed by the SMMU model.
Second problem (I think this didn't cause the trap because arm64 provides memory mapped io) was that I used memory access(pointer dereferencing) for memory mapped IO registers. As people commented, I should have used readl function. (Mainly because to make the code portable. readl works also for iomap platforms like x86_64. using the mmio adderss as pointer will not work on such platforms. I later found that readl function takes care of the memory barrier problem too).
ADD : I fixed volatile to __iomem for variable addr1.(thanks #0andriy)

Freeing platform driver device struct [duplicate]

I have found devm_kzalloc() and kzalloc() in device driver programmong. But I don't know when/where to use these functions. Can anyone please specify the importance of these functions and their usage.
kzalloc() allocates kernel memory like kmalloc(), but it also zero-initializes the allocated memory. devm_kzalloc() is managed kzalloc(). The memory allocated with managed functions is associated with the device. When the device is detached from the system or the driver for the device is unloaded, that memory is freed automatically. If multiple managed resources (memory or some other resource) were allocated for the device, the resource allocated last is freed first.
Managed resources are very helpful to ensure correct operation of the driver both for initialization failure at any point and for successful initialization followed by the device removal.
Please note that managed resources (whether it's memory or some other resource) are meant to be used in code responsible for the probing the device. They are generally a wrong choice for the code used for opening the device, as the device can be closed without being disconnected from the system. Closing the device requires freeing the resources manually, which defeats the purpose of managed resources.
The memory allocated with kzalloc() should be freed with kfree(). The memory allocated with devm_kzalloc() is freed automatically. It can be freed with devm_kfree(), but it's usually a sign that the managed memory allocation is not a good fit for the task.
In simple words devm_kzalloc() and kzalloc() both are used for memory allocation in device driver but the difference is if you allocate memory by kzalloc() than you have to free that memory when the life cycle of that device driver is ended or when it is unloaded from kernel but if you do the same with devm_kzalloc() you need not to worry about freeing memory,that memory is freed automatically by device library itself.
Both of them does the exactly the same thing but by using devm_kzalloc little overhead of freeing memory is released from programmers
Let explain you by giving example, first example by using kzalloc
static int pxa3xx_u2d_probe(struct platform_device *pdev)
{
int err;
u2d = kzalloc(sizeof(struct pxa3xx_u2d_ulpi), GFP_KERNEL); 1
if (!u2d)
return -ENOMEM;
u2d->clk = clk_get(&pdev->dev, NULL);
if (IS_ERR(u2d->clk)) {
err = PTR_ERR(u2d->clk); 2
goto err_free_mem;
}
...
return 0;
err_free_mem:
kfree(u2d);
return err;
}
static int pxa3xx_u2d_remove(struct platform_device *pdev)
{
clk_put(u2d->clk);
kfree(u2d); 3
return 0;
}
In this example you can this in funtion pxa3xx_u2d_remove(),
kfree(u2d)(line indicated by 3) is there to free memory allocated by u2d
now see the same code by using devm_kzalloc()
static int pxa3xx_u2d_probe(struct platform_device *pdev)
{
int err;
u2d = devm_kzalloc(&pdev->dev, sizeof(struct pxa3xx_u2d_ulpi), GFP_KERNEL);
if (!u2d)
return -ENOMEM;
u2d->clk = clk_get(&pdev->dev, NULL);
if (IS_ERR(u2d->clk)) {
err = PTR_ERR(u2d->clk);
goto err_free_mem;
}
...
return 0;
err_free_mem:
return err;
}
static int pxa3xx_u2d_remove(struct platform_device *pdev)
{
clk_put(u2d->clk);
return 0;
}
there is no kfree() to free function because the same is done by devm_kzalloc()

Basic mmap implementation

I am pretty much new to linux and am writing a custom driver for which I need to use mmap. I have found the sample code in many places as below
static int simple_remap_mmap(struct file *filp, struct vm_area_struct *vma)
{
if (remap_pfn_range(vma, vma->vm_start, vm->vm_pgoff,
vma->vm_end - vma->vm_start,
vma->vm_page_prot))
return -EAGAIN;
vma->vm_ops = &simple_remap_vm_ops;
simple_vma_open(vma);
return 0;
}
But I want to know where exactly does this code specify which portion of kernel memory to map e.g if I have allocated a buffer in the kernel using kmalloc pointed to by *kptr how can i mmap that buffer to user space. Thank you.

Find out the process name by pid in osx kernel extension

I am working on kernel extension and want to find out how to find process name by pid in kernel extension
This code works great in user space
static char procdata[4096];
int mib[3] = { CTL_KERN, KERN_PROCARGS, pid };
procdata[0] = '\0'; // clear
size_t size = sizeof(procdata);
if (sysctl(mib, 3, procdata, &size, NULL, 0)) {
return ERROR(ERROR_INTERNAL);
}
procdata[sizeof(procdata)-2] = ':';
procdata[sizeof(procdata)-1] = '\0';
ret = procdata;
return SUCCESS;
but for the kernel space, there are errors such as "Use of undeclared identifier 'CTL_KERN'" (even if I add #include )
What is the correct way to do it in kernel extension?
The Kernel.framework header <sys/proc.h> is what you're looking for.
In particular, you can use proc_name() to get a process's name given its PID:
/* this routine copies the process's name of the executable to the passed in buffer. It
* is always null terminated. The size of the buffer is to be passed in as well. This
* routine is to be used typically for debugging
*/
void proc_name(int pid, char * buf, int size);
Note however, that the name will be truncated to MAXCOMLEN - 16 bytes.
You might also be able to use the sysctl via sysctlbyname() from the kernel. In my experience, that function doesn't work well though, as the sysctl buffer memory handling isn't expecting buffers in kernel address space, so most types of sysctl will cause a kernel panic if called from a non-kernel thread. It also doesn't seem to work for all sysctls.

Resources