Difference Between Probe and resume functions in Linux - linux-kernel

I am a newbie to Linux. Can some one Please explain me about the differences between this functions. and the sequence of execution.
I had a look # this query.
Probe method device drivers
and got some idea about probe.
I have understanding the resume function is called after suspend. Please guide me in understanding the functionality.

Both are different in perspect:
Probe:
Will get called when you are registering your device to for the first time.(Gets called (a). during device boot or (b). calling insmod/modprob).
Resume:
It is a handler function routine part of the driver, you may supply function for the handler or leav(depends on your driver implementation).
So in simple words - Probe gets called only once (During registration of driver)
- Resume gets called depending on
(a) whether you have supplied function routine for handler
(b) If so then on suspend it gets called. (so n times it will get called for n times it gets suspended)

I guess there is enough information in a thread that you're mentioned. But I'll try to explain in other words.
Probe function is a part of initialization sequence of linux device driver. Usually, an Init function contains some sort of driver registration calls, and one of linux layers would call probe() later. But only driver's author can decide what part of code should be executed in init() or probe() : it depends on your device hardware specifications and corresponding linux layer (PCI, SPI, etc) features. By the way, in your driver you're not obliged to use any existing layer, so it is not mandatory to have probe().
Conserning suspend-resume: this pair of functions should take a place only when you're want to implement any energy-saving features of your device. Suspend() tells that you can switch off something (if have any) to preserve energy. Resume() tells that you should switch in on again. Have no such options? Do not implement suspend-resume.

Related

How to access physical address during interrupt handler linux

I wrote an interrupt handler in linux.
Part of the handler logic I need to access physical address.
I used iormap function but then I fell into KDB during handler time.
I started to debug it and i saw the below code which finally called by ioremap
What should I do? Is there any other way instead of map the region before?
If i will need to map it before it means that i will probably need to map and cache a lot of unused area.
BTW what are the limits for ioremap?
Setting up a new memory mapping is an expensive operation, which typically requires calls to potentially blocking functions (e.g. grabbing locks). So your strategy has two problems:
Calling a blocking function is not possible in your context (there is no kernel thread associated with your interrupt handler, so there is no way for the kernel to resume it if it had to be put to sleep).
Setting up/tearing down a mapping per IRQ would be a bad idea performance-wise (even if we ignore the fact that it can't be done).
Typically, you would setup any mappings you need in a driver's probe() function (or in the module's init() if it's more of a singleton thing). This mapping is then kept in some private device data structure, which is passed as the last argument to some variant of request_irq(), so that the kernel then passes it back as the second argument to the IRQ handler.
Not sure what you mean by "need to map and cache a lot of unused area".
Depending on your particular system, you may end up consuming an entry in your CPU's MMU, or you may just re-use a broader mapping that was setup by whoever wrote the BSP. That's just the cost of doing business on a virtual memory system.
Caching is typically not enabled on I/O memory because of the many side-effects of both reads and writes. For the odd cases when you need it, you have to use ioremap_cached().

What are allowed and not allowed to do in a linux Device Driver?

I have a general question about linux device driver. More often I get confused which actions are allowed or not allowed to perform in a linux device driver?
Is there any rules or kind of lookup list to follow?
for instance with the following examples, which are not allowable?
msleep(1000);
al = kmallock(sizeof(val));
printk(KERN_ALERT "faild to print\n";
ret = adc_get_val()*0.001;
In linux device driver programming it depends in which context you are. There are two contexts that need to be distinguished:
process context
IRQ context.
Sleeping can only be done while in process context or you schedule the work for later execution (there are several mechanism available to do that). This is a complex topic that cannot be described in a paragraph.
Allocating memory can sleep, it depends with which parameters/flags kmalloc is invoked.
print can always be called (once the kernel has been invoked), otherwise use early_printk.
I don't know what the function add_get_val does. It is not part of the linux kernel. And as has already been commented, float values cannot be easily used in the kernel.

Making a virtual IOPCIDevice with IOKit

I have managed to create a virtual IOPCIDevice which attaches to IOResources and basically does nothing. I'm able to get existing drivers to register and match to it.
However when it comes to IO handling, I have some trouble. IO access by functions (e.g. configRead, ioRead, configWrite, ioWrite) that are described in IOPCIDevice class can be handled by my own code. But drivers that use memory mapping and IODMACommand are the problem.
There seems to be two things that I need to manage: IODeviceMemory(described in the IOPCIDevice) and DMA transfer.
How could I create a IODeviceMemory that ultimately points to memory/RAM, so that when driver tries to communicate to PCI device, it ultimately does nothing or just moves the data to RAM, so my userspace client can handle this data and act as an emulated PCI device?
And then could DMA commands be directed also to my userspace client without interfering to existing drivers' source code that use IODMACommand.
Thanks!
Trapping memory accesses
So in theory, to achieve what you want, you would need to allocate a memory region, set its protection bits to read-only (or possibly neither read nor write if a read in the device you're simulating has side effects), and then trap any writes into your own handler function where you'd then simulate device register writes.
As far as I'm aware, you can do this sort of thing in macOS userspace, using Mach exception handling. You'd need to set things up that page protection fault exceptions from the process you're controlling get sent to a Mach port you control. In that port's message handler, you'd:
check where the access was going to
if it's the device memory, you'd suspend all the threads of the process
switch the thread where the write is coming from to single-step, temporarily allow writes to the memory region
resume the writer thread
trap the single-step message. Your "device memory" now contains the written value.
Perform your "device's" side effects.
Turn off single-step in the writer thread.
Resume all threads.
As I said, I believe this can be done in user space processes. It's not easy, and you can cobble together the Mach calls you need to use from various obscure examples across the web. I got something similar working once, but can't seem to find that code anymore, sorry.
… in the kernel
Now, the other problem is you're trying to do this in the kernel. I'm not aware of any public KPIs that let you do anything like what I've described above. You could start looking for hacks in the following places:
You can quite easily make IOMemoryDescriptors backed by system memory. Don't worry about the IODeviceMemory terminology: these are just IOMemoryDescriptor objects; the IODeviceMemory class is a lie. Trapping accesses is another matter entirely. In principle, you can find out what virtual memory mappings of a particular MD exist using the "reference" flag to the createMappingInTask() function, and then call the redirect() method on the returned IOMemoryMap with a NULL backing memory argument. Unfortunately, this will merely suspend any thread attempting to access the mapping. You don't get a callback when this happens.
You could dig into the guts of the Mach VM memory subsystem, which mostly lives in the osfmk/vm/ directory of the xnu source. Perhaps there's a way to set custom fault handlers for a VM region there. You're probably going to have to get dirty with private kernel APIs though.
Why?
Finally, why are you trying to do this? Take a step back: What is it you're ultimately trying to do with this? It doesn't seem like simulating a PCI device in this way is an end to itself, so is this really the only way to do what greater goal you're ultimately trying to achieve? See: XY problem

Calling schedule() inside Linux IRQ

I'm making an emulation driver that requires me to call schedule() in ATOMIC contexts in order to make the emulation part work. For now I have this hack that allows me to call schedule() inside ATOMIC (e.g. spinlock) context:
int p_count = current_thread_info()->preempt_count;
current_thread_info()->preempt_count = 0;
schedule();
current_thread_info()->preempt_count = p_count;
But that doesn't work inside IRQs, the system just stops afer calling schedule().
Is there any way to hack the kernel in a way to allow me to do it? I'm using Linux kernel 4.2.1 with User Mode Linux
In kernel code you can be either in interrupt context or in process context.
When you are in interrupt context, you cannot call any blocking function (e.g., schedule()) or access the current pointer. That's related to how the kernel is designed and there is no way for having such functionalities in interrupt context. See also this answer.
Depending on what is your purpose, you can find some strategy that allows you to reach your goal. To me, it sounds strange that you have to call schedule() explicitly instead of relying on the natural kernel flow.
One possible approach follows (but, again, it depends on your specific goal). Form the IRQ you can schedule the work on a work queue through schedule_work(). The work queue, in fact, by design, executes kernel code in process context. From there, you are allowed to call blocking functions and access the current process data.

IoGetDeviceObjectPointer() fails with no return status

This is my code:
UNICODE_STRING symbol;
WCHAR ntNameBuffer[128];
swprintf(ntNameBuffer, L"\\Device\\Harddisk1\\Partition1");
RtlInitUnicodeString(&symbol, ntNameBuffer);
KdPrint(("OSNVss:symbol is %ws\n",symbol.Buffer));
status = IoGetDeviceObjectPointer(&symbol,
FILE_READ_DATA,
&pDiskFileObject,
&pDiskDeviceObject);
My driver is next-lower-level of \\Device\\Harddisk1\\Partition1.
When I call IoGetDeviceObjectPointer it will fail and no status returns and it not continue do remaining code.
When I use windbg debug this, it will break with a intelpm.sys;
If I change the objectname to "\\Device\\Harddisk1\\Partition2" (the partition2 is really existing), it will success call
If I change objectname to "\\Device\\Harddisk1\\Partition3", (the partition3 is not existing), it failed and return status = 0xc0000034, mean objectname is not existing.
Does anybody know why when I use object "\\Device\\Harddisk1\\Partition1" it fails and no return status? thanks very much!
First and foremost: what are you trying to achieve and what driver model are you using? What bitness, what OS versions are targeted and on which OS version does it fail? Furthermore: you are at the correct IRQL for the call and is running inside a system thread, right? From which of your driver's entry points (IRP_MJ_*, DriverEntry ...) are you calling this code?
Anyway, was re-reading the docs on this function. Noting in particular the part:
The IoGetDeviceObjectPointer routine returns a pointer to the top object in the named device object's stack and a pointer to the
corresponding file object, if the requested access to the objects can
be granted.
and:
IoGetDeviceObjectPointer establishes a "connection" between the caller
and the next-lower-level driver. A successful caller can use the
returned device object pointer to initialize its own device objects.
It can also be used as as an argument to IoAttachDeviceToDeviceStack,
IoCallDriver, and any routine that creates IRPs for lower drivers. The
returned pointer is a required argument to IoCallDriver.
You don't say, but if you are doing this on a 32bit system, it may be worthwhile tracking down what's going on with IrpTracker. However, my guess is that said "connection" or rather the request for it gets somehow swallowed by the next-lower-level driver or so.
It is also hard to say what kind of driver you are writing here (and yes, this can be important).
Try not just breaking at a particular point before or after the fact but rather follow the stack that the IRP would travel downwards in the target device object's stack.
But thinking about it, you probably aren't attached to the stack at all (for whatever reason). Could it be that you actually should be using IoGetDiskDeviceObject instead, in order to get the actual underlying device object (at the bottom of the stack) and not a reference to the top-level object attached?
Last but not least: don't forget you can also ask this question over on the OSR mailing lists. There are plenty of seasoned professionals there who may have run into the exact same problem (assuming you are doing all of the things correct that I asked about).
thanks everyone , I solve this problem; what cause this problem is it becoming synchronous; when I
call IoGetDeviceObjectPointer , it will generate an new Irp IRP_MJ_WRITER which pass though from high level, when this irp reach my driver, my thread which handle IRP is the same thread whilch call IoGetDeviceObjectPointer ,so it become drop-dead halt;

Resources