I' m new in drivers. So excuse me for possible inaccuracies.
msdn such as some books about driver design give our some directions how to use wdm api. But i can find some literature or recources where i could get solid description of converting isr to final windows message.
for example we have keyboard. and device interrupt raised. I/O manager create irp and start to pass it downward along driver stack. every filter or functional driver can modify irp which they have just recieved. But what sould to be happened in the end of this process. But what layer or driver get some kind of parsed irp, transform it to windows message and put into input queue of OS?
Raw input thread (data received from the driver):
Overview of how Windows processes keyboard input:
Keyboard Input Model:
Related
I am studying UART Driver in kernel code and want to know, who first comes into picture, device_register() or driver_register() call?
For difference between them follow this.
and in UART probing, we call
uart_register_driver(struct uart_driver *drv)
and after successfully registration,
uart_add_one_port(struct uart_driver *drv, struct uart_port *uport)
Please explain this in details.
That's actually two questions, but I'll try to address both of them.
who first comes into picture, device_register() or driver_register() call?
As it stated in Documentation/driver-model/binding.txt, it doesn't matter in which particular order you call device_register() and driver_register().
device_register() adds device to device list and iterates over driver list to find the match
driver_register() adds driver to driver list and iterates over device list to find the match
Once match is found, matched device and driver are binded and corresponding probe function is called in driver code.
If you are still curious which one is called first (because it doesn't matter) -- usually it's device_register(), because devices are usually being registered on initcalls from core_initcall to arch_initcall, and drivers are usually being registered on device_initcall, which executed later.
See also:
[1] From where platform device gets it name
[2] Who calls the probe() of driver
[3] module_init() vs. core_initcall() vs. early_initcall()
Difference between uart_register_driver and platform_driver_register?
As you noticed there are 2 drivers (platform driver and UART driver) for one device. But don't let this confuse you: those are just two driver APIs used in one (in fact) driver. The explanation is simple: UART driver API just lacks some functionality we need, and this functionality is implemented in platform driver API. Here is responsibility of each API in regular tty driver:
platform driver API is used for 3 things:
Matching device (described in device tree file) with driver; this way probe function will be executed for us by platform driver framework
Obtaining device information (reading from device tree)
Handling Power Management (PM) operations (suspend/resume)
UART driver API: handling actual UART functionality: read, write, etc.
Let's use drivers/tty/serial/omap-serial.c for driver reference and arch/arm/boot/dts/omap5.dtsi for device reference. Let's say, for example, we have next device described in device tree:
uart1: serial#4806a000 {
compatible = "ti,omap4-uart";
reg = <0x4806a000 0x100>;
interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
ti,hwmods = "uart1";
clock-frequency = <48000000>;
};
It will be matched with platform driver in omap-serial.c by "ti,omap4-uart" string (you can find it in driver code). Then, using that platform driver, we can read properties from device tree node above, and use them for some platform stuff (setting up clocks, handling UART interrupt, etc.).
But in order to expose our device as standard TTY device we need to use UART framework (all those uart_* functions). Hence 2 different APIs: platform driver and UART driver.
We're trying to write a driver/API for a custom data acquisition device, which captures several "channels" of data. For the sake of discussion, let's assume this is a several-channel video capture device. The device is connected to the system via an 8xPCIe Gen-1 link, which has a theoretical throughput of 16Gbps. Our actual data rate will be around 2.8Gbps (~350MB/sec).
Because of the data rate requirement, we think we have to be careful about the driver/API architecture. We've already implemented a descriptor based DMA mechanism and the associated driver. For example, we can start a DMA transaction for 256KB from the device and it completes successfully. However, in this implementation we're only capturing the data in the kernel driver, and then dropping it and we aren't streaming the data to the user-space at all. Essentially, this is just a small DMA test implementation.
We think we have to separate the problem into three sections: 1. Kernel driver 2. Userspace API 3. User Code
The acquisition device has a register in the PCIe address space which indicates whether there is data to read for any channel from the device. So, our kernel driver must poll for this bit-vector. When the kernel driver sees this bit set, it starts a DMA transaction. The user application however does not need to know about all these DMA transactions and data, until an entire chunk of data is ready (For example, assume that the device provides us with 16 lines of video data per transaction, but we need to notify the user only when the entire video frame is ready). We need to only transfer entire frames to the user application.
Here was our first attempt:
Our user-side API allows a user application to register a function callback for a "channel".
The user-side API has a "start" function, which can be called by the user application, which uses ioctl to send a start message to the kernel driver.
In the kernel driver, upon receiving the start message, we started a kernel thread, which continuously monitors the "data ready" bit-vector, and when it sees new data, copies it over to a driver-allocated (kmalloc) buffer. It keeps doing this until the size of the collected data reaches the "frame size".
At this point a custom linux SIGNAL (similar to SIGINT, SIGHUP, etc) is sent to the process which is running the driver. Our API catches this signal and then calls back the appropriate user callback function.
The user callback function calls a function in the API (transfer_data), which uses an ioctl call to send a userspace buffer address to the kernel, and the kernel completes the data transfer by doing a copy_to_user of the channel frame data to userspace.
All of the above is working OK, except that the performance is abysmal. We can only achieve about 2MB/sec of transfer rate. We need to completely re-write this and we're open to any suggestions or pointers to examples.
Other notes:
Unfortunately, we can not change anything in the hardware device. So we must poll for the "data-ready" bit and start DMA based on that bit.
Some people suggested to look at Infiniband drivers as a reference, but we're completely lost in that code.
You're probably way past this now, but if not here's my 2p.
It's hard to believe that your card can't generate interrupts when
it has transferred data. It's got a DMA engine, and it can handle
'descriptors', which are presumably elements of a scatter-gather
list. I'll assume that it can generate a PCIe 'interrupt'; YMMV.
Don't bother trawling the kernel for existing similar drivers. You
might get lucky, but I suspect not.
You need to write a blocking read, which you supply a large memory buffer to. The driver read op (a) gets gets a list of user pages for your user buffer and locks them in memory (get_user_pages); (b) creates a scatter list with pci_map_sg; (c) iterates through the list (for_each_sg); (d) for each entry writes the corresponding physical bus address and data length to the DMA controller as what I presume you're calling a 'descriptor'.
The card now has a list of descriptors which correspond to the physical bus addresses of your large user buffer. When data arrives at the card, it writes it directly into user space, into your user buffer, while your user-level read is still blocked. When it has finished the descriptor list, the card has to be able to interrupt, or it's useless. The driver responds to the interrupt and unblocks your user-level read.
And that's it. The details are nasty, of course, and poorly documented, but that should be the basic architecture. If you really haven't got interrupts you can set up a timer in the kernel to poll for completion of transfer, but if it is really a custom card you should get your money back.
I have been trying to understand how do h/w interrupts end up in some user space code, through the kernel.
My research led me to understand that:
1- An external device needs attention from CPU
2- It signals the CPU by raising an interrupt (h/w trance to cpu or bus)
3- The CPU asserts, saves current context, looks up address of ISR in the
interrupt descriptor table (vector)
4- CPU switches to kernel (privileged) mode and executes the ISR.
Question #1: How did the kernel store ISR address in interrupt vector table? It might probably be done by sending the CPU some piece of assembly described in the CPUs user manual? The more detail on this subject the better please.
In user space how can a programmer write a piece of code that listens to a h/w device notifications?
This is what I understand so far.
5- The kernel driver for that specific device has now the message from the device and is now executing the ISR.
Question #3:If the programmer in user space wanted to poll the device, I would assume this would be done through a system call (or at least this is what I understood so far). How is this done? How can a driver tell the kernel to be called upon a specific systemcall so that it can execute the request from the user? And then what happens, how does the driver gives back the requested data to user space?
I might be completely off track here, any guidance would be appreciated.
I am not looking for specific details answers, I am only trying to understand the general picture.
Question #1: How did the kernel store ISR address in interrupt vector table?
Driver calls request_irq kernel function (defined in include/linux/interrupt.h and in kernel/irq/manage.c), and Linux kernel will register it in right way according to current CPU/arch rules.
It might probably be done by sending the CPU some piece of assembly described in the CPUs user manual?
In x86 Linux kernel stores ISR in Interrupt Descriptor Table (IDT), it format is described by vendor (Intel - volume 3) and also in many resources like http://en.wikipedia.org/wiki/Interrupt_descriptor_table and http://wiki.osdev.org/IDT and http://phrack.org/issues/59/4.html and http://en.wikibooks.org/wiki/X86_Assembly/Advanced_Interrupts.
Pointer to IDT table is registered in special CPU register (IDTR) with special assembler commands: LIDT and SIDT.
If the programmer in user space wanted to poll the device, I would assume this would be done through a system call (or at least this is what I understood so far). How is this done? How can a driver tell the kernel to be called upon a specific systemcall so that it can execute the request from the user? And then what happens, how does the driver gives back the requested data to user space?
Driver usually registers some device special file in /dev; pointers to several driver functions are registered for this file as "File Operations". User-space program opens this file (syscall open), and kernels calls device's special code for open; then program calls poll or read syscall on this fd, kernel will call *poll or *read of driver's file operations (http://www.makelinux.net/ldd3/chp-3-sect-7.shtml). Driver may put caller to sleep (wait_event*) and irq handler will wake it up (wake_up* - http://www.makelinux.net/ldd3/chp-6-sect-2 ).
You can read more about linux driver creation in book LINUX DEVICE DRIVERS (2005) by Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman: https://lwn.net/Kernel/LDD3/
Chapter 3: Char Drivers https://lwn.net/images/pdf/LDD3/ch03.pdf
Chapter 10: Interrupt Handling https://lwn.net/images/pdf/LDD3/ch10.pdf
I am using a FT232RL chip with FTD2XX_NET.dll I've made a program which writes and reads data to/from AVR atmega32 mcu. First writes data, then reads data as answer.
Now, i want to make an event which indicated me if there's available unreaded data, only when AVR sends data to FTDI buffer and ONLY then. Whithout forcing my program to making loops for checking available data. For my purpose, i want to do the mcu to sends data only when he wants, and the PC must to knows when there's new data in FTDI buffer's chip.
I know that It's impossible for the pc to know when AVR sending data to the FTDI. But this which I mean it's that I need some way for my program to know if FTDI have New unreaded data to it's own buffer.
I don't won't to running read operator over and over in an infinity loop as I do now.
You should create a read thread which does your reading in the background. Then from that thread you can signal an even to notify another part of your application when you have data. I'm not sure what language you are using but you should easily be able to find an example of threading and event notification with a Google search.
I have a general question about Rs232 Software Flowcontrol (aka XOn/XOff)
The .Net implementation (and the nativ win32 api) bothe define a property called WriteTimeout / ReadTimeout, which is a time in ms after which a communication is considered to be overdue.
No my problem is this: If I send, lets say a 5 Byte string to the device I don't see any WriteTimeout, as expected. How is this implemented? Everything I find about Software flow control is that XOFF is to be set, when the recieve buffer is full; XOn when it is ready to recieve again.
But from the behavior I see, I would suspect, hat the device sends XON, after it has processed the 5-Byte information that I send, thus creating the information for windows to generate the corresponding events.
So when to send XON on a two-wire only RS232 implementation? Only if the buffer was full and to restart recieving; Or to signal, that we are "still ready" to receive after every chunk we processed?
How to implement?
Cheers & thx in advance!
Corelgott
Send an XON any time you are ready to receive data (your receive buffer is empty or nearly so). Send an XOFF any time you cannot accept more incoming data (your receive buffer is full or nearly so). The process is documented on the Wikipedia software flow control page.