how to get net_device and net_device_operation in linux kernel - linux-kernel

I wanted to know tcp/ip sequence on linux kernel.
So, I have checked network code on kernel.
First, I checked kernel code when user calls sock function(for tcp/ip).
when user calls sock function for tcp/ip protocol, system call happen and
System call function calls sock_create function.
And then Sock_create function calls inet_create function using parameter of user sock function.
After that, I checked kernel code when user calls write function for socket(tcp/ip).
most of function operation and object for socket what kerenel needs for transmit is bound in inet_create.
but, I don't know where object of net_device and dev->netdev_ops function is bound.
when is net_device object and ops of netdevice bound and where is they bound ???
follwing sequence is my thinking about socket transmit sequence(for tcp/ip).(linux kernel ver is 3.18.)
write -----user space
----------------------------- System call
= >vfs_write ------------------VFS layer
~~~~>
=> file->f_op->aio_write ( sock_aio_write - bound by system call of user sock function )
=> sock->ops->sendmsg ( inet_sendmsg - bound by inet_create function)
-----------------------------
inet_sendmsg
=>sk->sk_prot->sendmsg ( tcp_sendmsg - bound by inet_create function )
=============================
tcp_sendmsg
~~~~~>
=> icsk->icsk_af_ops->queue_xmit ( ip_queue_xmit : bound by inet_create function )
=============
=> ip_queue_xmit
~~~~~>
=> netdev_start_xmit
=> __netdev_start_xmit
=> ops->ndo_start_xmit ( when is net_device object and ops of netdevice bound and where is they bound ???)
=================
(For example )
=>el3_start_xmit//(device driver)

net_device and dev->netdev_ops are bound in the network interface controller(NIC) driver code, this binding occures when the NIC driver is loaded and it recognizes the hardware. For example here is where it is bound in intel's driver e1000.
Then when you attempt to send a packet from user mode the routing tables determine which NIC to use, and the appropriate driver code is called.

Related

How to find dma_request_chan() failure reason details?

In an external kernel module, using DMA Engine, when calling dma_request_chan() returns an error pointer of value -19, i.e. ENODEV or "No such device".
Now, in the active device tree, I do find a dma-names entry with what I'm trying to get a channel for, so my suspicion is that something else deeper in the forest is already not found.
How do I find out what's wrong?
Background:
I have a Zynq MP Ultrascale+ board here, with an FPGA design which uses AXI VDMA block to provide one channel of data to be received on the Cortex A's Linux, where the data is written to DDR4 by the FPGA and to be read from Linux.
I found that there is a Xilinx DMA driver included in the kernel, in the Xilinx source repo anyway, currently kernel version 5.6.0.
And that that driver has no user space interface, such that an intermediate kernel driver is needed.
This is depicted, and they have an example here: Section "4 DMA Proxy Design". I modified the code in the dma-proxy.c of the zip file linked there such that it uses only the RX channel, i.e. also only tries to request it.
The code for that is here, to not make this post huge:
Modified dma-proxy.c at onlinegdb.com
Line 407 has the function create_channel(), which used to use dma_request_slave_channel() which ditches the error code of the function it wraps, so to see the error, I am using that one instead: dma_request_chan().
The function create_channel() is called in function dma_proxy_probe() # line 470 (the occurences before that are deactivated by compile switch).
So by way of this call, dma_request_chan() will be called with the parameters:
create_channel(pdev, &channels[RX_CHANNEL], "dma_proxy_rx", DMA_DEV_TO_MEM);
The Device Tree for my board has an added node for dma-proxy driver as is shown at the top of the dma-proxy.c
dma_proxy {
compatible ="xlnx,dma_proxy";
dmas = <&axi_dma_0 0>;
dma-names = "dma_proxy_rx";
};
The name "axi_dma_0" matches with the name in the axi DMA device tree node:
axi_dma_0: dma#a0000000 {
#dma-cells = <0x1>;
clock-names = "s_axi_lite_aclk", "m_axi_s2mm_aclk";
clocks = <0x3 0x47 0x3 0x47>;
compatible = "xlnx,axi-dma-7.1", "xlnx,axi-dma-1.00.a";
interrupt-names = "s2mm_introut";
interrupt-parent = <0x1d>;
interrupts = <0x0 0x2>;
reg = <0x0 0xa0000000 0x0 0x1000>;
xlnx,addrwidth = <0x28>;
xlnx,sg-length-width = <0x1a>;
phandle = <0x1e>;
dma-channel#a0000030 {
compatible = "xlnx,axi-dma-s2mm-channel";
dma-channels = <0x1>;
interrupts = <0x0 0x2>;
xlnx,datawidth = <0x40>;
xlnx,device-id = <0x0>;
};
If I now look here:
% cat /proc/device-tree/dma_proxy/dma-names
dma_proxy_rx
Looks like my dma_proxy_rx, that I'm trying to request the channel for, is in there.
Edit:
In the boot log, I see this:
xilinx-vdma a0000000.dma: Please ensure that IP supports buffer length > 23 bits
irq: no irq domain found for interrupt-controller#a0010000 !
xilinx-vdma a0000000.dma: unable to request IRQ 0
xilinx-vdma a0000000.dma: WARN: Device release is not defined so it is not safe to unbind this driver while in use
xilinx-vdma a0000000.dma: Xilinx AXI DMA Engine Driver Probed!!
There are warnings - but in the end, the Xilinx AXI DMA Engine got "probed", meaning the lowest level driver loaded and is ready, right?
So it looks to me like there should be my device, but the kernel disagrees.
I've got the same problem with similar configuration. After digging a lot of kernel source code (especially drivers/dma/xilinx/xilinx_dma.c) I've solved this problem by changing channel number in dmas parameter from 0 to 1 in dma-proxy device tree entry like this:
dma_proxy {
compatible ="xlnx,dma_proxy";
dmas = <&axi_dma_0 1>;
dma-names = "dma_proxy_rx";
};
It seems that dma-proxy example is written for AXI DMA block with both mm2s (channel #0) and s2mm (channel #1) channels. And if we remove mm2s channel from AXI DMA block, the s2mm channel stays #1.

Linux kernel network device driver and skb pointers

I am writing a network device driver.
Kernel 2.6.35.12
The device is supposed to be working when it is connected to a bridge port.
I am trying to intercept ICMPv6 RA and NS messages (Router/ Neighbor solicitation) forwarded to the interface from the bridge.
eth <–> br0 <–> mydevice
In the device start_xmit function I am doing to following:
Check that the protocol field after the Ethernet header is IPV6 (0x86dd)
Check that the ipv6 next header is ICMPv6 and check its type:
__u8 nexthdr = ipv6_hdr(skb)->nexthdr;
if (nexthdr == htons (IPPROTO_ICMPV6))
{
struct icmp6hdr *hdr = icmp6_hdr(skb);
u8 type = hdr->icmp6_type;
if(type == htons (NDISC_NEIGHBOUR_SOLICITATION) || type == htons (NDISC_ROUTER_SOLICITATION))
{
….Do something here…
}
}
When RS/NS are sent from within the device (e.g br0), I see that the code is working right.
The problem is when traffic is forwarded through the bridge from the other port.
I see that the icmp6_hdr(skb) returns an incorrect header.
Debugging some more, it seems that the
skb->network_header and the skb->transport_header are pointing to the same place.
icmp6_hdr is using the transport_header which explain why it is incorrect.
Dumping the skb data it looks that all the headers and payload are at the right offset (also compared it with tcpdump)
I suspect that it might be related to the bridge code, before going to dive into it,
I thought that maybe anyone had come up against anything similar or have any other ideas?
Part of the problem is that you are assuming that Netfilter did anything more than just figure out what was the next header. In my experience (albeit not very long) you want to do something like this:
struct icmp6hdr *icmp6;
// Obviously don't do this unless you check to make sure that it's the right protocol
struct ipv6_hdr *ip6hdr = (struct ipv6_hdr*)skb->network_header;
// You need to move the headers around
// Notice the memory address of skb->data and skb->network_header are the same
// that means that the IP header hasn't been "pulled"
skb->transport_header = skb_pull(skb, sizeof(struct ipv6_hdr));
if(ntohs(ip6hdr->nexthdr) == IPPROTO_ICMPV6) {
icmp6 = (struct icmp6hdr*)skb->transport_header;
// Doing this is more efficient, since you only are calling the
// Network to Host function once
__u8 type = ntohs(hdr->icmp6_type);
switch(type) {
case NDISC_NEIGHBOUR_SOLICITATION:
case NDISC_ROUTER_SOLICITATION:
// Do your stuff
break;
}
}
Hopefully this was helpful. I just started diving into writing Netfilter code, so I am not exactly certain 100%, but I found this out when I was trying to do something similar with IPv4 on the NF_IP_LOCAL_IN hook.

pci_driver.probe function not called so pci_device_id wrong?

I am moving my first steps into Linux Kernel Device Driver development.
I learnt that for pci-e cards I have to call pci_register_driver providing information via an object of type pci_driver ( below an example ).
When I load my module ( via insmod ) If the information passed via .id_table is found than the .probe function is called.
As I am now I cannot see my .probe function called at all ( I added some logging via printk ) so I must assume that the information contained in pci_device_id must be wrong, right?
Is there any way to retrieve this information directly from the hardware itself?
Once I plug my PCI-E card on my Linux box, where I can find all information about it?
Maybe reading BIOS or some file in sys?
Any help is appreciated.
AFG
static struct pci_driver my_driver = {
// other here
.id_table = pci_datatable,
.probe = driver_add
//
};
static struct pci_device_id pci_datatable[] __devinitdata =
{
{ VendorID, PciExp_0041, PCI_ANY_ID, PCI_ANY_ID },
{ 0 },
};
int __devinit DmaDriverAdd(
struct pci_dev * pPciDev,
const struct pci_device_id * pPciEntry
)
{
// my stuff!
}
While the accepted answer does indeed answer the question, I want to elaborate a bit about the probe function not being called.
According to the Documentation/PCI/pci.txt (How To Write Linux PCI Drivers) the probing function is called for all existing PCI devices that are not owned by the other drivers yet. So, even if you have the correct vendor and device IDs you will not see the function being called if the device is owned by another driver.
To see which drivers own which devices run:
lspci -knn
If you temporarily change both vendor ID and device ID to PCI_ANY_ID your probe function will be called for every available (i.e. not owned) device.
The command you want is lspci.
With no arguments it will give you a list of all PCI devices, eg:
$ lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family
03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 (rev 34)
...
Then to get the ids, use:
$ lspci -v -n -s 03:00.0
03:00.0 0280: 8086:0085 (rev 34)
Subsystem: 8086:1311
Flags: bus master, fast devsel, latency 0, IRQ 52
You can also find the same information in /sys:
$ cd /sys/bus/pci/devices/0000:03:00.0
$ cat vendor device
0x8086
0x0085
$ cat subsystem_vendor subsystem_device
0x8086
0x1311

What is Target Device of IOCTL_USB_GET_ROOT_HUB_NAME (USB driver specific IOCTL IRQ)

I am a bit confused by the USB IOCTL IOCTL_USB_GET_ROOT_HUB_NAME. What is the target device of it? Although the MSDN WDK doc clearly indicates the target device, I am still confused by the USBVIEW sample provided by the WDK. The reason I'm confused is as follows:
I am new to kernel mode and USB driver writing in Windows and is now studying the USBVIEW sample from the windows driver kit http://msdn.microsoft.com/en-us/library/ff558728(v=vs.85).aspx. The MSDN describes the first step the USBVIEW sample performs as:
Enumerate host controllers and root
hubs. Host controllers have symbolic
link names of the form "HCDx", where x
starts at 0.
Use CreateFile() to open each host
controller symbolic link.
Create a node in the tree view to
represent each host controller.
After a host controller has been
opened, send the host controller an
IOCTL_USB_GET_ROOT_HUB_NAME request to
get the symbolic link name of the root
hub that is part of the host
controller
But, I double checked the usage of IOCTL_USB_GET_ROOT_HUB_NAME in MSDN http://msdn.microsoft.com/en-us/library/ff537326(v=VS.85).aspx
which says:
IOCTL_USB_GET_ROOT_HUB_NAME is a
user-mode I/O control request. This
request targets the USB hub FDO.
Note that the target of the IOCTL_USB_GET_ROOT_HUB_NAME IRP is a USB Hub FDO. However, as described by the USBVIEW sample, we just retreived the host controller symbolic link which means the device object is a host controller device object. How could we send it a IOCTL_USB_GET_ROOT_HUB_NAME IRP? Should we retreive a USB hub FDO somehow first?
I would guess it's an unfortunate copy-paste error. IOCTL_USB_GET_ROOT_HUB_NAME is indeed sent to the host controller and therefore handled by the USB Host Controller FDO.
By the way, just to put you in context:
The term "FDO" only loosely concerns user mode -- it's not like you can access any other "xDO" anyway. If you were to send this IOCTL in kernel mode, then sure, you can send an IOCTL to any specific device object in the device stack ("can" doesn't mean "should", mind you). However, a DeviceIoControl from a user mode application always sends IOCTLs to the top of the device stack (therefore it passes all the filters, the FDO and down to the PDO).
This question was asked on March 28, so I really hope you've solved it by now :)
As the documentation states you will need a handle to the USB host controller but it is not very clear on how you are supposed to get such a handle. In USBView something similar to this function is used to get the device path name by passing GUID_DEVINTERFACE_USB_HOST_CONTROLLER (include initguid.h and usbiodef.h):
vector<wstring> EnumDevices(
_In_ const GUID Guid
)
{
vector<wstring> r;
int index = 0;
HDEVINFO hDevInfo = SetupDiGetClassDevs(&Guid, NULL, NULL, DIGCF_PRESENT | DIGCF_DEVICEINTERFACE);
SP_DEVINFO_DATA DevInfoData;
memset(&DevInfoData, 0, sizeof(SP_DEVINFO_DATA));
DevInfoData.cbSize = sizeof(SP_DEVINFO_DATA);
while (SetupDiEnumDeviceInfo(hDevInfo, index, &DevInfoData)) {
index++;
int jndex = 0;
SP_DEVICE_INTERFACE_DATA DevIntData;
memset(&DevIntData, 0, sizeof(SP_DEVICE_INTERFACE_DATA));
DevIntData.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);
while (SetupDiEnumDeviceInterfaces(
hDevInfo,
&DevInfoData, &Guid, jndex, &DevIntData
)) {
jndex++;
// Get the size required for the structure.
DWORD RequiredSize;
SetupDiGetDeviceInterfaceDetail(
hDevInfo, &DevIntData, NULL, NULL, &RequiredSize, NULL
);
PSP_DEVICE_INTERFACE_DETAIL_DATA pDevIntDetData = (PSP_DEVICE_INTERFACE_DETAIL_DATA)malloc(
sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA) + RequiredSize
);
memset(pDevIntDetData, 0, sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA) + RequiredSize);
pDevIntDetData->cbSize = sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA);
SetupDiGetDeviceInterfaceDetail(
hDevInfo,
&DevIntData,
pDevIntDetData, RequiredSize,
NULL,
&DevInfoData
);
r.push_back(wstring(pDevIntDetData->DevicePath));
free(pDevIntDetData);
}
}
return r;
}
Keep in mind using the above function you can also request devices of type GUID_DEVINTERFACE_USB_HUB and GUID_DEVINTERFACE_USB_DEVICE which may eliminate any need to interact with the host controller or hubs directly.

What could cause "The MDL is being inserted twice on the same process list"?

We are developing an NDIS protocol and miniport driver. When the driver is in-use and the system hibernates we get a bug check (blue screen) with the following error:
LOCKED_PAGES_TRACKER_CORRUPTION (d9)
Arguments:
Arg1: 00000001, The MDL is being inserted twice on the same process list.
Arg2: 875da420, Address of internal lock tracking structure.
Arg3: 87785728, Address of memory descriptor list.
Arg4: 00000013, Number of pages locked for the current process.
The stack trace is not especially helpful as our driver does not appear in the listing:
nt!RtlpBreakWithStatusInstruction
nt!KiBugCheckDebugBreak+0x19
nt!KeBugCheck2+0x574
nt!KeBugCheckEx+0x1b
nt!MiAddMdlTracker+0xd8
nt!MmProbeAndLockPages+0x629
nt!NtWriteFile+0x55c
nt!KiFastCallEntry+0xfc
ntdll!KiFastSystemCallRet
ntdll!ZwWriteFile+0xc
kernel32!WriteFile+0xa9
What types of issues could cause this MDL error?
It turns out the problem was related to this code in our IRP_MJ_WRITE handler:
/* If not in D0 state, don't attempt transmits */
if (ndisProtocolOpenContext &&
ndisProtocolOpenContext->powerState > NetDeviceStateD0)
{
DEBUG_PRINT(("NPD: system in sleep mode, so no TX\n"));
return STATUS_UNSUCCESSFUL;
}
This meant that we weren't fully completing the IRP and NDIS was likely doing something funny as a result. The addition of a call to IoCompleteRequest fixed the issue.
/* If not in D0 state, don't attempt transmits */
if (ndisProtocolOpenContext &&
ndisProtocolOpenContext->powerState > NetDeviceStateD0)
{
DEBUG_PRINT(("NPD: system in sleep mode, so no TX\n"));
pIrp->IoStatus.Status = STATUS_UNSUCCESSFUL;
IoCompleteRequest(pIrp, IO_NO_INCREMENT);
return STATUS_UNSUCCESSFUL;
}

Resources