Linux kernel network device driver and skb pointers - linux-kernel

I am writing a network device driver.
Kernel 2.6.35.12
The device is supposed to be working when it is connected to a bridge port.
I am trying to intercept ICMPv6 RA and NS messages (Router/ Neighbor solicitation) forwarded to the interface from the bridge.
eth <–> br0 <–> mydevice
In the device start_xmit function I am doing to following:
Check that the protocol field after the Ethernet header is IPV6 (0x86dd)
Check that the ipv6 next header is ICMPv6 and check its type:
__u8 nexthdr = ipv6_hdr(skb)->nexthdr;
if (nexthdr == htons (IPPROTO_ICMPV6))
{
struct icmp6hdr *hdr = icmp6_hdr(skb);
u8 type = hdr->icmp6_type;
if(type == htons (NDISC_NEIGHBOUR_SOLICITATION) || type == htons (NDISC_ROUTER_SOLICITATION))
{
….Do something here…
}
}
When RS/NS are sent from within the device (e.g br0), I see that the code is working right.
The problem is when traffic is forwarded through the bridge from the other port.
I see that the icmp6_hdr(skb) returns an incorrect header.
Debugging some more, it seems that the
skb->network_header and the skb->transport_header are pointing to the same place.
icmp6_hdr is using the transport_header which explain why it is incorrect.
Dumping the skb data it looks that all the headers and payload are at the right offset (also compared it with tcpdump)
I suspect that it might be related to the bridge code, before going to dive into it,
I thought that maybe anyone had come up against anything similar or have any other ideas?

Part of the problem is that you are assuming that Netfilter did anything more than just figure out what was the next header. In my experience (albeit not very long) you want to do something like this:
struct icmp6hdr *icmp6;
// Obviously don't do this unless you check to make sure that it's the right protocol
struct ipv6_hdr *ip6hdr = (struct ipv6_hdr*)skb->network_header;
// You need to move the headers around
// Notice the memory address of skb->data and skb->network_header are the same
// that means that the IP header hasn't been "pulled"
skb->transport_header = skb_pull(skb, sizeof(struct ipv6_hdr));
if(ntohs(ip6hdr->nexthdr) == IPPROTO_ICMPV6) {
icmp6 = (struct icmp6hdr*)skb->transport_header;
// Doing this is more efficient, since you only are calling the
// Network to Host function once
__u8 type = ntohs(hdr->icmp6_type);
switch(type) {
case NDISC_NEIGHBOUR_SOLICITATION:
case NDISC_ROUTER_SOLICITATION:
// Do your stuff
break;
}
}
Hopefully this was helpful. I just started diving into writing Netfilter code, so I am not exactly certain 100%, but I found this out when I was trying to do something similar with IPv4 on the NF_IP_LOCAL_IN hook.

Related

CopyPipe of DriverKit IOUSBHostInterface fails with kIOReturnError (0xe00002bc)

For my own edification, I'm trying to read some audio data from a USB audio interface using a DriverKit System Extension.
My IOProviderClass is IOUSBHostInterface. I can successfully Open() the interface, but CopyPipe() returns kIOReturnError (0xe00002bc). Why can't I copy the pipe?
To be able to open the interface at all, I had to outmatch AppleUSBAudio so my IOKitPersonalities explicitly match the bConfigurationValue, bInterfaceNumber, idVendor, idProduct, and bcdDevice keys. This list may not be minimal.
In ioreg I can normally see the interfaces (sometimes only my matching one is there, although I think this is a degenerate situation). I see a AppleUserUSBHostHIDDevice child on some of my other interfaces. Could this be the problem? Normally the device has no problem being both USBAudio and HID. I am trying unsuccessfully to out match HID too.
I was passing the wrong endpoint address to CopyPipe().
To find an endpoint address you need to enumerate through the IOUSBDescriptorHeaders in the IOUSBConfigurationDescriptor and examine the descriptors with bDescriptorType equal to kIOUSBDescriptorTypeEndpoint.
IOUSBGetNextDescriptor() from USBDriverKit/AppleUSBDescriptorParsing.h is made for this and will save you from having think about pointer manipulation.
If the endpoint is in a different alternate setting, then you need to switch the interface to that one with SelectAlternateSetting().
void
enumerate_configs(const IOUSBConfigurationDescriptor *configDesc) {
const IOUSBDescriptorHeader *curHeader = NULL;
while ((curHeader = IOUSBGetNextDescriptor(configDesc, curHeader))) {
switch (curHeader->bDescriptorType) {
case kIOUSBDescriptorTypeEndpoint: {
auto endpoint = (const IOUSBEndpointDescriptor *)curHeader;
os_log(OS_LOG_DEFAULT, "Endpoint bLength: %{public}i, bDescriptorType: %i, bEndpointAddress: %i, bmAttributes: 0x%x, wMaxPacketSize: %i, bInterval: %i",
endpoint->bLength,
endpoint->bDescriptorType,
endpoint->bEndpointAddress, // pass this to CopyPipe()
endpoint->bmAttributes,
endpoint->wMaxPacketSize,
endpoint->bInterval);
}
break;
default:
os_log(OS_LOG_DEFAULT, "some other type: %{public}i", curHeader->bDescriptorType);
break;
}
}
}

Why does UDPSocket.send always call getaddrinfo in Ruby?

I just solved a latency issue in our infrastructure that was triggered because this code snippet here triggered a call to getaddrinfo on every run of the code:
sock = UDPSocket.open
sock.send("#{key}|#{value}", 0,
GRAPHITE_SERVER,
STATSD_PORT)
sock.close
Because we use statsd and graphite for high-volume event and stats monitoring, we were effectively triggering numerous calls getaddrinfo on every API call, and potentially tens of thousands every minute.
I modified this code to use the internal IP address, not the DNS name, of our graphite server, and was able to resolve the latency issue (presumably because the internal AWS VPC DNS server was not equipped to handle such a high volume of requests).
Now that my issue is resolved, I would love to know why the UDP implementation in Ruby is not using a cached IP address value (presumably based on the TTL of the domain name entry). Here is the relevant line and the function in full, you can see the call to rsock_addrinfo just at the end:
static VALUE
udp_send(int argc, VALUE *argv, VALUE sock)
{
VALUE flags, host, port;
struct udp_send_arg arg;
VALUE ret;
if (argc == 2 || argc == 3) {
return rsock_bsock_send(argc, argv, sock);
}
rb_scan_args(argc, argv, "4", &arg.sarg.mesg, &flags, &host, &port);
StringValue(arg.sarg.mesg);
GetOpenFile(sock, arg.fptr);
arg.sarg.fd = arg.fptr->fd;
arg.sarg.flags = NUM2INT(flags);
arg.res = rsock_addrinfo(host, port, rsock_fd_family(arg.fptr->fd), SOCK_DGRAM, 0);
ret = rb_ensure(udp_send_internal, (VALUE)&arg,
rsock_freeaddrinfo, (VALUE)arg.res);
if (!ret) rsock_sys_fail_host_port("sendto(2)", host, port);
return ret;
}
I assume this decision is intentional and would love to learn more about the reasons why.
getaddrinfo does not return data about the TTL... because it may not have it at all in fact, as the resolution may not necessarily be done over the DNS (could be hosts file, LDAP, etc. see /etc/nsswitch.conf)
From its manual here is the structure returned:
int getaddrinfo(const char *hostname, const char *servname, const struct addrinfo *hints, struct addrinfo **res);
struct addrinfo {
int ai_flags; /* input flags */
int ai_family; /* protocol family for socket */
int ai_socktype; /* socket type */
int ai_protocol; /* protocol for socket */
socklen_t ai_addrlen; /* length of socket-address */
struct sockaddr *ai_addr; /* socket-address for socket */
char *ai_canonname; /* canonical name for service location */
struct addrinfo *ai_next; /* pointer to next in list */
};
After a successful call to getaddrinfo(), *res is a pointer to a linked list of one or more addrinfo structures.
So it is up to the thing "behind" getaddrinfo to do some caching or not, because getaddrinfo may have used the DNS to retrieve data, or not.
Some specific API for DNS, like getdnsapi will give back to the caller some information on the TTL, see https://getdnsapi.net/documentation/spec/ and example 6.2
6·2 Get IPv4 and IPv6 Addresses for a Domain Name
This example is similar to the previous one, except that it retrieves more information than just the addresses, so it traverses the replies_tree. In this case, it gets both the addresses and their TTLs.
Without any cache layer anywhere, since UDP is stateless, any new send must trigger resolution in some way or form.
You said:
"modified this code to use the internal IP address, not the DNS name"
You should instead install a local (on the box) recursive caching nameserver, such as unbound. All your local applications will benefit from it, and a faster DNS resolution (depending on how /etc/nsswitch.conf, /etc/resolv.conf and /etc/hosts are setup also).
For the associated bug report hinted by #Casper it seems at its core more an issue about IPv6 vs IPv4 which could be solved either by adjusting /etc/gai.conf or equivalent or doing some more clever programming around opening the connection, with the so called "happy eyeball algorithm" where you try to resolve both A and AAAA at the same time which means two parallel DNS queries (because you can not combine them into one per the protocol) and try to use the fastest one coming back, with a slight preference for AAAA if you want to be in the modern camp so you would fire the A one only some given amount of milliseconds after the AAAA to catch the case where you do not get a reply at all for AAAA or a negative one. See RFC6555 for details.

How does send() work in omnet++

Does the send() in omnet++ set the source address of the packet to the current host address?
Why am I asking? because I'm trying to code a class for a malicious host "Eve" that performs a replay attack.
void MalAODVRouter::handleMessage(cMessage *msg)
{
cMessage *ReplayMsg = msg->dup();
AODVRouting::handleMessage(msg);
capturedMsgs++;
if (capturedMsgs==10) // One out of every 10 packets (frequency of replay)
{
//we can add a delay before sending the copy of the message again (1 time unit)
sendDelayed(ReplayMsg, 1,"ipOut");
ReplayedMsgs++;
std::cout<<"Launched Replay Packet!\n";
ev<<"Launched Replay Packet!\n";
this->capturedMsgs=0;
// }
}
}
You can see at the beginning of my code snippet I tried using the function dup() to duplicate a packet (msg) Eve's receives while its on it's on its way to the legitimate destination.
Now, can I send the duplicated packet later and it would be having the original source address OR should I dig deeper into layers to fake the source address to have Bob's address instead of Eve's? like below:
/*UDPPacket *udpPacket = dynamic_cast<UDPPacket *>(msg);
AODVControlPacket *ctrlPacket = check_and_cast<AODVControlPacket *>(udpPacket->decapsulate());
IPv4ControlInfo *udpProtocolCtrlInfo = dynamic_cast<IPv4ControlInfo *>(udpPacket->getControlInfo());
ASSERT(udpProtocolCtrlInfo != NULL);
IPv4Address sourceAddr = udpProtocolCtrlInfo->getSrcAddr(); //get Source Address
IPv4Address destinationAddr = udpProtocolCtrlInfo->getDestAddr(); //get Destination Address
IPv4Address addr = getSelfIPAddress();
if (addr != destinationAddr) // if it is not destined for "Eve"
{
UDPPacket *ReplayUDPPacket = udpPacket;
AODVControlPacket *ReplayCtrlPacket = check_and_cast<AODVControlPacket *>(ReplayUDPPacket->decapsulate());
IPv4ControlInfo *ReplayUDPProtocolCtrlInfo = dynamic_cast<IPv4ControlInfo *>(ReplayUDPPacket->getControlInfo());
ASSERT(ReplayUDPProtocolCtrlInfo != NULL);
ReplayUDPProtocolCtrlInfo->setSrcAddr(sourceAddr); //Forge Source
ReplayUDPProtocolCtrlInfo->setDestAddr(destinationAddr); //Keep Destination
*/
//we can add a delay before sending the copy of the message again (1 time unit)
sendDelayed(ReplayMsg, 1,"ipOut");
ReplayedMsgs++;
std::cout<<"Launched Replay Packet!\n";
ev<<"Launched Replay Packet!\n";
this->capturedMsgs=0;
Does the send() method automatically sets the source address of the outgoing packet to the current host address? If so, then my replay attempt is not working...
send() is an OMNeT++ API call. As OMNeT++ is just a generic discrete event simulation framework, it does not know anything about the model code (so it cannot and should not manipulate it). IP address is a defined in the INET framework so only code from the INET framework can change it.
On the other hand the modules in the standard host below you module can do whatever they want before the packet is sent out to the network. Now in this actual case, the source IP address is determined by the control info that is attached to the packet. dup()-ing the packet copies that information too, so the IP address will be the same.

blk_cleanup_queue() doesn't return on block device deregistration

I'm writing a block device driver for a hot-pluggable PCI memory device on 2.6.43.2-6.fc15 (so LDD3 is out of date with respect to a lot of functions) and I'm having trouble getting the block device de-registration to go smoothly. When the device is removed, I go to tear down the gendisk and request_queue, but it hangs on blk_cleanup_queue(). Presumably there's some queue-related process I have neglected to carry out before that, but I can't see any major consistent differences with other block drivers from that kernel tree that i am using for reference (memstick, cciss, etc). What are the steps I should carry out before going to tidy up the queue and gendisk?
I am implementing .open, .release, .ioctl in the block_ops as well as a mydev_request(struct request_queue *q) attached with blk_init_queue(mydev_request, &mydev->lock), but I'm not sure exactly how to tidy the queue either when requests occur or when de-registering the block device.
This is caused by not ending the requests that you fetch off the queue. To fix it, end the request as follows:
while ((req = blk_fetch_request(q)) != NULL )
{
res = mydev_submit_request_sg(mydev, req);
if (res)
__blk_end_request_all(req, res);
else
__blk_end_request_cur (req, res);
}

What is Target Device of IOCTL_USB_GET_ROOT_HUB_NAME (USB driver specific IOCTL IRQ)

I am a bit confused by the USB IOCTL IOCTL_USB_GET_ROOT_HUB_NAME. What is the target device of it? Although the MSDN WDK doc clearly indicates the target device, I am still confused by the USBVIEW sample provided by the WDK. The reason I'm confused is as follows:
I am new to kernel mode and USB driver writing in Windows and is now studying the USBVIEW sample from the windows driver kit http://msdn.microsoft.com/en-us/library/ff558728(v=vs.85).aspx. The MSDN describes the first step the USBVIEW sample performs as:
Enumerate host controllers and root
hubs. Host controllers have symbolic
link names of the form "HCDx", where x
starts at 0.
Use CreateFile() to open each host
controller symbolic link.
Create a node in the tree view to
represent each host controller.
After a host controller has been
opened, send the host controller an
IOCTL_USB_GET_ROOT_HUB_NAME request to
get the symbolic link name of the root
hub that is part of the host
controller
But, I double checked the usage of IOCTL_USB_GET_ROOT_HUB_NAME in MSDN http://msdn.microsoft.com/en-us/library/ff537326(v=VS.85).aspx
which says:
IOCTL_USB_GET_ROOT_HUB_NAME is a
user-mode I/O control request. This
request targets the USB hub FDO.
Note that the target of the IOCTL_USB_GET_ROOT_HUB_NAME IRP is a USB Hub FDO. However, as described by the USBVIEW sample, we just retreived the host controller symbolic link which means the device object is a host controller device object. How could we send it a IOCTL_USB_GET_ROOT_HUB_NAME IRP? Should we retreive a USB hub FDO somehow first?
I would guess it's an unfortunate copy-paste error. IOCTL_USB_GET_ROOT_HUB_NAME is indeed sent to the host controller and therefore handled by the USB Host Controller FDO.
By the way, just to put you in context:
The term "FDO" only loosely concerns user mode -- it's not like you can access any other "xDO" anyway. If you were to send this IOCTL in kernel mode, then sure, you can send an IOCTL to any specific device object in the device stack ("can" doesn't mean "should", mind you). However, a DeviceIoControl from a user mode application always sends IOCTLs to the top of the device stack (therefore it passes all the filters, the FDO and down to the PDO).
This question was asked on March 28, so I really hope you've solved it by now :)
As the documentation states you will need a handle to the USB host controller but it is not very clear on how you are supposed to get such a handle. In USBView something similar to this function is used to get the device path name by passing GUID_DEVINTERFACE_USB_HOST_CONTROLLER (include initguid.h and usbiodef.h):
vector<wstring> EnumDevices(
_In_ const GUID Guid
)
{
vector<wstring> r;
int index = 0;
HDEVINFO hDevInfo = SetupDiGetClassDevs(&Guid, NULL, NULL, DIGCF_PRESENT | DIGCF_DEVICEINTERFACE);
SP_DEVINFO_DATA DevInfoData;
memset(&DevInfoData, 0, sizeof(SP_DEVINFO_DATA));
DevInfoData.cbSize = sizeof(SP_DEVINFO_DATA);
while (SetupDiEnumDeviceInfo(hDevInfo, index, &DevInfoData)) {
index++;
int jndex = 0;
SP_DEVICE_INTERFACE_DATA DevIntData;
memset(&DevIntData, 0, sizeof(SP_DEVICE_INTERFACE_DATA));
DevIntData.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);
while (SetupDiEnumDeviceInterfaces(
hDevInfo,
&DevInfoData, &Guid, jndex, &DevIntData
)) {
jndex++;
// Get the size required for the structure.
DWORD RequiredSize;
SetupDiGetDeviceInterfaceDetail(
hDevInfo, &DevIntData, NULL, NULL, &RequiredSize, NULL
);
PSP_DEVICE_INTERFACE_DETAIL_DATA pDevIntDetData = (PSP_DEVICE_INTERFACE_DETAIL_DATA)malloc(
sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA) + RequiredSize
);
memset(pDevIntDetData, 0, sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA) + RequiredSize);
pDevIntDetData->cbSize = sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA);
SetupDiGetDeviceInterfaceDetail(
hDevInfo,
&DevIntData,
pDevIntDetData, RequiredSize,
NULL,
&DevInfoData
);
r.push_back(wstring(pDevIntDetData->DevicePath));
free(pDevIntDetData);
}
}
return r;
}
Keep in mind using the above function you can also request devices of type GUID_DEVINTERFACE_USB_HUB and GUID_DEVINTERFACE_USB_DEVICE which may eliminate any need to interact with the host controller or hubs directly.

Resources