Bug in PF_ROUTE on macOS? - macos

I have a question about using PF_ROUTE on macOS to detect IP address changes. Basically, it seems to me that it is broken for IPv4. I have put together a sample program that simply creates the PF_ROUTE socket and then prints out when RTM_NEWADDR, RTM_DELADDR and RTM_IFINFO are received.
What I notice is that when I use a single interface (wifi or ethernet cable) and disconnect the network adapter (disable wifi or unplug the cable) I get nothing at all. If I then reconnect (enable wifi or plug in the cable) I get RTM_NEWADDR but no RTM_IFINFO.
If I have both the wifi and the cable connected at the same time, both disconnecting and then reconnecting one of the interfaces (e.g. disable wifi then re-enable wifi) produces no events at all.
IPv6 seems to work. If I test IPv6 in the same manner, I get an RTM_NEWADDR on connection and RTM_DELADDR on disconnection (the address is the IPv6 link local address - my DHCP server does not serve up IPv6 addresses).
A couple of other side notes: If I try to do if_indextoname(), it doesn't always work. I need to insert a sleep to be able to consistently get the name back (I chose 500 milliseconds, I didn't spend any time trying other values to see if a lower value would work).
Also, if I call getifaddrs() in a loop (with a little sleeping between calls) after receiving the IPv6 RTM_NEWADDR event to try to find the missing IPv4 address, it can take a long time for it to show up in the returned data. I have seen it take up to 8 seconds on my system. Note that the IP address is up and usable long before this as a continuous ping to an external address readily confirmed.
I have tested this program on a MacBook Pro running 10.13, an iMac running 10.14 and a VM running 10.12 - all behave the same way.
So, my question is: is this a bug in the OS, or do I have a fundamental misunderstanding of how the PF_ROUTE socket is supposed to work?
Thanks,
Kevin
#include <SystemConfiguration/SystemConfiguration.h>
#include <net/route.h>
#include <errno.h>
struct cmn_msghdr
{
u_short msglen;
u_char version;
u_char type;
};
int main(int argc, const char * argv[])
{
char buf[1024];
size_t len;
int skt, family = AF_UNSPEC;
if ( argv[1] && argv[1][0] == '4' )
family = AF_INET;
else if ( argv[1] && argv[1][0] == '6' )
family = AF_INET6;
// Create a PF_ROUTE socket over which we will receive change messages
skt = socket( PF_ROUTE, SOCK_RAW, family );
if ( skt == -1 )
{
printf( "ERR: Failed to create PF_ROUTE socket. error %d\n", errno );
return -1;
}
printf( "Watching for %s address changes. Press Ctrl-C to exit\n",
family == AF_UNSPEC ? "IP" : ( family == AF_INET6 ? "IPv6" : "IPv4" ) );
// Loop forever waiting for messages
for (;;)
{
len = recv( skt, buf, sizeof(buf), 0 );
if ( len < 0 )
{
switch (errno)
{
case EINTR:
case EAGAIN:
printf( "ERR: EINTR or EAGAIN on PF_ROUTE socket\n" );
continue;
default:
printf( "ERR: Failed to receive on PF_ROUTE socket. error %d\n", errno );
continue;
}
}
if ( len < sizeof( cmn_msghdr ) )
{
printf( "ERR: Data received on PF_ROUTE socket too small: %ld bytes\n", len );
continue;
}
struct cmn_msghdr *hdr = (struct cmn_msghdr *)buf;
if ( hdr->version != RTM_VERSION )
{
printf( "ERR: RTM version %d is not supported\n", hdr->version );
continue;
}
switch( hdr->type )
{
case RTM_NEWADDR:
printf( "RTM_NEWADDR\n" );
break;
case RTM_DELADDR:
printf( "RTM_DELADDR\n" );
break;
case RTM_IFINFO:
printf( "RTM_IFINFO\n" );
break;
default:
// Don't care
continue;
}
}
return 0;
}

Related

Do I need to "unbind" pca953x driver from GPIO devices on embedded platform in order to read from device?

I've been having an issue reading from GPIO devices on an embedded device (usrp n310). i2cdetect gives "UU" for the particular devices that I'm trying to reach indicating that the devices are already being occupied by a chip. /sys/bus/i2c/drivers shows that the driver linked to these devices is the pca953x. Previously, I was able to read and write from a GPIO device (tca6416) on the zc706 platform, however, when doing a comparison of /sys/bus/i2c/drivers, I don't see any drivers associated with that chip. The code that I'm using is the following
#include "i2c_dev.hpp"
int main()
{
int i2cfd;
__s32 num;
// Opening i2c adapter 6
printf("Opening bus adapter\n");
i2cfd = open("/dev/i2c-6", O_RDWR);
if ( i2cfd < 0 ) {
printf("Failed to open /dev/i2c-6: %s\n", strerror(errno));
return 1;
}
// Instatiating three objects of IO_Expander class
IO_Expander dba;
// Reading data from the IO Expander
printf("Setting slave address of device\n");
if (ioctl(i2cfd, I2C_SLAVE, 0x20) < 0) {
printf("Error setting slave address:%s\n", strerror(errno));
return 1;
}
printf("Reading data from the IO Expander for DB-A Object\n");
num = dba.read_data(i2cfd, 0x00);
if (num < 0) {
printf("Error reading data: %s\n", strerror(errno));
} else {
printf("The input value is %d\n", num);
}
printf("Leaving DB-A Object\n\n\n");
// Closing the adapter
close(i2cfd);
}
So, am I unable to read from the GPIO devices on the n310 platform because of this pca953x driver? If so, would a correct approach be to "unbind" the pca953x driver the from the devices in order to read values from them?

Inconsistent behavior transmitting bursts of UDP packets on Windows 7

I've got two systems, both running Windows 7. The source is 192.168.0.87, the target is 192.168.0.22, they are both connected to a small switch on my desk.
The source is transmitting a burst of 100 UDP packets to the target with this program -
#include <iostream>
#include <vector>
using namespace std;
#include <winsock2.h>
int main()
{
// It's windows, we need this.
WSAData wsaData;
int wres = WSAStartup(MAKEWORD(2,2), &wsaData);
if (wres != 0) { exit(1); }
SOCKET s = socket(AF_INET, SOCK_DGRAM, 0);
if (s < 0) { exit(1); }
struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = htons(0);
if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) < 0) { exit(3); }
int max = 100;
// build all the packets to send
typedef vector<unsigned char> ByteArray;
vector<ByteArray> v;
v.reserve(max);
for(int i=0;i<max;i++) {
ByteArray bytes(150+(i%25), 'a'+(i%26));
v.push_back(bytes);
}
// send all the packets out, one right after the other.
addr.sin_addr.s_addr = htonl(0xC0A80016);// 192.168.0.22
addr.sin_port = htons(24105);
for(int i=0;i<max;++i) {
if (sendto(s, (const char *)v[i].data(), v[i].size(), 0,
(struct sockaddr *)&addr, sizeof(addr)) < 0) {
cout << "i: " << i << " error: " << errno;
}
}
closesocket(s);
cout << "Complete!" << endl;
}
Now, on first run I get massive losses of UDP packets (often only 1 will get through!).
On subsequent runs, all 100 make it through.
If I wait for 2 minutes or so, and run again, I'm back to losing most of the packets.
Reception on the target system is done using Wireshark.
I also ran Wireshark at the same time on the source system, and found exactly the same trace as on the target in all cases.
That means that the packets are getting lost on the source machine, rather than being lost in the switch or on the wire.
I also tried running sysinternals process monitor, and found that indeed, all 100 sendto calls do result in appropriate winsock calls, but not necessarily in packets on the wire.
As near as I can tell (using arp -a), in all cases the target's IP is in the source's arp cache.
Can anyone tell me why Windows is so inconsistent in how it treats these packets? I get that in my actual application I've just got to rate limit my sends a bit, but I'd like to understand why it works sometimes and not others.
Oh yes, and I also tried swapping the systems for send and receive, with no change in behavior.
Most probably the client is overruning udp send buffer. Maybe while ARP protocol is running to get the target MAC address. You say that you lose datagrams the first run and if you wait 2 minutes or more. Why don't you check with Wireshark what happens in that first run? (If ARP frames are sent/received)
If that is the problem, you could apply one of these 2 alternatives:
1-Before running make sure the ARP entry is there.
2-Send the first datagram, wait 1 sec or less, send the burst

mac osx prism header pcap

i'm trying to capture packets in monitor mode on my mac for research issues. From these packets i need some special information, e.g. the rssi. Unfortunately, the linktype says DLT_IEEE802_11_RADIO, but i actually expect DLT_PRISM_HEADER, because monitor mode should be turned on. This is a problem, because the radiotap header does not provide any RSSI value or other stuff i need.
Here is my code (i leave out the callback method and so forth):
int main(int argc, char *argv[])
{
pcap_t *handle; /* Session handle */
char *dev; /* The device to sniff on */
char errbuf[PCAP_ERRBUF_SIZE]; /* Error string */
struct pcap_pkthdr header; /* The header that pcap gives us */
const u_char *packet; /* The actual packet */
struct ether_header *ether; /* net/ethernet.h */
/* Define the device */
dev = pcap_lookupdev(errbuf);
if(dev == NULL) {
printf("Couldn't find default device: %s\n", errbuf);
exit(EXIT_FAILURE);
}
printf("Device: %s\n", dev);
//handle = pcap_open_live(dev, 1562, 1, 500, errbuf);
handle = pcap_create(dev, errbuf);
if(handle == NULL) {
printf("pcap_create failed: %s\n", errbuf);
exit(EXIT_FAILURE);
}
/* set monitor mode on */
if(pcap_set_rfmon(handle, 1) != 0) {
printf("monitor mode not available\n");
exit(EXIT_FAILURE);
}
pcap_set_snaplen(handle, 2048); // Set the snapshot length to 2048
pcap_set_promisc(handle, 1); // Turn promiscuous mode on
pcap_set_timeout(handle, 512); // Set the timeout to 512 milliseconds
int status = pcap_activate(handle);
if(status != 0) {
printf("activation failed: %d\n", status);
}
printf("link-type: %s\n", pcap_datalink_val_to_name(pcap_datalink(handle)));
int loop = pcap_loop(handle, 1, process_packet, NULL);
if(loop != 0) {
printf("loop terminated before exhaustion: %d\n", loop);
}
/* And close the session */
pcap_close(handle);
return(0);
}
So does anybody know, why i am receiving radiotap and not prism and how i should do instead?
Again i am coding under OSX.
From these packets i need some special information, e.g. the rssi.
Then, unless the driver will let you request PPI headers rather than radiotap headers - use pcap_list_datalinks() in monitor mode after calling pcap_activate() and, if that includes DLT_PPI, set the link-layer header type to DLT_PPI with pcap_set_datalink() - you're out of luck. If you can request PPI headers, then you might be able to get RSSI values from that header; see the PPI specification.
Unfortunately, the linktype says DLT_IEEE802_11_RADIO, but i actually expect DLT_PRISM_HEADER, because monitor mode should be turned on.
There is no reason whatsoever to, on an arbitrary operating system with an arbitrary Wi-Fi device and driver, to expect that you'll get Prism headers in monitor mode. If you get radio information at all, you get whatever header the driver writer supplies. These days, drivers tend to use radiotap - Linux mac80211 drivers, most *BSD drivers, and OS X drivers do.

Upper limit to UDP performance on windows server 2008

It looks like from my testing I am hitting a performance wall on my 10gb network. I seem to be unable to read more than 180-200k packets per second. Looking at perfmon, or task manager I can receive up to a million packets / second if not more. Testing 1 socket or 10 or 100, doesn't seem to change this limit of 200-300k packets a second. I've fiddled with RSS and the like without success. Unicast vs multicast doesn't seem to matter, overlapped i/o vs synchronous doesn't make a difference either. Size of packet doesn't matter either. There just seems to be a hard limit to the number of packets windows can copy from the nic to the buffer. This is a dell r410. Any ideas?
#include "stdafx.h"
#include <WinSock2.h>
#include <ws2ipdef.h>
static inline void fillAddr(const char* const address, unsigned short port, sockaddr_in &addr)
{
memset( &addr, 0, sizeof( addr ) );
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = inet_addr( address );
addr.sin_port = htons(port);
}
int _tmain(int argc, _TCHAR* argv[])
{
#ifdef _WIN32
WORD wVersionRequested;
WSADATA wsaData;
int err;
wVersionRequested = MAKEWORD( 1, 1 );
err = WSAStartup( wVersionRequested, &wsaData );
#endif
int error = 0;
const char* sInterfaceIP = "10.20.16.90";
int nInterfacePort = 0;
//Create socket
SOCKET m_socketID = socket( AF_INET, SOCK_DGRAM, IPPROTO_UDP );
//Re use address
struct sockaddr_in addr;
fillAddr( "10.20.16.90", 12400, addr ); //"233.43.202.1"
char one = 1;
//error = setsockopt(m_socketID, SOL_SOCKET, SO_REUSEADDR , &one, sizeof(one));
if( error != 0 )
{
fprintf( stderr, "%s: ERROR setsockopt returned %d.\n", __FUNCTION__, WSAGetLastError() );
}
//Bind
error = bind( m_socketID, reinterpret_cast<SOCKADDR*>( &addr ), sizeof( addr ) );
if( error == -1 )
{
fprintf(stderr, "%s: ERROR %d binding to %s:%d\n",
__FUNCTION__, WSAGetLastError(), sInterfaceIP, nInterfacePort);
}
//Join multicast group
struct ip_mreq mreq;
mreq.imr_multiaddr.s_addr = inet_addr("225.2.3.13");//( "233.43.202.1" );
mreq.imr_interface.s_addr = inet_addr("10.20.16.90");
//error = setsockopt( m_socketID, IPPROTO_IP, IP_ADD_MEMBERSHIP, reinterpret_cast<char*>( &mreq ), sizeof( mreq ) );
if (error == -1)
{
fprintf(stderr, "%s: ERROR %d trying to join group %s.\n", __FUNCTION__, WSAGetLastError(), "233.43.202.1" );
}
int bufSize = 0, len = sizeof(bufSize), nBufferSize = 10*1024*1024;//8192*1024;
//Resize the buffer
getsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF, (char*)&bufSize, &len );
fprintf(stderr, "getsockopt size before %d\n", bufSize );
fprintf(stderr, "setting buffer size %d\n", nBufferSize );
error = setsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF,
reinterpret_cast<const char*>( &nBufferSize ), sizeof( nBufferSize ) );
if( error != 0 )
{
fprintf(stderr, "%s: ERROR %d setting the receive buffer size to %d.\n",
__FUNCTION__, WSAGetLastError(), nBufferSize );
}
bufSize = 1234, len = sizeof(bufSize);
getsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF, (char*)&bufSize, &len );
fprintf(stderr, "getsockopt size after %d\n", bufSize );
//Non-blocking
u_long op = 1;
ioctlsocket( m_socketID, FIONBIO, &op );
//Create IOCP
HANDLE iocp = CreateIoCompletionPort( INVALID_HANDLE_VALUE, NULL, NULL, 1 );
HANDLE iocp2 = CreateIoCompletionPort( (HANDLE)m_socketID, iocp, 5, 1 );
char buffer[2*1024]={0};
int r = 0;
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(overlapped));
DWORD bytes = 0, flags = 0;
// WSABUF buffers[1];
//
// buffers[0].buf = buffer;
// buffers[0].len = sizeof(buffer);
//
// while( (r = WSARecv( m_socketID, buffers, 1, &bytes, &flags, &overlapped, NULL )) != -121 )
//sleep(100000);
while( (r = ReadFile( (HANDLE)m_socketID, buffer, sizeof(buffer), NULL, &overlapped )) != -121 )
{
bytes = 0;
ULONG_PTR key = 0;
LPOVERLAPPED pOverlapped;
if( GetQueuedCompletionStatus( iocp, &bytes, &key, &pOverlapped, INFINITE ) )
{
static unsigned __int64 total = 0, printed = 0;
total += bytes;
if( total - printed > (1024*1024) )
{
printf( "%I64dmb\r", printed/ (1024*1024) );
printed = total;
}
}
}
while( r = recv(m_socketID,buffer,sizeof(buffer),0) )
{
static unsigned int total = 0, printed = 0;
if( r > 0 )
{
total += r;
if( total - printed > (1024*1024) )
{
printf( "%dmb\r", printed/ (1024*1024) );
printed = total;
}
}
}
return 0;
}
I am using Iperf as the sender and comparing the amount of data received to the amount of data sent: iperf.exe -c 10.20.16.90 -u -P 10 -B 10.20.16.51 -b 1000000000 -p 12400 -l 1000
edit: doing iperf to iperf the performance is closer to 180k or so without dropping (8mb client side buffer). If I am doing tcp I can do about 200k packets/second. Here's what interesting though - I can do far more than 200k with multiple tcp connections, but multiple udp connections do not increase the total (I test udp performance with multiple iperfs, since a single iperf with multiple threads doesn't seem to work). All hardware acceleration is tuned on in the drivers.. It seems like udp performance is simply subpar?
I've been doing some UDP testing with similar hardware as I investigate the performance gains that can be had from using the Winsock Registered I/O network extensions, RIO, in Windows 8 Server. For this I've been running tests on Windows Server 2008 R2 and on Windows Server 8.
I've yet to get to the point where I've begun testing with our 10Gb cards (they've only just arrived) but the results of my earlier tests and the example programs used to run them can be found here on my blog.
One thing that I might suggest is that with a simple test like the one you show where there's very little work being done to each datagram you may find that old fashioned, synchronous I/O, is faster than the IOCP design. Whilst the IOCP design steps ahead as the
workload per datagram rises and you can fully utilise the multiple threads.
Also, are your test machines wired back to back (i.e. without a switch) or do they run through a switch; if so, could the issue be down to the performance of your switch rather than your test machines? If you're using a switch, or have multiple nics in the server, can you run multiple clients against the server, could the issue be on the client rather than the server?
What CPU usage are you seeing on the sending and receiving machines? Have you looked at the machine's cpu usage with Process Explorer? This is more accurate than Task Manager. Which CPU is handling the nic interrupts, can you improve things by binding these to another cpu? or changing the affinity of your test program to run on another cpu? Is your IOCP example spreading its threads across multiple NUMA nodes or are you locking all of them to one node?
I'm hoping to get to run some more tests next week and will update my answer when I have done so.
Edit: For me the problem was due to the fact that the NIC drivers had "flow control" enabled and this caused the sender to run at the speed of the receiver. This had some undesirable "non-paged pool" usage characteristics and turning off flow control allows you to see how fast the sender can go (and the difference in network utilisation between the sender and receiver clearly shows how much data is being lost). See my blog posting here for more details.

Socket message sometimes not sent on Windows 7 / 2008 R2

When sending two UDP messages to a computer on Windows 7, it looks like sometimes the first message is not sent at all. Has anyone else experienced this?
The test code below demonstrates the issue on my machine. When I run the test program and watch all UDP traffic to 10.10.42.22, I see the second UDP message being sent, but the first UDP message is not sent. If I immediately run the program again, then both UDP messages are sent.
It doesn't fail every time, but it usually happens if I wait a couple minutes before running the test again.
#include <iostream>
#include <winsock2.h>
int main()
{
WSADATA wsaData;
WSAStartup( MAKEWORD(2,2), &wsaData );
sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons( 52383 );
addr.sin_addr.s_addr = inet_addr( "10.10.42.22" );
SOCKET s = socket( AF_INET, SOCK_DGRAM, IPPROTO_UDP );
if ( sendto( s, "TEST1", 5, 0, (SOCKADDR *) &addr, sizeof( addr ) ) != 5 )
std::cout << "first message not sent" << std::endl;
if ( sendto( s, "TEST2", 5, 0, (SOCKADDR *) &addr, sizeof( addr ) ) != 5 )
std::cout << "second message not sent" << std::endl;
closesocket( s );
WSACleanup();
return 0;
}
The problem here is basically the same as this post and it has to do with section 2.3.2.2 of RFC 1122:
2.3.2.2 ARP Packet Queue
The link layer SHOULD save (rather than
discard) at least one (the latest)
packet of each set of packets destined
to the same unresolved IP address, and
transmit the saved packet when the
address has been resolved.
It looks like opening a new socket for every UDP message is a workaround.

Resources