UDP packets burst loss and `SndbufErrors` increase - linux-kernel

I have a server application which send UDP packets at 200Mbps speed. The output ethernet interface is 1000Mbps. But UDP packets burst loss in a irregular interval. I noticed the field SndbufErrors in /proc/net/snmp increased as long as packet loss issue occurred. The packet loss not exists if UDP packets are sent to loopback interface.
There is not any error return by udp.send.
I have digged into Linux kernel, but I'm missing when I reach the route subsystem.
What does SndbufErrors mean? Why does the number increase?

Related

Significantly different LAN transmit speed

I have two machines, one Mac and one Linux, in the same local network. I tried to transfer files by using one of them as an HTTP server, it turned out the download speeds were quite different based on which one was the server. If I use Mac as the server, the download speed was around 3MB/s, but in the opposite way, it's about 12MB/s. Then I used iperf3 to test the speed between them and got a similar result:
When Mac was the server and Linux the client:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 28.7 MBytes 2942 KBytes/sec 1905 sender
[ 5] 0.00-10.00 sec 28.4 MBytes 2913 KBytes/sec receiver
When Linux was the server and Mac the client:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 162 MBytes 16572 KBytes/sec sender
[ 4] 0.00-10.00 sec 161 MBytes 16526 KBytes/sec receiver
I asked a friend to do the download test for me and he told me the speeds were both around 1MB/s on his two Mac, which was far from the router's capacity. How could this happen?
This isn't going to be much of an answer, but it will probably be long enough that it is not going to fit into a comment.
Your observation of "bogus TCP header length" is very interesting; I have never seen it in a capture before, so I wanted to check out exactly what it means. Here you can see that it means that the wireshark TCP protocol dissector can't make any sense of the segment, because the TCP header length is less than the minimum TCP header length.
So it seems you have an invalid TCP segment. Only two causes I know are that it was somehow erroneously constructed (i.e. a bug or intrusion attempt) or that it was corrupted.
I have certainly created plenty of invalid segments when working with raw sockets, and I have seen plenty of forged segments that were not standards conforming, but this doesn't seem likely to be the case in your situation.
So, based on my experience, it seems most likely that it has been somehow corrupted. Although if it was a transmitted packet in the capture, then you are actually sending a invalid segment. So in what follows, I'm assuming it was a received segment.
So where could it have been corrupted? The first mystery is that you are seeing it at all in a capture. If it had been corrupted in the network, the Frame Check Sequence (FCS, a CRC) shouldn't match, and it should have been discarded.
However, it is possible to configure your NIC/Driver to deliver segments with an invalid FCS. On linux you would check/configure these settings with ethtool and the relevant parameters are rx-fcs and rx-all (sorry, I don't know how to do this on a Mac). If those are both "off," your NIC/Driver should not be sending you segments with an invalid FCS and hence they wouldn't appear in a capture.
Since you are seeing the segments with an invalid TCP header length in your capture, and assuming your NIC/Driver is configured to drop segments with an invalid FCS, then your NIC saw a valid segment on the wire, and the segment was either corrupted before the FCS was calculated by a transmitter (usually done in the NIC), or corrupted after the FCS was validated by the receiving NIC.
In both these cases, there is a DMA transfer over a bus (e.g. PCI-e) between CPU memory and the NIC. I'm guessing there is a hardware problem causing corruption here, but I'm not so confident in this guess as I have little information to go on.
You might try getting a capture on both ends to compare what is transmitted to what is received (particularly in the case of segments with invalid TCP header lengths). You can match segments in the captures using the ID field in the IP header (assuming that doesn't get corrupted as well).
Good luck figuring it out!

On Windows, is WSASendTo() faster than sendto()?

Is WSASendTo() somehow faster than sendto() on Windows?
Is UDP sendto() faster with a non-blocking socket (if there is space in the send buffer)?
Similar to this question :
Faster WinSock sendto()
From my profiling, the send is network bound with blocking socket, i.e. for example with 100 mbit network both send about 38461 datagrams of size 256 bytes/s which is the network speed allowable, I was wondering if anyone has any preference over the 2 speed wise.
sending from localhost to itself on 127.0.0.1 it seems to handle about 250 k send / s which should be about 64 mbyte/s on a 3 ghz pc
it seems 2 times faster blocking, i.e. without FIONBIO set, i.e. with non blocking set it seems to drop to 32 mbyte/s if I retry on EWOULDBLOCK
I don't need to do any heavy duty UDP broadcasting, only wondering the most efficient way if anyone has any deep set "feelings" ?
Also could there be some sort of transmission moderation taking place on network card drivers could there be a maximum datagrams sendable on a gigabit card say would it tolerate for example 100k sends/s or moderate somehow ?

How to let kernel send out a ethernet frame large than 1514?

Here's a network performance issue. On my board there's a Gbit ethernet phy, the Tx speed is much poorer than Rx speed when I test network bandwidth with iperf. After comparing the package which is captured by Wireshark, can find that the board always send out Ethernet frame in 1514 bytes, while it can receive in larger Ethernet frame, which is up to 64k.
This is why Tx performance poor than Rx performance.
iperf send data in 128k per send, in kernel it always segment it into 1514 bytes and send to the network driver.
I traced the sku-len when send data, log as bellow. I guess there's some feature in kernel can send large Ethernet frame, but which is it?
I tried to change mtu to 8000 by ifconfig eth0 mtu 8000 command, but no improvement.
[ 128.449334] TCP: Gang tcp_sendmsg 1176 msg->msg_iter.count=31216,size_goal=65160,copy=11640,max=65160
[ 128.449377] TCP: Gang tcp_transmit_skb skb->len=46336
[ 128.449406] Gang ip_output skb-len=46388
[ 128.449416] Gang ip_finish_output2 skb->len=46388
[ 128.449422] Gang sch_direct_xmit skb->len=46402
[ 128.449499] Gang dev_hard_start_xmit skb->len=1514
[ 128.449503] Gang dwmac_xmit skb->len=1514
[ 128.449522] Gang dev_hard_start_xmit skb->len=1514 <>
[ 128.449528] Gang dwmac_xmit skb->len=1514
What you're seeing (TX 1500 and RX 65K) is most likely due to TCP LRO and LSO - Large Receive Offload and Large Send Offload. Rather than having the OS segment or reassemble the packets, this function is passed off to the NIC to reduce the load on the CPU and improve overall performance.
You can use ethtool to verify if either are set, or enable/disable the offload function.
By use ethtool -k eth0, find that tx-tcp-segmentation is off(fixed).
To enable it, need turn on NETIF_F_TSO in mac driver.
But unluckily, my driver crashes after enable this feature. It is another problem.
Thank you Jeff S

NS-3 TCP vs. UDP throughput

I'm a new NS-3 user. I'm trying to find and verify the throughput of TCP wireless network. When experimenting with the "ht-wifi-network.cc"(http://www.nsnam.org/doxygen-release/ht-wifi-network_8cc_source.html) in the example file, I used the default settings,which is a UDP flow, and then tried TCP flow. Then I noticed 2 things:
Throughput is very low comparing with datarate, UDP is 22.78 / 65 and TCP is 11.73 / 65. Is this how the result should be like? Because I was expecting at least 30 Mbps out of 65 Mbps.
UDP throughput is almost twice of the TCP throughput. But I expected that TCP throughput would be higher.
Can somebody help and explain why? Thanks!

C socket - Difference between sending and receiving time

I'm working with two devices, that have their clock correctly synchronized (offset less than 1 ms). I need to send 180KB, using WiFi (estimated bandwidth is about 20Mb/s).
I'm using the C function send (with TCP) on the sender, the recv on the receiver. Since the two clock are synchronized, I expect that the sending time and the receiving time should be the same (without taking into account the propagation time).
However, I obtained that the receiving time is 10ms-15ms higher than the sending time, and considering that the estimated sending/receiving time should be about 60ms, this difference is quite high. I don't think that the problem is due to the processing through the TCP stack on the receiver.
Any idea?

Resources