C socket - Difference between sending and receiving time - performance

I'm working with two devices, that have their clock correctly synchronized (offset less than 1 ms). I need to send 180KB, using WiFi (estimated bandwidth is about 20Mb/s).
I'm using the C function send (with TCP) on the sender, the recv on the receiver. Since the two clock are synchronized, I expect that the sending time and the receiving time should be the same (without taking into account the propagation time).
However, I obtained that the receiving time is 10ms-15ms higher than the sending time, and considering that the estimated sending/receiving time should be about 60ms, this difference is quite high. I don't think that the problem is due to the processing through the TCP stack on the receiver.
Any idea?

Related

Synchronisation for audio decoders

There's a following setup (it's basically a pair of TWS earbuds and a smartphone):
2 audio sink devices (or buds), both are connected to the same source device. One of these devices is primary (and is responsible for handling connection), other is secondary (and simply sniffs data).
Source device transmits a stream of encoded data and sink device need to decode and play it in sync with each other. There problem is that there's a considerable delay between each receiver (~5 ms # 300 kbps, ~10 ms # 600 kbps and # 900 kbps).
It seems that synchronisation mechanism which is already implemented simply doesn't want to work, so it seems that my only option is to implement another one.
It's possible to send messages between buds (but because this uses the same radio interface as sink-to-source communication, only small amount of bytes at relatively big interval could be transferred, i.e. 48 bytes per 300 ms, maybe few times more, but probably not by much) and to control the decoder library.
I tried the following simple algorithm: secondary will send every 50 milliseconds message to primary containing number of decoded packets. Primary would receive it and update state of decoder accordingly. The decoder on primary only decodes if the difference between number of already decoded frame and received one from peer is from 0 to 100 (every frame is 2.(6) ms) and the cycle continues.
This actually only makes things worse: now latency is about 200 ms or even higher.
Is there something that could be done to my synchronization method or I'd be better using something other? If so, what would be the best in such case? Probably fixing already existing implementation would be the best way, but it seems that it's closed-source, so I cannot modify it.

On Windows, is WSASendTo() faster than sendto()?

Is WSASendTo() somehow faster than sendto() on Windows?
Is UDP sendto() faster with a non-blocking socket (if there is space in the send buffer)?
Similar to this question :
Faster WinSock sendto()
From my profiling, the send is network bound with blocking socket, i.e. for example with 100 mbit network both send about 38461 datagrams of size 256 bytes/s which is the network speed allowable, I was wondering if anyone has any preference over the 2 speed wise.
sending from localhost to itself on 127.0.0.1 it seems to handle about 250 k send / s which should be about 64 mbyte/s on a 3 ghz pc
it seems 2 times faster blocking, i.e. without FIONBIO set, i.e. with non blocking set it seems to drop to 32 mbyte/s if I retry on EWOULDBLOCK
I don't need to do any heavy duty UDP broadcasting, only wondering the most efficient way if anyone has any deep set "feelings" ?
Also could there be some sort of transmission moderation taking place on network card drivers could there be a maximum datagrams sendable on a gigabit card say would it tolerate for example 100k sends/s or moderate somehow ?

Veins delay does not change with beacon frequency or number of nodes

I'm trying to simulate an imergancy breaking application using veins and analyze its performance. Research papers on 802.11p shows that as beacon frequency and number of vehicles increase delay should increase considerably due to mac layer delay of the protocol ( for 50 vehicles at 8Hz - about 300ms average delay).
But when simulating application with veins delay values does not show much different ( it ranges from 1ms-4ms).I've checked the Mac layer functionality and it appears that the channel is idle most of the time. So when a packet reaches Mac layer the channel has already been idle for more than the DIFS so packet gets sent quickly. I tried increasing packet size and reducing bitrate. It increase the previous delay by a certain amount. But drastic increase of delay due to backoff process cannot be seen.
Do you know why this happens ???
As you use 802.11p the default data rate on the control channel is 6Mbits (source: ETSI EN 302 663)
750Mbyte/s = 750.000bytes/s
Your beacons contain 500bytes. So the transmission of a beacon takes about 0.0007 seconds. As you have about 50 cars in your multi lane scenario and for example they are sending beacons with a 10 hertz frequency, it takes about 0.35s from 1 second to transmit your 500 beacons.
In my opinion, this are to less cars to create your mentioned effect, because the channel is idling for about 60% of the time.

What is the reason to big overhead while send data using spidev

I'm using spidev driver (linux embedded) to send data through spi communication.
I sent 7 bytes of data (in 1 Mb clock rate using "write" command) and I noticed that it takes me approximately 200 microseconds to complete that operation (I used scope to verify that the clock rate is correct).
Time of sending that data should be 56 microseconds + some overhead but it seems to me too much.
What can be the reason for that overhead?
Is it connected to the switch between user space and kernel space? or is it connected to spidev implementation?

Why wait SIFS time before sending ACK?

Question about the MAC-protocol of 802.11 Wifi.
We have learned that when a station has received the data it waits for SIFS time. Then it sends the packet. When searching online the reason that is always mentioned is to give ACK packets a higher priority. This is understandable since a station first has to wait DIFS time when it wants to send normal data (and DIFS is larger than SIFS).
But why wait at all? Why not immediately send the ACK? The station knows the data has arrived and the CRC is correct, so why wait?
It is theoretically possible to know that the CRC is correct at the exact end of the received data from the wire, but in practice, you need to accumulate all the samples in the last block in order to run the IFFT, deconvolution, FEC, and then, finally, at the very end, after finally getting the input data out of the waveform, do you know that the CRC is correct. Also, you sometimes need to turn on transmit circuitry to send the ACK, which can hamper receive performance. If each step in the processing chain were instantaneous, and if the transmit circuitry definitely didn't interfere with the receive circuitry, and if there were no lead-time necessary for building the waveform for the ACK, it'd be possible to send the ACK immediately after getting the last bit of the wave-form. But, while each element in this chain takes some deterministic time, it is not instantaneous. SIFS gives the receiver time to get the data from the PHY, verify it, and send the response.
Disclaimer: I'm more familiar with Homeplug than 802.11.
It is like that because Distributed Coordination Function (DCF) and Point Coordination Function (PCF) mode can coexist within one cell. That is a base station may use polling while at the same time the cell can use disitributed coordination using CSMA/CA.
So during SIFS, control frames or next fragment may be sent. During PIFS, PCF frames may be sent and during DIFS DCF frames may be sent. During SIFS and PIFS, PCF can work its magic.
Even though not all base stations support PCF all stations must wait since some may support it.
Update:
The way I understand this now is that during SIFS the station may send RTS,CTS or ACK and have enough time to switch back to receiving mode before the sender starts to transmit. If that's correct, it will send ACK during SIFS. Then it will change to receive mode and wait until SIFS elapses. When SIFS has elapsed the transmitter will start sending.
Also, PCF is controlled by PIFS which comes after SIFS and before DIFS and is therefor not relevant for this discussion (my mistake). That is, SIFS < PIFS < DIFS < EIFS.
Sources: This PDF (page 8) and Computer Networks by Andrew S. Tanenbaum
SIFS = RTT (based on PHY Transmission rate) + FRAME PROCESSING DELAY AT RECEIVER (PHY PROCESSING DELAY + MAC PROCESSING DELAY) + FRAME PROCESSING DELAY (FOR COMPOSING RESPONSE CTS/ACK)+ RF TUNER DELAY (CHANGE FROM RX to TX)
A the Transmitter side, after last PHY Symbol (of RTS, e.g), the time required to change to RX mode (at RF). So, I would see SIFS as a RX side calculation than a TX side.
I can't say for sure but it sounds like an optimization strategy similar to IP. If you don't require an ACK for every data packet, it makes sense to hold off for a bit so that, if more data packets arrive, you can acknowledge them all with a single ACK.
Example: client sends 400 packets really fast to the server. Rather than the server sending back 400 ACKs, it can simply wait until the client takes a breather before sending a single ACK back. Combined with the likelihood that the client will take a breather even under heavy load (it has to as its unacknowledged-packets buffer fills up), this would be workable.
This is possible in systems where the ACK(n) means "I've received everything up to and including packet # n.
You'll get better performance and less traffic by using such a strategy. As long as the wait-before-sending-ack time on the receiver is less than the retransmit-if-no-ack-before time on the sender (taking transmission delays into account), there should be no problem.
Quick crash-course on 802.11:
802.11 is a essentially a giant system of timers. The most common implementations of 802.11 utilize the distributed coordination function, DCF. The DCF allows for nodes to come in and out of the range of a radio channel being used for 802.11 and coordinate in a distributed fashion who should be sending and receiving data (ignoring hidden and exposed node problems for this discussion). Before any node can begin sending data on the channel they all must wait a period of DIFS, in which the channel is determined to be idle, if it is idle during a DIFS period the first node to grab the channel begins transmitting. In standard 802.11, i.e. non-802.11e implementations and non 802.11n, every single data packet that gets transmitted needs to be acknowledged by a physical layer, PHY, acknowledgment packet, irregardless of the upper layer protocol being used. After a data packet gets sent a SIFS time period needs to expire, after SIFS expires control frames destined for the node that has "taken" control of the channel may be sent, in this instance and acknowledgment frame is transmitted. SIFS allows the node that sent the data packet to switch from transmitting to receiving mode. If a packet does get lost and no ACK is received after SIFS/ACK timeout occurs, then exponential back-off is invoked. Exponential back-off, a.k.a contention window (CW), begins at a value CWmin, in some linux implementation this is 15 slot times, where a slot time varies depending on the 802.11 protocol that is being used. The CW value is then chosen from 1 to whatever the upper limit that has been calculated for CW. If the current packet was lost, then the CW is incremented from 15 to 30, and then a random value is chosen between 1 and 30. Every-time there is a consecutive lose the CW doubles up to 1023, at which point it hits a limit. Once a packet is received successfully the CW is reset back to CWmin.
In regards to 802.11n / 802.11e:
Every data packet still needs to be acknowledged, but when using 802.11e (implemented into 802.11n) multiple data packets can be aggregated together in two different ways A-MSDU or A-MPDU. A-MSDU is a jumbo-frame that has one checksum for the entire aggregated packet being sent, within it are many sub-frames that contain each of the data frames that needed to be sent. If there is any error in the A-MSDU frame and it needs to be retransmitted, then every sub-frame is required to be resent. However, when using A-MPDU, each sub-frame has a small header and checksum that allow for any sub-frame that has an error in it to be retransmitted by itself/within another aggregated frame the next time the sending nodes gains the channel. With these aggregated packet sending schemes there is the notion of the block-ack. The block-ack contains a bitmap of the frames from a starting sequence number that were just sent in the aggregated packet and received correctly or incorrectly. Usage of aggregated frame sending greatly improves throughput performance as more data can be sent per channel acquisition by a sending node, also allowing out-of-order packet sending. However, out-order packet sending greatly complicates the 802.11 MAC layer.
SIFS=D+M+Rx/Tx
Where,
D=(Receiver delay (RF delay) and decoding of physical layer convergence procedure (PLCP) preamble/header)
M=(MAC processing delay)
Rx/Tx=(transceiver turnaround time)
Above all the delays can not be avoided so It has to wait SIFS time before sending acknowledgement

Resources