ZeroMQ buffer size v/s High Water Mark - zeromq

In the zeromq socket options, we have flags for both a high water mark and a buffer size.
For sending, it's ZMQ_SNDHWM and ZMQ_SNDBUF.
Can someone explain the difference between these two?

Each one controls some other thing:
ZMQ_SNDBUF: Set kernel transmit buffer size
The ZMQ_SNDBUF option shall set the underlying kernel transmit buffer size for the socket to the specified size in bytes. A ( default ) value of -1 means leave the OS default unchanged.
where man 7 socket says: ( credits go to #Matthew Slatery )
[...]
SO_SNDBUF
Sets or gets the maximum socket send buffer in bytes. The ker-
nel doubles this value (to allow space for bookkeeping overhead)
when it is set using setsockopt(), and this doubled value is
returned by getsockopt(). The default value is set by the
wmem_default sysctl and the maximum allowed value is set by the
wmem_max sysctl. The minimum (doubled) value for this option is
2048.
[...]
NOTES
Linux assumes that half of the send/receive buffer is used for internal
kernel structures; thus the sysctls are twice what can be observed on
the wire.
[...]
whereas
ZMQ_SNDHWM: Set high water mark for outbound messages
The ZMQ_SNDHWM option shall set the high water mark for outbound messages on the specified socket. The high water mark is a hard limit on the maximum number of outstanding messages ØMQ shall queue in memory for any single peer that the specified socket is communicating with. A ( non-default ) value of zero means no limit.
If this limit has been reached the socket shall enter an exceptional state and depending on the socket type, ØMQ shall take appropriate action such as blocking or dropping sent messages. Refer to the individual socket descriptions in zmq_socket(3) for details on the exact action taken for each socket type.
ØMQ does not guarantee that the socket will accept as many as ZMQ_SNDHWM messages, and the actual limit may be as much as 60-70% lower depending on the flow of messages on the socket.
Verba docent, Exempla trahunt:
Having an infrastructure setup, where zmq.PUB side leaves all the settings in their default values, and such sender will have about 20, 200, 2000 zmq.SUB abonents listening to the sender, there would be soon depleted the default, unmodified O/S kernel buffer space, as each .bind()/.connect() relation ( each abonent ) will try to "stuff" as much as about the total cummulative sum of ~ 1000 * aVarMessageSIZE [Bytes] of data, once having been broadcast in a .send( ..., ZMQ_DONTWAIT ) manner & if the O/S would not have provided a sufficient buffer space -- there we go --
"Houston, we have a problem..."If the message cannot be queued on the socket, the zmq_send() function shall fail with errno set to EAGAIN.
Q.E.D.

Related

PIC SPI configuration questions

I have some questions related to SPIxCON registers of SPI. I use PIC18F26K83.
1) There is a SPIxTCNTH: SPI TRANSFER COUNTER MSB REGISTER. And I can set first 3 bits on it which counters the bits to be transmitted. And according to datasheet it is writable bit. According to the datasheet it counts bits that will be transmitted then why is it writable? Do I need to write it according to bits that I will send? Or is it there to inform user.
2) There is SPIxTWIDTH: SPI TRANSFER WIDTH REGISTER. In case of BMODE=1, it is
Size (in bits) of each transfer counted by the transfer counter
I will be sending values such as 1.1 or 2.3 to DAC. In this case what should I set it to? Is there a standart value for this register?
3) I couldn't get what are FIFO registers are for according to datasheet we cannot control them by software. Isn't it like a buffer? So If I am writing to transmit register faster than transmission speed, the transmit data will be put into the FIFO. And one by one they will be transmitted. Am I correct? I do not need to anything rather than writing to the transmit buffer.
4) I read but could not understand the polarity bits in SPIxCON1. Is it okay if I do not touch these bits in the control register? I do not want to mess up.
5) How can I select slaves? There is a SSET (Slave select enable bit) in the SPIxCON2 register. I can make it 1, but then how can I select the slave?
Thank you for your answers. I am a newbie. Sorry for the simple and maybe non sense questions. Or I can simply show my configuration code, but I believe it would be harder to analyse.
1) The transfer counter (when in use) is written to with the number of bytes, or partial bytes, to send or receive (depending on mode). So you'd set it, if you are using it (BMODE=0 or TXR=0) to the number of bytes that you are expecting to send or receive.
2) You'd need to look at your binary representation of those numbers to see how many bits you'd be sending in each case. Standard value is a full byte.
3) the FIFOs are hidden elements, writing to the SPIxTXB or reading from the SPIxRXB registers accesses the respective FIFO. the FIFOs are only two bytes deeps so you'd still need to check for overrun if you are sending fast TXWE bit (iirc) but if you have lots of data to transfer fast I'd recommend using DMA to do the transfer then you'd just be setting it up and letting it go and can do other things until it is finished.
4) I think the polarity bits just set the line level during idle state to either high or low. It should be set the same for everyone (masters and slaves).
5) If you only have one slave you can tie that line to the slaves enable line. If you have more than one slave you'll need to set up a gpio line for each and (for each one) OR the signals together and attach the OR output to the slave enable (if it is active low, which it usually is). Make sure only one slave is active at a time. Doing a daisy chain of slaves can be done as well. I haven't worked with that kind of setup.

Gianfar Linux Kernel Driver Maximum Receive/Transmit Size

I have been trying to understand the code for the gianfar linux ethernet driver and was having difficulty understanding fragemented pages. I understand the maximum transmission size is 9600 bytes, however does this include fragments ?
Is it possible to send and received transmissions that are larger in size (e.g. 14000 bytes) if they are split among multiple fragements ?
Thank you in advance
9600 is a jumbo frame maximum size. The maximum MTU ("jumbo MTU") size is 9600 - 14 = 9586 bytes. Also, if I recall correctly, MTU never includes 4-byte FCS.
So, 9586 must be simply the maximum Ethernet "payload" size which can be put on wire. It's a limitation with respect to a single Ethernet frame. So, if you have a larger chunk of data ("transmission"), you might be able to "slice" it and produce multiple Ethernet frames from it (to be precise, multiple independent skb-s), each fitting the MTU size. So, in this case you will have multiple independent Ethernet frames to be handed over to the network driver. The interconnection between these frames will only be detectable on the IP header level, i.e., if you peek at IP header of the 1st frame you will be able to see "more fragments" flag indicating that the next frame contains an IP packet which is the next fragment of the original (large) chunk of data. But from the driver's point of view such frames should remain independent.
However, if you mean "skb fragments" rather than "IP fragments", then putting a 14000 byte frame into multiple fragments ("data fragments") of a single skb might not be helpful with respect to the MTU (say, you've configured the jumbo MTU on the interface). Because these fragments are just smaller chunks of contiguous memory containing different parts of the same Ethernet frame. And the driver just makes multiple descriptors pointing to these chunks of memory. The hardware will pick them to send a single frame. And if the HW sees that the overall frame length is bigger than the maximum MTU, it might decline the transmission. Exact behaviour in this case is a topic for a separate talk.

what is line sequential buffer length in informatica?

I need explanation for below questions.
what is line sequential buffer length in informatica?
how integration service handles when allocated buffer is full?
Line sequential buffer length in informatica is a property of session which specifies the accepted length of bytes from an individual record of a flat file source. The default size is 1024 as can be seen in the attached screenshot.
To improve the performance of a session generally the size is decreased.
When the allocated buffer is full, the integration service stops the execution of the session and logs the error message in the session logs.

Serial port output buffer size in Windows 7

Unix serial ports have a large output buffer. Write calls return immediately as long as there's space in the buffer. When there isn't enough space, a blocking write waits until the buffer is emptied to some low level.
In Windows 7 SP1, the built-in 16550 serial port behaves as if there's no output buffer. It seems writes block until the data is output from the port. If there is a buffer, it's even smaller than the 16 bytes set in Device Manager (in Advanced Settings for COM1). The SetupComm function lets me specify recommended sizes for input and output buffers. However, the output buffer size doesn't seem to change any behaviour, and GetCommProperties always sets the dwCurrentTxQueue field to zero. The only thing SetupComm can do is increasing the size of the input buffer.
The documentation for SetupComm specifically permits the device driver to ignore the requested values.
Your best bet is to use overlapped I/O and handle the buffering yourself.

Why wait SIFS time before sending ACK?

Question about the MAC-protocol of 802.11 Wifi.
We have learned that when a station has received the data it waits for SIFS time. Then it sends the packet. When searching online the reason that is always mentioned is to give ACK packets a higher priority. This is understandable since a station first has to wait DIFS time when it wants to send normal data (and DIFS is larger than SIFS).
But why wait at all? Why not immediately send the ACK? The station knows the data has arrived and the CRC is correct, so why wait?
It is theoretically possible to know that the CRC is correct at the exact end of the received data from the wire, but in practice, you need to accumulate all the samples in the last block in order to run the IFFT, deconvolution, FEC, and then, finally, at the very end, after finally getting the input data out of the waveform, do you know that the CRC is correct. Also, you sometimes need to turn on transmit circuitry to send the ACK, which can hamper receive performance. If each step in the processing chain were instantaneous, and if the transmit circuitry definitely didn't interfere with the receive circuitry, and if there were no lead-time necessary for building the waveform for the ACK, it'd be possible to send the ACK immediately after getting the last bit of the wave-form. But, while each element in this chain takes some deterministic time, it is not instantaneous. SIFS gives the receiver time to get the data from the PHY, verify it, and send the response.
Disclaimer: I'm more familiar with Homeplug than 802.11.
It is like that because Distributed Coordination Function (DCF) and Point Coordination Function (PCF) mode can coexist within one cell. That is a base station may use polling while at the same time the cell can use disitributed coordination using CSMA/CA.
So during SIFS, control frames or next fragment may be sent. During PIFS, PCF frames may be sent and during DIFS DCF frames may be sent. During SIFS and PIFS, PCF can work its magic.
Even though not all base stations support PCF all stations must wait since some may support it.
Update:
The way I understand this now is that during SIFS the station may send RTS,CTS or ACK and have enough time to switch back to receiving mode before the sender starts to transmit. If that's correct, it will send ACK during SIFS. Then it will change to receive mode and wait until SIFS elapses. When SIFS has elapsed the transmitter will start sending.
Also, PCF is controlled by PIFS which comes after SIFS and before DIFS and is therefor not relevant for this discussion (my mistake). That is, SIFS < PIFS < DIFS < EIFS.
Sources: This PDF (page 8) and Computer Networks by Andrew S. Tanenbaum
SIFS = RTT (based on PHY Transmission rate) + FRAME PROCESSING DELAY AT RECEIVER (PHY PROCESSING DELAY + MAC PROCESSING DELAY) + FRAME PROCESSING DELAY (FOR COMPOSING RESPONSE CTS/ACK)+ RF TUNER DELAY (CHANGE FROM RX to TX)
A the Transmitter side, after last PHY Symbol (of RTS, e.g), the time required to change to RX mode (at RF). So, I would see SIFS as a RX side calculation than a TX side.
I can't say for sure but it sounds like an optimization strategy similar to IP. If you don't require an ACK for every data packet, it makes sense to hold off for a bit so that, if more data packets arrive, you can acknowledge them all with a single ACK.
Example: client sends 400 packets really fast to the server. Rather than the server sending back 400 ACKs, it can simply wait until the client takes a breather before sending a single ACK back. Combined with the likelihood that the client will take a breather even under heavy load (it has to as its unacknowledged-packets buffer fills up), this would be workable.
This is possible in systems where the ACK(n) means "I've received everything up to and including packet # n.
You'll get better performance and less traffic by using such a strategy. As long as the wait-before-sending-ack time on the receiver is less than the retransmit-if-no-ack-before time on the sender (taking transmission delays into account), there should be no problem.
Quick crash-course on 802.11:
802.11 is a essentially a giant system of timers. The most common implementations of 802.11 utilize the distributed coordination function, DCF. The DCF allows for nodes to come in and out of the range of a radio channel being used for 802.11 and coordinate in a distributed fashion who should be sending and receiving data (ignoring hidden and exposed node problems for this discussion). Before any node can begin sending data on the channel they all must wait a period of DIFS, in which the channel is determined to be idle, if it is idle during a DIFS period the first node to grab the channel begins transmitting. In standard 802.11, i.e. non-802.11e implementations and non 802.11n, every single data packet that gets transmitted needs to be acknowledged by a physical layer, PHY, acknowledgment packet, irregardless of the upper layer protocol being used. After a data packet gets sent a SIFS time period needs to expire, after SIFS expires control frames destined for the node that has "taken" control of the channel may be sent, in this instance and acknowledgment frame is transmitted. SIFS allows the node that sent the data packet to switch from transmitting to receiving mode. If a packet does get lost and no ACK is received after SIFS/ACK timeout occurs, then exponential back-off is invoked. Exponential back-off, a.k.a contention window (CW), begins at a value CWmin, in some linux implementation this is 15 slot times, where a slot time varies depending on the 802.11 protocol that is being used. The CW value is then chosen from 1 to whatever the upper limit that has been calculated for CW. If the current packet was lost, then the CW is incremented from 15 to 30, and then a random value is chosen between 1 and 30. Every-time there is a consecutive lose the CW doubles up to 1023, at which point it hits a limit. Once a packet is received successfully the CW is reset back to CWmin.
In regards to 802.11n / 802.11e:
Every data packet still needs to be acknowledged, but when using 802.11e (implemented into 802.11n) multiple data packets can be aggregated together in two different ways A-MSDU or A-MPDU. A-MSDU is a jumbo-frame that has one checksum for the entire aggregated packet being sent, within it are many sub-frames that contain each of the data frames that needed to be sent. If there is any error in the A-MSDU frame and it needs to be retransmitted, then every sub-frame is required to be resent. However, when using A-MPDU, each sub-frame has a small header and checksum that allow for any sub-frame that has an error in it to be retransmitted by itself/within another aggregated frame the next time the sending nodes gains the channel. With these aggregated packet sending schemes there is the notion of the block-ack. The block-ack contains a bitmap of the frames from a starting sequence number that were just sent in the aggregated packet and received correctly or incorrectly. Usage of aggregated frame sending greatly improves throughput performance as more data can be sent per channel acquisition by a sending node, also allowing out-of-order packet sending. However, out-order packet sending greatly complicates the 802.11 MAC layer.
SIFS=D+M+Rx/Tx
Where,
D=(Receiver delay (RF delay) and decoding of physical layer convergence procedure (PLCP) preamble/header)
M=(MAC processing delay)
Rx/Tx=(transceiver turnaround time)
Above all the delays can not be avoided so It has to wait SIFS time before sending acknowledgement

Resources