Is there a way to query whether the sending buffer of on NIC card is full or not in Linux Kernel? I don't mean the buffer size, I mean the usage of the buffer.
THanks
Related
I have a system that includes an AST BMC. The BMC can provide a KVM for a user to control the host.
The BMC can give the user a KVM because it has access to the host frame buffer.
On the host, the frame buffer is the physical memory of the PCI bus and you can get it using lspci.
On the BMC I have found the memory ranges that hold the frame buffer values.
My question is how exactly the BMC have access to these values? Who is responsible for giving the BMC access to these values in these exact memory ranges?
Thank you very much.
I'm trying to understand everything that happens in between the time a packet reaches the NIC until the time the packet is received by the target application.
Assumption: buffers are big enough to hold an entire packet. [I know it is not always the case, but I don't want to introduce too many technical details]
One option is:
1. Packet reaches the NIC.
2. Interrupt is raised.
2. Packet is transferred from the NIC buffer to OS's memory by means of DMA.
3. Interrupt is raised and the OS copies the packet from it's buffer to the relevant application.
The problem with the above is when there is a short burst of data and the kernel can't keep with the pace. Another problem is that every packet triggers an interrupt which sounds very inefficient to me.
I know that to solve at least one of the above problems there is a use of several buffers [ring buffer]. However I don't understand the mechanism which will allow to make this works.
Suppose that:
1. Packet arrives to the NIC.
2. DMA is triggered and the packet is transfered to one of the buffers [from the ring buffer].
3. Handling of the packet is then scheduled for latter time [bottom half].
Will this work?
Is this is what happened in the real NIC driver within the Linux kernel?
According to this slideshare the correct sequence of actions are:
Network Device Receives Frames and these frames are transferred to the DMA ring buffer.
Now After making this transfer an interrupt is raised to let the CPU know that the transfer has been made.
In the interrupt handler routine the CPU transfers the data from the DMA ring buffer to the CPU network input queue for later time.
Bottom Half of the handler routine is to process the packets from the CPU network input queue and pass it to the appropriate layers.
So a slight variant which is followed in this as compared to traditional DMA transfer is regarding the involvement of CPU.
In this we involve CPU after data gets transferred to the DMA ring buffer unlike traditional DMA transfer where we generate the interrupts as soon as data is available and expect CPU to initialise DMA device with appropriate memory locations to make happen the transfer of data.
Read this as well: https://www.safaribooksonline.com/library/view/linux-device-drivers/0596000081/ch13s04.html
At the moment, the transmit and receive packet size is defined by a macro
#define PKT_BUF_SZ (VLAN_ETH_FRAME_LEN + NET_IP_ALIGN + 4)
So PKT_BUF_SZ comes to around 1524 bytes. So the NIC I am having can handle incoming packets from the network which are <= 1524. Anything bigger than that causes the system to crash or worse reboot. Using Linux kernel 2.6.32 and RHEL 6.0, and a custom FPGA NIC.
Is there a way to change the PKY_BUF_SZ dynamically by getting the size of the incoming packet from the NIC? Will it add to the overhead? Should the hardware drop the packets before it reaches the driver ?
Any help/suggestion will be much appreciated.
This isn't something that can be answered without knowledge of the specific controller. They all work differently in details.
Some broadcom NICs for example have different-sized pools of buffers from which the controller will select an appropriate one based on the frame size. For example, a pool of small (256) byte buffers, a pool of standard size (1536 or so) buffers, and a pool of jumbo buffers.
Some intel NICs have allowed a list of fixed size buffers together with a maximum frame size and it will then pull as many consecutive buffers as needed (not sure linux ever supported this use though -- it's much more complicated for software to handle).
But the most common model that most NICs use (and in fact, I believe all of the commercial ones can be used this way): they expect an entire frame to fit in a single buffer, and your single buffer size needs to accommodate the largest frame you will receive.
Given that your NIC is a custom FPGA one, only its designers can advise you on the specifics you're asking. If linux is crashing when larger packets come through, then most likely either your allocated buffer size is not as large as you are telling the NIC it is (leading to overflow), or the NIC has a bug that is causing it to write into some other memory area.
I have a pci device that reads memory allocated by dma_alloc_coherent
In the kernel documentation it says:
"You may however need to make sure to flush the processor's write buffers before telling devices to read that memory"
How exactly do i do that? how do i flush the memory so that the device reads correct data?
Use wmb() to make sure all the writes to memory completed before the write to the device telling it to start DMA.
On the modern X86/X86_64 platform, due to MMIO mechanism, are DMA operations to move data between MMIO address space and memory address space? In the Linux kernel, I see that there is a dma_addr_t definition. Is this type used for MMIO addresses?
In general, a DMA operation just refers to a device other than the CPU accessing memory. On x86, there are not separate MMIO and RAM address spaces -- everything is unified. Some examples of typical DMA operations:
A network card might receive a packet from the network and use DMA to write the packet contents into the system's RAM.
A SATA controller might get a write command and use DMA to read the data to send to the hard disk from system RAM.
A graphics card might use DMA to read texture data from system RAM into its own video memory. The video memory is visible to the system CPU through a PCI BAR (MMIO), but that's not really relevant here.
The dma_addr_t type holds a "bus address" in Linux. The address that, for example, a PCI device (like a NIC / SATA controller / GPU) sees a given part of memory mapped at can be different than the address the CPU uses. So Linux has the abstraction of "DMA mapping" to handle this difference.
In the first example above, the network stack would allocate a buffer in RAM, and then pass it to a dma_map function to get a bus address that it hands to the NIC. The NIC would use that address to write the packet into memory.
In older x86 systems, there wasn't really any difference between the physical address that the CPU used and the bus address that external devices used, and the dma_map functions were pretty much NOPs. However, with modern technologies like VT-d, the bus address that a PCI device uses might be completely different than the CPU's physical address, and so it is important to do the DMA mapping and use a dma_addr_t for all addresses that are used by external DMA devices.