What is the reason to big overhead while send data using spidev - embedded-linux

I'm using spidev driver (linux embedded) to send data through spi communication.
I sent 7 bytes of data (in 1 Mb clock rate using "write" command) and I noticed that it takes me approximately 200 microseconds to complete that operation (I used scope to verify that the clock rate is correct).
Time of sending that data should be 56 microseconds + some overhead but it seems to me too much.
What can be the reason for that overhead?
Is it connected to the switch between user space and kernel space? or is it connected to spidev implementation?

Related

On Windows, is WSASendTo() faster than sendto()?

Is WSASendTo() somehow faster than sendto() on Windows?
Is UDP sendto() faster with a non-blocking socket (if there is space in the send buffer)?
Similar to this question :
Faster WinSock sendto()
From my profiling, the send is network bound with blocking socket, i.e. for example with 100 mbit network both send about 38461 datagrams of size 256 bytes/s which is the network speed allowable, I was wondering if anyone has any preference over the 2 speed wise.
sending from localhost to itself on 127.0.0.1 it seems to handle about 250 k send / s which should be about 64 mbyte/s on a 3 ghz pc
it seems 2 times faster blocking, i.e. without FIONBIO set, i.e. with non blocking set it seems to drop to 32 mbyte/s if I retry on EWOULDBLOCK
I don't need to do any heavy duty UDP broadcasting, only wondering the most efficient way if anyone has any deep set "feelings" ?
Also could there be some sort of transmission moderation taking place on network card drivers could there be a maximum datagrams sendable on a gigabit card say would it tolerate for example 100k sends/s or moderate somehow ?

STM32F411 I need to send a lot of data by USB with high speed

I'm using STM32F411 with USB CDC library, and max speed for this library is ~1Mb/s.
I'm creating a project where I have 8 microphones connected into ADC line (this part works fine), I need a 16-bit signal, so I'm increasing accuracy by adding first 16 signals from one line (ADC gives only 12-bits signal). In my project, I need 96k 16-bit samples for one line, so it's 0,768M signals for all 8 lines. This signal needs 12000Kb space, but STM32 have only 128Kb SRAM, so I decided to send about 120 with 100Kb data in one second.
The conclusion is I need ~11,72Mb/s to send this.
The problem is that I'm unable to do that because CDC USB limited me to ~1Mb/s.
Question is how to increase USB speed to 12Mb/s for STM32F4. I need some prompt or library.
Or maybe should I set up "audio device" in CubeMX?
If small b means byte in your question, the answer is: it is not possible as your micro has FS USB which max speeds is 12M bits per second.
If it means bits your 1Mb (bit) speed assumption is wrong. But you will not reach the 12M bit payload transfer.
You may try to write (only if b means bit) your own class but I afraid you will not find a ready made library. You will need also to write the device driver on the host computer

ACP and DMA, how they work?

I'm using ARM a53 platform, it has ACP component, and I'm trying to use DMA to transfer data through ACP.
By ARM trm document, if I understand it correctly, the DMA transmission data size limits to 64 bytes for each DMA transfer when using ACP.
If so, does this limitation make DMA not usable? Because it's dumb to configure DMA descriptor but to transfer 64 bytes only each time.
Or DMA should auto divide its transfer length into many ACP size limited(64 bytes) packets, without any software intervention.
Need any expert to explain how ACP and DMA work together.
Somewhere in the interfaces from the DMA to the ACP's AXI port should auto divide its transfer length as needed into transfers of appropriate length. For the Cortex-A53 ACP, AXI transfers are limited to 64B(perhaps intentionally 1x cacheline).
From https://developer.arm.com/documentation/ddi0500/e/level-2-memory-system/acp/transfer-size-support :
x byte INCR request characterized by:(some list of limitations)
Note the use of INCR instead of FIXED. INCR will automatically increment the address according to the size of the transfer, while FIXED will not. This makes it simple for the peripheral break a large transfer into a series of multiple INCR transfers.
However, do note that on the Cortex-A53, transfer size(x in the quote) is fixed at 16 or 64 byte aligned transfers. If the DMA sends an inappropriate sized transfer(because misconfigured or correct size unsupported), the AXI will emit a SLVERR. If the buffer is not appropriately aligned, I think this also causes a SLVERR.
Lastly, the on-chip network routing must support connecting the DMA to the ACP at chip design time. In my experience this is more commonly done for network accelerators and FPGA fabric glue, but tends to be less often connected for low speed peripherals like UART/SPI/I2C.

FTDI driver (Windows) FT_Write() issue with large (1KB) chunk - (version 2.12.16.0)

My application on PC sends a file (2 MB) in chunks of 1 KB to embedded device.
I use FTDI Windows driver, I use the classic FT_Write() API function as my code is cross-platform.
Note: These issues below appear when I use 1KB chunk size. Smaller chunk (I tried 64 bytes) works fine.
The problem is the function returns "0 byte sent" every couple hundred packets and stuck. I found a work around, by purging both TX and Rx, followed by ResetDevice() call recovered the chip. It still happened every couple hundred packets, but at least I can send the whole file (2 MB).
But when I use USB isolator (http://www.bb-elec.com/Products/USB-Connectivity/USB-Isolators/Compact-USB-Port-Guardian.aspx)
the work around failed.
I believe my work around is not a graceful solution.
Note: I use large chunk because of suggestion I found in FTDI application note below:
When writing data to an FTDI device, as much data as possible should
be buffered in the application and written to the device in a single
write function call (either WriteFile for a VCP application using the
Win32 API, FT_Write if using the D2XX classic interface or
FT_WriteFile if using the D2XX FT_W32 interface). The result of this
is that the data will be written to the device with 64 bytes per USB
packet.
Any idea what's the proper fix for these issues? Is it related to FTDI initialization? My driver version is 2.12.16.0 (3/9/2016).
I also saw the same problem of API FT_Write() not working right if too much data was passed,
while working on the library for my USB device Nusbio.
I mostly work in the mode Synchronous Bitbanging rather than UART but after all it is the same
hardware, driver and API.
There are the USB 2.0 specification or the FTDI FT232RL specification and then there is
reality of the electron and bit. The expected numbers of transfer speed never really match at
least at first. In other words it is complicated (see more below in my referenced blog post).
In 2015 I was under the impression that with FTDI chip FT232RL the size of 384 bytes was working well
and the number comes from the chip datasheet (128 byte receive buffer and 256 byte transmit buffer).
Using a size of 500 bytes would still work but above 600 bytes thing would not work.
I later used the chip FT231X which has a larger buffer (1k, 512 byte receive buffer and 512 byte transmit buffer).
and was able to transfer with FT_Write() 1k and 2k buffer of data, therefore more than doubling my speed of transfer.
But above 2k things would not work.
In 2016, I read every thing you can read about FTDI USB 2.0 Full speed chip, I came to the
conclusion that FT_Write should support up to 64K (see datasheet for the following chip
FT232RL, FT231X, FT232H, FT260, FT4222).
I also did some research on faster serial port communication from .NET than 115200 baud.
Somehow I was able to update my C# library to send data in buffer of 32k in FT_Write() and it is
working with the FT232RL and the FT231X chip, but I can't tell you what changed.
I was probably not completely underdanding the in and out of the USB 2.0 full speed FTDI technology.
For example let's say you are using the FT232RL and transfering 384 bytes at the time with
FT_Write(). Knowing that there is at least a 1 milli-second latency in USB 2.0 full speed what ever you
do, you are transfering from a USB point of view 384*1000/1024, that is 375 K byte/s in theory
(that would be the max), that said now what is the baudrate supported by your embedded device.
What is the baudrate used?
The FT232RL max baudrate is 900 000 baud, which would give you only 900000/(1+8+1) == 87 K byte/S.
Right away you can tell there is going to be some problem, may be the FTDI driver takes care of
it or not. I can't tell.
Re do the math based on the baudrate supported by your embedded device, and a 384 byte buffer
sent 1000 per second, then slow down your USB speed with a sleep() to match your baud rate.
That is where I would start.

C socket - Difference between sending and receiving time

I'm working with two devices, that have their clock correctly synchronized (offset less than 1 ms). I need to send 180KB, using WiFi (estimated bandwidth is about 20Mb/s).
I'm using the C function send (with TCP) on the sender, the recv on the receiver. Since the two clock are synchronized, I expect that the sending time and the receiving time should be the same (without taking into account the propagation time).
However, I obtained that the receiving time is 10ms-15ms higher than the sending time, and considering that the estimated sending/receiving time should be about 60ms, this difference is quite high. I don't think that the problem is due to the processing through the TCP stack on the receiver.
Any idea?

Resources