I have one of Zynq development boards (Z7020), where on the hardware cores I am running Linux. I want to be to control logic which I will program into FPGA portion of Zynq with a GUI interface running on the hardware cores and displayed on the connected touch display screen.
Would I just send interrupts to FPGA as I am selecting an options or start/stoping a task from the GUI interface?
How do I also return either indication that task is finished back from FPGA to hardware cores or possibly some data?
The most direct communication path between the CPUs and the programmable logic is the AXI memory interconnect, which enable the processors to send read and write requests to the programmable logic.
You can implement registers or FIFOs in your programmable logic and control the logic by writing to the registers or enqueuing data into the FIFOs. The programmable logic can return data to the processors via registers or be enqueuing into memory-mapped FIFOs that are dequeued by the processors.
It can be helpful for the programmable logic to interrupt the CPU when there is something for the CPU to do.
Interrupts and AXI interconnect between the processors and the programmable logic are documented in the Zynq Technical Reference Manual.
Related
I want to transmit a small static UDP packet upon receiving a trigger signal from an FPGA by GPOI. This has to be done around 1 microsecond with low latency and no jitter. My setup consists of FPGA card is connected tot NXP processor via PCIe lane.
My current experimentation showed that even starting the transmit from the GPIO interrupt handler in the kernel typically exhibits too high a jitter to be useful for the application (about one microsecond should be doable). As I am not familiar with DPDK, I wanted to ask whether it can be of any help in this situation.
Can I use DPDK to do the following
Prepare the UDP payload in Buffer.
Push the buffer to DPAA2.
Poll periodically for the GPIO from FPGA over mmaped area on PCIe in DPDK application.
Trigger the transmit of buffer in DPAA2 (and not CPU DDR memory).
Question: instead of issuing the transmit DPDK rte_eth_tx_burst the FPGA shall directly interact with the networking hardware to queue the packet. Can DPDK on NXP do the same for my use case?
note: If DPDK is not going to help, I think I would need to map an IO portal of the DPAA2 management complex directly into the FPGA. But according to the documentation from NXP, they do not consider DPAA2 a public API (unlike USDPAA) and only support it through e.g. DPDK.
I'm building something on a zybo board, so using a Zynq device.
I'd like to write into main memory from the CPU, and read from it with the FPGA in order to write the CPU results out to another device.
I'm pretty sure that I need to use the AXI bus to do this, but I can't work out the best approach to the problem. Do I:
Make a full AXI peripheral myself? Presumably a master which issues read requests to main memory, and then has them fulfilled. I'm finding it quite hard to find resources on how to actually make an AXI peripheral, where would I start looking for straightforward explanations.
Use one of the Xilinx IP cores to handle the AXI bus for me, but there are quite a few of them, and I'm not sure of the best one to use.
Whatever it is, it needs to be fast, and it needs to be able to do large reads from the DDR memory on my board. That memory needs to also be writable by the CPU.
Thanks!
An easy option is to use the AXI-Stream FIFO component in your block diagram. Then you can code up an AXI-Stream slave to receive the data. So the ARM would write via AXI to the FIFO, and your component would stream data out of the FIFO. No need to do any AXI work.
Take a look at Xilinx's PG080 for details.
If you have access to the vivado-hls tool.
Then transferring data from the main memory to the FPGA memory (e.g., BRAM) under a burst scheme would be one solution.
Just you need to use memcpy in your code and then the synthesis tool automatically generates the master IP which is very fast and reliable.
Option 1: Create your own AXI master. You would probably need to create a AXI slave for configuration purposes as well.
I found this article quite helpful to get started with AXI:
http://silica.com/wps/wcm/connect/88aa13e1-4ba4-4ed9-8247-65ad45c59129/SILICA_Xilinx_Designing_a_custom_axi_slave_rev1.pdf?MOD=AJPERES&CVID=kW6xDPd
And of course, the full AXI reference specification is here:
http://www.gstitt.ece.ufl.edu/courses/fall15/eel4720_5721/labs/refs/AXI4_specification.pdf
Option 2: Use the Xilinx AXI DMA component to setup DMA transfers between DDR memory and AXI streams. You would need to interface your logic to the "AXI streams" of the Xilinx DMA component. AXI streams are typically easier to implement than creating a new high performance AXI master.
This approach supports very high bandwidths, and can do both continous streams and packet based transfers. It also supports metadata for each packet.
The Xilinx AXI DMA component is here:
http://www.xilinx.com/products/intellectual-property/axi_dma.html
Xilinx also provides software drivers for this.
I know that we convert the GPIO to irq, but want to understand what is the advantage of doing so ?
If we need interrupt why can't we have interrupt line only in first place and use it directly as interrupt ?
What is the advantage of using GPIO as IRQ?
If I get your question, you are asking why even bother having a GPIO? The other answers show that someone may not even want the IRQ feature of an interrupt. Typical GPIO controllers can configure an I/O as either an input or an output.
Many GPIO pads have the flexibility to be open drain. With an open drain configuration, you may have a bi-direction 'BUS' and data can be both sent and received. Here you need to change from an input to an output. You can imagine this if you bit-bash I2C communications. This type of use maybe fine if the I2C is only used to initialize some other interface at boot.
Even if the interface is not bi-directional, you might wish to capture on each edge. Various peripherals use zero crossing and a timer to decode a signal. For example a laser bar code reader, a magnetic stripe reader, or a bit-bashed UART might look at the time between zero crossings. Is the time double a bit width? Is the line high or low; then shift previous value and add two bits. In these cases you have to look at the signal to see whether the line is high or low. This can happen even if polarity shouldn't matter as short noise pulses can cause confusion.
So even for the case where you have only the input as an interrupt, the current level of the signal is often very useful. If this GPIO interrupt happens to be connected to an Ethernet controller and active high means data is ready, then you don't need to have the 'I/O' feature. However, this case is using the GPIO interrupt feature as glue logic. Often this signalling will be integrated into a dedicated module. The case where you only need the interrupt is typically some custom hardware to detect a signal (case open, power disconnect, etc) which is not industry standard.
The ARM SOC vendor has no idea which case above the OEM might use. The SOC vendor gives lots of flexibility as the transistors on the die are cheap compared to the wire bond/pins on the package. It means that you, who only use the interrupt feature, gets economies of scale (and a cheaper part) because other might be using these features and the ARM SOC vendor gets to distribute the NRE cost between more people.
In a perfect world, there is maybe no need for this. Not so long ago when tranistors where more expensive, some lines did only behave as interrupts (some M68k CPUs have this). Historically the ARM only has a single interrupt line with one common routine (the Cortex-M are different). So the interrupt source has to be determined by reading another register. As the hardware needs to capture the state of the line on the ARM, it is almost free to add the 'input controller' portion.
Also, for this reason, all of the ARM Linux GPIO drivers have a macro to convert from a GPIO pin to an interrupt number as they are usually one-to-one mapped. There is usually a single 'GIC' interrupt for the GPIO controller. There is a 'GPIO' interrupt controller which forms a tree of interrupt controllers with the GIC as the root. Typically, the GPIO irq numbers are Max GIC IRQ + port *32 + pin; so the GPIO irq numbers are just appended to the 'GIC' irq numbers.
If you were designing a bespoke ASIC for one specific system you could indeed do precisely that - only implement exactly what you need.
However, most processors/SoCs are produced as commodity products, so more flexibility allows them to be integrated in a wider variety of systems (and thus sell more). Given modern silicon processes, chip size tends to be constrained by the physical packaging, so pin count is at an absolute premium. Therefore, allowing pins to double up as either I/O or interrupt sources depending on the needs of the user offers more functionality in a given space, or the same functionality in less space, depending on which way you look at it.
It is not about "converting" anything - on a typical processor or microcontroller, a number of peripherals are connected to an interrupt controller; GPIO is just one of those peripherals. It is also by no means universally true; different devices have different capabilities, but in any case you are simply configuring a GPIO pin to generate an interrupt - that's a normal function of the GPIO not a "conversion".
Prior to ARM Cortex, ARM did not define an interrupt controller, and the core itself had only two interrupt sources (IRQ and FIQ). A vendor defined interrupt controller was required to multiplex the single IRQ over multiple peripherals. ARM Cortex defines an interrupt controller and a more flexible interrupt architecture; it is possible to achieve zero-latency interrupt from a GPIO, so there is no real advantage in accessing a dedicated interrupt? Doing that might mean the addition of external signal conditioning circuitry that is often incorporated in GPIO on the die.
DMA controllers are present on disks, networking devices. So they can transfer data to main memory directly. Then what is use of the dma controller inside processor chip ?Also i would like to know, if there are different buses (i2c, pci, spi) outside of processor chip and only one bus (AXI) inside processor. how does this work?(shouldn’t it result in some bottleneck)
The on-chip DMA can take the task of copying data from devices to memory and viceversa for simple devices that cannot implement a DMA of their own. I can think that such devices can be a mouse, a keyboard, a soundcard, a bluetooth device, etc. These devices have simple logic and their requests are multiplexed and sent to a single general purpose DMA on the chip.
Peripherals with high bandwidths like GPU cards, Network Adapters, Hard Disks implement their own DMA that communicates with the chip's bus in order to initiate uploads and downloads to the system's memory.
if there are different buses (i2c, pci, spi) outside of processor chip
and only one bus (AXI) inside processor. how does this work?(shouldn’t
it result in some bottleneck)
That's actually simple. The on-chip internal AXI bus is much faster - running at a much higher frequency (equal or in the same range to the CPU's frequency) (has a much higher bandwidth) than all the aggregated bandwidths of i2c+pci+spi. Of course multiple hardware elements compete on the AXI bus but usually you have priorities implemented and different optimization techniques.
From Wikipedia:
Direct memory access (DMA) is a feature of computerized systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU). [...] A DMA controller can generate memory addresses and initiate memory read or write cycles. It contains several processor registers that can be written and read by the CPU. These include a memory address register, a byte count register, and one or more control registers.
I just want to know the the difference between I/o ports and I/o memory, because I am quite confused. And if someone explain the use of it, that would be great. And by use I mean, when I/O ports are preferred and when I/O memory is preferred.
There is no conceptual difference between memory regions and I/O regions: both of them are accessed by asserting electrical signals on the address bus and control bus
While some CPU manufacturers implement a single address space in their chips, others decided that peripheral devices are different from memory and, therefore, deserve a separate address space. Some processors (most notably the x86 family) have separate read and write electrical lines for I/O ports and special CPU instructions to access ports.
Linux implements the concept of I/O ports on all computer platforms it runs on, even on platforms where the CPU implements a single address space. The implementation of port access sometimes depends on the specific make and model of the host computer (because different models use different chipsets to map bus transactions into memory address space).
Even if the peripheral bus has a separate address space for I/O ports, not all devices map their registers to I/O ports. While use of I/O ports is common for ISA peripheral boards, most PCI devices map registers into a memory address region. This I/O memory approach is generally preferred, because it doesn't require the use of special-purpose processor instructions; CPU cores access memory much more efficiently, and the compiler has much more freedom in register allocation and addressing-mode selection when accessing memory.
More Details at http://www.makelinux.net/ldd3/chp-9-sect-1