I am confused with port mapping and ISR
since i am following an article which mentioned that hardware ports are mapped to memory from 0x00000000 to 0x000003FF
now we can talk with microcontroller of that hardware using these port no using IN and OUT instructions ok
but what is ivt then mean i read ivt contain address of interrupt service routine
everthing is messed in mind
do when we use IN /OUTwith port no cpu checks in ivt and how microcontrollers knows their number
When hardware ports are mapped to memory location then this is called Memory-Mapped IO.
Hardware is accessed by reading/writing data/commands in their registers. In Memory Mapped IO, instead of transmitting data/command to hardware registers, the cpu reads/writes signal/command/data at particular memory locations which are mapped to hardware registers. Therefore, communication between hardware and cpu happens via read/write to specific memory location.
When a hardware in installed it is given a set of fixed memory location for the purpose of Memory Mapped IO and these memory location are recorded. Also, every hardware has its ISR whose address is stored in IVT. Now when a particular hardware interrupts the cpu, the cpu finds the interrupting hardware's ISR address from the IVT. Once the cpu identifies with which hardware the communication (I/O) needs to be done then it communicate with that hardware via Memory Mapped IO by making use of the fixed memory locations which were allocated for that hardware.
Related
I want to understand how a CPU works and so I want to know how it communicates with a PCIe card.
Which instructions does the CPU use to initialize a PCIe port and than read and write to it?
For example OUT or MOV.
A CPU mainly communicates with PCIe cards through memory ranges they expose. This memory may be small for network or sound cards, and very large for graphics cards. Integrated GPUs have also have their own tiny memory but share most of the main memory. Most other cards also have read/write access to main memory.
To set up the PCIe device, the configuration space is written to. On x86, the BIOS or bootloader will provide the location of this data. PCI devices are connecting in a tree which may include hubs and bridges on larger computers and this can be shown in lspci -t. Thunderbolt can even connect to external devices. This is why the OS needs to recursively "probe" the tree to find PCI devices and configure them.
Synchronization uses interrupts and ring buffers. The device can send a prenegotiated interrupt to the CPU when it's done doing work. The CPU writes work to a ring buffer. It then writes another memory location that contains the head pointer. This memory location is located on the device so it can listen to writes there and wake up when there is work to do.
Most of the interaction for modern devices will use MOV instead of OUT. The I/O ports concept is very old and not very suitable for the massive amount of data on modern systems. Having devices expose their functionality as a type of memory instead of a separate mechanism allows vectorized variants of MOV to move 32 bytes or similar at a time. With graphics card and modern network cards supporting offload, they can also use their own hardware to write results back to main memory when instructed to do so. The CPU can then read the results when it's free later, again using MOV.
Before this memory access works, the OS will need to set up the memory mapping properly. The memory mapping is set in the PCI configuration space as BARs. On the CPU side it is set up in the page tables. CPUs usually have caches to keep data locally because access to RAM is slower. This causes a problem when the data needs to get to a PCI device, so the OS will set certain memory as write-through or even uncacheable so this is ensured.
The word BAR is often marketed by GPU vendors. What they are selling is the ability to map a larger region of memory at a time. Without that, OSes have been just unmapping and reinitializing by remapping a limited window of memory at a time. This exemplifies the importance of MOV accessing PCIe devices.
DMA controllers are present on disks, networking devices. So they can transfer data to main memory directly. Then what is use of the dma controller inside processor chip ?Also i would like to know, if there are different buses (i2c, pci, spi) outside of processor chip and only one bus (AXI) inside processor. how does this work?(shouldn’t it result in some bottleneck)
The on-chip DMA can take the task of copying data from devices to memory and viceversa for simple devices that cannot implement a DMA of their own. I can think that such devices can be a mouse, a keyboard, a soundcard, a bluetooth device, etc. These devices have simple logic and their requests are multiplexed and sent to a single general purpose DMA on the chip.
Peripherals with high bandwidths like GPU cards, Network Adapters, Hard Disks implement their own DMA that communicates with the chip's bus in order to initiate uploads and downloads to the system's memory.
if there are different buses (i2c, pci, spi) outside of processor chip
and only one bus (AXI) inside processor. how does this work?(shouldn’t
it result in some bottleneck)
That's actually simple. The on-chip internal AXI bus is much faster - running at a much higher frequency (equal or in the same range to the CPU's frequency) (has a much higher bandwidth) than all the aggregated bandwidths of i2c+pci+spi. Of course multiple hardware elements compete on the AXI bus but usually you have priorities implemented and different optimization techniques.
From Wikipedia:
Direct memory access (DMA) is a feature of computerized systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU). [...] A DMA controller can generate memory addresses and initiate memory read or write cycles. It contains several processor registers that can be written and read by the CPU. These include a memory address register, a byte count register, and one or more control registers.
I understand that PCI and PCIe devices can be configured by the CPU (via code in the BIOS or OS) to respond to certain physical memory addresses by writing to specific areas of the device's configuration space.
In fact the Linux kernel has quite the complicated algorithm for doing this taking into account a lot of requirements of the device (memory alignment, DMA capabilities etc).
Seeing that software seems to be in control of if, when and where this memory is mapped, my question is: How can a piece of software control mapping of physical memory?
After this configuration, the PCI device will know to respond to the given address range, but how does the CPU know that it should go on the PCI bus for those specific addresses that were just dynamically decided?
The northbridge is programmed with the address range(s) that are to be routed to the memory controller(s).
All other addresses go to the external bus.
It is based on address mapping info that CPU had.
normally you have 2^64 -1 address lines with CPU if it is 64 bit processor.
Now memory is now around 16 GB which is 2^34 is around 16 GB.
So all the devices which CPU has (even legacy PCI and PCIe devices) and their config space can be mapped
to address line above this RAM physical address space.
Any IO to this space can be forwarded to respective device.
In our case CPU finds out that the config space which it wants to access to is a PCI or PCIe device then it forwards the
instruction to host bridge of CPU (00:00:00 Do lspci in a box you will see the host bridge with this BDF)
Once it finds out the target device is within host bridge the instruction (Can be IO or Memory) will be converted to appropriate TLP request.
On the modern X86/X86_64 platform, due to MMIO mechanism, are DMA operations to move data between MMIO address space and memory address space? In the Linux kernel, I see that there is a dma_addr_t definition. Is this type used for MMIO addresses?
In general, a DMA operation just refers to a device other than the CPU accessing memory. On x86, there are not separate MMIO and RAM address spaces -- everything is unified. Some examples of typical DMA operations:
A network card might receive a packet from the network and use DMA to write the packet contents into the system's RAM.
A SATA controller might get a write command and use DMA to read the data to send to the hard disk from system RAM.
A graphics card might use DMA to read texture data from system RAM into its own video memory. The video memory is visible to the system CPU through a PCI BAR (MMIO), but that's not really relevant here.
The dma_addr_t type holds a "bus address" in Linux. The address that, for example, a PCI device (like a NIC / SATA controller / GPU) sees a given part of memory mapped at can be different than the address the CPU uses. So Linux has the abstraction of "DMA mapping" to handle this difference.
In the first example above, the network stack would allocate a buffer in RAM, and then pass it to a dma_map function to get a bus address that it hands to the NIC. The NIC would use that address to write the packet into memory.
In older x86 systems, there wasn't really any difference between the physical address that the CPU used and the bus address that external devices used, and the dma_map functions were pretty much NOPs. However, with modern technologies like VT-d, the bus address that a PCI device uses might be completely different than the CPU's physical address, and so it is important to do the DMA mapping and use a dma_addr_t for all addresses that are used by external DMA devices.
I just want to know the the difference between I/o ports and I/o memory, because I am quite confused. And if someone explain the use of it, that would be great. And by use I mean, when I/O ports are preferred and when I/O memory is preferred.
There is no conceptual difference between memory regions and I/O regions: both of them are accessed by asserting electrical signals on the address bus and control bus
While some CPU manufacturers implement a single address space in their chips, others decided that peripheral devices are different from memory and, therefore, deserve a separate address space. Some processors (most notably the x86 family) have separate read and write electrical lines for I/O ports and special CPU instructions to access ports.
Linux implements the concept of I/O ports on all computer platforms it runs on, even on platforms where the CPU implements a single address space. The implementation of port access sometimes depends on the specific make and model of the host computer (because different models use different chipsets to map bus transactions into memory address space).
Even if the peripheral bus has a separate address space for I/O ports, not all devices map their registers to I/O ports. While use of I/O ports is common for ISA peripheral boards, most PCI devices map registers into a memory address region. This I/O memory approach is generally preferred, because it doesn't require the use of special-purpose processor instructions; CPU cores access memory much more efficiently, and the compiler has much more freedom in register allocation and addressing-mode selection when accessing memory.
More Details at http://www.makelinux.net/ldd3/chp-9-sect-1