What are the advantages of 16-bit adressing on IEEE 802.15.4 networks? - wireless

The only advantage I can think of using 16-bit instead of 64-bit addressing on a IEEE 802.15.4 network is that 6 bytes are saved in each frame. There might be a small win for memory constrained devices as well (microcontrollers), especially if they need to keep a list of many addresses.
But there are a couple of drawbacks:
A coordinator must be present to deal out short addresses
Big risk of conflicting addresses
A device might be assigned a new address without other nodes knowing
Are there any other advantages of short addressing that I'm missing?

You are correct in your reasoning, it saves 6 bytes which is a non-trivial amount given the packet size limit. This is also done with PanId vs ExtendedPanId addressing.
You are inaccurate about some of your other points though:
The coordinator does not assign short addresses. A device randomly picks one when it joins the network.
Yes, there is a 1/65000 or so chance for a collision. When this happens, BOTH devices pick a new short address and notify the network that there was an address conflict. (In practice I've seen this happen all of twice in 6 years)
This is why the binding mechanism exists. You create a binding using the 64-bit address. When transmission fails to a short address, the 64-bit address can be used to relocate the target node and correct the routing.

The short (16-bit) and simple (8-bit) addressing modes and the PAN ID Compression option allow a considerable saving of bytes in any 802.15.4 frame. You are correct that these savings are a small win for the memory-constrained devices that 802.15.4 is design to work on, however the main goal of these savings are for the effect on the radio usage.
The original design goals for 802.15.4 were along the lines of 10 metre links, 250kbit/s, low-cost, battery operated devices.
The maximum frame length in 802.15.4 is 128 bytes. The "full" addressing modes in 802.15.4 consist of a 16-but PAN ID and a 64-bit Extended Address for both the transmitter and receiver. This amounts to 20 bytes or about 15% of the available bytes in the frame. If these long addresses had to be used all of the time there would be a significant impact on the amount of application data that could be sent in any frame AND on the energy used to operate the radio transceivers in both Tx and Rx.
The 802.15.4 MAC layer defines an association process that can be used to negotiate and use shorter addressing mechanisms. The addressing that is typically used is a single 16-bit PAN ID and two 16-bit Short Ids, which amounts to 6 bytes or about 5% of the available bytes.
On your list of drawbacks:
Yes, a coordinator must hand out short addresses. How the addresses are created and allocated is not specified but the MAC layer does have mechanisms for notifying the layers above it that there are conflicts.
The risk of conflicts is not large as there are 65533 possible address to be handed out and 802.15.4 is only worried about "Layer 2" links (NB: 0xFFFF and 0xFFFE are special values). These addresses are not routable/routing/internetworking addresses (well, not from 802.15.4's perspective).
Yes, I guess a device might get a new address without the other nodes knowing but I have a hunch this question has more to do with ZigBee's addressing than with the 802.15.4 MAC addressing. Unfortunately I do not know much about ZigBee's addressing so I can't comment too much here.
I think it is important for me to point out that 802.15.4 is a layer 1 and layer 2 specification and the ZigBee is layer 3 up, i.e. ZigBee sits on top of 802.15.4.
This table is not 100% accurate, but I find it useful to think of 802.15.4 in this context:
+---------------+------------------+------------+
| Application | HTTP / FTP /Etc | COAP / Etc |
+---------------+------------------+------------+
| Transport | TCP / UDP | |
+---------------+------------------+ ZigBee |
| Network | IP | |
+---------------+------------------+------------+
| Link / MAC | WiFi / Ethernet | 802.15.4 |
| | Radio | Radio |
+---------------+------------------+------------+

Related

ACP and DMA, how they work?

I'm using ARM a53 platform, it has ACP component, and I'm trying to use DMA to transfer data through ACP.
By ARM trm document, if I understand it correctly, the DMA transmission data size limits to 64 bytes for each DMA transfer when using ACP.
If so, does this limitation make DMA not usable? Because it's dumb to configure DMA descriptor but to transfer 64 bytes only each time.
Or DMA should auto divide its transfer length into many ACP size limited(64 bytes) packets, without any software intervention.
Need any expert to explain how ACP and DMA work together.
Somewhere in the interfaces from the DMA to the ACP's AXI port should auto divide its transfer length as needed into transfers of appropriate length. For the Cortex-A53 ACP, AXI transfers are limited to 64B(perhaps intentionally 1x cacheline).
From https://developer.arm.com/documentation/ddi0500/e/level-2-memory-system/acp/transfer-size-support :
x byte INCR request characterized by:(some list of limitations)
Note the use of INCR instead of FIXED. INCR will automatically increment the address according to the size of the transfer, while FIXED will not. This makes it simple for the peripheral break a large transfer into a series of multiple INCR transfers.
However, do note that on the Cortex-A53, transfer size(x in the quote) is fixed at 16 or 64 byte aligned transfers. If the DMA sends an inappropriate sized transfer(because misconfigured or correct size unsupported), the AXI will emit a SLVERR. If the buffer is not appropriately aligned, I think this also causes a SLVERR.
Lastly, the on-chip network routing must support connecting the DMA to the ACP at chip design time. In my experience this is more commonly done for network accelerators and FPGA fabric glue, but tends to be less often connected for low speed peripherals like UART/SPI/I2C.

AXI4 (Lite) Narrow Burst vs. Unaligned Burst Clarification/Compatibility

I'm currently writing an AXI4 master that is supposed to support AXI4 Lite (AXI4L) as well.
My AXI4 master is receiving data from a 16-bit interface. This is on a Xilinx Spartan 6 FPGA and I plan on using the EDK AXI4 Interconnect IP, which has a minimum WDATA width of 32 bits.
At first I wanted to use narrow burst, i.e. AWSIZE = x"01" (2 bytes in transfer). However, I found that Xilinx' AXI Reference Guide UG761 states "narrow bursts [are] supported but [...] not recommended." Unaligned transactions are supposed to be supported.
This had me thinking. Say I start an unaligned burst:
AWLEN = x"01" (2 beats)
AWSIZE = x"02" (4 bytes in transfer")
And do the following:
AX (32-bit word #0: send hi16)
XB (32-bit word #1: send lo16)
Where A, B are my 16 bit words that start off at an unaligned (2-byte aligned) address. X means WSTRB is deasserted for the indicated 16 bit.
Is this supported or does this fall under the category "narrow burst" even through AWSIZE = x"02" (4 bytes in transfer) as opposed to AWSIZE = x"01" (2 bytes in transfer)?
Now, if this was just for AXI4, I would probably not care as much about this use case, because AXI4 peripherals are required to use the WSTRB signals. However, the AXI Reference Guide UG761 states "[AXI4L] Slaves interface can elect to ignore WSTRB (assume all bytes valid)."
I read here that many (but not all; and there is not list?) Xilinx AXI4L peripherals do elect to ignore WSTRB.
Does this mean that I'm essentially barred from doing narrow burst ("not recommended") as well as unaligned bursts ("WSTRB can be ignored") or is there an easy way to unload some of the implementation work from my master into the interconnect, guaranteeing proper system behavior when accessing AXI4L peripherals?
Your example is not a narrow burst, and should work.
The reason narrow burst is not recommended is that it gives sub-optimal performances. Both narrow-burst and data realignement cost in area and are not recommended IMHO. However, DRE has minimal bandwidth cost, while narrow burst does. If your AXI port is 100MHz 32 bits, you have 3.2GBits maximum throughput, if you use narrow burst of 16 bits 50% of the time, than your maximum throughput is reduced to 2.4GBits (32bits X 50MHz + 16bits X 50Mhz). Also, I'm not sure AXI-Lite support narrow burst or data realignement.
Your example has 2 major flaws. First, it requires 3 data-beats to transfer 32 bits, which is worst than narrow-burst (I don't think AXI is smart enough to cancel the last burst with WSTRB to 0). Second, you can't burst more than 2 16-bits at a time, which will hang your AXI infrastructure's performances if you have a lot of data to transfer.
The best way to deal with this is concatenate the 16 bits together to form a 32 bits in your block. Then you buffer these 32 bits and burst them when you have enough. This is the AXI high performance way to do this.
However, if you receive data as 16-bits, it seems you would be better using AXI-Stream, which support 16-bits but doesn't have the notion of addresses. You can map an AXI-Stream to AXI-4 using Xilinx's IP cores. Either AXI-Datamover or AXI-DMA can do that. Both do the same (in fact, AXI-DMA includes a datamover), but AXI-DMA is controlled trough an AXI-Lite interface while Datamover is controlled through additionals AXI-Streams.
As a final note, the Xilinx cores never requires narrow-burst or DRE. If you need DRE in AXI-DMA, it's done by the AXI-DMA core and not the AXI Interconnect. Also, these cores are clear-source, so you can checkout how they operate easily.

How does computer really request data in a computer?

I was wondering how exactly does a CPU request data in a computer. In a 32 Bits architecture, I thought that a computer would put a destination on the address bus and would receive 4 Bytes on the data bus. I recently read on the memory alignment in computer and it confused me. I read that the CPU has to read two times the memory to access a not multiple 4 address. Why is so? The address bus lets it access not multiple 4 address.
The address bus itself, even in a 32-bit architecture, is usually not 32 bits in size. E.g. the Pentium's address bus was 29 bits. Given that it has a full 32-bit range, in the Pentium's case that means each slot in memory is eight bytes wide. So reading a value that straddles two of those slots means two reads rather than one, and alignment prevents that from happening.
Other processors (including other implementations of the 32-bit Intel architecture) have different memory access word sizes but the point is generic.

Little Endian Cabling?

I have a doubt about endian-ness concept.please don't refer me to wikipedia, i've already read it.
Endian-ness, Isn't it just the 2 ways that the hardware cabling(between memory, and registers, through data bus) has been implemented in a system?
In my understanding, below picture is a little endian implementation(follow horizontal line from a memory address (e.g 4000) and then vertical line to reach to the low/high part of the register please)
As you see little memory addresses have been physically connected to low-part of 4-byte register.
I think that it does not related at all to READ and WRITE instructions in any language(e.g. LDR in ARM).
1-byte memory address:
- 4000 value:XX ------------------|
- 4001 value:XX ---------------| |
- 4002 value:XX ------------| | |
- 4003 value:XX ---------| | | |
| | | |
general-purpose register:XX XX XX XX
Yes and no. (I can't see your diagram, but I think I understand what you're asking). The way data lines are physically connected in the hardware can determine/control whether the representation in memory is treated as big or little endian. However, there is more to it than this; little endian is a means of representation, so for instance data stored on magnetic storage (in a file) might be coded using little endian representation or big endian representation and obviously at this level the hardware is not important.
Furthermore, some 8 bit microcontrollers can perform 16 bit operations, which are performed at the hardware level using two separate memory accesses. They can therefore use either little or big endian representation independent of bus design and ALU connection.

Independent memory channels: What does it mean for a programmer

I read the Datasheet for an Intel Xeon Processor and saw the following:
The Integrated Memory Controller (IMC) supports DDR3 protocols with four
independent 64-bit memory channels with 8 bits of ECC for each channel (total of
72-bits) and supports 1 to 3 DIMMs per channel depending on the type of memory
installed.
I need to know what this exactly means from a programmers view.
The documentation on this seems to be rather sparse and I don't have someone from Intel at hand to ask ;)
Can this memory controller execute 4 loads of data simultaneously from non-adjacent memory regions (and request each data from up to 3 memory DIMMs)? I.e. 4x64 Bits, striped from up to 3 DIMMs, e.g:
| X | _ | X | _ | X | _ | X |
(X is loaded data, _ an arbitrarily large region of unloaded data)
Can this IMC execute 1 load which will load up to 1x256 Bits from a contiguous memory region.
| X | X | X | X | _ | _ | _ | _ |
This seems to be implementation specific, depending on compiler, OS and memory controller. The standard is available at: http://www.jedec.org/standards-documents/docs/jesd-79-3d . It seems that if your controller is fully compliant there are specific bits that can be set to indicate interleaved or non-interleaved mode. See page 24,25 and 143 of the DDR3 Spec, but even in the spec details are light.
For the i7/i5/i3 series specifically, and likely all newer Intel chips the memory is interleaved as in your first example. For these newer chips and presumably a compiler that supports it, yes one Asm/C/C++ level call to load something large enough to be interleaved/striped would initiate the required amount of independent hardware channel level loads to each channel of memory.
In the Triple channel section in of the Multichannel memory page on wikipedia there is a small list of CPUs that do this, likely it is incomplete: http://en.wikipedia.org/wiki/Multi-channel_memory_architecture

Resources