Understanding flash composition of Arduino bootloader - makefile

I'm trying to understand the Arduino bootloader. I've found some intro here: Arduino Bootloader, but I need more details and because of it I'm asking for help.
During my research i found a start point: OK, Arduino (atmega family) has a specific block of flash dedicated to bootloader. Once the mcu has a bootloader, it can download the new program via serial and store it on flash, at address 0x00.
Let's asign the atmega328p for this question.
#1 - If you take a look at datasheet page 343, you will see a table, showing some info about the bootloader size:
By this table I understood: if i set BOOTSZ1/0 to 0/0, I can have a 2K bootloader and it will be stored in flash stack: 0x3800 ~ 0x3FFF.
#2 - If you open the hex file of ATMEGA328_BOOTLOADER generated by Arduino, you will see the bootloader stored at:
:10**7800**000C94343C0C94513C0C94513C0C94513CE1
to
:10**7FF0**00FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF91
If you consider 7FF0 - 7800 you will get 7F0 (2K bytes of program)
#3 - If you open the makefile (C:\Program Files (x86)\Arduino\hardware\arduino\avr\bootloaders\atmega) you will see this argument for atmega328:
atmega328: LDSECTION = --section-start=.text=0x7800
Where 0x7800 matches the hexa file from bootloader.
Questions:
1- Why the datasheet tells me that I have an special place for bootloader and the makefile of Arduino force it to be stored in a different place?
2- What means the line of an hexa file?
:10E000000D9489F10D94B2F10D94B2F10D94B2F129
:10 : (?)
E000 : Address
00 : (?)
0D9489F10D94B2F10D94B2F10D94B2F1 : data (?)
29 : CRC (?)

First of all, I'm not sure we have the same atmel datasheet; your table was 27-7 on my one, but anyway the data is the same.
Disclaimer: I'm not 100% sure about this, but I'm pretty confident.
I think that the "problem" here is that "words". Reading in the overview of the core they wrote
Most AVR instructions have a single 16-bit word format. Every program memory address contains a 16- or 32-bit instruction
This means that instructions are 2 byte wide. So your address 0x3800 corresponds to 0x7000 byte from the start of the memory, while the end of the bootloader 0x3FFF corresponds to the bytes 0x7FFE and 0x7FFF.
This is confirmed by the memory size: 0x7FFF corresponds to 32k, which is the available flash. Since the bootloader is at the end of the memory (as the pictures from the DS imply), this means that 0x3FFF should be the last memory cell.
EDIT:
When I answered, I did not read the second part of the question. The format is the so-called Intel Binary format. See Wikipedia for the detailed information. Here is a brief summary.
You correctly identified the field division:
: : Start code
10 : Number of data bytes (in this case: 16)
E000 : Starting address of the data bytes
00 : Record type (in this case: data)
0D9489F10D94B2F10D94B2F10D94B2F1 : data payload (in this case: 16 bytes)
29 : CRC
The record type can be
00: data
01: end of file
02: extended segment address
03: start segment address
04: extended linear address
05: start linear address
The CRC is obtained summing all the bytes (starting from the length one, so the very beginning), then getting the LSB and finally making the 2 complement. In this case the sum is 2519, which is 0x9D7 in hex. The lower byte, D7, corresponds to 215. Its 2 complement can be calculated as 256 - x, so 41, which means 0x29 in hex.

Related

Question about memory space in microprocessor

My teacher has given me the question to differentiate the maximum memory space of 1MB and 4GB microprocessor. Does anyone know how to answer this question apart from size mentioned difference ?
https://i.stack.imgur.com/Q4Ih7.png
A 32-bit microprocessor can address up to 4 GB of memory, because its registers can contain an address that is 32 bits in size. (A 32-bit number ranges from 0 to 4,294,967,295‬). Each of those values can represent a unique memory location.
The 16-bit 8086, on the other hand, has 16-bit registers which only range from 0 to 65,535. However, the 8086 has a trick up its sleeve- it can use memory segments to increase this range up to one megabyte (20 bits). There are segment registers whose values are automatically bit-shifted left by 4 then added to the regular registers to form the final address.
For example, let's look at video mode 13h on the 8086. This is the 256-color VGA standard with a resolution of 320x200 pixels. Each pixel is represented by a single byte and the desired color is stored in that byte. The video memory is located at address 0xA0000, but since this value is greater than 16 bits, typically the programmer will load 0xA000 into a segment register like ds or es, then load 0000 into si or di. Once that is done, the program can read from [ds:si] and write to [es:di] to access the video memory. It's important to keep in mind that with this memory addressing scheme, not all combinations of segment and offset represent a unique memory location. Having es = A100/di = 0000 is the same as es=A000/di=1000.

CRC32 CRC peripheral on STM 32 : byte and word streams of same data give different results

I am using the STM32 ARM CRC peripheral and getting different CRC codes for the same data when fed in as bytes compared to when fed in as words.
Using the byte word length and a small word aligned data string:
const char *ts4 = "The quick brown fox jumped over the lazy brown dog."; // 52 CHARS divisible by 4;
This, with a buffer size of strlen(ts4) gives a CRC32 of ~ 0x01fba559
0xfe045aa6.
It was then configured the CRC for WORD size (setting buffer size to strlen(ts4)/4) and the DMA engine was pointed at the CRC data register. It gave a different CRC result, ~ 0xf2bd1910 0x0d42e6ef, so I called it, again for WORD size using the HAL_CALCULATE method (to ensure the DMA was working as expected). This again gave ~ 0xf2bd1910 0x0d42e6ef.
Does the CRC32 algorithm give different results for different word size inputs ? I don't really want to tie the DMA engine up transferring bytes. Is there an equivalent `C' function that calculates CRC32 with a 32 bit WORD input ? I have tried reversing the order of the bytes in the word but this does not solve it (I thought it might have been a big/little endian problem).
That's 51 characters, not 52. That length over 4 would give 12, not 13. The CRC of the first 48 characters would be expected to be different than the CRC of the 51 characters.
Also I'd think that you would need to assure that the string starts on a word boundary.

AVR Internal Data bus width

I got a doubt about the width of internal data bus of AVR controllers connected to flash memory. I was mainly referring to Atmega328. Datasheet says (Page 17) "Since all AVR instructions are 16 or 32 bits wide, the Flash is organized as
2/4/8/16K x 16.". That means flash memory data bus width must be 16 bit? I could not see anywhere mentioning about 16 bit wide program memory data bus (Of course internal to the controller). But bus for RAM seems to be again 8 bit. Just want a clarification.
BitThe 8-bit AVR family is based on a (modified) Harvard architecture, where you have dedicated program and data storages. The data path to program memory is indeed 16-bit, while it is 8-bit only to data memory.
The funny part is, that in the beginning Atmel points out, that these are 8-bit CPUs. This makes them look very competitive when compared to other 8-bit products like 8051 or Rabbit. Due to the 16-bit program data path the AVRs perform very well in benchmark tests. Later, when 8-bit sounds a bit old-fashioned, Atmel decided to call them 8/16-bit CPUs.
Figure 7.1 on page 9 of the data sheet/complete shows that the flash isn't at all connected to the (8 bit) data bus but only to an address bus. The "data" of the flash memory primarily goes into the instruction register and by use of the LPM instruction this data is transferred into a register. Note that when writing data to the flash you always write 16 bit (R1:R0) addressed by the Z pointer (SPM instruction) ... and that the SPM instruction cannot be expressed in "clock cycles" (pg. 617)

Stm32 cortex-m3 memory remap

I am working on a stm32l152 now.
My boot up vector table is located on flash 0x0800 0000, where there is a valid reset handler vector and stack pointer. The rest of the exception/interrupt vectors are just endless loops.
Then I setup another vector table in ram, starting at 0x2000 0000. This vector table will have all necessary vectors.
My problem is that after doing a memory remap to map 0x0000 0000 to 0x2000 0000, and when my interrupt fires off, it seems the mcu is still looking for the vectors in 0x0800 0000. I have confirmed this by changing my related-vector in the flash table to that of the one in the ram table. If the flash table related-vector points to a endless loop, my program will loop endlessly. Also, I confirmed my memory remap is correct by writing/read back some memory locations across 0x0000 0000, 0x0800 0000, 0x2000 0000.
Next I use the other method of changing the VTOR in the mcu to offset the vector table by 0x2000 0000. Now, it works and the mcu will find the vector in the ram. Note that in this method, I did not do any above memory remapping.
My question is: can I use memory remap to relocate my vector table (Without changing VTOR)?
What other uses are there for memory remapping?
Can I write to 0x0000 0000 (mapped to 0x0800 0000 flash) and modify flash during runtime?
You probably have done it right in your first attempt. However, SystemInit() function supplied by the IDE silently sets VTOR = 0x8000000, so the table at the beginning of the flash is used regardless of your memory remap settings.

Why do segments begin on paragraph boundaries?

In real mode segmented memory model, a segment always begins on a paragraph boundary. A paragraph is 16 bytes in size so the segment address is always divisible by 16. What is the reason/benefit of having segments on paragraph boundaries?
It's not so much a benefit as an axiom - the 8086's real mode segmentation model is designed at the hardware level such that a segment register specifies a paragraph boundary.
The segment register specified the base address of the upper 16 bits of the 8086's 20 bit address space, the lower 4 bits of that base address were in essence forced to zero.
The segmented architecture was one way to have the 8086's 16-bit register architecture be able to address a full megabyte(!) of address space (which requires 20 bits of addressing).
For a little more history, the next step that Intel took in the x86 architecture was to abstract the segment registers from directly defining the base address. That was the 286 protected mode, where the segment register held a 'selector' that instead of defining the bits for a physical base address was an index into a a set of descriptor tables that held information about the physical address, permissions for access to the physical memory, and other things.
The segment registers in modern 32-bit or larger x86 processors still do that. But with the address offsets being able to specify a full 32-bits of addressing (or 64-bits on the x64 processors) and page tables able to provide virtual memory semantics within the segment defined by a selector, the programming model essentially does away with having to manipulate segment registers at the application level. For the most part, the OS sets the segment registers up once, and nothing else needs to deal with them. So programmers generally don't even need to know they exist anymore.
The 8086 had 20 address lines. The segment was mapped to the top 16, leaving an offset of the bottom 4 lines, or 16 addresses.
The Segment register stores the address of the memory location where that segment starts. But Segment registers stores 16 bit information. This 16 bit is converted to 20 bits by appending 4 bits of 0 to the right end of the address. If a segment register contains 1000H then it is left shifted to get 10000H. Now it is 20 bits.
While converting we added 4 bits of 0 to the end of the address. So every segment in memory must begin with memory location where the last 4 bits are 0.
For ex:
If a segment starts at 10001H memory location, we cannot access it because the last 4 bits are not 0.
Any address in segment register will be appended with 4 bits in right end to convert to 20bits. So there is no way of accessing such an address.

Resources