gcc for ARM - move code and stack - gcc

I am working on a project for an ARM Cortex-M3 (SiLabs) SOC. I need to move the interrupt vector [edit] and code away from the bottom of flash to make room for a "boot loader". The boot loader starts at address 0 to come up when the core comes out of reset. Its function is to validate the main image, loaded at a higher address and possibly replace that main image with a new one.
Therefore, the boot loader would have its vector table at 0, followed by its code. At a higher, fixed address, say 8KB, would be the main image, starting with its vector table.
I have found this page which describes the Vector Table Offset Register which the boot loader can use (with interrupts masked, obviously) to point the hardware to the new vector table.
My question is how to get the "main" image linked so that it will work when written to flash, starting not at zero. I'm not familiar with ARM assembly but I assume the code is not position independent.
I'm using SiLabs's Precision32 IDE which uses gcc for the toolchain. I've found how to add linker flags. My question is what gcc flag(s) will provide the change to the base of the vector table and code.
Thank you.

vectors.s
.cpu cortex-m3
.thumb
.word 0x20008000 /* stack top address */
.word _start /* Reset */
.word hang
.word hang
/* ... */
.thumb_func
hang: b .
.thumb_func
.globl _start
_start:
bl notmain
b hang
notmain.c
extern void fun ( unsigned int );
void notmain ( void )
{
fun(7);
}
fun.c
void fun ( unsigned int x )
{
}
Makefile
hello.elf : vectors.s fun.c notmain.c memmap
arm-none-eabi-as vectors.s -o vectors.o
arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -mthumb -mcpu=cortex-m3 -march=armv7-m -c notmain.c -o notmain.o
arm-none-eabi-gcc -Wall -O2 -nostdlib -nostartfiles -ffreestanding -mthumb -mcpu=cortex-m3 -march=armv7-m -c fun.c -o fun.o
arm-none-eabi-ld -o hello.elf -T memmap vectors.o notmain.o fun.o
arm-none-eabi-objdump -D hello.elf > hello.list
arm-none-eabi-objcopy hello.elf -O binary hello.bin
so if the linker script (memmap is the name I used) looks like this
MEMORY
{
rom : ORIGIN = 0x00000000, LENGTH = 0x40000
ram : ORIGIN = 0x20000000, LENGTH = 0x8000
}
SECTIONS
{
.text : { *(.text*) } > rom
.bss : { *(.bss*) } > ram
}
since all of the above is .text, no .bss nor .data, the linker takes the objects as listed on the ld command line and places them starting at that address...
mbly of section .text:
00000000 <hang-0x10>:
0: 20008000 andcs r8, r0, r0
4: 00000013 andeq r0, r0, r3, lsl r0
8: 00000011 andeq r0, r0, r1, lsl r0
c: 00000011 andeq r0, r0, r1, lsl r0
00000010 <hang>:
10: e7fe b.n 10 <hang>
00000012 <_start>:
12: f000 f801 bl 18 <notmain>
16: e7fb b.n 10 <hang>
00000018 <notmain>:
18: 2007 movs r0, #7
1a: f000 b801 b.w 20 <fun>
1e: bf00 nop
00000020 <fun>:
20: 4770 bx lr
22: bf00 nop
So for this to work you have to be careful to put your bootstrap code first on the command line. But you can also do things like this with a linker script.
the order appears to matter, listing the specific object files first then the generic .text later
MEMORY
{
romx : ORIGIN = 0x00000000, LENGTH = 0x1000
romy : ORIGIN = 0x00010000, LENGTH = 0x1000
ram : ORIGIN = 0x00030000, LENGTH = 0x1000
bob : ORIGIN = 0x00040000, LENGTH = 0x1000
ted : ORIGIN = 0x00050000, LENGTH = 0x1000
}
SECTIONS
{
abc : { vectors.o } > romx
def : { fun.o } > ted
.text : { *(.text*) } > romy
.bss : { *(.bss*) } > ram
}
and we get this
00000000 <hang-0x10>:
0: 20008000 andcs r8, r0, r0
4: 00000013 andeq r0, r0, r3, lsl r0
8: 00000011 andeq r0, r0, r1, lsl r0
c: 00000011 andeq r0, r0, r1, lsl r0
00000010 <hang>:
10: e7fe b.n 10 <hang>
00000012 <_start>:
12: f00f fff5 bl 10000 <notmain>
16: e7fb b.n 10 <hang>
Disassembly of section def:
00050000 <fun>:
50000: 4770 bx lr
50002: bf00 nop
Disassembly of section .text:
00010000 <notmain>:
10000: 2007 movs r0, #7
10002: f03f bffd b.w 50000 <fun>
10006: bf00 nop
the short answer is with gnu tools you use a linker script to manipulate where things end up, I assume you want these functions to be in the rom at some specified location. I dont quite understand exactly what you are doing. But if for example you are trying to put something simple like the branch to main() in the flash initially with main() being deeper in the flash, then somehow either through whatever code is deeper in the flash or through some other method, then later you erase and reprogram only the stuff near zero. you will still need a simple branch to main() the first time. you can force what I am calling vectors.o to be at address zero then .text can be deeper in the flash putting all of the reset of the code basically there, then leave that in flash and replace just the stuff at zero.
like this
MEMORY
{
romx : ORIGIN = 0x00000000, LENGTH = 0x1000
romy : ORIGIN = 0x00010000, LENGTH = 0x1000
ram : ORIGIN = 0x00030000, LENGTH = 0x1000
bob : ORIGIN = 0x00040000, LENGTH = 0x1000
ted : ORIGIN = 0x00050000, LENGTH = 0x1000
}
SECTIONS
{
abc : { vectors.o } > romx
.text : { *(.text*) } > romy
.bss : { *(.bss*) } > ram
}
giving
00000000 <hang-0x10>:
0: 20008000 andcs r8, r0, r0
4: 00000013 andeq r0, r0, r3, lsl r0
8: 00000011 andeq r0, r0, r1, lsl r0
c: 00000011 andeq r0, r0, r1, lsl r0
00000010 <hang>:
10: e7fe b.n 10 <hang>
00000012 <_start>:
12: f00f fff7 bl 10004 <notmain>
16: e7fb b.n 10 <hang>
Disassembly of section .text:
00010000 <fun>:
10000: 4770 bx lr
10002: bf00 nop
00010004 <notmain>:
10004: 2007 movs r0, #7
10006: f7ff bffb b.w 10000 <fun>
1000a: bf00 nop
then leave the 0x10000 stuff and replace 0x00000 stuff later.
Anyway, the short answer is linker script you need to craft a linker script to put things where you want them. gnu linker scripts can get extremely complicated, I lean toward the simple.
If you want to place everything at some other address, including your vector table, then perhaps something like this:
hop.s
.cpu cortex-m3
.thumb
.word 0x20008000 /* stack top address */
.word _start /* Reset */
.word hang
.word hang
/* ... */
.thumb_func
hang: b .
.thumb_func
ldr r0,=_start
bx r0
and this
MEMORY
{
romx : ORIGIN = 0x00000000, LENGTH = 0x1000
romy : ORIGIN = 0x00010000, LENGTH = 0x1000
ram : ORIGIN = 0x00030000, LENGTH = 0x1000
bob : ORIGIN = 0x00040000, LENGTH = 0x1000
ted : ORIGIN = 0x00050000, LENGTH = 0x1000
}
SECTIONS
{
abc : { hop.o } > romx
.text : { *(.text*) } > romy
.bss : { *(.bss*) } > ram
}
gives this
Disassembly of section abc:
00000000 <hang-0x10>:
0: 20008000 andcs r8, r0, r0
4: 00010013 andeq r0, r1, r3, lsl r0
8: 00000011 andeq r0, r0, r1, lsl r0
c: 00000011 andeq r0, r0, r1, lsl r0
00000010 <hang>:
10: e7fe b.n 10 <hang>
12: 4801 ldr r0, [pc, #4] ; (18 <hang+0x8>)
14: 4700 bx r0
16: 00130000
1a: 20410001
Disassembly of section .text:
00010000 <hang-0x10>:
10000: 20008000 andcs r8, r0, r0
10004: 00010013 andeq r0, r1, r3, lsl r0
10008: 00010011 andeq r0, r1, r1, lsl r0
1000c: 00010011 andeq r0, r1, r1, lsl r0
00010010 <hang>:
10010: e7fe b.n 10010 <hang>
00010012 <_start>:
10012: f000 f803 bl 1001c <notmain>
10016: e7fb b.n 10010 <hang>
00010018 <fun>:
10018: 4770 bx lr
1001a: bf00 nop
0001001c <notmain>:
1001c: 2007 movs r0, #7
1001e: f7ff bffb b.w 10018 <fun>
10022: bf00 nop
then you can change the vector table to 0x10000 for example.
if you are asking a different question like having the bootloader at 0x00000, then the bootloader modifies the flash to add an application say at 0x20000, then you want to run that application, there are simpler solutions that dont necessarily require you to modify the location of the vector table.

Related

Getting an label address to a register on THUMB assembly - Armv5

I am trying to get the the address of a label in thumb assembly and I am having some trouble.
I already read this post but that cannot help me and I will explain why.
I am writing an simple program with Thumb assembly ( unfortunately I cannot use Thumb2 ).
Let's consider this code:
.arch armv5te
.syntax unified
.text
.thumb
.thumb_func
thumbnow:
0x0 PUSH {LR}
0x2 LDR R0, =loadValues
0x4 POP {PC}
.align
loadValues:
0x8 .word 0xdeadbee1
0xC .word 0xdeadbee2
0x10 .word 0xdeadbee3
I am using the arm-linux-gnueabi toolchain to assemble that.
My microcontroller doesn't have an MMU so the memory address are static, no virtual pages etc.
The thing that I am trying to do is to make R0 having the value of 0x8 here so that then I can access the three words like this:
LDR R1, [R0]
LDR R2, [R0,#4]
LDR R3, [R0,#8]
This is not possible with LDR though because the value in the word is not possible to fit in a MOV command. The documentation of the assembler states that if the value cannot fit in a MOV command then it will put the value in a literal pool.
So my question is, is it possible in Thumb assembly to get the actual address of the label if the content of the address cannot fit in a MOV command?
Starting with this
.thumb
ldr r0,=hello
adr r0,hello
nop
nop
nop
nop
hello:
.word 0,1,2,3
gives this unlinked
00000000 <hello-0xc>:
0: 4806 ldr r0, [pc, #24] ; (1c <hello+0x10>)
2: a002 add r0, pc, #8 ; (adr r0, c <hello>)
4: 46c0 nop ; (mov r8, r8)
6: 46c0 nop ; (mov r8, r8)
8: 46c0 nop ; (mov r8, r8)
a: 46c0 nop ; (mov r8, r8)
0000000c <hello>:
c: 00000000 andeq r0, r0, r0
10: 00000001 andeq r0, r0, r1
14: 00000002 andeq r0, r0, r2
18: 00000003 andeq r0, r0, r3
1c: 0000000c andeq r0, r0, r12
linked
00001000 <hello-0xc>:
1000: 4806 ldr r0, [pc, #24] ; (101c <hello+0x10>)
1002: a002 add r0, pc, #8 ; (adr r0, 100c <hello>)
1004: 46c0 nop ; (mov r8, r8)
1006: 46c0 nop ; (mov r8, r8)
1008: 46c0 nop ; (mov r8, r8)
100a: 46c0 nop ; (mov r8, r8)
0000100c <hello>:
100c: 00000000 andeq r0, r0, r0
1010: 00000001 andeq r0, r0, r1
1014: 00000002 andeq r0, r0, r2
1018: 00000003 andeq r0, r0, r3
101c: 0000100c andeq r1, r0, r12
both ways r0 will return the address to the start of data from which you can then offset into that data from the caller or wherever.
Edit
.thumb
adr r0,hello
nop
nop
nop
arm-none-eabi-as so.s -o so.o
so.s: Assembler messages:
so.s:2: Error: address calculation needs a strongly defined nearby symbol
So the tool won't turn that into a load from the pool for you.
For what you want to do I think the pc relative add (adr) is the best you are going to get. You can try other toolchains as all of this is language and toolchain specific (assembly language is defined by the assembler not the target and for each toolchain (with an assembler) there can be differences in the language). Over time within gnu, how the linker and assembler worked together has changed, the linker patches up things it didn't used to.
You could of course go into the linker and add code to it to perform this optimization, the problem is most likely that by link time the linker is looking to resolve an address in the pool which is easy for it to do it doesn't have to change the instruction, the assembler would have to leave information for the linker that this is not just a fill this memory location with an address thing, either you modify gas to allow adr to work, and then if the linker cant resolve it within the instruction then the linker bails out with an error.'
Or you could just hard-code what you want and maintain it. I am not sure why the adr solution isn't adequate.
mov r0,#8 is a valid thumb instruction.

Does arm-none-eabi-ld rewrite the bl instruction?

I'm trying to understand why some Cortex-M0 code behaves differently when it is linked versus unlinked. In both cases it is loaded to 0x20000000. It looks like despite my best efforts to generate position independent code by passing -fPIC to the compiler, the bl instruction appears to differ after the code has passed through the linker. Am I reading this correctly, is that just a part of the linker's job in ARM Thumb, and is there a better way to generate a position independent function call?
Linked:
20000000:
20000000: 0003 movs r3, r0
20000002: 4852 ldr r0, [pc, #328]
20000004: 4685 mov sp, r0
20000006: 0018 movs r0, r3
20000008: f000 f802 bl 20000010
2000000c: 46c0 nop ; (mov r8, r8)
2000000e: 46c0 nop ; (mov r8, r8)
Unlinked:
00000000:
0: 0003 movs r3, r0
2: 4852 ldr r0, [pc, #328]
4: 4685 mov sp, r0
6: 0018 movs r0, r3
8: f7ff fffe bl 10
c: 46c0 nop ; (mov r8, r8)
e: 46c0 nop ; (mov r8, r8)
start.s
.globl _start
_start:
.word 0x20001000
.word reset
.word hang
.word hang
.thumb
.thumb_func
reset:
bl notmain
.thumb_func
hang:
b .
notmain.c
unsigned int x;
unsigned int fun ( unsigned int );
void notmain ( void )
{
x=fun(x+5);
}
fun.c
unsigned int y;
unsigned int fun ( unsigned int z )
{
return(y+z+1);
}
memmap
MEMORY
{
ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
}
build
arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -fPIC -O2 -c -mthumb fun.c -o fun.o
arm-none-eabi-gcc -fPIC -O2 -c -mthumb notmain.c -o notmain.o
arm-none-eabi-ld -T memmap start.o notmain.o fun.o -o so.elf
produces
20000000 <_start>:
20000000: 20001000 andcs r1, r0, r0
20000004: 20000011 andcs r0, r0, r1, lsl r0
20000008: 20000015 andcs r0, r0, r5, lsl r0
2000000c: 20000015 andcs r0, r0, r5, lsl r0
20000010 <reset>:
20000010: f000 f802 bl 20000018 <notmain>
20000014 <hang>:
20000014: e7fe b.n 20000014 <hang>
...
20000018 <notmain>:
20000018: b510 push {r4, lr}
2000001a: 4b06 ldr r3, [pc, #24] ; (20000034 <notmain+0x1c>)
2000001c: 4a06 ldr r2, [pc, #24] ; (20000038 <notmain+0x20>)
2000001e: 447b add r3, pc
20000020: 589c ldr r4, [r3, r2]
20000022: 6823 ldr r3, [r4, #0]
20000024: 1d58 adds r0, r3, #5
20000026: f000 f809 bl 2000003c <fun>
2000002a: 6020 str r0, [r4, #0]
2000002c: bc10 pop {r4}
2000002e: bc01 pop {r0}
20000030: 4700 bx r0
20000032: 46c0 nop ; (mov r8, r8)
20000034: 00000032 andeq r0, r0, r2, lsr r0
20000038: 00000000 andeq r0, r0, r0
2000003c <fun>:
2000003c: 4b03 ldr r3, [pc, #12] ; (2000004c <fun+0x10>)
2000003e: 4a04 ldr r2, [pc, #16] ; (20000050 <fun+0x14>)
20000040: 447b add r3, pc
20000042: 589b ldr r3, [r3, r2]
20000044: 681b ldr r3, [r3, #0]
20000046: 3301 adds r3, #1
20000048: 1818 adds r0, r3, r0
2000004a: 4770 bx lr
2000004c: 00000010 andeq r0, r0, r0, lsl r0
20000050: 00000004 andeq r0, r0, r4
Disassembly of section .got:
20000054 <.got>:
20000054: 20000068 andcs r0, r0, r8, rrx
20000058: 2000006c andcs r0, r0, ip, rrx
Disassembly of section .got.plt:
2000005c <_GLOBAL_OFFSET_TABLE_>:
...
Disassembly of section .bss:
20000068 <x>:
20000068: 00000000 andeq r0, r0, r0
2000006c <y>:
2000006c: 00000000 andeq r0, r0, r0
when it wants to find the global variable x what it appears to have done is it takes the program counter and a linker supplied/modfied offset 0x32 and uses that to find the entry in the global offset table. then takes an offset from that to find X. same for Y. so it appears that when you relocate you will need to modify the global offset table at runtime or load time depending.
If I get rid of those global variables, other than the vector table which is hardcoded and not PIC (and wasnt compiled anyway), this is all position independent.
20000000 <_start>:
20000000: 20001000 andcs r1, r0, r0
20000004: 20000011 andcs r0, r0, r1, lsl r0
20000008: 20000015 andcs r0, r0, r5, lsl r0
2000000c: 20000015 andcs r0, r0, r5, lsl r0
20000010 <reset>:
20000010: f000 f802 bl 20000018 <notmain>
20000014 <hang>:
20000014: e7fe b.n 20000014 <hang>
...
20000018 <notmain>:
20000018: b508 push {r3, lr}
2000001a: 2005 movs r0, #5
2000001c: f000 f804 bl 20000028 <fun>
20000020: 3006 adds r0, #6
20000022: bc08 pop {r3}
20000024: bc02 pop {r1}
20000026: 4708 bx r1
20000028 <fun>:
20000028: 3001 adds r0, #1
2000002a: 4770 bx lr
back to this version
unsigned int y;
unsigned int fun ( unsigned int z )
{
return(y+z+1);
}
position independent
00000000 <fun>:
0: 4b03 ldr r3, [pc, #12] ; (10 <fun+0x10>)
2: 4a04 ldr r2, [pc, #16] ; (14 <fun+0x14>)
4: 447b add r3, pc
6: 589b ldr r3, [r3, r2]
8: 681b ldr r3, [r3, #0]
a: 3301 adds r3, #1
c: 1818 adds r0, r3, r0
e: 4770 bx lr
10: 00000008 andeq r0, r0, r8
14: 00000000 andeq r0, r0, r0
not position independent
00000000 <fun>:
0: 4b02 ldr r3, [pc, #8] ; (c <fun+0xc>)
2: 681b ldr r3, [r3, #0]
4: 3301 adds r3, #1
6: 1818 adds r0, r3, r0
8: 4770 bx lr
a: 46c0 nop ; (mov r8, r8)
c: 00000000 andeq r0, r0, r0
the code has to do a bit more work to access the external variable. position dependent, some work because it is external but not as much. the linker will fill in the required items to make it work...to link it...
the elf file contains information for the linker to know to do this.
Relocation section '.rel.text' at offset 0x1a4 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000010 00000a19 R_ARM_BASE_PREL 00000000 _GLOBAL_OFFSET_TABLE_
00000014 00000b1a R_ARM_GOT_BREL 00000004 y
or
Relocation section '.rel.text' at offset 0x174 contains 1 entries:
Offset Info Type Sym.Value Sym. Name
0000000c 00000a02 R_ARM_ABS32 00000004 y
notmain had these PIC
Relocation section '.rel.text' at offset 0x1cc contains 3 entries:
Offset Info Type Sym.Value Sym. Name
0000000e 00000a0a R_ARM_THM_CALL 00000000 fun
0000001c 00000b19 R_ARM_BASE_PREL 00000000 _GLOBAL_OFFSET_TABLE_
00000020 00000c1a R_ARM_GOT_BREL 00000004 x
and without.
Relocation section '.rel.text' at offset 0x198 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
00000008 00000a0a R_ARM_THM_CALL 00000000 fun
00000014 00000b02 R_ARM_ABS32 00000004 x
so in short the toolchain is doing its job, you dont need to re-do its job. And note this has nothing to do with arm or thumb. any time you use the object and linker model and allow for external items from an object the linker has to patch things up to glue the code together. thats just how it works.

How to explicitly assign a section in C code (like .text, .init, .fini) (mainly for arm)?

I am trying to make an embedded system. I have some C code, however, before the main function runs, some pre-initialization is needed. Is there a way to tell the gcc compiler, that a certain function is to be put in the .init section rather than the .text section?
this is the code:
#include <stdint.h>
#define REGISTERS_BASE 0x3F000000
#define MAIL_BASE 0xB880 // Base address for the mailbox registers
// This bit is set in the status register if there is no space to write into the mailbox
#define MAIL_FULL 0x80000000
// This bit is set in the status register if there is nothing to read from the mailbox
#define MAIL_EMPTY 0x40000000
struct Message
{
uint32_t messageSize;
uint32_t requestCode;
uint32_t tagID;
uint32_t bufferSize;
uint32_t requestSize;
uint32_t pinNum;
uint32_t on_off_switch;
uint32_t end;
};
struct Message m =
{
.messageSize = sizeof(struct Message),
.requestCode =0,
.tagID = 0x00038041,
.bufferSize = 8,
.requestSize =0,
.pinNum = 130,
.on_off_switch = 1,
.end = 0,
};
void _start()
{
__asm__
(
"mov sp, #0x8000 \n"
"b main"
);
}
/** Main function - we'll never return from here */
int main(void)
{
uint32_t mailbox = MAIL_BASE + REGISTERS_BASE + 0x18;
volatile uint32_t status;
do
{
status = *(volatile uint32_t *)(mailbox);
}
while((status & 0x80000000));
*(volatile uint32_t *)(MAIL_BASE + REGISTERS_BASE + 0x20) = ((uint32_t)(&m) & 0xfffffff0) | (uint32_t)(8);
while(1);
}
EDIT: using __attribute__(section("init")) doesn't seem to be working
Dont understand why you think you need a .init section for baremetal. A complete working example for a pi zero (using .init)
start.s
.section .init
.globl _start
_start:
mov sp,#0x8000
bl centry
b .
so.c
unsigned int data=5;
unsigned int bss;
unsigned int centry ( void )
{
return(0);
}
so.ld
MEMORY
{
ram : ORIGIN = 0x8000, LENGTH = 0x1000
}
SECTIONS
{
.init : { *(.init*) } > ram
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
.data : { *(.data*) } > ram
}
build
arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-ld -T so.ld start.o so.o -o so.elf
arm-none-eabi-objdump -D so.elf
Disassembly of section .init:
00008000 <_start>:
8000: e3a0d902 mov sp, #32768 ; 0x8000
8004: eb000000 bl 800c <centry>
8008: eafffffe b 8008 <_start+0x8>
Disassembly of section .text:
0000800c <centry>:
800c: e3a00000 mov r0, #0
8010: e12fff1e bx lr
Disassembly of section .bss:
00008014 <bss>:
8014: 00000000 andeq r0, r0, r0
Disassembly of section .data:
00008018 <data>:
8018: 00000005 andeq r0, r0, r5
Note if you do it right you dont need to init .bss in the bootstrap (put .data after .bss and make sure there is at least one item in .data)
hexdump -C so.bin
00000000 02 d9 a0 e3 00 00 00 eb fe ff ff ea 00 00 a0 e3 |................|
00000010 1e ff 2f e1 00 00 00 00 05 00 00 00 |../.........|
0000001c
if you want them in separate places then yes your linker script gets instantly more complicated as well as your bootstrap (with lots of room for error).
The only thing the extra work that .init buys you here IMO is that you can re-arrange the linker command line
arm-none-eabi-ld -T so.ld so.o start.o -o so.elf
get rid of .init all together
Disassembly of section .text:
00008000 <_start>:
8000: e3a0d902 mov sp, #32768 ; 0x8000
8004: eb000000 bl 800c <centry>
8008: eafffffe b 8008 <_start+0x8>
0000800c <centry>:
800c: e3a00000 mov r0, #0
8010: e12fff1e bx lr
Disassembly of section .bss:
00008014 <bss>:
8014: 00000000 andeq r0, r0, r0
Disassembly of section .data:
00008018 <data>:
8018: 00000005 andeq r0, r0, r5
No issues, works fine. Just have to know that with gnu ld (and probably others) if you dont call out something in the linker script then it fills things in in the order presented (on the command line).
Whether or not a compiler uses sections or what they are named is compiler specific, so you have to dig into the compiler specific options to see if there are any to change the defaults. Using C to bootstrap C is more work than it is worth, gcc will accept assembly files if it is a Makefile issue you are having problems with, very rarely is there a reason to use inline assembly when you can use real assembly and have something more reliable and maintainable. In real assembly then these things are trivial.
.section .helloworld
.globl _start
_start:
mov sp,#0x8000
bl centry
b .
Disassembly of section .helloworld:
00008000 <_start>:
8000: e3a0d902 mov sp, #32768 ; 0x8000
8004: ebfffffd bl 8000 <_start>
8008: eafffffe b 8008 <bss>
Disassembly of section .text:
00008000 <centry>:
8000: e3a00000 mov r0, #0
8004: e12fff1e bx lr
Disassembly of section .bss:
00008008 <bss>:
8008: 00000000 andeq r0, r0, r0
Disassembly of section .data:
0000800c <data>:
800c: 00000005 andeq r0, r0, r5
real assembly is generally used for bootstrapping, no compiler games required, no needing to go back and maintain the code every so often because of compiler games, porting is easier, etc.

unexpected behaviour of ldr [pc, #value]

I have built a simple application using GNU gcc (4.8.1429) for ARM (ATsam4LC4B: cortexM4 core, 256K flash) that:
loads at 0x3F000 (linker option -Ttext=0x3F000) and initializes,
copies itself (mirrors) at 0
jumps to the mirror Reset_Handler() (near 0) using assembly
the mirror application begins to run as expected
The problem raises when the flash memory location of the mirror main() call (flash location 0x532) is reached:
The instruction ldr [pc, #32] that is located there, loads r3 with 0x3F565 (thus calling the original main()), instead of the expected 0x565 (so that to call the mirror main() as expected).
This happens even if all the registers containing 0x3F... are zeroed prior to the ldr instruction.
The debugger reports the PC register to be 0x534 as expected and in compliance to the flash region within the stepping is performed.
Can anybody please explain to me what is going on?
Thank you in advance.
The program is compiled with [-mthumb -nostdlib -nostartfiles -ffreestanding -msingle-pic-base -mno-pic-data-is-text-relative -mlong-calls -mpoke-function-name] flags and linked with the [-Wl,--gc-sections -mcpu=cortex-m4 -Ttext=0x3f000 -nostdlib -nostartfiles -ffreestanding -Wl,-dead-strip -Wl,-static -Wl,--entry=Reset_Handler -Wl,--cref -mthumb] flags. The linker script defines rom start=0.
The region of interest within the produced .lss file contains (omitted lines are marked as ...):
...
void Reset_Handler(void)
{
...
/* Initialize the relocate segment */
...
/* Clear the zero segment */
...
/* Set the vector table base address */
/* Initialize the C library */
// __libc_init_array();
/* Branch to main function */
main();
3f532: 4b08 ldr r3, [pc, #32] ; (3f554 <Reset_Handler+0x5c>)
3f534: 4798 blx r3
3f536: e7fe b.n 3f536 <Reset_Handler+0x3e>
3f538: 20000000 .word 0x20000000
3f53c: 0003f59c .word 0x0003f59c
3f540: 20000004 .word 0x20000004
3f544: 20000004 .word 0x20000004
3f548: 20000008 .word 0x20000008
3f54c: e000ed00 .word 0xe000ed00
3f550: 0003f000 .word 0x0003f000
3f554: 0003f565 .word 0x0003f565
3f558: 6e69616d .word 0x6e69616d
3f55c: 00 .byte 0x00
3f55d: 00 .byte 0x00
3f55e: bf00 nop
3f560: ff000008 .word 0xff000008
0003f564 <main>:
int main (void)
{
3f564: b510 push {r4, lr}
...
asm("bx %0"::"r"(*(unsigned*)0x3F004 - 0x3F000));
3f580: 4b05 ldr r3, [pc, #20] ; (3f598 <main+0x34>)
3f582: 681b ldr r3, [r3, #0]
3f584: f5a3 337c sub.w r3, r3, #258048 ; 0x3f000
3f588: 4718 bx r3
}
return 0;
}
3f58a: 2000 movs r0, #0
3f58c: bd10 pop {r4, pc}
3f58e: bf00 nop
3f590: e0001000 .word 0xe0001000
3f594: 0003f329 .word 0x0003f329
3f598: 0003f004 .word 0x0003f004
...
#Notlikethat answered my question.
I was confused assuming that pc+32 address should be loaded to r3 directly, instead of the contents of this address.

Generating %pc relative address of constant data

Is there a way to have gcc generate %pc relative addresses of constants? Even when the string appears in the text segment, arm-elf-gcc will generate a constant pointer to the data, load the address of the pointer via a %pc relative address and then dereference it. For a variety of reasons, I need to skip the middle step. As an example, this simple function:
const char * filename(void)
{
static const char _filename[]
__attribute__((section(".text")))
= "logfile";
return _filename;
}
generates (when compiled with arm-elf-gcc-4.3.2 -nostdlib -c
-O3 -W -Wall logfile.c):
00000000 <filename>:
0: e59f0000 ldr r0, [pc, #0] ; 8 <filename+0x8>
4: e12fff1e bx lr
8: 0000000c .word 0x0000000c
0000000c <_filename.1175>:
c: 66676f6c .word 0x66676f6c
10: 00656c69 .word 0x00656c69
I would have expected it to generate something more like:
filename:
add r0, pc, #0
bx lr
_filename.1175:
.ascii "logfile\000"
The code in question needs to be partially position independent since it will be relocated in memory at load time, but also integrate with code that was not compiled -fPIC, so there is no global offset table.
My current work around is to call a non-inline function (which will be done via a %pc relative address) to find the offset from the compiled location in a technique similar to how -fPIC code works:
static intptr_t
__attribute__((noinline))
find_offset( void )
{
uintptr_t pc;
asm __volatile__ (
"mov %0, %%pc" : "=&r"(pc)
);
return pc - 8 - (uintptr_t) find_offset;
}
But this technique requires that all data references be fixed up manually, so the filename() function in the above example would become:
const char * filename(void)
{
static const char _filename[]
__attribute__((section(".text")))
= "logfile";
return _filename + find_offset();
}
Hmmm, maybe you have to compile it as -fPIC to get PIC. Or simply write it in assembler, assembler is a lot easier than the C you are writing.
00000000 :
0: e59f300c ldr r3, [pc, #12] ; 14
4: e59f000c ldr r0, [pc, #12] ; 18
8: e08f3003 add r3, pc, r3
c: e0830000 add r0, r3, r0
10: e12fff1e bx lr
14: 00000004 andeq r0, r0, r4
18: 00000000 andeq r0, r0, r0
0000001c :
1c: 66676f6c strbtvs r6, [r7], -ip, ror #30
20: 00656c69 rsbeq r6, r5, r9, ror #24
Are you getting the same warning I am getting?
/tmp/ccySyaUE.s: Assembler messages:
/tmp/ccySyaUE.s:35: Warning: ignoring changed section attributes for .text

Resources