Set Flash Memory Location to certain values on MSP430 using GCC - gcc

How do you set a few bytes of flash to be programmed by the flash programmer during programming of the mcu to be a certain value using msp GCC toolchain? For example, TI C/C++ compiler toolchain, includes assembler, and the following lines of assembly set the memory locations to the desired values:
;----------------------------------------------------------------------
.sect ".BSLSIG"
.retain
;----------------------------------------------------------------------
.word 0xFFFF ; 0x17F0
BslProtectVecLoc .word BSL_Protect ; 0x17F2 adress of function
PBSLSigLoc .word 03CA5h ; 0x17F4 1st BSL signature
SBSLSigLoc .word 0C35Ah ; 0x17F6 2nd BSL signature
.word 0xFFFF ; 0x17F8
BslEntryLoc .word BSL_Entry_JMP ; 0x17FA BSL_Entry_JMP
Is there a way to do something similar using msp GCC toolchain?

The GNU assembler has the same mechanisms.
For example, here is how the MSP430 startup code puts the address of the startup code into the reset vector:
.section ".resetvec", "a"
__msp430_resetvec_hook:
.word __crt0_start
As with the TI compiler, this requires the section (here: .resetvec) to be defined in the linker script.

Related

Branching to a c symbol from thumb inline assembly

I'm on a Cortex-M0+ device (Thumb only) and I'm trying to dynamically generate some code in ram and then jump to it, like so:
uint16_t code_buf[18];
...
void jump() {
register volatile uint32_t* PASET asm("r0") = &(PA->OUTSET.reg);
register volatile uint32_t* PACLR asm("r1") = &(PA->OUTCLR.reg);
register uint32_t set asm("r2") = startset;
register uint32_t cl0 asm("r3") = clears[0];
register uint32_t cl1 asm("r4") = clears[1];
register uint32_t cl2 asm("r5") = clears[2];
register uint32_t cl3 asm("r6") = clears[3];
register uint32_t dl0 asm("r8") = delays[0];
register uint32_t dl1 asm("r9") = delays[1];
register uint32_t dl2 asm("r10") = delays[2];
register uint32_t dl3 asm("r11") = delays[3];
asm volatile (
"bl code_buf\n"
: [set]"+r" (set) : [PASET]"r" (PASET), [PACLR]"r" (PACLR), [cl0]"r" (cl0), [cl1]"r" (cl1), [cl2]"r" (cl2), [cl3]"r" (cl3), [dl0]"r" (dl0), [dl1]"r" (dl1), [dl2]"r" (dl2), [dl3]"r" (dl3) : "lr"
);
}
The code in code_buf will use the arguments passed via registers (that's why I'm forcing specific registers).
This code compiles fine, but when I look at the disassembly the branch instruction has been changed to
a14: f004 ebb0 blx 0x5178
Which would try to switch the cpu to ARM mode and cause a HardFault. Is there a way to force the assembler to keep the branch as a simple bl?
So it turns out that the toolchain I was using (gcc 4.8) is buggy, and makes two errors: it interprets code_buf as an arm address, and produces a bogus blx label which isn't even legal on a cortex-m0+. I updated it to 6.3.1 and the inline asm was converted to a bl label as it was supposed to.
From section 4.1.1 of the ARMv6-M Architecture Reference Manual:
Thumb interworking is held as bit [0] of an interworking address.
Interworking addresses are used in the following instructions: BX,
BLX, or POP that loads the PC.
ARMv6-M only supports the Thumb
instruction Execution state, therefore the value of address bit [0]
must be 1 in interworking instructions, otherwise a fault occurs. All
instructions ignore bit [0] and write bits [31:1]:’0’ when updating
the PC.
The target of your branch, code_buf, will be word-aligned (possibly double-word aligned) so bit 0 will be clear in its address. The key is to ensure that bit 0 is set before you branch, and then even if the toolchain selects an interworking instruction you'll remain in thumb mode.
I don't have a development environment in front of me to test this, but I would suggest casting to a pointer-to-single-byte type and using pointer arithmetic to set bit 0:
uint8_t *thumb_target = ((uint8_t *)code_buf) + 1;
asm volatile (
"bl thumb_target\n"
: [set]"+r" (set) : [PASET]"r" (PASET), [PACLR]"r" (PACLR), [cl0]"r" (cl0), [cl1]"r" (cl1), [cl2]"r" (cl2), [cl3]"r" (cl3), [dl0]"r" (dl0), [dl1]"r" (dl1), [dl2]"r" (dl2), [dl3]"r" (dl3) : "lr"
);
Edit: The above doesn't work, as Peter Cordes points out, because a local variable can't be used in inline ASM in this context. Not being well-versed in gcc's inline ASM, I won't attempt to fix it.
I have now had a chance to test the supplied code though, and gcc 7.2.1 with -S -mtune=cortex-m0plus -fomit-frame-pointer generates a BL not a BLX.
Edit 2: The documentation (section A6.7.14) suggests that only the register-target version of BLX is present in the ARMv6-M architecture (this is in common with the ARMv7 devices I'm most familiar with) and so it looks to me as if the fault is caused not by an attempt to switch to ARM mode but by an illegal instruction. Is your compiler correctly configured?
IDK why your assembler would be changing bl into blx. Mine doesn't, using arm-none-eabi-gcc 7.3.0 on Arch Linux. arm-none-eabi-as --version shows Binutils 2.30.
unsigned short code_buf[18];
void jump() {
asm("bl code_buf");
asm("blx code_buf"); // still assembles to BL, not BLX
// asm("blx jump");
// asm("bl jump");
}
compiled with arm-none-eabi-gcc -O2 -nostdlib arm-bl.c -mcpu=cortex-m0plus -mthumb (I made a linked executable with -nostdlib so I could see actual branch displacements, not placeholders).
Disassembling with arm-none-eabi-objdump -d a.out shows
00008000 <jump>:
8000: f010 f804 bl 1800c <__data_start>
8004: f010 f802 bl 1800c <__data_start>
8008: 4770 bx lr
800a: 46c0 nop ; (mov r8, r8)
Your f004 ebb0 may be a Thumb2 encoding for BLX. I don't know why you're getting it.
The Thumb encoding for bl is documented in section 5.19 of this ARM7TDMI ISA manual ("long branch with link"), but that manual doesn't mention a Thumb encoding for blx at all (because it's only Thumb, not Thumb 2). The Thumb bl encoding stores the branch displacement right-shifted by 1 (i.e. without the low bit), and always stays in Thumb mode.
It's actually two separate instructions; one which puts the high 12 bits of the displacement into LR, and another which branches and updates LR to the return address. (This 2-instruction hack allows Thumb1 to work without Thumb2 32-bit instructions). Both instructions start with f, so your disassembly shows that you got something else; the first 16-bit chunk of f004 ebb0 is the LR setup, but ebb0 doesn't match any Thumb 1 instruction.
Possibly asm("bl code_buf+1" : ...); or blx code_buf+1 could work, if the +1 convinces the assembler to treat it as a Thumb target. But you might need to use asm to get a .thumb_func directive applied to code_buf somehow to keep your assembler happy.

Who decide the location of the symbol in ARM

I'm looking at arch/arm/boot/compressed/head.S in linux kerenl.
My board is odroid-S and when I see my symbol table which I got from readelf -S arch/arm/boot/compressed/vmlinux and I see these codes.
LC0: .word LC0 # r1
.word __bss_start # r2
.word _end # r3
.word zreladdr # r4
.word _start # r5
.word _got_start # r6
.word _got_end # ip
.word user_stack+4096 # sp `
But the address each they have is not sequenced.
As an example, LC0 value is 0000013c
but __bss_start is 0031a734.
Can anybody tell me what confirms the values of these symbols???
When compiling a baremetal implementation of software like an OS or bootloader, one will have a platform specific linker script will specify what addresses each section will go into. The linker script will be written according to the platform's memory map.
When the operating system loads an executable, the operating system loaded will read the various sections in your elf file and ensure that each gets placed into the correct section of the process's memory map. The OS loader will then fix any unlinked addresses as required.

basic assembly not working on Mac (x86_64+Lion)?

here is the code(exit.s):
.section .data,
.section .text,
.globl _start
_start:
movl $1, %eax
movl $32, %ebx
syscall
when I execute " as exit.s -o exit.o && ld exit.o -o exit -e _start && ./exit"
the return is "Bus error: 10" and the output of "echo $?" is 138
I also tried the example of the correct answer in this question: Process command line in Linux 64 bit
stil get "bus error"...
First, you are using old 32-bit Linux kernel calling convention on Mac OS X - this absolutely doesn't work.
Second, syscalls in Mac OS X are structured in a different way - they all have a leading class identifier and a syscall number. The class can be Mach, BSD or something else (see here in the XNU source) and is shifted 24 bits to the left. Normal BSD syscalls have class 2 and thus begin from 0x2000000. Syscalls in class 0 are invalid.
As per §A.2.1 of the SysV AMD64 ABI, also followed by Mac OS X, syscall id (together with its class on XNU!) goes to %rax (or to %eax as the high 32 bits are unused on XNU). The fist argument goes in %rdi. Next goes to %rsi. And so on. %rcx is used by the kernel and its value is destroyed and that's why all functions in libc.dyld save it into %r10 before making syscalls (similarly to the kernel_trap macro from syscall_sw.h).
Third, code sections in Mach-O binaries are called __text and not .text as in Linux ELF and also reside in the __TEXT segment, collectively referred as (__TEXT,__text) (nasm automatically translates .text as appropriate if Mach-O is selected as target object type) - see the Mac OS X ABI Mach-O File Format Reference. Even if you get the assembly instructions right, putting them in the wrong segment/section leads to bus error. You can either use the .section __TEXT,__text directive (see here for directive syntax) or you can also use the (simpler) .text directive, or you can drop it altogether since it is assumed if no -n option was supplied to as (see the manpage of as).
Fourth, the default entry point for the Mach-O ld is called start (although, as you've already figured it out, it can be changed via the -e linker option).
Given all the above you should modify your assembler source to read as follows:
; You could also add one of the following directives for completeness
; .text
; or
; .section __TEXT,__text
.globl start
start:
movl $0x2000001, %eax
movl $32, %edi
syscall
Here it is, working as expected:
$ as -o exit.o exit.s; ld -o exit exit.o
$ ./exit; echo $?
32
Adding more explanation on the magic number. I made the same mistake by applying the Linux syscall number to my NASM.
From the xnu kernel sources in osfmk/mach/i386/syscall_sw.h (search SYSCALL_CLASS_SHIFT).
/*
* Syscall classes for 64-bit system call entry.
* For 64-bit users, the 32-bit syscall number is partitioned
* with the high-order bits representing the class and low-order
* bits being the syscall number within that class.
* The high-order 32-bits of the 64-bit syscall number are unused.
* All system classes enter the kernel via the syscall instruction.
Syscalls are partitioned:
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
As we can see, the tag for BSD system calls is 2. So that magic number 0x2000000 is constructed as:
// 2 << 24
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
(SYSCALL_NUMBER_MASK & (syscall_number)))
Why it uses BSD tag in the end, probably Apple switches from mach kernel to BSD kernel. Historical reason.
Inspired by the original answer.

ELF Shared Object in x86-64 Assembly language

I'm trying to create a Shared library (*.so) in ASM and I'm not sure that i do it correct...
My code is:
.section .data
.globl var1
var1:
.quad 0x012345
.section .text
.globl func1
func1:
xor %rax, %rax
# mov var1, %rcx # this is commented
ret
To compile it i run
gcc ker.s -g -fPIC -m64 -o ker.o
gcc ker.o -shared -fPIC -m64 -o libker.so
I can access variable var1 and call func1 with dlopen() and dlsym() from a program in C.
The problem is in variable var1. When i try to access it from func1, i.e. uncomment that line, the compiler generates an error:
/usr/bin/ld: ker.o: relocation R_X86_64_32S against `var1' can not be used when making a shared object; recompile with -fPIC
ker.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
I don't understand. I've already compiled with -fPIC, so what's wrong?
I've already compiled with -fPIC, so what's wrong?
That part of the error message is for people who are linking compiler-generated code.
You're writing asm by hand, so as datenwolf correctly wrote, when writing a shared library in assembly, you have to take care for yourself that the code is position independent.
This means file must not contain any 32-bit absolute addresses (because relocation to an arbitrary 64-bit base is impossible). 64-bit absolute relocations are supported, but normally you should only use that for jump tables.
mov var1, %rcx uses a 32-bit absolute addressing mode. You should normally never do this, even in position-dependent x86-64 code. The normal use-cases for 32-bit absolute addresses are: putting an address into a 64-bit register withmov $var1, %edi (zero-extends into RDI)
and indexing static arrays: mov arr(,%rdx,4), %edx
mov var1(%rip), %rcx uses a RIP-relative 32-bit offset. It's the efficient way to address static data, and compilers always use this even without -fPIE or -fPIC for static/global variables.
You have basically two possibilities:
Normal library-private static data, like C compilers will make for __attribute__((visibility("hidden"))) long var1;, same as for -fno-PIC.
.data
.globl var1 # linkable from other .o files in the same shared object / library
.hidden var1 # not visible for *dynamic* linking outside the library
var1:
.quad 0x012345
.text
.globl func1
func1:
xor %eax, %eax # return 0
mov var1(%rip), %rcx
ret
full symbol-interposition-aware code like compilers generate for -fPIC.
You have to use the Global Offset Table. This is how a compiler does it, if you tell him to produce code for a shared library.
Note that this comes with a performance hit because of the additional indirection.
See Sorry state of dynamic libraries on Linux for more about symbol-interposition and the overheads it imposes on code-gen for shared libraries if you're not careful about restricting symbol visibility to allow inlining.
var1#GOTPCREL is the address of a pointer to your var1, the pointer itself is reachable with rip-relative addressing, while the content (the address of var1) is filled by the linker during loading of the library. This supports the case where the program using your library defined var1, so var1 in your library should resolve to that memory location instead of the one in the .data or .bss (or .text) of your .so.
.section .data
.globl var1
# without .hidden
var1:
.quad 0x012345
.section .text
.globl func1
func1:
xor %eax, %eax
mov var1#GOTPCREL(%rip), %rcx
mov (%rcx), %rcx
ret
See some additional information at http://www.bottomupcs.com/global_offset_tables.html
An example on the Godbolt compiler explorer of -fPIC vs. -fPIE shows the difference that symbol-interposition makes for getting the address of non-hidden global variables:
movl $x, %eax 5 bytes, -fno-pie
leaq x(%rip), %rax 7 bytes, -fPIE and hidden globals or static with -fPIC
y#GOTPCREL(%rip), %rax 7 bytes and a load instead of just ALU, -fPIC with non-hidden globals.
Actually loading always uses x(%rip), except for non-hidden / non-static vars with -fPIC where it has to get the runtime address from the GOT first, because it's not a link-time constant offset relative to the code.
Related: 32-bit absolute addresses no longer allowed in x86-64 Linux? (PIE executables).
A previous version of this answer stated that the DATA and BSS segments could move relative to TEXT when loading a dynamic library. This is incorrect, only the library base address is relocatable. RIP-relative access to other segments within the same library is guaranteed to be ok, and compilers emit code that does this. The ELF headers specify how the segments (which contain the sections) need to be loaded/mapped into memory.
I don't understand. I've already compiled with -fPIC, so what's wrong?
-fPIC is a flag concerning the creation of machine code from non-machine code, i.e. which operations to use. In the compilation stage. Assembly is not compiled, though! Each assembly mnemonic maps directly to a machine instruction, your code is not compiled. It's just transcribed into a slightly different format.
Since you're writing it in assembly, your assembly code must be position independent to be linkable into a shared library. -fPIC has not effect in your case, because it only affects code generation.
Ok, i think i found something...
First solution from drhirsch gives almost the same error but the relocation type is changed. And type is always ended with 32. Why is it? Why 64 bit program uses 32-bit relocation?
I found this from googling: http://www.technovelty.org/code/c/relocation-truncated.html
It says:
For code optimisation purposes, the default immediate size to the mov
instructions is a 32-bit value
So that's the case. I use 64-bit program but relocation is 32-bit and all i need is to force it to be 64 bit with movabs instruction.
This code is assembling and working (access to var1 from internal function func1 and from external C program via dlsym()):
.section .data
.globl var1
var1:
.quad 0x012345
.section .text
.globl func1
func1:
movabs var1, %rax # if one is symbol, other must be %rax
inc %rax
movabs %rax, var1
ret
But i'm in doubt about Global Offset Table. Must i use it, or this "direct" access is absolutely correct?

programming with NASM in Windows XP

I have the following code which assembles and runs fine on Windows XP 32 bit, 2.09.08 NASM:
; how to compile: nasm -f elf test.asm
; how to link: ld -o test.exe test.o
section .data
section .text
;global _WinMain#16
;_WinMain#16:
;global _start
_start:
mov ax,4
jmp $
According to many tutorials on NASM the asm file needs the following in it:
global _WinMain#16
_WinMain#16:
...
As you can see my asm file doesn't have that in it. (it's commented out, All it has is _start). So what is with all of these tutorials mentioning the need for the global _WinMain#16 stuff when my assembly program doesn't have that and works?
this is the command to assemble: nasm -f elf test.asm
this is the command to link: ld -o test.exe test.o
There are several types of application on Windows with different entry points depending on which type they are. By link.exe option:
/SUBSYSTEM:CONSOLE - requires main and linking with msvcrXX.dll. These applications run in console windows; if you aren't running an instance of cmd.exe, one will be opened.
/SUBSYSTEM:WINDOWS - WinMain is the starting point. See here. Usually in C, these #include <windows.h> and are linked directly to kernel32.dll. These a gui apps and are almost definitely linked with user32.dll and possibly advapi32.dll as well.
/SUBSYSTEM:NATIVE - there are two types of application here; drivers and applications. Native NT apps run during windows startup and require NtProcessSStartup as an entry point. There is no libc in native applications. Drivers are different again.
A full list of supported windows subsystems by link.exe is available here.
_start is the symbol windows will actually start your code running at. Normally, libc or the like actually handles _start and does some initial setup, so your program doesn't actually quite start at _main. If you wanted to link with libc you would have problems, since you'd have conflicting symbols with the libc library. If however you never intend to call any functions that are part of the C or C++ standard libraries, you are ok using _start.
Edit yikes I've just noticed this:
; how to compile: nasm -f elf test.asm
; how to link: ld -o test.exe test.o
I assume you're not using the -f elf one. ELF (executable and linkable format) is the linux format for executables; Windows requires Portable Executable (PE) images. The nasm option is -f win32, or for dos nasm -f coff.
Edit 2 just to check, I assembled the code and disassembled it again. I also used mingw. Anyway, I got:
SECTION .text align=16 execute ; section number 1, code
Entry_point:; Function begin
; Note: Length-changing prefix causes delay on Intel processors
mov ax, 4 ; 00401000 _ 66: B8, 0004
?_001: jmp ?_001 ; 00401004 _ EB, FE
; Entry_point End of function
; Note: Length-changing prefix causes delay on Intel processors
mov ax, 4 ; 00401006 _ 66: B8, 0004
?_002: jmp ?_002 ; 0040100A _ EB, FE
The rest of the header appears to be a valid PE format executable with no Entry point specification. I believe therefore that the code is simply "falling through" to the first piece of assembly code to start. I wouldn't advise this behaviour, especially when linking multiple objects as I've no idea what would happen. Do use -entry.
Disassembling the elf object file I get this:
SECTION .data align=4 noexecute ; section number 1, data
SECTION .text align=16 execute ; section number 2, code
_start_here:; Local function
; Note: Length-changing prefix causes delay on Intel processors
mov ax, 4 ; 0000 _ 66: B8, 0004
?_001: jmp ?_001 ; 0004 _ EB, FE
_another_symbol:; Local function
; Note: Length-changing prefix causes delay on Intel processors
mov ax, 4 ; 0006 _ 66: B8, 0004
?_002: jmp ?_002
In other words, there aren't any specific ELF-format headers in it. I believe you're getting lucky on this one; start importing or trying to link with other code modules and things will start to get more tricky.
For Windows / mingw, you want:
nasm -f win32 file.asm
for each file you want to assemble. Substitute win32 for win64 when needed. ld will do fine for linking.
Just a thought - I never explained the #16 part. The functions are 16-byte aligned on Windows, whereas, as you can see, the data is only four-byte aligned. See this explanation for the why.

Resources