Questions about u-boot relocation feature - gcc

I am using the u-boot-2011.12 on my OMAP3 target, the cross tool chain is CodeSourcery arm-none-linux-gnueabi, I compiled u-boot, downloaded it onto the target and booted it, everything went fine,but I have some questions about the u-boot relocation feature, we know that this feature is base on PIC(position independent code), position independent code is generated by setting the -fpic flag to gcc, but I don't find fpic in the compile flags. Without the PIC, how can u-boot implement the relocation feature?

Remember when u-boot is running there is no OS yet. It doesn't really need the 'pic' feature used in most user applications. What I'll describe below is for the PowerPC architecture.
u-boot is initially running in NV memory (NAND or NOR). After u-boot initializes most of the peripherals (specially the RAM) it locates the top of the RAM, reserves some area for the global data, then copies itself to RAM. u-boot will then branch to the code in RAM and modify the fixups. u-boot is now relocated in RAM.
Look at the start.S file for your architecture and find the relocate_code() function. Then study, study, study...

I found this troubling too, and banged my head around this question for a few hours.
Luckily I stumbled upon the following thread on the u-boot mailing list :
http://lists.denx.de/pipermail/u-boot/2010-October/078297.html
What this says, is that at least on ARM, using -fPIC/-fPIE at COMPILE TIME is not necessary to generate position independent binaries. It eases the task of the runtime loader by doing as most work up-front as possible, but that's all.
Whether you use fPIC or not, you can always use -pic / -pie at LINK TIME, which will move all position-dependent references to a relocation section. Since no processing was performed at COMPILE TIME to add helpers, expect this section to be larger than when using -fPIC.
They conclude that for their purposes using -fPIC does not have any significant advantage over a link-time only solution.
[edit] See commit u-boot 92d5ecba for reference
arm: implement ELF relocations
http://git.denx.de/cgi-bin/gitweb.cgi?p=u-boot.git;a=commit;h=92d5ecba47feb9961c3b7525e947866c5f0d2de5

Related

Bare-metal ARM Cortex-A7 newlib crt0 not initializing .bss and .data regions

I'm learning to write bare-metal ARM Cortex-A7 firmware to run on QEMU with semihosting. I know that ARM GCC has a libc implementation called newlib, which supports semihosting for common libc functions. So I'm trying to get newlib to work as well.
After addressing a lot of issues, the code is finally running correctly on QEMU: https://github.com/iNvEr7/qemu-learn/tree/master/semihosting-newlib
(Note: QEMU 5.2.0 seems to have a bug that would crash newlib's semihosting call to HEAPINFO, so to run my code on QEMU, you have to compile QEMU master, and use make run target to run the code with QEMU in a tmux session)
However I'd like to find some answers to some of the problems I encountered when integrating with newlib.
To my understanding, newlib, as a libc implementation, provides a crt0 routine that initialize the application's memory region, including .bss, .data, heap and stack.
However, from my tests, the crt0 that GCC linked with doesn't initialize the .bss and .data region, and would crash the later crt0 routine because of that.
So I had to write my own initialization code for .bss and .data in order for it to run correctly.
So I want to understand if I'm doing it the right way? Did I missing something that would instead enable newlib to initialize these regions for me? Or is it conventional to do the initialization myself?
Note: I'm using arm-none-eabi-gcc stable 9-2019-q4-major
It seems like I'm hitting a bug in newlib itself, and my current code is running fine because of some random luck.
So I updated my toolchain to gcc-arm-none-eabi-10-2020-q4-major and tried to compile the same code. This time it crashes again.
So I attached GDB and stepped through the ctr0 assembly code trying to figure out why.
It turns out that this line of code is loading the label's address to r1, but it should be loading the content in that label's address, i.e. ldr r1, .LC0 instead of adr r1, .LC0 .
The consequence of this typo is that the returned data from the heapinfo semihosting call is overwriting other data after that label, which contains information about the memory regions. It in turns affected the .bss initialization code later in the crt0 routine. With my previous test using an older toolchain it luckily runs without crashes, but with latest toolchain such error is causing fatal crashes.
I also realized that the 5.2.0 QEMU crash may also be caused by this newlib bug, instead of a QEMU problem. Somehow the master QEMU version behaved differently making the crash to dissapear.
I have submitted a patch to newlib. It surprised me that such a fatal mistake can slip through so many years without notice while it can be revealed by a simple hello world program.
Anyway, it seems my question is also answered by my digging. If newlib was working correctly, it should have initialized .bss section. But there's no code in newlib to initialize .data section, and we have to do that manually for bare-metal.
Plot twist: got back from newlib mailing list. It turns out the newlib's implementation is indeed correctly conforming to the ARM spec:
https://developer.arm.com/documentation/100863/0300/Semihosting-operations/SYS-HEAPINFO--0x16-?lang=en
Where "the PARAMETER REGISTER contains the address of a pointer to a four-field data block."
It's instead QEMU made an misinterpretation and wrote to the wrong address. Will file an issue with QEMU instead.

Simple bootloader for running Linux kernel on a simulator

We have built a simple instruction set simulator for the sparc v8 processor. The model consists of a v8 processor, a main memory and a character input and a character output device. Currently I am able to run simple user-level programs on this simulator which are built using a cross compiler and placed in the modeled main memory directly.
I am trying to get a linux kernel to run on this simulator by building a simplest bootloader. (I'm considering uClinux which is made for mmu-less systems). The uncompressed kernel and the filesystem are both assumed to be present in the main memory itself, and all that my bootloader has to do is pass the relevant information to the kernel and make a jump to the start of the kernel code. I have no experience in OS development or porting linux.
I have the following questions :
What is this bare minimum information that a bootloader has to supply to the kernel ?
How to pass this information?
How to point the kernel to use my custom input/output devices?
There is some documentation available for porting linux to ARM boards, and from this documentation, it seems that the bootloader passes information about the size of RAM etc
via a data structure called ATAGS. How is it done in the case of a Sparc processor? I could not find much documentation for Sparc on the internet. There exists a linux bootloader for the Leon3 implementation of Sparc v8, but I could not find the specific information I was looking for in its code.
I will be grateful for any links that explain the bare minimum information to be passed to a kernel and how to pass it.
Thanks,
-neha

How to link iPad Air app (arm64) against existing armv7 static libraries?

I have compiled armv7 static libraries (lib*.a) and i'm going to compile iPad Air app (arm64).
I'm getting linker warning and then linker error:
$ lipo -info /Users/user/Documents/dev/src/iOS_Projects/iProject/libMyLib.a
input file /Users/user/Documents/dev/src/iOS_Projects/iProject/libMyLib.a is not a fat file
Non-fat file: /Users/user/Documents/dev/src/iOS_Projects/iProject/libMyLib.a is architecture: armv7
Ld: warning: ignoring file /Users/user/Documents/dev/src/iOS_Projects/iProject/libMyLib.a, file was built for archive which is not the architecture being linked (arm64): /Users/user/Documents/dev/src/iOS_Projects/iProject/libMyLib.a ignoring file
It's undesirable (and can be impossible) to recompile static libs for arm64. How can i use them?
With difficulty.
You can only switch between AArch32 state and AArch64 state at an exception boundary, so whilst e.g. 64-bit kernel/32-bit userspace is possible, it's impossible to use both in a single process. Since it's an entirely different instruction set/register layout/exception model/etc. there's no 32/64-bit interworking in the style of ARM/Thumb (which are essentially just different encodings of the same instructions).
In general (I'm not familiar with iOS specifics, but I assume it supports "legacy" AArch32 processes as Linux does):
If the libraries are completely integral to your code, your best bet is to simply give in and compile your app as 32-bit.
If you have super-crucial-absolutely-must-be-64-bit code but the library calls are not in the fast path, you could compile them into a 32-bit helper program that you spawn as an additional process and call via some form of IPC.
Otherwise you're looking at the ridiculously impractical prospect of some form of binary translation.
I gather that iOS offers no support for IPC, which rather rules out the second option in this particular case.

Is it possible to generate native x86 code for ring0 in gcc?

I wonder, are there any ways to generate with the gcc some native x86 code (which can be booted without any OS)?
Yes, the Linux kernel is compiled with GCC and runs in ring 0 on x86.
The question isn't well-formed. Certainly not all of the instructions needed to initialize a modern CPU from scratch can be emitted by gcc alone, you'll need to use some assembly for that. But that's sort of academic because modern CPUs don't actually document all this stuff and instead expect your hardware manufacturer to ship firmware to do it. After firmware initialization, a modern PC leaves you either in an old-style 16 bit 8086 environment ("legacy" BIOS) or a fairly clean 32 or 64 bit (depending on your specific hardware platform) environment called "EFI Boot Services".
Operations in EFI mode are all done using C function pointers, and you can indeed build for this environment using gcc. See the gummiboot boot loader for an excellent example of working with EFI.

what is cross compilation?

what is cross compilation?
Cross-compilation is the act of compiling code for one computer system (often known as the target) on a different system, called the host.
It's a very useful technique, for instance when the target system is too small to host the compiler and all relevant files.
Common examples include many embedded systems, but also typical game consoles.
A cross-compiler is compiles the source code from one architecture to another architecture.
For example: hello.c
gcc hello.c (gcc is a compiler for x86 architecture.)
arm-cortexa8-linux-gnueabihf-gcc hello.c
(arm-....-gcc is a compiler for the arm architecture.) This you are compiling on the host pc for a target board (e.g rpi, beaglebone, wega board). In this example arm-cortexa8-linux-gnueabihf-gcc is called the 'cross compiler'.
This process is called cross compilation.
see the link for more info cross compilation
To "cross compile" is to compile source on say a Linux box with intent on running it on a MAC or Windows box. This is usually done using a cross compilation plugin, which are readily available from various web servers across the net. If one is to install a cross compilation plugin onto their Linux box that is designed to compile for Windows boxes. Then they may compile for either a Linux/*NIX box as well as have the option to compile and link a Windows-ready executable. This is extremely convenient for a freelance programmer whom has access to no more than a single Linux/Windows/MAC box. Note that various cross compilation plugins will allow for multitudes of applications, some of which you may or may not perceive as useful, thus a thorough perusal of the plugin's README file.
Did you have a particular project in mind that you would like to apply the method of cross compilation to?
In a strict sense, it is the compilation of code on one host that is intended to run on another.
Most commonly it is used with reference to compilation for architectures that are not binary-compatible with the host -- for instance, building RISC binaries on a CISC CPU platform, or 64-bit binaries on a 32-bit system. Or, for example, building firmware intended to run on embedded devices (perhaps using the ARM CPU architecture) on Intel PC-based OSs.
A Cross Compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running.
For e.g. a compiler that runs on a Windows 7 PC but generates code that runs on Android smartphone is a cross compiler.
A cross compiler is necessary to compile for multiple platforms from one machine.
A platform could be infeasible for a compiler to run on, such as for the microcontroller of an embedded system because those systems contain no operating system.
In paravirtualization one machine runs many operating systems, and a cross compiler could generate an executable for each of them from one main source.

Resources