Using assembly JMP function on x86_64 - gcc

I'm really new to programming (in general - it's pathetic) and some Python-related assembly has cropped up in this app that I'm hacking to run on 64-bit.
Essentially, the code goes like this:
#define FUNCTION(name) \
.globl _##name; \
_##name: \
jmp *(_p_##name)
.text
FUNCTION(name)
The FUNCTION(name) syntax is used about 50 times to define headers for an external Python library as far as I can tell (I'm not going to pretend that I fully understand it, I'm just bugfixing).
Since I'm compiling for x86_64, the following error is spit out by GCC for each FUNCTION(name) instance:
32-bit absolute addressing is not supported for x86-64
cannot do signed 4 byte relocation
How would I go about "fixing" this to run on x86_64?

Grab a copy of the Intel Architecture Software Developer's Manuals. As you're seeing, some forms of the jmp instruction are invalid in 64-bit mode. In particular, the two "Jump far, absolute, address given in operand" forms won't work. You will need to change to a relative addressing or absolute indirect addressing form of the instruction. Volume 2A, page 3-549 in my copy, of the manual has a huge pile of information about jmp.

Related

Compiling and linking NASM and 64-bit C code together into a bootloader [duplicate]

This question already has an answer here:
Relocation error when compiling NASM code in 64-bit mode
(1 answer)
Closed 4 years ago.
I made a very simple 1 stage bootloader that does two main things: it switches from 16 bit real mode to 64 bit long mode, and it read the next few sectors from the hard disk that are for initiating the basic kernel.
For the basic kernel, I am trying to write code in C instead of assembly, and I have some questions regarding that:
How should I compile and link the nasm file and the C file?
When compiling the files, should I compile to 16 bit or 64 bit? since I am switching from 16 to 64 bits.
How would I add more files from either C or assembly to the project?
I rewrote the question to make my goal more clear, so if source code is needed tell me to add it.
Code: https://github.com/LatKid/BasicBootloaderNASMC
since I am also linking a nasm file with the C file, it spits an error from the nasm object file, which is relocation R_X86_64_16 against .text' can not be used when making a shared object; recompile with -fPIC
One of your issues is probably inside that nasm assembler file (which you don't show in the initial version of your question). It should contain only position-independent code (PIC) so cannot produce an object file with relocation R_X86_64_16 (In your edited question, mov sp, main is obviously not PIC, you should use instruction pointer relative data access of x86-64, and you cannot define main both in your nasm file and in a C file, and you cannot mix 16 bits mode with 64 bits mode when linking).
Study ELF, then the x86-64 ABI to understand what kind of relocations are permitted in a PIC file (and what constraints an assembler file should follow to produce a PIC object file).
Use objdump(1) & readelf(1) to inspect object files (and shared objects and executables).
Once your nasm code produces a PIC object file, link with gcc and use gcc -v to understand what happens under the hoods (you'll see that extra libraries and object files, including crt0 ones, -lgcc and -lc, are used).
Perhaps you need to understand better compilation and linking. Read Levine's book Linkers and Loaders, Drepper's paper How To Write Shared Libraries, and -about compilation- the Dragon book.
You might want to link with gcc but use your own linker script. See also this answer to a very related question (probably with motivations similar to yours); the references there are highly relevant for you.
PS. Your question lacks motivation and context (it has no MCVE but needs one) and might be some XY problem. I guess you are on Linux. I strongly recommend publishing your actual full code -even buggy- (perhaps on github or gitlab or elsewhere) as free software to get potential help. I strongly recommend using an existing bootloader (probably GRUB) and focus your efforts on your OS code (which should be published as free software, to get some feedback).

Where is _start symbol likely to be defined

I have some startup assembly for RISCV which defines the .text section as beginning at .globl _start.
I know what this is - as a disassembly shows me the address, but I cannot see where it is defined. It's not in the linker script and a grep in the build directories shows it is in various binary files, but I cannot find a definition.
I am guessing this appears in a file somewhere as a function of the architecture, but can anyone tell me where? (This is all being built using RISCV GNU cross compilers on Linux)
Unless you control it yourself there is usually at least in the gnu tools world a file called crt0.s. Or perhaps some other name. Should be one per architecture since it is in assembly language. It is the default bootstrap, zeros .bss copies .data as needed, etc.
I dont remember if it is part of the C library (glibc, newlib, etc), or if it is added on later by folks that build a toolchain targeting some specific platform.
Not required certainly but it is not uncommon to see _start be the label of the beginning of the binary, it is supposed to be the entry point certainly. So if you have an operating system/loader that uses a binary with labels present (elf, etc), then it can load the binary and instead of branching to the first address it branches to the entry point.
So the _start is merely defined as being at the start of the .text section, and the address of the .text section is defined in the linker script.

Moving a label into 64bit register - inline assembly (GCC / CLANG)

)
I'm trying to move a label's address into a 64bit register and it won't let me.
I'm getting a :
fatal error: error in backend: 32-bit absolute addressing is not supported in 64-bit mode
Here's an example of what i'm trying to do:
asm ("mov $label, %rax"); // Tried movq, movl (No difference)
...
asm volatile("label:");
...
Why won't it let me? does it allow moving a label only into a 32 bit register?
I have to insert that label's address into a 64bit register, how do I achieve that then?
thanks
Try either of these two asm statements:
asm ("movabs $label, %rax");
asm ("lea label(%rip), %rax");
The first one uses a 64-bit immediate operand (and thus a 64-bit absolute relocation), while the second one uses RIP relative addressing. The second choice is probably the best as it's shorter, though it requires that label be within 2^31 bytes.
However, as David Wohlferd noted, your code is unlikely to work.

GDB doesn't disassemble program running in RAM correctly

I have an application compiled using GCC for an STM32F407 ARM processor. The linker stores it in Flash, but is executed in RAM. A small bootstrap program copies the application from Flash to RAM and then branches to the application's ResetHandler.
memcpy(appRamStart, appFlashStart, appRamSize);
// run the application
__asm volatile (
"ldr r1, =_app_ram_start\n\t" // load a pointer to the application's vectors
"add r1, #4\n\t" // increment vector pointer to the second entry (ResetHandler pointer)
"ldr r2, [r1, #0x0]\n\t" // load the ResetHandler address via the vector pointer
// bit[0] must be 1 for THUMB instructions otherwise a bus error will occur.
"bx r2" // jump to the ResetHandler - does not return from here
);
This all works ok, except when I try to debug the application from RAM (using GDB from Eclipse) the disassembly is incorrect. The curious thing is the debugger gets the source code correct, and will accept and halt on breakpoints that I have set. I can single step the source code lines. However, when I single step the assembly instructions, they make no sense at all. It also contains numerous undefined instructions. I'm assuming it is some kind of alignment problem, but it all looks correct to me. Any suggestions?
It is possible that GDB relies on symbol table to check instruction set mode which can be Thumb(2)/ARM. When you move code to RAM it probably can't find this information and opts back to ARM mode.
You can use set arm force-mode thumb in gdb to force Thumb mode instruction.
As a side note, if you get illegal instruction when you debugging an ARM binary this is generally the problem if it is not complete nonsense like trying to disassembly data parts.
I personally find it strange that tools doesn't try a heuristic approach when disassembling ARM binaries. In case of auto it shouldn't be hard to try both modes and do an error count to decide which mode to use as a last resort.

How to debug an assembled program?

I have a program written in assembly that crashes with a segmentation fault. (The code is irrelevant, but is here.)
My question is how to debug an assembly language program with GDB?
When I try running it in GDB and perform a backtrace, I get no meaningful information. (Just hex offsets.)
How can I debug the program?
(I'm using NASM on Ubuntu, by the way if that somehow helps.)
I would just load it directly into gdb and step through it instruction by instruction, monitoring all registers and memory contents as you go.
I'm sure I'm not telling you anything you don't know there but the program seems simple enough to warrant this sort of approach. I would leave fancy debugging tricks like backtracking (and even breakpoints) for more complex code.
As to the specific problem (code paraphrased below):
extern printf
SECTION .data
format: db "%d",0
SECTION .bss
v_0: resb 4
SECTION .text
global main
main:
push 5
pop eax
mov [v_0], eax
mov eax, v_0
push eax
call printf
You appear to be just pushing 5 on to the stack followed by the address of that 5 in memory (v_0). I'm pretty certain you're going to need to push the address of the format string at some point if you want to call printf. It's not going to take to kindly to being given a rogue format string.
It's likely that your:
mov eax, v_0
should be:
mov eax, format
and I'm assuming that there's more code after that call to printf that you just left off as unimportant (otherwise you'll be going off to never-never land when it returns).
You should still be able to assemble with Stabs markers when linking code (with gcc).
I reccomend using YASM and assembling with -dstabs options:
$ yasm -felf64 -mamd64 -dstabs file.asm
This is how I assemble my assembly programs.
NASM and YASM code is interchangable for the most part (YASM has some extensions that aren't available in NASM, but every NASM code is well assembled with YASM).
I use gcc to link my assembled object files together or while compiling with C or C++ code. When using gcc, I use -gstabs+ to compile it with debug markers.

Resources