Including a separate binary into an ELF executable

Including a separate binary into an ELF executable - makefile

I am developing an operating system. I would like to include a small asm program into my main kernel elf that can serve as the first process to load. I am having trouble getting this to work. The program is initcode.s. I using the following Makefile which I modified from the xv6 operating system source for this task:
initcode:
$(AS) $(ASFLAGS) initcode.s -o initcode.o
ld $(LDFLAGS) -N -e start -Ttext 0 -o initcode initcode.o
objcopy --input binary --output elf32-i386 --binary-architecture i386 initcode.out initcode
kernel.elf: $(OBJECTS) initcode
ld -T link.ld -melf_i386 $(OBJECTS) -o kernel.elf initcode
The kernel compiles and links fine. objcopy also creates markers which enable me to find the binary from within the kernel code. However the contents of initcode is trashed. The contents does not resemble what the assembler step produced within initcode.out.
How can I achieve including initcode.s as a separate binary somewhere in my main kernel.elf with some markers generated so I can find it from within my kernel? Any suggestions?

I can think of 2 simple methods. There are certainly more complex methods if you prefer.
Write a small utility to convert the initcode binary to an asm file (or even C file) containing a GAS data section. Then assemble that file and link it with your kernel. Your initcode will than appear as a variable in your kernel.
OR
Simply 'cat' the initcode binary to the end of your kernel elf after the link step. This method will depend on whether your loader is happy to support this.

Related

avr-gcc Makefile: Why is avr-size foo.out used / required?

# Link
game.out: game.o system.o
avr-gcc -mmcu=atmega32u2 -Os -Wall $^ -o $# -lm
avr-size $#
What is the avr-size $# used for / required for?
I understand that this is probably needed by the linker, but what does the linker actually do with this piece of information. Haven't been able to find anything online but maybe I haven't look hard enough. Any explanations / links to information would be greatly appreciated, cheers.

It's not required at all as part of the build process; it simply displays information about the generated binary. The avr-size program is just the avr version of the size program, and you can find the corresponding manual page online or on your local system. From the description:
The GNU size utility lists the section sizes and the total size
for each of the binary files objfile on its argument list. By
default, one line of output is generated for each file or each
module if the file is an archive.

Getting assember output from GCC/Clang in LTO mode

Normally, one can get GCC's optimized assembler output from a source file using the -S flag in GCC and Clang, as in the following example.
gcc -O3 -S -c -o foo.s foo.c
But suppose I compile all of my source files using -O3 -flto to enable link-time whole-program optimizations and want to see the final compiler-generated optimized assembly for a function, and/or see where/how code gets inlined.
The result of compiling is a bunch of .o files which are really IR files disguised as object files, as expected. In linking an executable or shared library, these are then smushed together, optimized as a whole, and then compiled into the target binary.
But what if I want assembly output from this procedure? That is, the assembly source that results after link-time optimizations, during the compilation of IR to assembly, and before the actual assembly and linkage into the final executable.
I tried simply adding a -S flag to the link step, but that didn't really work.
I know disassembling the executable is possible, even interleaving with source, but sometimes it's nicer to look at actual compiler-generated assembly, especially with -fverbose-asm.

For GCC just add -save-temps to linker command:
$ gcc -flto -save-temps ... *.o -o bin/libsortcheck.so
$ ls -1
...
libsortcheck.so.ltrans0.s
For Clang the situation is more complicated. In case you use GNU ld (default or -fuse-ld=ld) or Gold linker (enabled via -fuse-ld=gold), you need to run with -Wl,-plugin-opt=emit-asm:
$ clang tmp.c -flto -Wl,-plugin-opt=emit-asm -o tmp.s
For newer (11+) versions of LLD linker (enabled via -fuse-ld=lld) you can generate asm with -Wl,--lto-emit-asm.

Run two instances of the same C++ program simultaneously

I've got a C++ program with a Makefile, building (g++) and running on Windows cmd. Thing is, sometimes it takes a while to run and save the results, and I want to run it with different parameters at the same time so that I can do something else while I wait for the first instance to finish. It doesn't work though, because of the executable I guess:
>make
g++ -c -o main.o main.cpp
Assembler messages:
Fatal error: can't create main.o: Permission denied
make: *** [main.o] Error 1

You have two problems: The one you ask about, and the reason you ask this question in the first place.
Lets start with the problem you have...
Judging by the Makefile you show, you have it all wrong.
Rules are in the format
target: sources_the_target_depend_on
The target is usually a file that need to be created. For an object file that is the name of the actual object file itself. The source files that the object files then depend on should be on the right-hand side.
To take an example from you Makefile (before you edited it away):
graph2: graph2.o
g++ -g -c graph.cpp -o graph2.o
Here you tell make that the file graph2 depends on the file graph2.o, and then it creates the graph2.o file. That's wrong. The rule should be that the file graph2.o depends om the file graph.cpp and go on to generate the file graph2.o:
graph2.o: graph.cpp
g++ -g -c graph.cpp -o graph2.o
This indirectly leads to the problem you have, with this line (deduced from your error and the Makefile):
main: main.o utils.o graph.o heuristics.o
g++ -g main.cpp -o main.o utils.o graph.o heuristics.o
This contains the same error as discussed above: You say that the file main depends on main.o and then the rule create main.o. Your rule should be
main: main.cpp utils.o graph.o heuristics.o
g++ -g main.cpp -o main utils.o graph.o heuristics.o
Note also how I no longer name the executable file main.o, as that is supposed to be used for object files.
Now lets continue with the reason you have the problem in the first place: That you need to edit the code to change data or values.
This is a problem that you need to solve. One common way to solve it is through command line arguments. If your program parses the command line arguments passed to your program you can pass it the values that could change from run to run.
How to do this is whole chapter on its own, so I wont give you any more details. There are plenty of tutorials online.
Lastly, you can simplify your Makefile considerably, by using implicit rules and variables.
I would simply create the Makefile to look something like this
# The compiler to use
CXX = g++
# Flags to pass to the compiler (add warnings when building)
CXXFLAGS = -Wall
# The main executable file to generate
TARGET = main
# List the object files needed to generate the main executable file
OBJECTS = main.o utils.o graph.o heuristics.o
# The all target depends on your main executable file
# Also as the first target in the Makefile, if no specific target is specified
# this will be the one that is used (it's the "default" target for the Makefile)
all: $(TARGET)
# The main executable file depends on the object files
$(TARGET): $(OBJECTS)
This is really it. the object files will be built automatically from their respective source files, and then the executable program will be linked using the object files listed.

Named common block in a shared library

I am encountering a problem when I include a Fortran
subroutine in a shared library. This subroutine has a
named common block.
I have a Fortran main program that uses this common block
and links with the shared library.
The behavior is that variables in the common block set in
either the subroutine or main program are not shared between
the two.
I am using gfortran 4.9.3 under MinGW on windows. Here are the pieces of
my very simple example.
Main program:
program mainp
common/whgc/ivar
ivar = 23
call sharedf
end
Subroutine:
subroutine sharedf
common/whgc/ivar
print *, 'ivar=', ivar
end
Makefile:
FC = gfortran
FFLAGS=-g
all: shltest.dll mainp.exe
shltest.dll: sharedf.o
$(FC) -shared -o shltest.dll sharedf.o
mainp.exe: mainp.o shltest.dll
$(FC) -o mainp.exe mainp.o shltest.dll
clean:
rm *.o mainp.exe shltest.dll
When mainp.exe is run, it produces ivar = 0 instead of the correct ivar=23
Here are the results of some experimentation I did with nm.
nm -g mainp.o shows:
...
00000004 C _whgc_
nm on sharedf.o shows the same.
nm -g shltest.dll shows:
...
71446410 B _whgc_
nm -g mainp.exe shows:
...
00406430 B _whgc_
This is the only _whgc_ symbol in mainp.exe.
However, when I run mainp.exe in gdb and set break points in both
mainp and sharedf, I can print the address of ivar at each break point. The addresses
are not the same.
From the behavior it seems clear that GNU ld is not correctly
matching the _whgc_ symbols but I'm unclear about what options
to pass either in the shared library build or the final link to
make it do so?
(Please don't suggest alternatives to common blocks. In my real
application I am dealing with legacy code that uses common blocks.)
EDIT:
I tried my example on Linux/x86 and there the behavior is correct.
Of course on Linux the shared library and executable are ELF format
objects and on Windows/MinGW the format is PE/COFF.

Objcopy, how it makes binary output?

As I'm new to binutils, gcc ant others, I have some general questions, anwsers on which I havn't found in manuals.
I'm using C and assembly(nasm syntax) and I need raw binary files on output. First of all, I compile my code to objec file with parameters:
cc -nostartfiles -nostdlib -c -ffreestanding <input file(s)> ;cc or gcc no matter
Then I link all the files using simple script which only puts segments in needed order.
ld -T <script> -o <o.file> <in.file(s)> ;nothing special here
And to get raw binary I use objcopy
objcopy -O binary <o.file> <in.file> ;can't be simplier
All in all, I need binary file only with .text and .data segments in it and 32-bit code.
1.Can i get this way what I want?
2.Are there other ways to do that? (no matter easier or more complicated)
Thank you for help.
I haven't problems compiling Asm code, almost all problems with C code.

Once I came across a ld manual page and /DISCARD/ block was said to exclude everything listed in it from final output.
So I've inserted this block after the .text, .data and .bss blocks
/DISCARD/ :
{
*(.comment)
*(.eh_frame)
*(.note.GNU-stack)
}
As well as this line in the very beginning of my linker script.
OUTPUT_FORMAT("binary")
Therefore, I do not need to use objcopy anymore.

You need to compile the source files using this command
nasm -o bin <SOURCE FILES>
This will produce pure binary output.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio