What's the pass in GCC handles const strings? - gcc

What's the pass name in GCC that handles building string array into .rodata section? Would like to write a plugin to intercept also strings in source code, I know there're a bunch of tools in binutils can achieve the same goal, but what if we want to do some postprocessing, for example verify words.

Read-only data section, also known as .rodata, generates after the last step of all rtl passes. You can see how it works in file varasm.c, which lays in /gcc folder. Look at section
section *
default_function_rodata_section (tree decl)
and below.
You can also easily add some functions to intercept data into asm file or some other output file here or write an external function.
varasm.c file handles the generation of all the assembler code
except the instructions of a function.
This includes declarations of variables and their initial values.

Related

GCC Linker Script Ignore PHDR / PHDRS?

I'm trying to target a flat file for the output of my code, where I don't want PHDR. But it seems no matter how I set things up, in modern GCC versions, I can't avoid PHDR.
Either I get PHDR segment not covered by LOAD segment or if I define a PHDRS property as in my linker script as follows:
PHDRS
{
header PT_NULL FILEHDR;
text PT_NULL PHDRS;
data PT_NULL FILEHDR;
}
but throw it out, I get the error no sections assigned to phdrs
I can't seem to find any way to force GCC to just trust me and not emit the PHDRs. What can I put in my linker script to tell GCC that I really mean it.
EDIT
I found this: https://sourceware.org/bugzilla/show_bug.cgi?id=25585
If I add the following to my GCC invocation, it seems to output the binary anyway: -Wl,--noinhibit-exec
But, it now includes extra header data in the middle of the binary image.
If you want to generate a flat binary file, then you could just specify the output format to be “binary”.
Not sure it’s what you want
A long time a ago, when I was looking into bootloaders, I was generating the MBR using exactly that.
I found a solution!
Either have an empty PHDRS {...} definition at the top (or it seems maybe you don't need it).
Then, in your sections, be sure to discard phdr and contain all the troublesome sections.
/* If we're on a newer compiler */
/DISCARD/: {
*(.interp)
*(.dynsym)
*(.dynstr)
*(.hash)
*(.gnu.hash)
*(.header)
} : phdr

Is it possible to make writeable variables in .text segment using DB directive in NASM?

I've tried declaring variables in .text segment using e.g. file_handle: dd 0.
However, trying to store something in this variable like mov [file_handle], eax results in a write error.
I know, I could declare writeable variables in the .data segment, but to make the code more compact I'd like to try it as above.
Is the only possibility to use the stack for storing these value (e.g. the file handle), or could I somehow write to my variable above?
Executable code segments are not writable by default. This is a basic security precaution. No, it's not a good idea. But if you insist, as this is a toy project anyway, go ahead.
You can make yours writable by letting the linker know to mark it so, e.g. give the following argument to the MS linker:
link /SECTION:.text,EWR ....
You can actually arrange for the text segment of your Windows process to be mapped read+write+execute, see #Kuba's answer. This might also be possible on Linux with ELF binaries; I think ELF has similar flags for segments.
I think you could also call a Windows function (VirtualProtect) to change the mapping of your text segment to read+write+execute from inside your process.
Overall this sounds like a terrible idea, and you should definitely keep temporaries on the stack like a C compiler would, if you want to avoid having a data page.
Static storage for things you only use in part of the program is wasteful.
No it's not possible to have writable "variable" in .text section of an assembly program.
When writing file_handle: dd 0 in the .text section and then assemblying, your label file_handle refers to an address located in the text section of your binary. However the text section is read-only.
If the text section wasn't only read-only accessible, a program could modify itself while executing.

GDB: Seeing the source code lines?

Does any program compiled with the -g command have its source code available for gbd to list even if the source code files are unavailable?? Also when you set the breakpoints at a line in a program with a complicated multi source file structure do you need the names of the source code files??
OP's 1st Question:
Does any program compiled with the -g command have its source code available for gbd to list even if the source code files are unavailable??
No. If there is no path to the sources, then you will not see the source.
OP's 2nd Question:
[...] when you set the breakpoints at a line in a program with a complicated multi source file structure do you need the names of the source code files??
Not always. There are a few ways of setting breakpoints. The only two I remember are breaking on a line or breaking on a function. If you wanted to break on the first line of a function, use
break functionname
If the function lives in a module
break __modulename_MOD_functionname
The modulename and functionname should be lowercase, no matter how you've declared them in the code. Note the two underscores before the module name. If you are not sure, use nm on the executable to find out what the symbol is.
If you have the source code available and you are using a graphical environment, try ddd. It stops me swearing and takes a lot of guesswork out of gdb. If the source is available, it will show up straight away.

How can I create a custom variable attribute to direct movs into different address spaces?

So, I'm building a custom backend for GCC for a processor. This processor has 4 address spaces: local, global, mmm, and mmr. I want to make it such that when writing c code, you can do this:
int global x = 5;
which would cause the compiler to spit out an instruction like this:
ldi.g %reg, 5
I know that certain processors like blackfin and MeP do something similar to this, so I figure its possible to do, however I have no idea how to do it. The technique that should allow me to do this is a variable attribute.
Any suggestions on how I could go about doing this?
You can add target-specific attributes by registering a struct attribute_spec table using TARGET_ATTRIBUTE_TABLE, as described in the GCC internals documentation. The details of struct attribute_spec can be found in the source (gcc/tree.h).
This handler doesn't need to do anything beyond returning NULL_TREE, although typically it will at least do some error checking. (Read the comments in gcc/tree.h, and look at examples in other targets.)
Later, you can obtain the list of attributes for a declaration tree node with DECL_ATTRIBUTES() (see the internals docs again), and use lookup_attribute() (see gcc/tree.h again) to see if a given attribute in the list.
You want to references to a symbol to generate different assembly based on your new attributes, so you probably want to use the TARGET_ENCODE_SECTION_INFO hook ("Define this hook if references to a symbol or a constant must be treated differently depending on something about the variable or function named by the symbol") to set a flag on the symbol_ref (as the docs suggest). You can define a predicate for testing this flag in the .md .

define a program section in C code (GCC)

In assembly language, it's easy to define a section like:
.section foo
How can this be done in C code? I want to put a piece of C code in a special section rather than .text, so I will be able to put that section in a special location in the linker script.
I'm using GCC.
The C standard doesn't say anything about "sections" in the sense that you mean, so you'll need to use extensions specific to your compiler.
With GCC, you will want to use the section attribute:
extern void foobar(void) __attribute__((section("bar")));
There is some limited documentation here, including a warning:
Some file formats do not support
arbitrary sections so the section
attribute is not available on all
platforms. If you need to map the
entire contents of a module to a
particular section, consider using the
facilities of the linker instead.

Resources