change instruction set in GCC - gcc

I want to test some architecture changes on an already existing architecture (x86) using simulators. However to properly test them and run benchmarks, I might have to make some changes to the instruction set, Is there a way to add these changes to GCC or any other compiler?

Simple solution:
One common approach is to add inline assembly, and encode the instruction bytes directly.
For example:
int main()
{
asm __volatile__ (".byte 0x90\n");
return 0;
}
compiles (gcc -O3) into:
00000000004005a0 <main>:
4005a0: 90 nop
4005a1: 31 c0 xor %eax,%eax
4005a3: c3 retq
So just replace 0x90 with your inst bytes. Of course you wont see the actual instruction on a regular objdump, and the program would likely not run on your system (unless you use one of the nop combinations), but the simulator should recognize it if it's properly implemented there.
Note that you can't expect the compiler to optimize well for you when it doesn't know this instruction, and you should take care and work with inline assembly clobber/input/output options if it changes state (registers, memory), to ensure correctness. Use optimizations only if you must.
Complicated solution
The alternative approach is to implement this in your compiler - it can be done in gcc, but as stated in the comments LLVM is probably one of the best ones to play with, as it's designed as a compiler development platform, but it's still very complicated as LLVM is best suited for IR optimization stages, and is somewhat less friendly when trying to modify the target-specific backends.
Still, it's doable, and you have to do that if you also plan to have your compiler decide when to issue this instruction. I'd suggest to start from the first option though, to see if your simulator even works with this addition, and only then spending time on the compiler side.
If and when you do decide to implement this in LLVM, your best bet is to define it as an intrinsic function, there's relatively more documentation about this in here - http://llvm.org/docs/ExtendingLLVM.html

You can add new instructions, or change existing by modifying group of files in GCC called "machine description". Instruction patterns in <target>.md file, some code in <target>.c file, predicates, constraints and so on. All of these lays in $GCCHOME/gcc/config/<target>/ folder. All of this stuff using on step of generation ASM code from RTL. You can also change cases of emiting instructions by change some other general GCC source files, change SSA tree generation, RTL generation, but all of this a little bit complicated.
A simple explanation what`s happened:
https://www.cse.iitb.ac.in/grc/slides/cgotut-gcc/topic5-md-intro.pdf

It's doable, and I've done it, but it's tedious. It is basically the process of porting the compiler to a new platform, using an existing platform as a model. Somewhere in GCC there is a file that defines the instruction set, and it goes through various processes during compilation that generate further code and data. It's 20+ years since I did it so I have forgotten all the details, sorry.

Related

When we get runtime error in swift project, Why does Xcode send us to Thread output in assembly language? What's the point ?

As you know when there is somethings wrong when we are running a Swift project in Xcode we will direct to tread debug navigator's thread section and we will be face with some assembly code like this :
I am wondering is there any reference, tutorial or tools for understanding these codes , there should be reasone that we direct to these code
let me clear; I know how to fix the errors but this suffering me when I do not understand some thing like this. I want to know what are these codes and how we can use them or at least understand them.
Thanks :)
Original question: what language is that? That's AT&T syntax assembly language for x86-64. https://stackoverflow.com/tags/x86/info for manuals from Intel and other resources, and https://stackoverflow.com/tags/att/info for how AT&T syntax differs from Intel syntax used in most manuals. (I think the x86 tag wiki has a few AT&T syntax tutorials.) Most AT&T-syntax disassemblers have an intel-syntax mode, too, so you can use that if you want asm that matches Intel's manuals.
What's the point?
The point is so you can debug your program if you know asm. Or you can show the asm to someone who does understand it, or include it in a bug report.
Did you compile without debug symbols? Or did it crash in library code without symbols? It's normal for debuggers to show you asm if it can't show you source, or if you ask for asm.
If you have debug symbols for your own code, you can at least backtrace into parent functions for which you do have source. (Unless the stack is corrupted.)
Did your program fault on that instruction highlighted in pink? That's a bit odd, since it's loading from static data (a RIP-relative load means the address is a link-time constant).
Did you maybe munmap or mprotect that page of your program's data or text segment so a load would fault? Normally you only get faults when an addressing mode involves a pointer.
(The call *0x1234(%rip) right before it is calling through a function pointer, though. The function-pointer is stored in memory, but code-fetch after the call executes would fault if it was pointing to an unmapped or non-executable page). But your first image shows you got a SIGABRT, not SIGSEGV, so that's more like the program on purpose aborted after failing an assertion.
I believe majority of swift coders don't know asm
There's nothing more useful a debugger can do without debug symbols and source files.
Also keep in mind that the majority of debugger authors do know asm, so for them it is an obviously-useful feature / behaviour. They know that many people won't be able to benefit from it, but that some will.
Asm is what's really running on the machine. Without asm, you couldn't find wrong-code compiler bugs, etc. etc. As far as software bugs, there is no lower level than asm, so it's not some arbitrary choice of some lower-level layer to stop at.
(Unless there's also a bug in your disassembler or debugger, in which case you need to check the hex machine code.)

Optimizing used registers when using inline ARM assembly in GCC

I want to write some inline ARM assembly in my C code. For this code, I need to use a register or two more than just the ones declared as inputs and outputs to the function. I know how to use the clobber list to tell GCC that I will be using some extra registers to do my computation.
However, I am sure that GCC enjoys the freedom to shuffle around which registers are used for what when optimizing. That is, I get the feeling it is a bad idea to use a fixed register for my computations.
What is the best way to use some extra register that is neither input nor output of my inline assembly, without using a fixed register?
P.S. I was thinking that using a dummy output variable might do the trick, but I'm not sure what kind of weird other effects that will have...
Ok, I've found a source that backs up the idea of using dummy outputs instead of hard registers:
4.8 Temporary registers:
People also sometimes erroneously use clobbers for temporary registers. The right way is
to make up a dummy output, and use “=r” or “=&r” depending on the permitted overlap
with the inputs. GCC allocates a register for the dummy value. The difference is that
GCC can pick a convenient register, so it has more flexibility.
from page 20 of this pdf.
For anyone who is interested in more info on inline assembly with GCC this website turned out to be very instructive.

How can I get a list of legal ARM opcodes from gcc (or elsewhere)?

I'd like to generate pseudo-random ARM instructions. Via assembler directives, I can tell gcc what mode I'm in, and it will complain if I try a set of opcodes and operands that's not legal in that mode, so it must have some internal listing of what can be done in which mode. Where does that live? Would it be easier to extract that info from LLVM?
Is this question "not even wrong"? Should I try a different approach entirely?
To answer my own question, this is actually really easy to do from arm.md and and constraints.md in gcc/config/arm/. I probably spent more time answering asking this question and answering comments for it than I did figuring this out. Turns out I just need to look for 'TARGET_THUMB1', until I get around to implementing thumb2.
For the ARM family the buck stops at the ARM ARM (ARM Architectural Reference Manual). There is an ARM instruction set section and a Thumb instruction set section. Within both each instruction tells you what generation (ARMvX where X is some number like 4 (arm7), or 5 (arm9 time frame) ,etc). Since the opcode and pseudo code is listed for each instruction you should be able to figure out what is a real instruction and, if any, are syntax to save typing on another (push and pop for example).
With the Cortex-m3 and thumb2 in particular you also need to look at the TRM (Technical Reference Manual) as well. ARM has, I forget the name, a universal syntax they are trying to use that should work on both Thumb and ARM. For example on an ARM you have three register instructions:
add r1,r1,r2
In thumb there are only two register operations
add r1,r2
The desire basically is to meet in the middle or I would say more accurately to encourage ARM assemblers to parse Thumb instructions and encode them with the equivalent ARM instruction without complaining. This may have started with thumb and not thumb2, I have always separated the two syntaxes in my code until recently (and I still generally use ARM syntax for ARM and Thumb for Thumb).
And then yes you have to see what the specific implementation of the assembler tool is, in your case binutils. And it sounds like you have found the binutils/gnu secret decoder ring.

How to read / write .exe machine code manually?

I am not well acquainted to the compiler magic. The act of transforming human-readable code (or the not really readable Assembly instructions) into machine code is, for me, rocket science combined with sorcery.
I will narrow down the subject of this question to Win32 executables (.exe). When I open these files up in a specialized viewer, I can find strings (usually 16b per character) scattered at various places, but the rest is just garbage. I suppose the unreadable part (majority) is the machine code (or maybe resources, such as images etc...).
Is there any straightforward way of reading the machine code? Opening the exe as a file stream and reading it byte by byte, how could one turn these individual bytes into Assembly? Is there a straightforward mapping between these instruction bytes and the Assembly instruction?
How is the .exe written? Four bytes per instruction? More? Less? I have noticed some applications can create executable files just like that: for example, in ACD See you can export a series of images into a slideshow. But this does not necessarily have to be a SWF slideshow, ACD See is also capable of producing EXEcutable presentations. How is that done?
How can I understand what goes on inside an EXE file?
OllyDbg is an awesome tool that disassembles an EXE into readable instructions and allows you to execute the instructions one-by-one. It also tells you what API functions the program uses and if possible, the arguments that it provides (as long as the arguments are found on the stack).
Generally speaking, CPU instructions are of variable length, some are one byte, others are two, some three, some four etc. It mostly depends on the kind of data that the instruction expects. Some instructions are generalised, like "mov" which tells the CPU to move data from a CPU register to a place in memory, or vice versa. In reality, there are many different "mov" instructions, ones for handling 8-bit, 16-bit, 32-bit data, ones for moving data from different registers and so on.
You could pick up Dr. Paul Carter's PC Assembly Language Tutorial which is a free entry level book that talks about assembly and how the Intel 386 CPU operates. Most of it is applicable even to modern day consumer Intel CPUs.
The EXE format is specific to Windows. The entry-point (i.e. the first executable instruction) is usually found at the same place within the EXE file. It's all kind of difficult to explain all at once, but the resources I've provided should help cure at least some of your curiosity! :)
You need a disassembler which will turn the machine code into assembly language. This Wikipedia link describes the process and provides links to free disassemblers. Of course, as you say you don't understand assembly language, this may not be very informative - what exactly are you trying to do here?
You can use debug from the command line, but that's hard.
C:\WINDOWS>debug taskman.exe
-u
0D69:0000 0E PUSH CS
0D69:0001 1F POP DS
0D69:0002 BA0E00 MOV DX,000E
0D69:0005 B409 MOV AH,09
0D69:0007 CD21 INT 21
0D69:0009 B8014C MOV AX,4C01
0D69:000C CD21 INT 21
0D69:000E 54 PUSH SP
0D69:000F 68 DB 68
0D69:0010 69 DB 69
0D69:0011 7320 JNB 0033
0D69:0013 7072 JO 0087
0D69:0015 6F DB 6F
0D69:0016 67 DB 67
0D69:0017 7261 JB 007A
0D69:0019 6D DB 6D
0D69:001A 206361 AND [BP+DI+61],AH
0D69:001D 6E DB 6E
0D69:001E 6E DB 6E
0D69:001F 6F DB 6F
The executable file you see is Microsofts PE (Portable Executable) format. It is essentially a container, which holds some operating system specific data about a program and the program data itself split into several sections. For example code, resources, static data are stored in seperate sections.
The format of the section depends on what is in it. The code section holds the machine code according to the executable target architecture. In the most common cases this is Intel x86 or AMD-64 (same as EM64T) for Microsoft PE binaries. The format of the machine code is CISC and originates back to the 8086 and earlier. The important aspect of CISC is that its instruction size is not constant, you have to start reading at the right place to get something valuable out of it. Intel publishes good manuals on the x86/x64 instruction set.
You can use a disassembler to view the machine code directly. In combination with the manuals you can guess the source code most of the time.
And then there's MSIL EXE: The .NET executables holding Microsofts Intermediate Language, these do not contain machine specific code, but .NET CIL code. The specifications for that are available online at the ECMA.
These can be viewed with a tool such as Reflector.
The contents of the EXE file are described in Portable Executable. It contains code, data, and instructions to OS on how to load the file.
There is an 1:1 mapping between machine code and assembly. A disassembler program will perform the reverse operation.
There isn't a fixed number of bytes per instruction on i386. Some are a single byte, some are much longer.
Just relating to this question, anyone still read things like
CD 21?
I remembered Sandra Bullock in one show, actually reading a screenful of hex numbers and figure out what the program does. Sort of like the current version of reading Matrix code.
if you do read stuff like CD 21, how do you remember the different various combinations?
Win32 exe format on MSDN
I'd suggest taking an bit of Windows C source code and build and start debugging it in Visual Studio. Switch to the disassembly view and step over the commands. You can see how the C code has been compiled into machine code - and watch it run step-by-step.
If it's as foreign to you as it seems, I don't think a debugger or disassembler is going to help - you need to learn assembler programming first; study the architecture of the processor (plenty of documentation downloadable from Intel). And then since most machine code is generated by compilers, you'll need to understand how compilers generate code - the simplest way to write lots of small programs and then disassemble them to see what your C/C++ is turned into.
A couple of books that'll help you understand:-
Reversing
Hacking = The Art of Exploitation
To get an idea, set a breakpoint on some interesting code, and then go to the CPU window.
If you are interested in more, it is easier to compile short fragments with Free Pascal using the -al parameter.
FPC allows to output the generated assembler in a multitude of assembler formats (TASM,MASM,GAS ) using the -A parameter, and you can have the original pascal code interleaved in comments (and more) for easy crossreference.
Because it is compiler generated assembler, as opposed to assembler from disassembled .exe, it is more symbolic and easier to follow.
Familiarity with low level assembly (and I mean low level assembly, not "macros" and that bull) is probably a must. If you really want to read the raw machine code itself directly, usually you would use a hex editor for that. In order to understand what the instructions do, however, most people would use a disassembler to convert that into the appropriate assembly instructions. If you're one of the minority who wants to understand the machine language itself, I think you'd want the Intel® 64 and IA-32 Architectures Software Developer's Manuals. Volume 2 specifically covers the instruction set, which relates to your query about how to read machine code itself and how assembly relates to it.
Both your curiosity and your level of understanding is exactly where I was at one point. I highly recommend Code: The Hidden Language of Computer Hardware and Software. This will not answer all of the questions you ask here but it will shed light on some of the utterly black magic aspects of computers. It's a thick book but highly readable.
ACD See is probably taking advantage of the fact that .EXE files do no error checking on file length or anything beyond the length of the expected portion of the file. Because of this, you can make an .EXE file that will open its self and load everything beyond a given point as data. This is useful because you can then make a .EXE that works on a given set of data by just tacking that data on the end of a suitably written .EXE
(I have no idea what exactly ACD See is so take that with a big grain of salt but I do know that some program are generated that way.)
Every instruction is in machine code kept in a special memory area within the cpu. EARLY INTEL books gave the machine code for their instructions, so one should try to obtain such books so as to understand this. Obviously today machine codeis not easily available. What would be nice is a program which can reverse hex to machine code. Or do it manually _!!
tedious

How to get GCC to use more than two SIMD registers when using intrinsics?

I am writing some code and trying to speed it up using SIMD intrinsics SSE2/3. My code is of such nature that I need to load some data into an XMM register and act on it many times. When I'm looking at the assembler code generated, it seems that GCC keeps flushing the data back to the memory, in order to reload something else in XMM0 and XMM1. I am compiling for x86-64 so I have 15 registers. Why is GCC using only two and what can I do to ask it to use more? Is there any way that I can "pin" some value in a register? I added the "register" keyword to my variable definition, but the generated assembly code is identical.
Yes, you can. Explicit Reg Vars talks about the syntax you need to pin a variable to a specific register.
If you're getting to the point where you're specifying individual registers for each intrinsic, you might as well just write the assembly directory, especially given gcc's nasty habit of pessimizing intrinsics unnecessarily in many cases.
It sounds like you compiled with optimization disabled, so no variables are kept in registers between C statements, not even int.
Compile with gcc -O3 -march=native to let the compiler make non-terrible asm, optimized for your machine. The default is -O0 with a "generic" target ISA and tuning.
See also Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for more about why "debug" builds in general are like that, and the fact that register int foo; or register __m128 bar; can stay in a register even in a debug build. But it's much better to actually have the compiler optimize, as well as using registers, if you want your code to run fast overall!

Resources