How to can I programmatically edit a binary (x86)? - gcc

I'm working on a program to do post-compilation optimization. Because I've noticed there are a few special cases that gcc just doesn't optimize well, even at -O3.
Is there a library that would allow me to load a binary (x86), into some datastructure that would be suitable for editing, and then write it out again? I would also want it to handle updating all the memory offsets, as the edits might change the size of the binary.

I will suppose you are looking at ELF executable format. Say it if you were more thinking about PE or Mach-O.
You, indeed, can find several libraries or tools to edit an ELF format and modify it. Among others, here is a small list of such tools:
Elfesteem
ERESI
PatchELF

Related

Position code independant executable on windows

Is it possible to build a position independant code for windows ?
I am not talking about a dll, i am talking about an executable (PE).
What i want to do is that functions (main for example) which are in the program should be mapped at a different memory address between 2 executions.
Thanks
You should read up on the image base relocation directory in a PE file. I was wondering the same thing you are asking now a while back. From the research I had done I couldn't find any compiler that generates completely base independent code. I ended up writing my program entirely in x86 asm with some tricks to load strings etc independent form the address. I can tell you its not worth the trouble. Just remap all the addresses to your target location.
I suggest you start reading this
https://msdn.microsoft.com/en-us/library/ms809762.aspx
You should also check out Icezilon's (not sure how you spell his name) tutorials. They're very helpful

What is the ".scatterload" file in macOS?

I've downloaded Apple's TextEdit example app (here) and I'm a bit puzzled by one thing I see there: the TextEdit.scatterload file. It contains a list of functions and methods. My guess is that it provides information to the linker as to which functions/methods will be needed, and in what order, when the app launches, and that this is used to order the binary generated by the linker for maximum efficiency. Oddly, I seem to be unable to find any information whatsoever about this file through Google. So. First of all, is my guess as to the function of this file correct? And second, if so, can I generate a .scatterload file for my own macOS app, to make it launch faster? How would I do that? Seems like a good idea! (I am using Objective-C, but perhaps this question is not specific to that, so I'm not going to tag for it here.)
Scatter loading refers to a way to organize the mapping of your code in memory by specifying which part of code must be near which one, etc. This is to optimize page faults, etc.
You can read about it here Improving locality of reference (HTML)
or here Improving locality of reference (PDF).
.scatterload file is used by the linker to position code in memory layout of the executable.
Except if your app really need tight performance tuning, I would not encourage you to have a look at this.

What might cause a slight difference in binaries when compiled at different times using make?

I compiled my code using the make utility and got the binaries.
I compiled the code again with a few changes in makefile (-j inserted at some points) and got a slight difference in the binaries. The difference was reported by "beyond compare". To further check in, I compiled the code again without my changes in makefile and found that the binaries are still differing.
Why is it happening that the same code compiled at different times is resulting into slightly different (in size and content) binaries? How should if check if the changes i have made are legitimate and the binaries are the same logically?
Do ask me for any further explanation.
You haven't said what you're building (C, C++ etc) but I wouldn't be surprised if it's a timestamp.
You could find out the format for the binary type you're building (which will depend on your operating system) and see whether it makes sense for there to be a timestamp in the place which is changing.
It's probably easiest to do this on a tiny sample program which will produce a very small binary, to make it easier to work out what everything means.
ELF object files contain a timestamp for when they are compiled. Thus, you can expect a different object file each time you compile (on Linux or Solaris). You may find the same with other systems of object files too.

Patching an EXE using IDA

Say there is a buggy program that contains a sprintf() and i want to change it to a snprintf so it doesn't have a buffer overflow.. how do I do that in IDA??
You really don't want to make that kind of change using information from IDA pro.
Although IDA's disassembly is relatively high quality, it's not high quality enough to support executable rewriting. Converting a call to sprintf to a call to snprintf requires pushing a new argument on to the stack. That requires the introduction of a new instruction, which impacts the EA of everything that follows it in the executable image. Updating those effective addresses requires extremely high quality disassembly. In particular, you need to be able to:
Identify which addresses in the executable are data, and which ones are code
Identify which instruction operands are symbolic (address references) and which instruction operands are numeric.
Ida can't (reliably) give you that information. Also, if the executable is statically linked against the crt, it may not contain snpritnf, which would make performing the rewriting by hand VERY difficult.
There are a few potential workarounds. If there is sufficient padding available in (or after) the function making the call, you might be able to get away with only rewriting a single function. Alternatively, if you have access to object files, and those object files were compiled with the /GY switch (assuming you are using Visual Studio) then you may be able to edit the object file. However, editing the object file may still require substantial fix ups.
Presumably, however, if you have access to the object files you probably also have access to the source. Changing the source is probably your best bet.

profile linking times with gcc/g++ and ld

I'm using g++ to compile and link a project consisting of about 15 c++ source files and 4 shared object files. Recently the linking time more than doubled, but I don't have the history of the makefile available to me. Is there any way to profile g++ to see what part of the linking is taking a long time?
Edit: After I noticed that the makefile was using -O3 optimizations all the time, I managed to halve the linking time just by removing that switch. Is there any good way I could have found this without trial and error?
Edit: I'm not actually interested in profiling how ld works. I'm interested in knowing how I can match increases in linking time to specific command line switches or object files.
Profiling g++ will prove futile, because g++ doesn't perform linking, the linker ld does.
Profiling ld will also likely not show you anything interesting, because linking time is most often dominated by disk I/O, and if your link isn't, you wouldn't know what to make of the profiling data, unless you understand ld internals.
If your link time is noticeable with only 15 files in the link, there is likely something wrong with your development system [1]; either it has a disk that is on its last legs and is constantly retrying, or you do not have enough memory to perform the link (linking is often RAM-intensive), and your system swaps like crazy.
Assuming you are on an ELF based system, you may also wish to try the new gold linker (part of binutils), which is often several times faster than the GNU ld.
[1] My typical links involve 1000s of objects, produce 200+MB executables, and finish in less than 60s.
If you have just hit your RAM limit, you'll be probably able to hear the disk working, and a system activity monitor will tell you that. But if linking is still CPU-bound (i.e. if CPU usage is still high), that's not the issue. And if linking is IO-bound, the most common culprit can be runtime info. Have a look at the executable size anyway.
To answer your problem in a different way: are you doing heavy template usage? For each usage of a template with a different type parameter, a new instance of the whole template is generated, so you get more work for the linker. To make that actually noticeable, though, you'd need to use some library really heavy on templates. A lot of ones from the Boost project qualifies - I got template-based code bloat when using Boost::Spirit with a complex grammar. And ~4000 lines of code compiled to 7,7M of executable - changing one line doubled the number of specializations required and the size of the final executable. Inlining helped a lot, though, leading to 1,9M of output.
Shared libraries might be causing other problems, you might want to look at documentation for -fvisibility=hidden, and it will improve your code anyway. From GCC manual for -fvisibility:
Using this feature can very substantially
improve linking and load times of shared object libraries, produce
more optimized code, provide near-perfect API export and prevent
symbol clashes. It is *strongly* recommended that you use this in
any shared objects you distribute.
In fact, the linker normally must support the possibility for the application or for other libraries to override symbols defined into the library, while typically this is not the intended usage. Note that using that is not for free however, it does require (trivial) code changes.
The link suggested by the docs is: http://gcc.gnu.org/wiki/Visibility
Both gcc and g++ support the -v verbose flag, which makes them output details of the current task.
If you're interested in really profiling the tools, you may want to check out Sysprof or OProfile.

Resources