Interpreting Mach-O Data - macos

I am trying to tinker with the appearance of the Dock in OS X.
I have the raw data from the Dock's Mach-O executable, but I do not know much about them. I am trying to figure out where I might find the segments/sections where the Dock actually gets drawn. For example, I see all kinds of sections, such as __DATA,__mod_init_func and __DATA,__cfstring, and I am just wondering if there is an easy way to tell which of these sections (or even particular segments) has the data I'm looking for, or a way decompile the data into a more readable format.

You can't really "decompile" a mach-o file unless you understand everything about them. You can get some "human-readable" contents from the raw data like its methods and instances eg.
-[AClass anInstance:]
would be something like:
-(id)anInstance:(id)arg1;
I would suggest some other tool for understanding this.
There are a few command lines that could use:
nm /path/to/mach-o // Prints all the strings of a Mach-O Executable
hexdump -C /path/to/mach-o // Shows the Hexadecimal Code of a Mach-O Executable
otool -t /path/to/mach-o // Outputs the raw (Hexadecimal) "_TEXT,__text" section of a Mach-O Executable (Compare this with the hexdump -C command)
otool -tV /path/to/mach-o // Outputs the converted (Human Readable-ish) "__TEXT,__text" section of a Mach-O Executable
But if you really want to understand everything about Mach-O
I suggest downloading Hopper at:
https://www.hopperapp.com Which is good for showing you bytes of a mach-o binary and what they're for. Then you can have a look here: https://www3.nd.edu/~dthain/courses/cse40243/fall2015/intel-intro.html which will teach you how to understand how the mach-o is compiled and how you can read the execution methods.
eg.
1. Open Hopper and drag and drop in the Mach-O executable then wait for it to load.
2. Execute "otool -tV /path/to/mach-o" in Terminal.app
You can notice the difference between hopper and terminal's output
and begin to piece the differences together. You can then open the website I provided and learn what all the output functions are for.
I hope this helped you a little and gets you started on a search for knowledge.
Your Welcome.

Related

How to determine if an ELF file is a Go ELF file?

I need to determine whether a given ELF file originated from Go. According to this link:
$ readelf -a traefik.stripped | grep "\.note\.go\.buildid"
Is this in any way inferior to go's native way:
$ go tool buildid traefik.stripped
oPIWoPjqt1P3rttpA3ee/ByNXPhvgS37nIoJY-HYB/8b25JYXrgktA-FYgU5MU/0Posfq41xZW9BEPEG4Ub
Are both methods guaranteed to work on stripped binaries?
I need to determine whether a given ELF file originated from Go
That is impossible to do in general. What is and isn't a Go binary is not well defined, and a sufficiently optimized Go binary may end up containing just a few instructions. E.g. on x86_64, you may end up with a single HLT instruction.
how come strip itself doesn't remove this section?
This section (indeed every section) is not necessary for execution -- you can remove all sections, and the binary will still work.
This section is present only to help developers identify a particular build. strip doesn't remove it by default because that would defeat the purpose of this section, but it certainly can do so.
can an innocent go developer build a golang ELF and accidentally remove this (redundant??) section
Sure. The developer can run a broken version of strip, or he can have aliased strip with strip --strip-all, or he could have used some other ELF post-processing tool, or he could have used UPX, or ...
The mentioned section is a NOTE section:
$ readelf -a traefik.stripped | grep "\.note\.go\.buildid" | sed -n "1,1p"
[11] .note.go.buildid NOTE 0000000000400f9c 00000f9c
And apparently NOTE sections might sometimes be removed for size reductions (related):
objcopy --remove-section=.note.go.buildid traefik.stripped traefik.super.stripped
Removing the mentioned section does not seem to harm the integrity of the whole binary
As for using standard go tools the section should be there, but there is a way that the go nature of a binary can be hidden without any malicious intent. Using upx to reduce the size of the binary will completely hide the go nature of the binary as upx works with binaries from any language.

Emdedding accessible text in binary - post-compilation

I have a rather weird question but I don't really know how to put it or where to start looking from.
My question is not about "embedding" a text file (we already have at compile time) - that is too obvious.
My question is if (and how) I could let's say "package" an existing (created by C) binary with a text file and generate a new... working binary with access to that file.
I'm a Mac user. I know that could work with an .app package and all that. But that's still not what I want. I want to be able to "tweak" an existing binary, add some (accessible - how?) additional text data to it, and the binary remaining absolutely functional.
Is that even possible?
P.S. The only serious tool I've looked into is bsdiff and bspatch but I'm not really sure it's what I'm looking for.
You can definitely do this, but the exact procedure is going to be different for every platform, with a few commonalities. Your tool of choice here is probably going to be llvm_objcopy.
At a high level, you will create a special segment or section in the binary (or both as in the case of MachO) containing the data you want, and then you'll have to parse your own executable to retrieve it. Since you said you're on a Mac, we can start there as an example.
Create the dumbest possible test program as a starting point:
test.c
#include <stdio.h>
int main(int argc, char **argv)
{
printf("I'm a binary!\n");
return 0;
}
Compile and run it:
prompt$ clang -o test test.c
prompt$ ./test
I'm a binary
Now create a text file hello.txt and put some text in it:
Hello world!
You can embed this into the MachO file with llvm-objcopy
llvm-objcopy --add-section __MAGIC,__magic_section=hello.txt test test
Then check that it still runs:
prompt$ ./test
I'm a binary!
You can verify that the section has been added using otool -l, you can also run strings on the binary, and you should see your text:
prompt$ strings ./test
I'm a binary!
Hello world!
Now you have to be able to retrieve this data. If you were compiling everything in a priori, it would be easy since the linker would make symbols for you marking the start and end of the __magic_section section that you added.
Since you specifically said this has to be an a posteriori step, you're going to have to actually self-parse the MachO file to look up the __magic_section section in the __MAGIC segment that you added. You can find a few different references for parsing MachO files, but you probably also want to make use of built in dyld functionality. See Apple's Reference on dyld utility calls that can for example give you the Mach header for the running process. Linux has similar functionality by way of dl_iterate_phdr.
Once you know where the section is in your original binary, you can retrieve the text.
To repeat all of this on Linux, you will do pretty much the same thing, but you'll be working with the ELF file format instead of MachO. The same principles would apply though.
As a side note: this is exactly how code signing works on MacOS. A signature is generated and placed into a dedicated "signature" section in the binary to be read by the system on launch.

Rust library for inspecting .rlib binaries

I'm looking for a way to load and inspect .rlib binaries generated by rustc. I've hunted around the standard library without much luck. My assumption is that an .rlib contains all the type information necessary to statically type check programs that "extern crate" it. rustc::metadata is where my hunt ended. I can't quite figure out if the structures available at this point in the compiler are intended as entry points for users, or if they are solely intermediate abstractions depending on a chain of previously initialized data.
Alternatively, If there's a way to dump an .rlib to stdout in a parsable form then that's also fantastic. I tried /usr/bin/nm, but it seemed to be excluding function type signatures. Maybe I'm missing something.
Anyways, I'm working on an editor utility for emacs that I hope at some point will provide contextually relevant information such as available methods, module items and their types, etc. I'd really appreciate any hints anyone has.
The .rlib file is an ar archive file. You can use readelf to read its content.
Try readelf -s <your_lib>.rlib. The type name may be mingled/decorated by the compiler so it may not be exactly the same as in .rs file.

How can I add sections to an existing (OS X) executable?

Is there any way of adding sections to an already-linked executable?
I'm trying to code-sign an OS X executable, based on the Apple instructions. These include the instruction to create a suitable section in the binary to be signed, by adding arguments to the linker options:
-sectcreate __TEXT __info_plist Info.plist_path
But: The executable I'm trying to sign is produced using Racket (a Scheme implementation), which assembles a standalone executable from Racket/scheme code by cloning the 'real' racket executable and editing the Mach-O file directly.
So the question is: is there a way I can further edit this executable, to add the section which is required for the code-signing?
Using ld doesn't work when used in the obvious way:
% ld -arch i386 -sectcreate __TEXT __info_plist ./hello.txt racket-executable
ld: in racket-executable, can't link with a main executable
%
That seems fair enough, I suppose. Libtool doesn't have any likely-looking options, and neither does the redo_prebinding command (which is at least a command for editing executables).
The two possibilities suggested by the relevant Racket list were (i) to extend the the racket compilation tool to adjust the surgery which is done on the executable (feasible, but scary), or (ii) to create a custom racket executable which has the desired section already in place. Both seem like sledgehammer-and-nut solutions. The macosx-dev list didn't come up with any suggestions.
I think this is infeasible.
There appear to be no stock commands which edit a Mach-O object file (which includes executables). The otool command allows you to view the structure of such a file (use otool -l), but that's about it.
The structure of a Mach-O object file is documented on the Apple reference site. In summary, a Mach-O object file has the following structure:
a header, followed by
a sequence of 'load commands', followed by
a sequence of 'segments' (some of the load commands are responsible for pointing to the offsets of the segments within the file)
The segments contain zero or more 'sections'. The header and load commands are deemed to be in the first segment of the file, before any of that segment's sections. There are a couple of dozen load commands documented, and other commands defined in the relevant header file, therefore clearly private.
Adding a section would imply changing the length of a segment. Unless the section were very small, this would require pushing the following segment further into the file. That's a no-no, because a lot of the load commands refer to data within the file with absolute offsets from the beginning of the file (as opposed to, say, the beginning of the segment or section which contains them), so that relocating the segments in a putative Mach-O editor would involve patching up a large number of offsets. That doesn't look easy.
There are one or two Mach-O file viewers on the web, but none which do much that's different from otool, as far as I can see (it's not particularly hard: I wrote most of one this afternoon whilst trying to understand the format). There's at least one Mach-O editor, but it didn't seem to do anything that lipo didn't do (called 'moatool', but the source appears to have disappeared from google code).
Oh well. Time to find a Plan B, I suppose.
The gimmedebugah tool is able to modify the __info_plist TEXT section of an existing binary. See https://reverse.put.as/2013/05/28/gimmedebugah-how-to-embedded-a-info-plist-into-arbitrary-binaries/
It is available here: https://github.com/gdbinit/gimmedebugah

Clarification on Binary file (PE/COFF & ELF) formats & terminology

I'm confusing little in terminology.
A file that is given as input to the linker is called Object File.
The linker produces an Image file, which in turn is used as input by the loader.
I got this from "MS PE & COFF Specification"
Q1. Image file is also referred to as Binary Image, Binary File or just Binary. Right?
Q2. So, according to the above stated terminology, the PE/ELF/COFF are the formats of Image File & not the Object File. right? But http://www.sco.com/developers/gabi/latest/ch4.intro.html says
This chapter describes the object file format, called ELF (Executable and Linking Format). There are three main types of object files.
A relocatable file holds code and data suitable for linking with other
object files to create an executable
or a shared object file.
An executable file holds a program suitable for execution; the
file specifies how exec(BA_OS) creates
a program's process image.
A shared object file holds code and data suitable for linking in two
contexts. First, the link editor [see
ld(BA_OS)] processes the shared object
file with other relocatable and shared
object files to create another object
file. Second, the dynamic linker
combines it with an executable file
and other shared objects to create a
process image.
contradictorily he is saying that both Object File & Image File are ELF formats & He is not at all differentiating between object & image files but referring them commonly as Object files. Isn't it wrong?
Q3. I know that PE is derived from COFF. But why does the Microsoft specifications of PE format is named Microsoft Portable Executable "and Common Object File Format Specification". Do they still support COFF? If they, in which OS? I thought PE completely replaced COFF long ago.
I'm the OP. Every one's answer is a partial answer. So, I'm combining all the other answers with what I learnt to complete the answer.
This is the "Generally" used terminology.
A file that is given as input to the linker (output of assembler) is called Object File or Relocatable File.
The linker produces an Image file, which in turn is used as input by the loader. Now, an Image file can either be an Executable file or Library file. These 'Library files' are of two kinds:
Static Library (*.lib files for windows. *.a for linux)
Shared/Dynamic libraries : DLL ( *.dll on windows) & Shared Object file( *.so in Linux)
The term Binary File / Binary can be used to refer to either an ObjectFile or an ImageFile. Undestand depending up on the context. It is a very general term.
Loader when loads the image file into memory. Then it is called Module (I'm not sure about Linux guys, but windows guys call it Module
http://www.gliffy.com/pubdoc/1978433/L.jpg
alt text http://www.gliffy.com/pubdoc/1978433/L.jpg
As I said, these are "Generally" used terminology. There are no strict definitions for the terms 'binary file', 'image file', or 'object file'.
Particularly the term 'object file' might sometimes be used to mean an intermediate file output by the compiler for use by the linker, but in another context might mean an executable file.
Especially on different platforms they might be used for refer to different or similar things. Even when discussing issues on a single platform, one writer might use the terms somewhat differently than another.
Both ObjectFile & ImageFile are in PE Format in windows & ELF format in linux.
ELF is not only the format of the image file but is also the format of the object file.
Every ELF file starts with an ELF Header. The second field of an ELF Header is e_type; this fields lets us know whether the file is an object file (aka a relocatable in ELF parlance), or an image (which can be either an executable or a shared object) or something else (core file's are also ELF files).
I don't know if there is any bit in header that differentiates an Object file from Image file. It needs to be checked.
I know that PE is derived from COFF.
But why does the Microsoft
specifications of PE format is named
Microsoft Portable Executable "and
Common Object File Format
Specification". Do they still support
COFF? If they, in which OS? I thought
PE completely replaced COFF long ago.
As far as "PE" vs "COFF", my recollection is that Microsoft use the "COFF" specification as the starting point for the "PE" specification but extended it for their needs. So strictly speaking a "PE" file isn't a "COFF" file, but it's very similar in many ways.
There are no strict definitions for the terms 'binary file', 'image file', or 'object file'.
Particularly the term 'object file' might sometimes be used to mean an intermediate file output by the compiler for use by the linker, but in another context might mean an executable file.
Especially on different platforms they might be used for refer to different or similar things. Even when discussing issues on a single platform, one writer might use the terms somewhat differently than another.
As far as "PE" vs "COFF", my recollection is that Microsoft use the "COFF" specification as the starting point for the "PE" specification but extended it for their needs. So strictly speaking a "PE" file isn't a "COFF" file, but it's very similar in many ways.
gcc -c will produce a .o file, which is an elf format object file, on a Linux system. "ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV)" is how a .o file is described by the file command on my machine.
In regards to Q2 for ELF, ELF is not only the format of the image file but is also the format of the object file.
Every ELF file starts with an ELF Header. The second field of an ELF Header is e_type; this fields lets us know whether the file is an object file (aka a relocatable in ELF parlance), an image (which can be either an executable or a shared object) or something else (core file's are also ELF files).
As an aside, I know core dumps on Solaris (and I am guessing other Unix flavors) can be in ELF format.

Resources