How to inspect a MacOS executable file (Mach-O)? - macos

How to see some basic information about a MacOS binary file, specifically what was used to build it, which frameworks it is linked against and what system calls it is using?
I tried nm and otool -L but their output is only partially useful.
For example, what would indicate that a binary was built with xcode or golang compiler?
NB. I am not interested in reverse engineering macos binaries. I just want to know better what is running on my system using just the tools that are already included OOTB.

The compiler used is not a trivial question. Some compilers embed information about themselves, and you sometimes find that by running strings, but it's not guaranteed. That said, the answer on Mac is almost always clang, so it's not usually that hard to get the basics. As an example:
strings iTunes | grep clang
COMPILER=clang-9.0.0
But that's just luck (and might not even be accurate).
IDA Pro does a good job at this, and is the gold standard for reverse engineering work. If this kind of thing is important to you, then IDA Pro is the tool. It's expensive.
To get a list of frameworks that it links at runtime, otool -L is the tool you want. I don't understand what you mean by "isn't very readable." It just prints out all the frameworks, one per line. It's hard to imagine anything more readable than that. What are you looking for here?
otool -L will not tell you what static libraries were used, so "which frameworks it was built against" may be too broad to answer. You can generally find the symbols for well-known static libraries (OpenSSL for instance), but there is no easy way to know precisely what was included in a binary, particularly if debugging information is stripped. (If there's debugging information, then the filepaths will tend to be available, which can tell you a lot more about how it was built.)
There is no easy static way to get a list of all system calls, since those may be buried in libraries. nm is generally what you want, though. It includes a lot of things that aren't "system" calls, though, since it's going to include every symbol linked from external sources (you probably want something like nm -ju). Again, I'm not sure what you mean by "isn't very readable." It's one symbol per line.
In order to get the list of system calls at runtime, run the application with dtruss. It'll output every system call as it's made, and will focus on actual "system" calls (i.e. syscall).

Related

gdb, how to step into c runtime? Where is crt_c.c?

When I'm stepping into debugged program, it says that it can't find crt/crt_c.c file. I have sources of gcc 6.3.0 downloaded, but where is crt_c.c in there?
Also how can I find source code for printf and rand in there? I'd like to step through them in debugger.
Ide is codeblocks, if that's important.
Edit: I'm trying to do so because I'm trying to decrease size of my executable. Going straight into freestanding leaves me with a lot of missing functions, so I intend to study and replace them one by one. I'm trying to do that to make my program a little smaller and faster, and to be able to study assembly output a bit easier.
Also, forgot to mention, I'm on windows, msys2. But answer is still helpful.
How can I find source code for printf and rand in there?
They (printf, rand, etc....) are part of your C standard library which (on Linux) is outside of the GCC compiler. But crt0 is provided by GCC (however, is often not compiled with debug information) and some C files there are generated in the build tree during compilation of GCC.
(on Windows, most of the C standard library is proprietary -inside some DLL provided by MicroSoft- and you are probably forbidden to look into the implementation or to reverse-engineer it; AFAIK EU laws might mention some exception related to interoperability¸ but then you need to consult a lawyer and I am not a lawyer)
Look into GNU glibc (or perhaps musl-libc) if you want to study its source code. libc is generally using system calls (listed in syscalls(2)) provided by the Linux kernel.
I'd like to step through them in debugger.
In practice you won't be able to do that easily, because the libc is provided by your distribution and has generally been compiled without debug information in DWARF format.
Some Linux distributions provide a debuggable variant of libc, perhaps as some libc6-dbg package.
(your question lacks motivation and smells like some XY problem)
I intend to study and replace them one by one.
This is very unrealistic (particularly on Windows, whose system call interface is not well documented) and could take you many years (or perhaps more than a lifetime). Do you have that much time?
Read also Operating Systems: Three Easy Pieces and look into OsDev wiki.
I'm trying to do so because I'm trying to decrease size of my executable.
Wrong approach. A debugger needs debug info (e.g. in DWARF) which will increase the size of the executable (but could later be stripped). BTW standard C functions are in some common shared library (or DLL on Windows) which is used by many processes.
I'm on windows, msys2.
Bad choice. Windows is proprietary. Linux is made of free software (more than ten billions lines of source code, if you consider all useful packages inside a typical Linux distribution), whose source code you could study (even if it would take several lifetimes).

Win32: Get info on statically linked libraries

Assuming you only have access to the final product (i.e. in form of the exe file), how would you go about finding out which libraries/components the developer used to create the application?
In my specific case the question is about an application developed in VC++ using a few third party components and I'm curious which those are.
But I think the question is generally valid, e.g. when it should be proven if a developer is in line with license requirements of a specific library.
So, what you're saying is that if I suspect that a binary is using a certain library, I could try to map the respective function calls and see if I get a result. But there is no shortcut to this and unless I am willing to try out hundreds of mappings or the dev left some information in some strings or other resources, I have little chance of finding this out. Yes?
There is small shortcut, here's what I'd do:
check executable for strings and constants, and try to find out what library is that.
IF used libraries are open-source, compile them on my own and create FLAIR signatures (IDA Pro).
Use generated flair signatures on target executable.
In some situations, that can really work like a charm and can let you distinguish actual code from used libraries.
The IDA Pro Book - Ch 12. Library Recognition Using FLIRT Signatures

What is the difference between "binary install" and "compile and install from source"? Which is better?

I want to install a driver for Ros (robot operating system), and I have two options the binary install and the compile and install from source. I would like to know which installation is better, and what are the advantages and disadvantages of each one.
Source: AKA sourcecode, usually in some sort of tarball or zip file. This is RAW programming language code. You need some sort of compiler (javac for java, gcc for c++, etc.) to create the executable that your computer then runs.
Advantages:
You can see what the source code is which means....
You can edit the end result program to behave differently
Depending on what you're doing, when you compile, you could enable certain optimizations that will work on your machine and ONLY your machine (or one EXACTLY like it). For instance, for some sort of gfx rendering software, you could compile it to enable GPU support, which would increase the rendering speed.
You can create a version of an application for a different OS/Chipset (see Binary below)
Disadvantages:
You have to have your compiler installed
You need to manually install all required libraries, which frequently also need to be compiled (and THEIR libraries need to be installed, etc.) This can easily turn a quick 30-second command into a multi-hour project.
There are any number of things that could go wrong, and if you're not familiar with what the various errors mean, finding support online could be quite difficult.
Binary: This is the actual program that runs. This is the executable that gets created when you compile from source. They typically have all necessary libraries built into them, or install/deploy them as necessary (depending on how the application was written).
Advantages:
It's ready-to-run. If you have a binary designed for your processor and operating system, then chances are you can run the program and everything will work the first time.
Less configuration. You don't have to set up a whole bunch of configuration options to use the program; it just uses a generic default configuration.
If something goes wrong, it should be a little easier to find help online, since the binary is pre-compiled....other people may be using it, which means you are using the EXACT same program as them, not one optimized for your system.
Disadvantages:
You can't see/edit the source code, so you can't get optimizations, or tweak it for your specific application. Additionally, you don't really know what the program is going to do, so there could be nasty surprises waiting for you (this is why Antivirus is useful....although LESS necessary on a linux system).
Your system must be compatible with the Binary. For instance, you can't run a 64-bit application on a 32-bit operating system. You can't run an Intel binary for OS X on an older PowerPC-based G5 Mac.
In summary, which one is "better" is up to you. Only you can decide which one will be necessary for whatever it is you're trying to do. In most cases, using the binary is going to be just fine, and give you the least trouble. Sometimes, though, it is nice to have the source available, if only as documentation.

Do DLLs built with Rust require libgcc.dll on run time?

If I build a DLL with Rust language, does it require libgcc*.dll to be present on run time?
On one hand:
I've seen a post somewhere on the Internet, claiming that yes it does;
rustc.exe has libgcc_s_dw2-1.dll in its directory, and cargo.exe won't run without the dll when downloaded from the http://crates.io website;
On the other hand:
I've seen articles about building toy OS kernels in Rust, so they most certainly don't require libgcc dynamic library to be present.
So, I'm confused. What's the definite answer?
Rust provides two main toolchains for Windows: x86_64-pc-windows-gnu and x86_64-pc-windows-msvc.
The -gnu toolchain includes an msys environment and uses GCC's ld.exe to link object files. This toolchain requires libgcc*.dll to be present at runtime. The main advantage of this toolchain is that it allows you to link against other msys provided libraries which can make it easier to link with certain C\C++ libraries that are difficult to under the normal Windows environment.
The -msvc toolchain uses the standard, native Windows development tools (either a Windows SDK install or a Visual Studio install). This toolchain does not use libgcc*.dll at either compile or runtime. Since this toolchain uses the normal windows linker, you are free to link against any normal Windows native libraries.
If you need to target 32-bit Windows, i686- variants of both of these toolchains are available.
NOTE: below answer summarizes situation as of Sep'2014; I'm not aware if it's still current, or if things have changed to better or worse since then. But I strongly suspect things have changed, given that 2 years have already passed since then. It would be cool if somebody tried to ask steveklabnik about it again, then update below info, or write a new, fresher answer!
Quick & raw transcript of a Rust IRC chat with steveklabnik, who gave me a kind of answer:
Hi; I have a question: if I build a DLL with Rust, does it require libgcc*.dll to be present on run time? (on Windows)
I believe that if you use the standard library, then it does require it;
IIRC we depend on one symbol from it;
but I am unsure.
How can I avoid using the standard library, or those parts of it that do? (and/or do you know which symbol exactly?)
It involves #[no_std] at your crate root; I think the unsafe guide has more.
Running nm -D | grep gcc shows me __gc_personality_v0, and then there is this: What is __gxx_personality_v0 for?,
so it looks like our stack unwinding implementation depends on that.
I seem to recall I've seen some RFCs to the effect of splitting standard library, too; are there parts I can use without pulling libgcc in?
Yes, libcore doesn't require any of that.
You give up libstd.
Also, quoting parts of the unsafe guide:
The core library (libcore) has very few dependencies and is much more portable than the standard library (libstd) itself. Additionally, the core library has most of the necessary functionality for writing idiomatic and effective Rust code. (...)
Further libraries, such as liballoc, add functionality to libcore which make other platform-specific assumptions, but continue to be more portable than the standard library itself.
And fragment of the current docs for unwind module:
Currently Rust uses unwind runtime provided by libgcc.
(The transcript was edited slightly for readability. Still, I'll happily delete this answer if anyone provides something better formatted and more thorough!)

How to determine development tools used to make a Windows application?

I've got a working proprietary application (windows exe) and would like to know which particular toolkit was used to make it. The reason is that I like the widgets it uses and seek to use same library in my project (to buy it if it's proprietary as well).
Just use Process Explorer to see what DLLs the application has loaded. That will be your widget set. Sort the results by folder to roughly group them by manufacturer. You may need to examine the properties of the DLLs for more detailed info as well.
If the library is statically linked you may have to do some deep looking around, maybe you'll get lucky and find a string saying the name of the library or a class/function in it. You can use OllyDbg for this to view strings loaded at runtime, or something like the linux command strings to look through statically, although that wont work if the program decodes itself at startup. If that doesn't work, you'd have to come up with a list of libraries that do what the one you are looking at does, and find some artifacts in the binary that are common between the two. Anyways, better to check the dlls first like Paul Sasik said.
You can use PEiD to identify the compiler, which can be a hint aswel. PEiD also has a nice process explorer.
For instance, Google Chrome uses C:\WINDOWS\SYSTEM32\IEFRAME.DLL :-) Nice isn't it?
(Don't trust it 100%. For instance, my own compiler has the "Morphine 1.2 - 1.3 -> rootkit" description, which I find quite awkward: that's a packer/compiler trace obfuscator.)

Resources