Relationship between (PE <> winnt.h ) and (ELF <> elf32.h) - compilation

What I know:
winnt.h contains the structure/definition of PE file and its components (Windows), and ELF32.h contains structure of ELF file and definition of each components(Linux).
What i think (understanding/observation):
I understand that the winnt.h not only contains the PE structure/definition but also contains various macros and types (for Windows NT), and it is a child header file of windows.h (so, based on my understanding, winnt.h has another important application to provide win API etc). However, based on my observation, elf32.h only contains definition/structure of ELF file (and nothing more than that.
My Question: what is the application/functionality of winnt.h when it comes to compiling/interpereting/executing a PE file?
I understand that winnt.h has another application (providing Win API/Macros/etc) and is a prefect guide to understand/dismantle a PE file, but how this file is specificly used by the OS?
Does the compiler use it to build the PE file from source?
Does the OS use it to interperet the PE file?
*And the same question for ELF32(or 64).h and ELF file.
Any answer is much appreciated.

Related

Where I can find information about windows linker?

I need to make hand crafted object file's and then link it with each other under windows. So what is linux .o files, GNU linker and nm windows alternatives?
I've already read this http://www.lurklurk.org/linkers/linkers.html it's great article but analysis of object file covers only linux. So I want to know how translated sources and names becomes object file, and then executable file.

What is the format for debug info in Windows obj files?

I'm messing around with compilers, .obj files, assembly, etc. The .obj file contains info that eventually ends up in the PDB, but I can't find any reference to the format that's used within the debug sections of the .obj file. (I have, however, found a reference to the COFF file format -- so I already know about that).
So: What's the format of the .debug$S and .debug$T sections when the source C file is compiled with the /Zi flag?
This information isn't published (the format used for native PDBs). If you can link the object file into an executable using "link" there are windows "debugging apis" you can use to "interrogate" symbols in an image. However, the format used for object files is not made publicly available.
You could try and reverse engineer it. If you find any info, please share it.

What sections are not loaded by the PE loader?

Are any sections at all not loaded by the PE loader? Or are every section specified in the section headers loaded? In ELF programs, it's section headers (Called program headers, or segments) that are supposed to be loaded are those that are flagged with PT_LOAD. Is there anything similar to that in PE programs?
PS. I found the flag IMAGE_SCN_MEM_DISCARDABLE. Are sections flagged with that not loaded?
When a relocation section is available, but the PE image does not need to be relocated, the loader does not load the relocation section. If a PE image has been digitally signed, it has a section that contains the certificate. This section is not loaded by the loader. Additionally, if a debug section is available, this is also not loaded by the loader.
Well, DOS Stub is not a Section!
As a general rule, some parts of the PE file can be read, but not mapped in memory (like relocations). And some parts are not mapped at all. Debugging information at the end of the file, is an instance of such a situation.
Usually the data placed at the end of the file -past any parts in the file (that are willing to be mapped)- are not mapped in memory.

How to distinguish user-defined / library functions from a compiled file?

EDIT:
What I want is to distinguish statically linked library functions and user self-written functions within a compiled file (e.g. PE file).
How to achieve that? (I am thinking of database comparison but I do not know any database.)
By the way, (I have already known long before I asked this question) for dynamically linked library functions, they are just an entry in the import table (of PE).
By library functions, I mean those defined in libraries, such as STL (I know this is a bad name).
By user-defined functions, I mean those written by individual programmers.
Is there any programmatic way to achieve this goal?
Right now I am thinking about comparing binaries with a database, but I do not know any database so far.
Please recommend a database or a different way as an answer. Thank you.
This answer is assuming you want to analyze a standard Windows executable that is dynamically linked against other import libraries (.lib and assoicated .dll files that are not statically linked), and if this is the case, you want to interperet the PE (Portable Executable) file structure.
Here's a good article to get you started, with sample code on dumping the PE header.
You will want to focus on the Import table (.idata section) for external library calls, and the Export table (.edata section) for calls defined inside the executable and marked as exportable (usually this only exists in .dll files).
For static libraries, their format is called COFF, and there is the DUMPBIN utility that ships with Visual Studio that you can use to quickly peer into your lib files and even dump the disassembly of the code if you wanted.
The DUMPBIN utility, which is provided with the 32-bit version of
Microsoft Visual C++, combines the abilities of the LINK, LIB, and
EXEHDR utilities. The combination of these tools features the ability
to provide information about the format and symbols provided in
executable, library, and DLL files.
For information on the structure of COFF files, see this article.
Figuring out if a function call is from a lib or not would be tricky, but from what I remember, most static lib calls in code are actually thunk calls (simple jmp calls to the actual object code copied in from the lib) and are small in size (usually around 5 bytes), while "user defined" ones are not thunks, and are bp-based framed calls.
When your programm is linked, static functions and user-define functions are
include file by file.
So if you dump the header of a PE file, and look at the symbols
table (using objdump -x if you run with mingw32, or anything else)
you will see the name of a file and then all functions import from this one,
after an other file name and its functions...
Or if you have debug information, may be this can be easier.
So after you link functions with a file you can sort the functions by analysing their file name. Looking for extention (.c / .lib / .a) or check in a list of file you have somwhere.
Be carefull to eliminate crt0 files...
However this is kind a tricky solution and I'm not sure this'll work for every program.

Difference between .dll and .exe?

I want to know the exact difference between the dll and exe file.
I don't know why everybody is answering this question in context of .NET. The question was a general one and didn't mention .NET anywhere.
Well, the major differences are:
EXE
An exe always runs in its own address space i.e., It is a separate process.
The purpose of an EXE is to launch a separate application of its own.
DLL
A dll always needs a host exe to run. i.e., it can never run in its own address space.
The purpose of a DLL is to have a collection of methods/classes which can be re-used from some other application.
DLL is Microsoft's implementation of a shared library.
The file format of DLL and exe is essentially the same. Windows recognizes the difference between DLL and EXE through PE Header in the file. For details of PE Header, You can have a look at this Article on MSDN
EXE:
It's a executable file
When loading an executable, no export is called, but only the module entry point.
When a system launches new executable, a new process is created
The entry thread is called in context of main thread of that process.
DLL:
It's a Dynamic Link Library
There are multiple exported symbols.
The system loads a DLL into the context of an existing process.
For More Details: http://www.c-sharpcorner.com/Interviews/Answer/Answers.aspxQuestionId=1431&MajorCategoryId=1&MinorCategoryId=1
http://wiki.answers.com/Q/What_is_the_difference_between_an_EXE_and_a_DLL
Reference: http://www.dotnetspider.com/forum/34260-What-difference-between-dll-exe.aspx
The difference is that an EXE has an entry point, a "main" method that will run on execution.
The code within a DLL needs to be called from another application.
There are a few more differences regarding the structure you could mention.
Both DLL and EXE share the same file structure - Portable Executable, or PE. To differentiate between the two, one can look in the Characteristics member of IMAGE_FILE_HEADER inside IMAGE_NT_HEADERS. For a DLL, it has the IMAGE_FILE_DLL (0x2000) flag turned on. For a EXE it's the IMAGE_FILE_EXECUTABLE_IMAGE (0x2) flag.
PE files consist of some headers and a number of sections. There's usually a section for code, a section for data, a section listing imported functions and a section for resources. Some sections may contain more than one thing. The header also describes a list of data directories that are located in the sections. Those data directories are what enables Windows to find what it needs in the PE. But one type of data directory that an EXE will never have (unless you're building a frankenstein EXE) is the export directory. This is where DLL files have a list of functions they export and can be used by other EXE or DLL files. On the other side, each DLL and EXE has an import directory where it lists the functions and DLL files it requires to run.
Also in the PE headers (IMAGE_OPTIONAL_HEADER) is the ImageBase member. It specifies the virtual address at which the PE assumes it will be loaded. If it is loaded at another address, some pointers could point to the wrong memory. As EXE files are amongst the first to be loaded into their new address space, the Windows loader can assure a constant load address and that's usually 0x00400000. That luxury doesn't exist for a DLL. Two DLL files loaded into the same process can request the same address. This is why a DLL has another data directory called Base Relocation Directory that usually resides in its own section - .reloc. This directory contains a list of places in the DLL that need to be rebased/patched so they'll point to the right memory. Most EXE files don't have this directory, but some old compilers do generate them.
You can read more on this topic # MSDN.
This answer was a little more detailed than I thought but read it through.
DLL:
In most cases, a DLL file is a library. There are a couple of types of libraries, dynamic and static - read about the difference. DLL stands for dynamic link library which tells us that it's a part of the program but not the whole thing. It's made of reusable software components (library) which you could use for more than a single program. Bear in mind that it's always possible to use the library source code in many applications using copy-paste, but the idea of a DLL/Static Library is that you could update the code of a library and at the same time update all the applications using it - without compiling.
For example:
Imagine you're creating a Windows GUI component like a Button. In most cases you'd want to re-use the code you've written because it's a complex but a common component - You want many applications to use it but you don't want to give them the source code You can't copy-paste the code for the button in every program, so you decide you want to create a DL-Library (DLL).
This "button" library is required by EXEcutables to run, and without it they will not run because they don't know how to create the button, only how to talk to it.
Likewise, a DLL cannot be executed - run, because it's only a part of the program but doesn't have the information required to create a "process".
EXE:
An executable is the program. It knows how to create a process and how to talk to the DLL. It needs the DLL to create a button, and without it the application doesn't run - ERROR.
hope this helps....
Both DLL and EXE are Portable Executable(PE) Formats
A Dynamic-link library (DLL) is a library and therefore can not be executed directly. If you try to run it you will get an error about a missing entry point. It needs an entry point (main function) to get executed, that entry point can be any application or exe. DLL binding occurs at run-time. That is why its called "Dynamic Link" library.
An Executable (EXE) is a program that can be executed. It has its own entry point. A flag inside the PE header indicates which type of file it is (irrelevant of file extension). The PE header has a field where the entry point for the program resides. In DLLs it isn't used (or at least not as an entry point).
There are many software available to check header information. The only difference causing both to work differently is the bit in header as shown in below diagram.
EXE file has only single main entry means it is isolated application, when a system launches exe, a new process is created while DLLs have many entry points so when application use it no new process started, DLL can be reused and versioned. DLL reduces storage space as different programs can use the same dll.
Dll v/s Exe
1)DLL file is a dynamic link library which can be used in exe files and
other dll files.
EXE file is a executable file which runs in a separate
process which is managed by OS.
2)DLLs are not directly executable . They are separate files containing functions that can be called by programs and other DLLs to perform computations and functions.
An EXE is a program that can be executed . Ex :Windows program
3)Reusability
DLL: They can be reused for some other application. As long as the coder knows the names and parameters of the functions and procedures in the DLL file .
EXE: Only for specific purpose .
4)A DLL would share the same process and memory space of the calling application while an
EXE creates its separate process and memory space.
5)Uses
DLL: You want many applications to use it but you don't want to give them the source code You can't copy-paste the code for the button in every program, so you decide you want to create a DL-Library (DLL).
EXE: When we work with project templates like Windows Forms Applications, Console Applications, WPF Applications and Windows Services they generate an exe assembly when compiled.
6)Similarities :
Both DLL and EXE are binary files have a complex nested structure defined by the Portable Executable format, and they are not intended to be editable by users.
Two things: the extension and the header flag stored in the file.
Both files are PE files. Both contain the exact same layout. A DLL is a library and therefore can not be executed. If you try to run it you'll get an error about a missing entry point. An EXE is a program that can be executed. It has an entry point. A flag inside the PE header indicates which file type it is (irrelevant of file extension). The PE header has a field where the entry point for the program resides. In DLLs it isn't used (or at least not as an entry point).
One minor difference is that in most cases DLLs have an export section where symbols are exported. EXEs should never have an export section since they aren't libraries but nothing prevents that from happening. The Win32 loader doesn't care either way.
Other than that they are identical. So, in summary, EXEs are executable programs while DLLs are libraries loaded into a process and contain some sort of useful functionality like security, database access or something.
The .exe is the program. The .dll is a library that a .exe (or another .dll) may call into.
What sakthivignesh says can be true in that one .exe can use another as if it were a library, and this is done (for example) with some COM components. In this case, the "slave" .exe is a separate program (strictly speaking, a separate process - perhaps running on a separate machine), but one that accepts and handles requests from other programs/components/whatever.
However, if you just pick a random .exe and .dll from a folder in your Program Files, odds are that COM isn't relevant - they are just a program and its dynamically-linked libraries.
Using Win32 APIs, a program can load and use a DLL using the LoadLibrary and GetProcAddress API functions, IIRC. There were similar functions in Win16.
COM is in many ways an evolution of the DLL idea, originally concieved as the basis for OLE2, whereas .NET is the descendant of COM. DLLs have been around since Windows 1, IIRC. They were originally a way of sharing binary code (particularly system APIs) between multiple running programs in order to minimise memory use.
An EXE is visible to the system as a regular Win32 executable. Its entry
point refers to a small loader which initializes the .NET runtime and tells
it to load and execute the assembly contained in the EXE.
A DLL is visible to the system as a Win32 DLL but most likely without any
entry points. The .NET runtime stores information about the contained
assembly in its own header.
dll is a collection of reusable
functions where as an .exe is an
executable which may call these
functions
An exe is an executible program whereas A DLL is a file that can be loaded and executed by programs dynamically.
● .exe and dll are the compiled version of c# code which are also called as
assemblies.
● .exe is a stand alone executable file, which means it can executed directly.
● .dll is a reusable component which cannot be executed directly and it requires
other programs to execute it.
For those looking a concise answer,
If an assembly is compiled as a class library and provides types for other assemblies to use, then it has the ifle extension .dll (dynamic link library), and it cannot be executed standalone.
Likewise, if an assembly is compiled as an application, then it has the file extension .exe (executable) and can be executed standalone. Before .NET Core 3.0, console apps were compiled to .dll fles and had to be executed by the dotnet run command or a host executable. - Source
Difference in DLL and EXE:
1) DLL is an In-Process Component which means running in the same memory space as the client process. EXE is an Out-Process Component which means it runs in its own separate memory space.
2) The DLL contains functions and procedures that other programs can use (promotes reuability) while EXE cannot be shared with other programs.
3) DLL cannot be directly executed as they're designed to be loaded and run by other programs. EXE is a program that is executed directly.
The major exact difference between DLL and EXE that DLL hasn't got an entry point and EXE does. If you are familiar with c++ you can see that build EXE has main() entry function and DLL doesn't :)

Resources