Are the names of COFF Data Directories fixed? - windows

I have a PE file (notepad), the NumberOfRvaAndSize value in the COFF header is 0x10, and there are 16 DataDirectory entries as expected.
The documentation says that this value can change (though I've never seen it), which would mean there were greater than of fewer than 16 entries.
Immediatly after there's a list of 16 data directories complete with names.
Are these names just always the same, in that exact order?
If there are fewer, will it always be whatever directories are at the end that will be missing?
If there are greater than 16, what names are they assigned?

It's always a matter of specification vs. implementation.
Are these names just always the same, in that exact order?
As for the names (I guess you are referring to the section names?), no they can change. You can name them whatever you want although most implementations (i.e linkers) will keep the specification names (e.g .reloc for the relocations).
The order is fixed; You can refer to them by their numbers.
If there are fewer, will it always be whatever directories are at the end that will be missing?
I'm not sure a valid PE (which can be loaded by an actual supported system) can have fewer than 16 data directories. It might be possible though as the location of the section headers is probably calculated using the FILE_HEADER.SizeOfOptionalHeader.
The reference implementation for loading a PE file (the Windows Loader) is not open source so it's not easy to answer this question.
My guess is that it could work: it's like trying to load a win2K PE on a windows 10 system (given that it is importing functions that are still present on a windows 10 system). It would be like the CLR data directory is just not there.
If there are greater than 16, what names are they assigned?
You can't have more than 16 data directories because the maximum number is 16. I'm pretty sure the Windows loader would not load a PE file with more than 16 data directories.
The documentation says that this value can change (though I've never seen it), which would mean there were greater than of fewer than 16 entries.
The number is fixed to 16 right now. For example the last addition was the CLR data directory which was added to load the CLR with the introduction of .NET. Before that, the number was 15, so yes the value can change and will not always be 16, but this doesn't mean it changes between PEs. What I mean is that, at a given time, for a supported system, all PEs will have the same number of data directories.
My guess is that, at the time of the introduction of .NET (with the CLR data directory) there were PEs with 15 data directories and others with 16. The windows loader was probably patched to account for the two different numbers. Right now it is probable that the number is fixed to only 16.

Related

Visual Studio embed large resource file (almost 4gb)

I am trying to embed a large resource file (almost 4gb), its a .dat file. However i am running into issues where it throws an error
"Error reading resource 'Sx64.x-none.dat' -- 'Specified argument was out of the range of valid values.
It appears there is a limitation to the size of an embedded resource for Visual studio. Would there be a way to increase the max size? or some other work around for this? I am trying not to use a linked resource or have another file being copied around with the exe.
While in the PE format specification the SizeOfImage value is a 32 bit unsigned integer and can theoretically handle up to 4 GiB, in practice the limit for an executable file is lower. Some user here on stackoverflow has tested this behavior. However it's still possible to make an executable bigger and working (on 64 bit Windows only) but the data must be kept outside of the image sections at End Of File, so the loader won't attempt to allocate it. This is a bad practice and I suggest, as suggested by others in comments, to ship it in a separate file along with your executable.

JVM - bytecode content difference after compiling

I have recently seen a behaviour which made me ask this on SO . I was hoping that people would be able to share their findings too.
Would a class file (bytecode) be different if the same file is compiled (unchanged) using JDK 1.8 u66 and JDK 1.8 u121? What I mean is the following:
1) I compile an application using JDK 1.8 u66
2) I make changes to 1 or 2 files and recompile using JDK 1.8 u66.
Could I expect some of the unchanged class files to have different binary content even though they haven't changed?
My reason is that when I took a hash of a file which wasn't changed as part of my steps above - they had same size on disk, but the hashcode was totally different. and I used Winmerge to compare these two versions where the size was reported as identical, but the binary contents were different. The following is what I have compared using Winmerge (blue marked item was something related to my source name so I had to mask it out) - but please observe the difference in 208 and 248.
Is this expected? if so, could someone please point me to that literature which explains this?
Regards,
Countless reasons exist, why the same Java source file may be compiled to different bytes by different compilers, where different versions of the same compiler should indeed be seen as different compilers. Even for the exact same compiler there is no guarantee that bytes are identical.
One such reason is, that all references in the code (other than opcodes and bytecode offsets) are indirected through the Constant Pool. The order of entries in the constant pool is not specified and hence it may change leading to all references using a different offset.
See also that JVMS has a section titled Compiling for the Java Virtual Machine, which, however, starts by saying:
The numbered sections in this chapter are not normative
As a result, reasoning works only in one direction: same bytes implies same source code, but different bytes doesn't necessarily imply different source code.
JDK-8067422, as linked from one comment, gives an example where even the same compiler can produce different bytes for the same source file (perhaps due to different set of source files compiled in the same compiler invocation). As per JLS and JVMS this is legal, just inconvenient.

How can an executable be this small in file size?

I've been generating payloads on Metasploit and I've been experimenting with the different templates and one of the templates you can have your payload as is exe-small. The type of payload I've been generating is a windows/meterpreter/reverse_tcp and just using the normal exe template it has a file size around 72 KB however exe-small outputs a payload the size of 2.4kb. Why is this? And how could I apply this to my programming?
The smallest possible PE file is just 97 bytes - and it does nothing (just return).
The smallest runnable executable today is 133 bytes, because Windows requires kernel32 being loaded. Executing a PE file with no imports is not possible.
At that size it can already download payload from the Internet by specifying an UNC path in the import table.
To achieve such a small executable, you have to
implement in assembler, mainly to get rid of the C runtime
decrease the file alignment which is 1024 by default
remove the DOS stub that prints the message "This program cannot be run in DOS mode"
Merge some of the PE parts into the MZ header
Remove the data directory
The full description is available in a larger research blog post called TinyPE.
For EXE's this small, the most space typically is used for the icon. Typically the icon has various sizes and color schemes contained, which you could get rid of, if you do not care having an "old, rusty" icon, or no icon at all.
There is also some 4k of space used, when you sign the EXE.
As an example for a small EXE, see never10 by grc. There is a details page which highlights the above points:
https://www.grc.com/never10/details.htm
in the last paragraph:
A final note: I'm a bit annoyed that “Never10” is as large as it is at
85 kbyte. The digital signature increases the application's size by
4k, but the high-resolution and high-color icons Microsoft now
requires takes up 56k! So without all that annoying overhead, the app
would be a respectable 25k. And, yes, of course I wrote it in
assembly language.
Disclaimer: I am not affiliated with grc in any way.
The is little need for an executable to be big, except when it contains what I call code spam, code not actually critical to the functionality of the program/exe. This is valid for other files too. Look at a manually written HTML page compared to one written in FrontPage. That's spamcode.
I remember my good old DOS files that were all KB in size and were performing practically any needed task in the OS. One of my .exes (actually .com) was only 20 bytes in size.
Just think of it this way: just as in some situations a large majority of the files contained in a Windows OS can be removed and still the OS can function perfectly, it's the same with the .exe files: large parts of the code is either useless, or has different than relevant-to-objective purpose or are intentionally added (see below).
The peak of this aberration is the code added nowdays in the .exe files of some games that use advanced copy protection, which can make the files as large as dozens of MB. The actually code needed to run the game is practically under 10% of the full code.
A file size of 72 KB as in your example can be pretty sufficient to do practically anything to a windows OS.
To apply this to your programming, as in make very small .exes, keep things simple. Don't add unnecessary code just for the looks of it or by thinking you will use that part of the program/code at a point.

Is dual mode executable possible?

A bit of history... I have 3 systems that I spend time on, a DOS 6.22 system, a Windows 95 system, and a modern Windows 7 (64-bit) system. When I upgraded to Win7-64, some of my favorite command line utilities stopped working, so I decided to re-write them myself. The only 2 compilers I have are Borland Turbo C++ 3.0 and Visual Studio 2008, and they worked fine for building 2 versions, a DOS 16-bit, and a Windows 7 32-bit (could have built 64-bit too, I guess.) The problem came with my Win95 system. The DOS version works fine there, but since I spent the time to support LFNs in the Win7 build, I wanted it with my Win95 system. So, after a lot of research, I found and purchased Visual Studio 6 (last one with Win95 support according to what I researched,) copied the code over (had to rewrite sections, of course,) and it compiled just fine, and works :)
The problem occurred the next time I had to boot my Win95 system in DOS mode. The program stopped working (of course,) because Win95 wasn't loaded. I don't really want to have 2 copies of the program installed (needing 2 different file names,) so I was hoping there was a way to link the 2 versions together into one file. If I execute it in DOS, instead of it saying it requires windows, it would just jump to the DOS section of the program. That way, it would be a single program, with LFN support if Win95 is loaded, and without if Win95 isn't loaded. Since the Win95 version also works fine in Win7-64, it would probably also produce a single version that works on all 3 systems (which would be an added bonus.)
I did some web searches, and couldn't find anything germane to what I'm looking for. So I have no idea if it is even possible. I may have to get yet another compiler, but considering how old it would have to be, I could probably afford it. My web searches did result in information that leads me to believe that it "should" be possible, though. It would just require a different exe header than the one Windows compilers put in. It may require that I re-write the DOS version for 32-bit and use a DOS extender (for protected mode, assuming I can't find a way to include it in the file itself.) That would be acceptable (though not ideal.) I would much rather have 16-bit code in the DOS section, and 32-bit code in the Windows section (for the most compatibility.)
Does anyone have any information about something like this? If you could just point me in the right direction it would be greatly appreciated.
I don't know if it has been continued in Windows 7 executables, but back in Win95 the executable (EXE) actually had two entry points -- one "normal" one that DOS would find, and a second one that Windows would use. The DOS entry point was usually a very simple default that would just print "This is a Windows program" and exit. You can actually override this default, and have the linker use your own code, however it is very limited.
What I'd recommend doing is add logic to your DOS 6.22 version (e.g. "sed") that would check the OS level & if it meets the right criteria, pass the parameters along to a second executable (e.g. "sedx") that uses features from the "newer" OS.
The documentation for Visual Studio 6 describes the /STUB option here, simply point this at the DOS version of your program.
I don't have VS6 handy, so I can't be too specific, but in the project settings GUI, there should be an "additional options" setting in the linker section.
Well the answer is the /stub option in the Linker you are using for your Windows code. Some additional information for anyone who finds the question later.... I had to do several days of web searches to find that there doesn't appear to be another answer to my particular problem.
Stub requires that the DOS mode executable have a header of at least 40 bytes. After fighting with multiple compilers that "DO" give you a header of the right size (Borland Turbo C++ won't,) and not being able to convert my code, I had to get sneaky/fancy. BTW - Visual C 1.52c (last Visual C that supports DOS,) will make a correct header, as will Open WatCom.
If you are faces with the same issue I was - the compiler you used won't make the correct size header, and your code is too compiler specific to convert easily, you can do what I ended up doing. I used Open WatCom to write a tiny ("Hello World") Windows program using my exe with the short (Borland created,) header as the stub. Open WatCom will adjust the header automatically. I then used a Hex Editor to read the header information to get the ending address of the stub and a partial file copier to copy only that part of the program to a file I named "stub.exe" (stripping of the Windows code.) Using the same Hex Editor I zeroed out the PE pointer in the header. I now had a working DOS exe that would also work as a stub. Took my stub to my Windows compiler, and linked it in. It works great, all features fully realized :)
FYI - Information needed to strip the Windows portion and zero the PE pointer.
first byte is offset 0 (of course, but some people may not realize that, and think it's byte 1.) Also remember, that most Hex Editors (by their very name,) are giving you numbers in hexadecimal format.
offset 2 & 3, number of bytes in the last block of the DOS portion of the file in low byte - high byte format. That is, offset 2 is low, 3 is high. So take them, reverse them, and you will get a number from 0 - 511 (0 - 1ff in hex.) 0 means the entire block of 512 (200 in hex) bytes is used.
offset 4 & 5 (again in low/high format,) is the number of 512 (200 in hex) byte block in the DOS portion. Remember to reverse the number, and that the last block may only be a partial block. So, subtract one, multiply by 512 (200 hex,) add the number from 2-3, and you have how many bytes are in the DOS portion. Since you are starting from 0, subtract 1, and you now know to only copy bytes 0 - "whatever the total is" to your stub exe.
offset 60-61 (hex 3C-3D) is the pointer to the start of the PE (or Portable Executable,) portion of the code (the part that Windows jumps to.) It should be just past (mine was padded with a few zeroes,) the end of the DOS portion of the code. This isn't important at this time, as we are just turning those into 0's anyway (the PE portion has been stripped.) You can use this as confirmation that you have the correct "end of DOS" offset selected though.
The tools I used are:
Open WatCom at http://www.openwatcom.org/index.php/Main_Page
and
Part Copy at http://www.virtualobjectives.com.au/utilitiesprogs/partcopy.htm
I have no idea where to find the Hex Editor I used. I used CEdit, a DOS program I really like, but have been unable to find on the net. Have to use DOSBox with it as Win7 won't run it, though. There are probably other compilers that do the same thing, and probably tons of partial file copiers available. These are the tools I used.

How can I find the physical address of a file?

I'm using the GoAsm assembler on a Windows 7 - 64 bit OS and I'll be asking you a few (not so dumb) questions.
First question :
How can I find the physical address of a file ?
Let's suppose file "Text.txt" is at the root of my C:\ partition.
Is there a way to get the exact memory address where this file is ?
Second question :
Is it possible to call a routine which will just do like if I invoked a C function ?
(i.e. : Consider a C function "WriteToScreen", is it possible to have the same function, but in assembler format, that means without having the need to use high-level invokes to do that work ?
Third question :
Are there somewhere on the net some include files for GoAsm containing useful routines like (move, copy, edit, erase) commands ? I've first thought of ms-dos interrupts but I can't manage to get them to work without crashing the program. I guess it just not compatible with Windows OS even though the command prompt acts like ms-dos... ?
Fourth question :
I've heard from different sources and myself that NASM works pretty bad on Win7 x64, is it just true, or am I doing it the wrong way ?
1
An hard drive, from a logical point of view, can be seen as a sequence of "blocks" (the more common name is sectors). How these blocks are organized physically on the disks can be disregarded, but the driver must know someway how to get data of course, though you send to modern hd driver "high level" commands that, as far as you know, are not strongly related to where data physically are (you can say "read the block 123", but there's no extern evidence of where that block lives).
However this way you can "name" a block with a number, and say e.g. that block 0 is the MBR. Each block contains several bytes (512, 1024...). Not all used blocks contain actual data of a file, in fact there are metainformations of any sort, depending on the filesystem but even related to the "structure" of the hd (I mean, partitions).
A file located on an hd is not automatically loaded into memory, so it has no memory address. Once you read it, piece of it if not all are of course copied into the memory you give, which is not an intrinsic property of the file. (Filesystems retrieve the blocks belonging to the file and "show" them as we are used to see them, as a single "unit", the file)
Summarizing: files have no memory address. The physical address could be the set of blocks holding data (and metadata, like inodes ) of the file, or just the first block (but if a block of data is N, N+1 could not belong to the same file - the blocks need no to be one next to the other). To know them, you have to analyse the structure of the filesystem you use. I don't know if there's an API to retrieve them easily, but in the worst case you can analyse the source code of the filesystem... good luck!
2
C functions are translated into assembly. If you respect the C calling convention, you can write a "C function" directly in assembly. Try reading this and this for x86.
3
You can call windows API from asm. Forget MS-DOS, MS-DOS is dead, MS-DOS is not Windows, the cmd is a sort of "emulation"... indeed no, not an emulation but just a command line interface that resemble the one MS-DOS users was used to. But it is not exaclty the same, i.e. there are no MS-DOS system interrupt you can use. Iczelion's assembly tutorials, though old, could be an interesting resource. (If links expire, try with the wayback machine)
4
I do not own Win7 and never installed nasm on windows, so I can't say anything about.
For the first question just drag the file into the address bar in the browser

Resources